UCSY's Research Repository

Myanmar Language Continuous Speech Recognition Using Convolutional Neural Network (CNN)

Show simple item record

dc.contributor.author Mon, Aye Nyein
dc.date.accessioned 2019-09-23T05:14:17Z
dc.date.available 2019-09-23T05:14:17Z
dc.date.issued 2019-01
dc.identifier.uri http://onlineresource.ucsy.edu.mm/handle/123456789/2255
dc.description.abstract Researchers of many nations have developed automatic speech recognition (ASR) to show their national improvement in information and communication technology for their languages. The dissertation aims to develop good quality Myanmar language automatic speech recognition on read speech. Myanmar language is being considered as a lowresourced language. Thus, there is no speech corpus which is freely and commercially available for ASR research. Therefore, a speech corpus named “University of Computer Studies Yangon - Speech Corpus (UCSY-SC1)” which is essential for Myanmar ASR research is constructed. The speech corpus is developed by using two types of domains: web news and daily conversations. The news is collected from the Internet and the conversational data is recorded by ourselves. This corpus is applied to build the Myanmar ASR. Myanmar language is one of the tonal languages and different types of tones convey the difference in meanings. Therefore, like the other tonal languages such as Mandarin, Vietnamese and Thai, tone information is significantly played to improve the Myanmar ASR performance. Moreover, syllable is the basic unit of Myanmar language. Thus, in this work, the effect of tones is explored on both syllable and word-based ASR models. The comparison of syllable-based ASR model and wordbased ASR model is also done. In this work, Myanmar ASR is built by applying state-of-the-art acoustic model, Convolutional Neural Network (CNN). In low-resourced condition, CNN is better than Deep Neural Network (DNN) because the fully connected nature of the DNN can cause overfitting. And it degrades the ASR performance for low-resourced languages where there is a limited amount of training data. CNN can alleviate these problems and it is very useful for a low-resourced language such as Myanmar. Furthermore, CNN can model well tone patterns because it can reduce spectral variations and model spectral correlations existing in the signal. In this task, it showed that CNN outperformed DNN and Gaussian Mixture Model (GMM)-Hidden Markov Model (HMM). The best accuracy is achieved with CNN-based model in Myanmar ASR. en_US
dc.language.iso en_US en_US
dc.publisher University of Computer Studies, Yangon en_US
dc.title Myanmar Language Continuous Speech Recognition Using Convolutional Neural Network (CNN) en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository



Browse

My Account

Statistics