UCSY's Research Repository

Joint Word Segmentation and Part-of-Speech Tagging for Myanmar Language

Show simple item record

dc.contributor.author Cing, Dim Lam
dc.contributor.author Soe, Khin Mar
dc.date.accessioned 2020-08-08T05:49:56Z
dc.date.available 2020-08-08T05:49:56Z
dc.date.issued 2020-08
dc.identifier.uri http://onlineresource.ucsy.edu.mm/handle/123456789/2530
dc.description.abstract A lot of research is currently ongoing in word segmentation and POS tagging developed differently with various methods. Separate word segmenters and POS taggers are also available for Myanmar Language, based on computational methods such as Neural Network (NN) and Hidden Markov Models (HMM). There is no research in joint word segmentation and POS tagging for Myanmar Language. Thus, this research intends to develop joint Myanmar word segmentation and POS tagging based on Hidden Markov Model and morphological rules. The morphology of the language through a systematic linguistic study is important in order to reveal words that are significant to users such as historians, linguists. As there are no space explicitly needed between the words in Myanmar language writing style, the first processing step is to break the text into units called tokens in which each is either a word or something like a number. In word segmentation and POS tagging, the structure of morphological words is the main source of information to get the correct process of tagging. By using the morphological structure of words, eliminate irrelevant tags can be removed and the suitable tag is found for the word. Therefore, morphological analysis is an important part of language engineering applications especially for morphologically rich and complex language like Myanmar. Most of the current research on Myanmar language used a lexicon or dictionary or corpus which lists all the word for word segmentation as an initial stage of processing. The proposed system uses HMM and morphological rules for word segmentation and POS tagging. The evaluation result shows that accuracy achieved 94%. en_US
dc.language.iso en en_US
dc.publisher University of Computer Studies, Yangon en_US
dc.title Joint Word Segmentation and Part-of-Speech Tagging for Myanmar Language en_US
dc.type Book en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository



Browse

My Account

Statistics