UCSY's Research Repository

Building Large Scale Text Corpus for Joint Word Segmentation and Part-of-Speech Tagging of Myanmar Language

Show simple item record

dc.contributor.author Dim Lam, Cing
dc.contributor.author Soe, Khin Mar
dc.date.accessioned 2020-03-12T06:37:57Z
dc.date.available 2020-03-12T06:37:57Z
dc.date.issued 2020-02-28
dc.identifier.isbn 978-981-14-4787-7
dc.identifier.uri http://onlineresource.ucsy.edu.mm/handle/123456789/2497
dc.description.abstract In Natural Language Processing (NLP), Word segmentation and Part-of-Speech (POS) tagging are fundamental tasks. The POS information is also necessary in NLP’s preprocessing work applications such as machine translation (MT), information retrieval (IR), etc. Currently, there are many research efforts in word segmentation and POS tagging developed separately with different methods to get high performance and accuracy. Word segmentation and Part-of-speech tagging is one of the important actions in language processing. Against this, while numerous models are provided in different languages, few works have been performed for Myanmar language. This paper describes the building of Myanmar Corpus to use for joint word segmentation and part-of-speech tagging of Myanmar Language. In our research, the corpus contains 51207 sentences and 839161words. The corpus is created using 12 tags. To evaluate the accuracy of the corpus, HMM model is trained on different data size and testing is done with closed test and opened test. Results with 94% accuracy in the experiments show the appropriate efficiency of the built corpus. en_US
dc.language.iso en_US en_US
dc.publisher Proceedings of the 10th International Workshop on Computer Science and Engineering en_US
dc.subject Natural Language Processing en_US
dc.subject POS en_US
dc.subject HMM en_US
dc.subject Corpus en_US
dc.title Building Large Scale Text Corpus for Joint Word Segmentation and Part-of-Speech Tagging of Myanmar Language en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository



Browse

My Account

Statistics