Abstract:
In Natural Language Processing (NLP), Word segmentation and Part-ofSpeech (POS) tagging are fundamental tasks. The POS information is also
necessary in NLP’s preprocessing work applications such as machine
translation (MT), information retrieval (IR), etc. Currently, there are many
research efforts in word segmentation and POS tagging developed separately
with different methods to get high performance and accuracy. For Myanmar
Language, there are also separate word segmentors and POS taggers based on
statistical approaches such as Neural Network (NN) and Hidden Markov
Models (HMMs). But, as the Myanmar language's complex morphological
structure, the OOV problem still exists. To keep away from error and improve
segmentation by utilizing POS data, segmentation and labeling should be
possible at the same time.The main goal of developing POS tagger for any
Language is to improve accuracy of tagging and remove ambiguity in
sentences due to language structure. This paper focuses on developing word
segmentation and Part-of- Speech (POS) Tagger for Myanmar Language. This
paper presented the comparison of separate word segmentation and POS
tagging with joint word segmentation and POS tagging.