Abstract:
Word segmentation and Part-of-Speech (POS) tagging are fundamental tasks in natural language processing (NLP). The POS information is also necessary in NLP based applications such as machine translation (MT), information retrieval (IR), etc. Currently, there are many research efforts in word segmentation and POS tagging developed separately with different approaches to reach high performance and accuracy. For Myanmar Language, there are also separate word segmentors and POS taggers based on statistical approaches such as Neural Network (NN) and Hidden Markov Model (HMM). But, as the Myanmar language's complex morphological structure, the OOV problem still exists. Thus, we intend to develop morphological analysis based joint Myanmar word segmentation and POS tagging. In this paper, we described word segmentation and POS tagging by using the proposed stemming algorithm and the morphological rules.