Basic Word Identification for Part-of-Speech Tagging of Myanmar Language

Myint, Phyu Hninn; Htwe, Tin Myat; Thein, Ni Lar

UCSYRR Home
/
Conferences
/
International Conference on Computer Applications (ICCA)
/
Eleventh International Conference On Computer Applications (ICCA 2013)
/
View Item

Basic Word Identification for Part-of-Speech Tagging of Myanmar Language

Myint, Phyu Hninn; Htwe, Tin Myat; Thein, Ni Lar

URI: http://onlineresource.ucsy.edu.mm/handle/123456789/404

Date: 2012-02-28

Abstract:

The basic word identification is an essential process in Part-of-Speech tagging as a preprocessing step. Before disambiguating among more than one Part-of-Speech tags of one basic or root word, word boundaries need to be identified in advance because basic words are not consistently separated by any delimiters and there is no standard break among these words in Myanmar sentences. As a result, a word identification or segmentation approach for Myanmar sentences is proposed in this paper. A Myanmar lexicon is used to identify each basic word by applying longest word length matching method and rules are generated to identify reduplicated words. The proposed approach achieves the high performance by evaluating on two testing corpora and it is a very useful tool for many Myanmar Natural Language Processing applications

Show full item record