Abstract:
The lexical analysis within the context of
language processing is to connect each word
with its corresponding label in a lexicon.
However, many words have more than one
meaning, ambiguity word, which may make it
impossible to choose the correct meaning of the
word considering only the highlighted word in its
context. Beside then there is also unknown word,
a word does not have in the lexicon, which may
need to handle for tagging and updating this
word in the lexicon to improve the coverage of
lexicon. For this reason, this paper proposes the
lexical analyzer to solve the ambiguity of known
words and to tag the unknown words of
Myanmar language by using rule based
approach and decision tree induction method.
Moreover, to support the lexical analyzer,
segmentation and pattern merging algorithm is
also proposed by using the Myanmar-English
computational lexicon. The propose system is
effective for Myanmar language lexical analysis
and can improve the coverage of the lexicon.