Abstract:
Word segmentation is a basic task and an
important problem in natural language processing. In
Myanmar text, words composed of single or multiple
syllables are usually not separated by white space.
Word segmentation to determine the boundaries of
words for languages without word separators in
orthography is a basic task in natural language
processing. This system uses a 2-step longest matching
approach. The first step was syllable segmentation, in
the second was Hybrid Approach of left-to-right
syllable maximum matching and hierarchical
expectation maximization approach. This system is to
be able to use as a pre-processing tool in Myanmar
text processing such as Machine Translation,
Information Retrieval, Search Engine using Myanmar
language. The experiment result shows that 96% of
accuracy in word segmentation.