UCSY's Research Repository

A Study of Myanmar Word Segmentation Schemes for Statistical Machine Translation

Show simple item record

dc.contributor.author Thu, Ye Kyaw
dc.contributor.author Finch, Andrew
dc.contributor.author Sagisaka, Yoshinori
dc.contributor.author Sumita, Eiichiro
dc.date.accessioned 2019-10-23T13:25:31Z
dc.date.available 2019-10-23T13:25:31Z
dc.date.issued 2013-02-26
dc.identifier.uri http://onlineresource.ucsy.edu.mm/handle/123456789/2335
dc.description.abstract Myanmar sentences are written as contiguous sequences of syllables with no characters delimiting the words. In statistical machine translation (SMT), word segmentation is a necessary step for languages that do not naturally delimit words. Myanmar is a low-resource language and therefore it is difficult to develop a good word segmentation tool based on machine learning techniques. In this paper, we examine various word segmentation schemes and their effect on the translation from Myanmar to seven other languages. We performed experiments based on character segmentation, syllable segmentation, human lexical/phrasal segmentation, and unsupervised/supervised word segmentation. The results show that the highest quality machine translation was attained with syllable segmentation, and we found this effect to be greatest for translation into subject-objectverb (SOV) structured languages such as Japanese and Korean. Approaches based on machine learning were unable to match this performance for most language pairs, and we believe this was due to the lack of linguistic resources. However, a machine learning approach that extended syllable segmentation produced promising results and we expect this can be developed into a viable method as more data becomes available in the future. en_US
dc.language.iso en_US en_US
dc.publisher Eleventh International Conference On Computer Applications (ICCA 2013) en_US
dc.title A Study of Myanmar Word Segmentation Schemes for Statistical Machine Translation en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository



Browse

My Account

Statistics