Abstract:
The paper presents the first large scale
evaluation of the quality of Syllable-based Neural
Machine Translation (Syllable-NMT) system for
Myanmar-English pair. Neural Machine Translation
(NMT) system has reached state-of-the-arts results
on some languages. However, one of the main
challenges that NMT still faces is dealing with very
large vocabularies and morphologically rich
languages. Like other low-resources languages,
Myanmar Language has a lots of morphology
information. This issue lead is to increase the
ambiguity and to decrease the quality of translation
results. Moreover, rule-based and phrase-based
techniques were used in the existing research on
Myanmar translation with the small amount of
parallel corpus. Therefore, a large amount of
parallel corpus is prepared and introduces a NMT
model that maps a source syllable sequence to a
target word sequences to address the morphological
problems. In addition, this paper shows some
experiments results and compare them. Our results
show that syllable-NMT system is able to surpass
than the character-based and word-based NMT
systems by 5 BLEU.
Description:
This work is partially supported by Institute of
Infocomm research (I2R), Singapore. We are thankful
to Aw Ai Ti and Wu Kui, Department of Human
Language Technology, I2R, who provided expertise
that greatly assisted for my research. We also thank
the reviewers for their valuable comments and
suggestions.