Abstract:
Statistical word alignment models have been
widely used for various Natural Language
Processing (NLP) problem. In statistical machine
translation, word alignment models are trained on
bilingual corpora. To build an SMT system we
require bitext and a word alignment of that bitext, as
well as language models built from target language
data. A word alignment for a parallel sentence pair
represents the correspondence between words in a
source language and their translations in a target
language. This system will use the IBM model which
is based on the EM algorithm. This system deals with
the step of word alignment. In this paper, C#
implementation of a word alignment algorithm is
used to testing the source and target sentences. This
system also uses a English-Myanmar dictionary to
bootstrap the Expectation Maximization (EM)
algorithm.