Abstract:
Speech synthesis system is a popular field in natural
language processing of computer science for various
languages. The process of speech synthesis is to
produce the human-like speech from corresponding
language and it can be divided into two key phases
such as high-level and low-level text-to-speech
synthesis. In high-level synthesis, the input text is
converted into such form that the low-level
synthesizer can produce the output speech. Speech
synthesis can be used three types: Articulatory,
Format and Concatenative synthesis. Concatenative
speech synthesis is the most popular technique among
three types and it is the easiest technique to synthesis
the speech. Diphone-concatenative method is one of
the popular methods in nowadays because of speech
quality is better than other techniques and complexity
is less than others. This paper gives the analysis of
two type of Diphone-concatenation; word-based and
unit-based synthesis for Myanmar language by
applying Pitch Synchronous Overlap and Add
(PSOLA) algorithm to smooth the joints of the speech
signals. This paper describes the Time-Domain Pitch
Synchronous Overlap and Add method in Diphoneconcatenation
speech synthesis and it is to maintain
the consistency and accuracy of the pitch marks of
the speech signal and Diphones database with
integrated vowels and consonants of Myanmar
language. This paper also describes the building of
Myanmar Diphone database for concatenation
method. Diphone Database for Myanmar
pronunciations is constructed in this paper to reduce
the ambiguity in pronunciations.