Abstract:
Speech signal processing plays a crucial role
in any speech-related system whether Automatic
Speech Recognition or Speaker Recognition or
Speech Synthesis or something else. Burmese
language can be considered as an under resourced
language due to its linguistic resource availability.
For building Burmese speaker identification system,
the sufficient amount of speech data collection is a
very challenging task in a short time. In order to get
higher data size, this paper analyzes that the getting
higher duration of speech data actually combining
with various noises encountering in our
surroundings. For increased noisy state speech data,
we also used the voice activity detection (VAD)
technique to acquire only the speaker specific
information. For feature extraction, we used MFCC,
Filter Banks and PLP techniques. The experiments
were developed with i-vector methods on GMM-UBM
together with PLDA and presented the performance
of different data set in the form of EER with two
models trained on clean and noisy data to prove that
the developed speaker identification system is noise
robust.