Abstract:
Automatic machine-printed Optical
Characters or texts Recognizers (OCR) are
highly desirable for a multitude of modern IT
applications, including Digital Library software.
However, the state of the art OCR systems can’t
do for Myanmar scripts as our language pose
many challenges for document understanding.
Therefore, we design an Optical Character
Recognition System for Myanmar Printed
Document (OCRMPD), with several propose
techniques that can automatically recognize
Myanmar printed text from document image. In
order to get more accuracy system, we propose
the method for isolation of the character image
by using not only the projection methods but also
structural analysis for wrongly segmented
characters. To reveal the effectiveness of our
segmentation technique, we follow a new hybrid
feature extraction method and choose the SVM
classifier for recognition of the character image.
The proposed algorithms have been tested on a
variety of Myanmar printed documents and the
results of the experiments indicate that the
methods can increase the segmentation accuracy
as well as recognition rates.