Abstract:
Text extraction and character recognition are the computer vision tasks which became important after smart phones with good camera. Character recognition from scene text images is still challenging area, because the camera captured text images have various background noise and the text also varies in shape, font, color. In this research, a camera captured based text extraction and character recognition system is developed for Myanmar warning text images. One major challenge of this research is that Myanmar OCR system has been greatly under-researched on the camera captured images. This research therefore considers these challenges of Myanmar OCR system and proposes a new algorithm for segmentation and recognition of Myanmar script. In the character segmentation, zone-based character segmentation is performed using position and size of connected component objects. Combination of features with chain code, pixel density, a new shape-based features of boundary-centroid distance and centroid-boundary distance, are explored and exploited in recognition process. This system uses K-Nearest Neighbors (KNN) classifiers to recognize the segmented characters. From the experiment, this system achieves satisfied results 93.9% segmentation accuracy and 92.77% classification accuracy.