Abstract:
This paper contributes Myanmar Printed
Character Recognition with format. This system
consist recognition and formatting. It recognizes for
Myanmar Portable Document Format (.pdf) such as
font size, font style, alignment and table. It converts
the existing document to Machine Editable Word
Document (.doc). It contains paragraph and table
classification. In table classification, table
recognition and formatting can also be performed.
The extraction of text format, paragraph format and
table format can be done efficiently. The system is
based on MICR (Myanmar Intelligent Character
Recognition) which is a kind of ICR (Intelligent
Character Recognition). MICR uses statistical and
semantic information which includes width and
height ratio, black stroke counts, number of loops,
open directions and histogram value, etc. The final
decision is made by the voting system. The system
use image processing and Matlab programming.