DocumentCode :
2529231
Title :
Converting Myanmar printed document image into machine understandable text format
Author :
Win, Htwe Pa Pa ; Khine, Phyo Thu Thu ; Tun, Khin Nwe Ni
Author_Institution :
Univ. of Comput. Studies, Yangon, Myanmar
fYear :
2011
fDate :
26-28 Sept. 2011
Firstpage :
96
Lastpage :
101
Abstract :
The large amount of Myanmar document images are getting archived by the Digital Libraries, an efficient strategy is needed to convert document image into machine understandable text format. The state of the art OCR systems can´t do for Myanmar scripts as our language pose many challenges for document understanding. Therefore, this paper plans an OCR system for Myanmar Printed Document (OCRMPD) with several proposed methods that can automatically convert Myanmar printed text to machine understandable text. Firstly, the input image is enhanced by making some correction on noise variants. Then, the characters are segmented with a novel segmentation method. The features of the isolated characters are extracted with a hybrid feature extraction method to overcome the similarity problems of the Myanmar scripts. Finally, hierarchical mechanism is used for SVM classifier for recognition of the character image. The experiments are carried out on a variety of Myanmar printed documents and results show the efficiency of the proposed algorithms.
Keywords :
digital libraries; document image processing; optical character recognition; pattern classification; support vector machines; Myanmar printed document image convertion; OCR systems; OCRMPD; SVM classifier; character image recognition; digital libraries; machine understandable text format; Accuracy; Character recognition; Feature extraction; Image segmentation; Optical character recognition software; Support vector machines; Text recognition; Myanmar scripts; OCRMPD; feature extraction; segmentation; support vector machines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Digital Information Management (ICDIM), 2011 Sixth International Conference on
Conference_Location :
Melbourn, QLD
ISSN :
Pending
Print_ISBN :
978-1-4577-1538-9
Type :
conf
DOI :
10.1109/ICDIM.2011.6093371
Filename :
6093371
Link To Document :
بازگشت