DocumentCode
1636624
Title
Markov Random Field Based Text Identification from Annotated Machine Printed Documents
Author
Peng, Xujun ; Setlur, Srirangaraj ; Govindaraju, Venu ; Sitaram, Ramachandrula ; Bhuvanagiri, Kiran
Author_Institution
Dept. of Comput. Sci. & Eng., SUNY at Buffalo, Amherst, NY, USA
fYear
2009
Firstpage
431
Lastpage
435
Abstract
In this paper, we describe an approach to segment handwritten text, machine printed text and noise from annotated machine printed documents. Three categories of word level features are extracted. We use a modified K-Means clustering algorithm for classification followed by a relabeling procedure using Markov Random Field(MRF) based on a concept of neighboring patches and Belief Propagation(BP) rules. Experimental results on an imbalanced data set show that our approach achieves an overall recall of 96.33%.
Keywords
Markov processes; document image processing; feature extraction; image classification; image segmentation; pattern clustering; random processes; text analysis; Markov random field; annotated machine printed document; belief propagation; feature extraction; k-mean clustering algorithm; machine printed text; segment handwritten text; text identification; Classification algorithms; Feature extraction; Gabor filters; Handwriting recognition; Hidden Markov models; Image segmentation; Markov random fields; Optical character recognition software; Text analysis; Text recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition, 2009. ICDAR '09. 10th International Conference on
Conference_Location
Barcelona
ISSN
1520-5363
Print_ISBN
978-1-4244-4500-4
Electronic_ISBN
1520-5363
Type
conf
DOI
10.1109/ICDAR.2009.237
Filename
5277639
Link To Document