Title :
A CRF Based Scheme for Overlapping Multi-colored Text Graphics Separation
Author :
Garg, Ritu ; Hassan, Ehtesham ; Chaudhury, Santanu ; Gopal, M.
Author_Institution :
Dept. of Electr. Eng., Indian Inst. of Technol. Delhi, New Delhi, India
Abstract :
In this paper, we propose a novel framework for segmentation of documents with complex layouts. The document segmentation is performed by combination of clustering and conditional random fields (CRF) based modeling. The bottom-up approach for segmentation assigns each pixel to a cluster plane based on color intensity. A CRF based discriminative model is learned to extract the local neighborhood information in different cluster/color planes. The final category assignment is done by a top-level CRF based on the semantic correlation learned across clusters. The proposed framework has been extensively tested on multi-colored document images with text overlapping graphics/image.
Keywords :
computer graphics; document image processing; image segmentation; text analysis; CRF based scheme; color intensity; conditional random fields; discriminative model; document segmentation; multicolored document image; overlapping multicolored text graphics separation; semantic correlation; Feature extraction; Image color analysis; Image segmentation; Layout; Support vector machines; Text analysis; Complex layout analysis; Conditional random fields; Document image analysis;
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2011 International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4577-1350-7
Electronic_ISBN :
1520-5363
DOI :
10.1109/ICDAR.2011.245