Title :
Fuzzy Rule Based Document Image Segmentation for Component Labeling
Author :
Kapoor, Aditi ; Pandey, Parul ; Biswas, K.K.
Author_Institution :
Amar Nath & Shashi Khosla Sch. of Inf. Technol., Indian Inst. of Technol., New Delhi, India
Abstract :
In this paper, we propose a fuzzy rule based method for extensive segmentation of documents, with the view of labeling various components like page header, title, figure and tables, column divider, text etc. The fuzzy rules are learned using evolutionary computation, based on labeled document images serving as ground truth. The proposed fuzzy rules exploit the placement heuristics among the various components in a technical paper. A genetic algorithm involving individual fitness functions of all components selects the best set of rules. The effectiveness of the proposed method is validated through a case study involving 300 documents. It is also shown that this approach can be extended to segmentation of magazine pages.
Keywords :
document image processing; fuzzy set theory; image segmentation; knowledge based systems; column divider; component labeling; figure; fuzzy rule based document image segmentation; ground truth; magazine pages; page header; tables; title; Biological cells; Equations; Genetic algorithms; Image segmentation; Labeling; Layout; Text analysis; Document image; chromosomes; connected components; fuzzy rules; genetic algorithm;
Conference_Titel :
Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), 2011 Third National Conference on
Conference_Location :
Hubli, Karnataka
Print_ISBN :
978-1-4577-2102-1
DOI :
10.1109/NCVPRIPG.2011.10