• DocumentCode
    3060019
  • Title

    Fuzzy Rule Based Document Image Segmentation for Component Labeling

  • Author

    Kapoor, Aditi ; Pandey, Parul ; Biswas, K.K.

  • Author_Institution
    Amar Nath & Shashi Khosla Sch. of Inf. Technol., Indian Inst. of Technol., New Delhi, India
  • fYear
    2011
  • fDate
    15-17 Dec. 2011
  • Firstpage
    11
  • Lastpage
    14
  • Abstract
    In this paper, we propose a fuzzy rule based method for extensive segmentation of documents, with the view of labeling various components like page header, title, figure and tables, column divider, text etc. The fuzzy rules are learned using evolutionary computation, based on labeled document images serving as ground truth. The proposed fuzzy rules exploit the placement heuristics among the various components in a technical paper. A genetic algorithm involving individual fitness functions of all components selects the best set of rules. The effectiveness of the proposed method is validated through a case study involving 300 documents. It is also shown that this approach can be extended to segmentation of magazine pages.
  • Keywords
    document image processing; fuzzy set theory; image segmentation; knowledge based systems; column divider; component labeling; figure; fuzzy rule based document image segmentation; ground truth; magazine pages; page header; tables; title; Biological cells; Equations; Genetic algorithms; Image segmentation; Labeling; Layout; Text analysis; Document image; chromosomes; connected components; fuzzy rules; genetic algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), 2011 Third National Conference on
  • Conference_Location
    Hubli, Karnataka
  • Print_ISBN
    978-1-4577-2102-1
  • Type

    conf

  • DOI
    10.1109/NCVPRIPG.2011.10
  • Filename
    6132989