• DocumentCode
    479819
  • Title

    Text and Non-text Segmentation and Classification from Document Images

  • Author

    Ibrahim, Zaidah ; Isa, Dino ; Rajkumar, Rajprasad

  • Author_Institution
    Fac. of Inf. Technol. & Quantitative Sci., Univ. Technol. MARA, Shah Alam
  • Volume
    1
  • fYear
    2008
  • fDate
    12-14 Dec. 2008
  • Firstpage
    973
  • Lastpage
    976
  • Abstract
    Text and non-text segmentation and classification is very important in document layout analysis system before it is presented to an OCR system. Heuristic rules have been used in segmenting and classifying the text and non-text blocks. This research focuses on the classification of non-text block in technical documents into table, graph, and figure. A comparative study is conducted between backpropagation neural network and support vector machine and the result shows that support vector machine classifies better than back propagation neural network.
  • Keywords
    backpropagation; image classification; image segmentation; neural nets; support vector machines; text analysis; OCR system; backpropagation neural network; document images; document layout analysis system; nontext classification; nontext segmentation; support vector machine; text classification; text segmentation; Backpropagation; Computer science; Image segmentation; Labeling; Neural networks; Pixel; Software engineering; Support vector machine classification; Support vector machines; Text analysis; Backpropagation neural network; non-text segmentation; support vector machine; zoning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Software Engineering, 2008 International Conference on
  • Conference_Location
    Wuhan, Hubei
  • Print_ISBN
    978-0-7695-3336-0
  • Type

    conf

  • DOI
    10.1109/CSSE.2008.1516
  • Filename
    4721913