DocumentCode
479819
Title
Text and Non-text Segmentation and Classification from Document Images
Author
Ibrahim, Zaidah ; Isa, Dino ; Rajkumar, Rajprasad
Author_Institution
Fac. of Inf. Technol. & Quantitative Sci., Univ. Technol. MARA, Shah Alam
Volume
1
fYear
2008
fDate
12-14 Dec. 2008
Firstpage
973
Lastpage
976
Abstract
Text and non-text segmentation and classification is very important in document layout analysis system before it is presented to an OCR system. Heuristic rules have been used in segmenting and classifying the text and non-text blocks. This research focuses on the classification of non-text block in technical documents into table, graph, and figure. A comparative study is conducted between backpropagation neural network and support vector machine and the result shows that support vector machine classifies better than back propagation neural network.
Keywords
backpropagation; image classification; image segmentation; neural nets; support vector machines; text analysis; OCR system; backpropagation neural network; document images; document layout analysis system; nontext classification; nontext segmentation; support vector machine; text classification; text segmentation; Backpropagation; Computer science; Image segmentation; Labeling; Neural networks; Pixel; Software engineering; Support vector machine classification; Support vector machines; Text analysis; Backpropagation neural network; non-text segmentation; support vector machine; zoning;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science and Software Engineering, 2008 International Conference on
Conference_Location
Wuhan, Hubei
Print_ISBN
978-0-7695-3336-0
Type
conf
DOI
10.1109/CSSE.2008.1516
Filename
4721913
Link To Document