DocumentCode :
1994113
Title :
Automated detection and segmentation of table of contents page from document images
Author :
Mandal, S. ; Chowdhury, S.P. ; Das, A.K. ; Chanda, Bhabatosh
Author_Institution :
Bengal Eng. Coll., Howrah, India
fYear :
2003
fDate :
3-6 Aug. 2003
Firstpage :
398
Abstract :
With an aim to extract the structural information from the table of contents (TOC) to help develop a digital document library, the requirement of identifying/segmenting the TOC page is obvious. The objective to create a digital document library is to provide a non-labour intensive, cheap and flexible way of storing, representing and managing the paper document in electronic form to facilitate indexing, viewing, printing and extracting the intended portions. Information from the TOC pages is to be extracted for use in a document database for effective retrieval of the required pages. We present a fully automatic identification and segmentation of a table of contents (TOC) page from a scanned document.
Keywords :
character recognition; digital libraries; document image processing; image segmentation; information retrieval; visual databases; TOC page identification; automated detection; automatic identification; digital document library development; document database; document image segmentation; document images; electronic form; information extraction; information retrieval; nonlabour intensive document storage; page segmentation; paper document; scanned document; structural information; table of contents detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 2003. Proceedings. Seventh International Conference on
Print_ISBN :
0-7695-1960-1
Type :
conf
DOI :
10.1109/ICDAR.2003.1227697
Filename :
1227697
Link To Document :
بازگشت