DocumentCode
3497808
Title
Detection and segmentation of table of contents and index pages from document images
Author
Mandal, S. ; Chowdhury, S.P. ; Das, A.K. ; Chanda, Bhabatosh
Author_Institution
Dept. of Comput. Sci. & Technol., Bengal Eng. & Sci. Univ., Howrah
fYear
2006
fDate
27-28 April 2006
Lastpage
81
Abstract
Identification and segmentation of the table of contents (TOC) and index pages for the development of a digital library is an obvious task. A digital document library is created to provide a non-labour intensive, cheap and flexible way of storage, representation and management of paper documents in electronic form to facilitate indexing, viewing, printing and extracting the intended portions. Using document image analysis techniques information from the TOC and index pages may be extracted to use in a document database for effective retrieval of the required pieces of information. In this paper, we present fully automatic identification and segmentation of TOC and index pages from scanned documents
Keywords
digital libraries; document image processing; image segmentation; indexing; information retrieval; digital document library; document database; document image analysis; document images; electronic document storage; fully automatic table of contents detection; fully automatic table of contents identification; index pages detection; index pages identification; index pages segmentation; information extraction; information retrieval; paper document management; paper document representation; paper document storage; scanned documents; table of contents segmentation;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Image Analysis for Libraries, 2006. DIAL '06. Second International Conference on
Conference_Location
Lyon
Print_ISBN
0-7695-2531-8
Type
conf
DOI
10.1109/DIAL.2006.13
Filename
1612948
Link To Document