• DocumentCode
    3497808
  • Title

    Detection and segmentation of table of contents and index pages from document images

  • Author

    Mandal, S. ; Chowdhury, S.P. ; Das, A.K. ; Chanda, Bhabatosh

  • Author_Institution
    Dept. of Comput. Sci. & Technol., Bengal Eng. & Sci. Univ., Howrah
  • fYear
    2006
  • fDate
    27-28 April 2006
  • Lastpage
    81
  • Abstract
    Identification and segmentation of the table of contents (TOC) and index pages for the development of a digital library is an obvious task. A digital document library is created to provide a non-labour intensive, cheap and flexible way of storage, representation and management of paper documents in electronic form to facilitate indexing, viewing, printing and extracting the intended portions. Using document image analysis techniques information from the TOC and index pages may be extracted to use in a document database for effective retrieval of the required pieces of information. In this paper, we present fully automatic identification and segmentation of TOC and index pages from scanned documents
  • Keywords
    digital libraries; document image processing; image segmentation; indexing; information retrieval; digital document library; document database; document image analysis; document images; electronic document storage; fully automatic table of contents detection; fully automatic table of contents identification; index pages detection; index pages identification; index pages segmentation; information extraction; information retrieval; paper document management; paper document representation; paper document storage; scanned documents; table of contents segmentation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Image Analysis for Libraries, 2006. DIAL '06. Second International Conference on
  • Conference_Location
    Lyon
  • Print_ISBN
    0-7695-2531-8
  • Type

    conf

  • DOI
    10.1109/DIAL.2006.13
  • Filename
    1612948