Title :
Identifying contents page of documents
Author :
Luo, Qin ; Watanabe, Toyohide ; Nakayama, Takeshi
Author_Institution :
Fac. of Eng., Toyama Univ., Japan
Abstract :
Contents pages of a document include useful information such as the list of contents, the hierarchical organization of the document, and the locations of each component (page number). Therefore, identification of contents pages is an effective way to establish the reference information of documents automatically. In this paper, we propose a recognition method for contents pages of documents: not only extract the meaningful data from contents page images and classify them into distinct items, but distinguish contents pages from other pages. As many researches reported until today indicate, successful methods for document analysis depend on the effective application of object-specific information. In this sense, the recognition of document types or classes is an important complement to document analysis
Keywords :
document image processing; image recognition; contents page identification; contents page recognition; documents; hierarchical organization; object-specific information; Connectors; Data mining; Image analysis; Information analysis; Software libraries; Testing; Text analysis;
Conference_Titel :
Pattern Recognition, 1996., Proceedings of the 13th International Conference on
Conference_Location :
Vienna
Print_ISBN :
0-8186-7282-X
DOI :
10.1109/ICPR.1996.547035