Title :
Page classification through logical labelling
Author :
Liang, Jian ; Doermann, David ; Ma, Matthew ; Guo, Jinhong K.
Author_Institution :
Language & Media Process. Lab., Maryland Univ., College Park, MD, USA
Abstract :
We propose an integrated approach to page classification and logical labelling. Layout is represented by a fully connected attributed relational graph that is matched to the graph of an unknown document, achieving classification and labelling simultaneously. By incorporating global constraints in an integrated fashion, ambiguity at the zone level can be reduced, providing robustness to noise and variation. Models are automatically trained from sample documents. Experimental results show promise for the classification and labelling of technical article title pages, and supports the idea of a hierarchical model base.
Keywords :
document image processing; graph theory; image classification; optical character recognition; OCR; attributed relational graph; document images; experimental results; global constraints; hierarchical model base; labelling; logical labelling; noise; page classification; technical article title pages; unknown document; Companies; Educational institutions; Hardware; Image databases; Labeling; Laboratories; Noise level; Noise reduction; Noise robustness; Optical character recognition software;
Conference_Titel :
Pattern Recognition, 2002. Proceedings. 16th International Conference on
Print_ISBN :
0-7695-1695-X
DOI :
10.1109/ICPR.2002.1047980