Title :
A method for document zone content classification
Author :
Wang, Yalin ; Phillips, Ihsin T. ; Haralick, Robert M.
Author_Institution :
Dept. of Electr. Eng., Washington Univ., Seattle, WA, USA
Abstract :
This paper describes an algorithm to classify each given document zone into one of nine classes and provides a protocol for its performance evaluation. The classification scheme uses an optimized binary decision tree and Viterbi algorithm for HMM to find the optimal solution. Our algorithm was trained and tested on a total of 24,177 zones within the 1600 images from UWCDROM III database. Its accuracy rate is 98.45% with a mean false alarm rate of 0.50%.
Keywords :
binary decision diagrams; decision trees; document image processing; hidden Markov models; image classification; image segmentation; performance evaluation; visual databases; HMM; UWCDROM III database; Viterbi algorithm; document zone content classification; false alarm rate; hidden Markov model; optimized binary decision tree; performance evaluation; visual database; Classification tree analysis; Context modeling; Decision trees; Educational institutions; Hidden Markov models; Image databases; Optimization methods; Spatial databases; Testing; Viterbi algorithm;
Conference_Titel :
Pattern Recognition, 2002. Proceedings. 16th International Conference on
Print_ISBN :
0-7695-1695-X
DOI :
10.1109/ICPR.2002.1047828