DocumentCode :
2143512
Title :
Efficient Cut-Off Threshold Estimation for Word Spotting Applications
Author :
Kesidis, A.L. ; Gatos, B.
Author_Institution :
Dept. of Surveying Eng., Technol. Educ. Instn. of Athens, Athens, Greece
fYear :
2011
fDate :
18-21 Sept. 2011
Firstpage :
279
Lastpage :
283
Abstract :
Word spotting is an alternative methodology for document indexing based on spotting words directly on document images with the help of efficient word matching while avoiding conventional OCR procedure. The result of the word spotting procedure is a list of word images ranked according to a certain similarity criterion. In this paper, we propose an efficient method to cut-off the ranked list in order to provide the best tradeoff between recall and precision rates. Our aim is to filter the most relevant results based on a threshold which corresponds to an approximate maximization of the expected F-Measure. This is achieved by introducing an estimator that combines the distance of each ranked word with its cumulative moving average. Experimental results on a database with representative historical printed documents prove the efficiency of the proposed approach.
Keywords :
document image processing; image matching; F-measure; OCR procedure; document images; document indexing; efficient cut-off threshold estimation; similarity criterion; word matching; word spotting application; Approximation methods; Image segmentation; Indexing; Measurement; Text analysis; Vectors; cut-off threshold; document indexing; word spotting;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2011 International Conference on
Conference_Location :
Beijing
ISSN :
1520-5363
Print_ISBN :
978-1-4577-1350-7
Electronic_ISBN :
1520-5363
Type :
conf
DOI :
10.1109/ICDAR.2011.64
Filename :
6065319
Link To Document :
بازگشت