DocumentCode :
591985
Title :
A Coarse-to-Fine Approach for Handwritten Word Spotting in Large Scale Historical Documents Collection
Author :
Almazan, Jon ; Fernandez, Diego ; Fornes, Alicia ; Llados, Josep ; Valveny, Ernest
Author_Institution :
Dept. Cienc. de la Computacio, Univ. Aut`onoma de Barcelona, Barcelona, Spain
fYear :
2012
fDate :
18-20 Sept. 2012
Firstpage :
455
Lastpage :
460
Abstract :
In this paper we propose an approach for word spotting in handwritten document images. We state the problem from a focused retrieval perspective, i.e. locating instances of a query word in a large scale dataset of digitized manuscripts. We combine two approaches, namely one based on word segmentation and another one segmentation-free. The first approach uses a hashing strategy to coarsely prune word images that are unlikely to be instances of the query word. This process is fast but has a low precision due to the errors introduced in the segmentation step. The regions containing candidate words are sent to the second process based on a state of the art technique from the visual object detection field. This discriminative model represents the appearance of the query word and computes a similarity score. In this way we propose a coarse-to-fine approach achieving a compromise between efficiency and accuracy. The validation of the model is shown using a collection of old handwritten manuscripts. We appreciate a substantial improvement in terms of precision regarding the previous proposed method with a low computational cost increase.
Keywords :
cryptography; document image processing; handwritten character recognition; image retrieval; image segmentation; object detection; query processing; coarse-to-fine approach; digitized manuscript; discriminative model; focused retrieval perspective; handwritten document image; handwritten manuscript; handwritten word spotting; hashing strategy; historical document collection; query word; similarity score; visual object detection; word segmentation; Accuracy; Computational modeling; Histograms; Image segmentation; Training; Vectors; Visualization; appearance models; historical documents; word indexation; word spotting;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Frontiers in Handwriting Recognition (ICFHR), 2012 International Conference on
Conference_Location :
Bari
Print_ISBN :
978-1-4673-2262-1
Type :
conf
DOI :
10.1109/ICFHR.2012.151
Filename :
6424435
Link To Document :
بازگشت