Title :
Spot It! Finding Words and Patterns in Historical Documents
Author :
Dovgalecs, Vladislavs ; Burnett, Alexandre ; Tranouez, Pierrick ; Nicolas, S. ; Heutte, Laurent
Author_Institution :
LITIS, Univ. de Rouen, St. Étienne du Rouvray, France
Abstract :
We propose a system designed to spot either words or patterns, based on a user made query. Employing a two stage approach, it takes advantage of the descriptive power of the Bag of Visual Words (BOVW) representation and the discriminative power of the proposed Longest Weighted Profile (LWP) algorithm. First, we try to identify the zones of images that share common characteristics with the query as summed up in a BOVW. Then, we filter these zones using the LWP introducing spatial constraints extracted from the query. We have validated our system on the George Washington handwritten document database for word spotting, and medieval manuscripts from the DocExplore project for pattern spotting.
Keywords :
document image processing; filtering theory; image retrieval; BOVW representation; DocExplore project; George Washington handwritten document database; LWP algorithm; bag of visual words representation; historical documents; longest weighted profile algorithm; medieval manuscripts; pattern spotting; user made query; word spotting; Clustering algorithms; Databases; Feature extraction; Hidden Markov models; Robustness; Training; Visualization; document image analysis; document understanding; historical documents; pattern spotting; segmentation free; word spotting;
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2013 12th International Conference on
Conference_Location :
Washington, DC
DOI :
10.1109/ICDAR.2013.208