Title :
A Simple and Fast Word Spotting Method
Author :
Kovalchuk, Alon ; Wolf, Lars ; Dershowitz, Nachum
Author_Institution :
Blavatnik Sch. of Comput. Sci., Tel Aviv Univ., Tel Aviv, Israel
Abstract :
A simple and efficient pipeline for word spotting in handwritten documents is proposed. The method allows for extremely rapid querying, while still maintaining high accuracy. The dataset images that are to be queried are preprocessed by a simple binarization operation, followed by the extraction of multiple overlapping candidate targets. Each binary target, as well as the binarized query, is resized to fit a fixed-size rectangle and represented by conventional image descriptors. Then, a cosine similarity operator -- followed by maximum pooling over random groups -- is used to represent each target or query as a concise 250D vector. Retrieval is performed in a fraction of a second by nearest-neighbor search within that space, followed by a simple suppression of extra overlapping candidates.
Keywords :
document image processing; handwritten character recognition; image representation; image retrieval; vectors; 250D vector; binarization operation; conventional image descriptor representation; cosine similarity operator; dataset image querying; handwritten documents; image retrieval; nearest-neighbor search; word spotting method; Accuracy; Benchmark testing; Image segmentation; Pipelines; Standards; Training; Vectors;
Conference_Titel :
Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on
Conference_Location :
Heraklion
Print_ISBN :
978-1-4799-4335-7
DOI :
10.1109/ICFHR.2014.9