Title :
Word-Graph Based Handwriting Key-Word Spotting: Impact of Word-Graph Size on Performance
Author :
Toselli, Alejandro Hector ; Vidal, Enrique
Author_Institution :
Univ. Politec. de Valencia, Valencia, Spain
Abstract :
Key-Word Spotting (KWS) in handwritten documents is approached here by means of Word Graphs (WG) obtained using segmentation-free handwritten text recognition technology based on N-gram Language Models and Hidden Markov Models. Linguistic context significantly boost KWS performance with respect to methods which ignore word contexts and/or rely on image-matching with pre-segmented isolated words. On the other hand, WG-based KWS can be significantly faster than other KWS approaches which directly work on the original images where, in general, computational demands are exceedingly high. A large WG contains most of the relevant information of the original text (line) image needed for KWS but, if it is too large, the computational advantages over traditional, image matching-based KWS become diminished. Conversely, if it is too small, relevant information may be lost, leading to degraded KWS precision/recall performance. We study the trade off between WG size and KWS information retrieval performance. Results show that small, computationally cheap WGs can be used without loosing the excellent KWS performance achieved with huge WGs.
Keywords :
computational linguistics; data structures; document image processing; handwritten character recognition; hidden Markov models; image matching; image segmentation; information retrieval; text analysis; KWS information retrieval performance; N-gram language models; WG-based KWS; handwritten documents; hidden Markov models; image matching-based KWS; linguistic context; presegmented isolated words; segmentation-free handwritten text recognition technology; text image; word-graph based handwriting key-word spotting; word-graph size; Decoding; Hidden Markov models; Image segmentation; Training; Vectors; Viterbi algorithm; Vocabulary; KWS performence effect; Word-Graph size; word-graph generation cost;
Conference_Titel :
Document Analysis Systems (DAS), 2014 11th IAPR International Workshop on
Conference_Location :
Tours
Print_ISBN :
978-1-4799-3243-6
DOI :
10.1109/DAS.2014.65