DocumentCode :
3134472
Title :
Script Independent Word Spotting in Offline Handwritten Documents Based on Hidden Markov Models
Author :
Wshah, S. ; Kumar, Girish ; Govindaraju, Vengatesan
fYear :
2012
fDate :
18-20 Sept. 2012
Firstpage :
14
Lastpage :
19
Abstract :
Keyword spotting aims to retrieve all instances of a given keyword from a document in any language. In this paper, we propose a novel script independent line based word spotting framework for offline handwritten documents based on Hidden Markov Models. The methodology simulates the keywords in model space as a sequence of character models and uses the filler models for better representation of background or non-keyword text. We propose a two stage spotting framework where the candidate keywords are further pruned using the character based background and lexicon based background model. The system deals with large vocabulary without the need for word or character segmentation. The system has been evaluated on many public dataset from several languages such as IAM for English, AMA for Arabic and LAW for Devanagari. The system outperforms the modern line based approach on the English, Arabic and Devanagari Datasets.
Keywords :
handwritten character recognition; hidden Markov models; AMA; Arabic; Devanagari; English; IAM; LAW; background model; hidden Markov model; keyword spotting; offline handwritten document; script independent line; script independent word spotting; vocabulary; word spotting framework; Computational modeling; Context; Context modeling; Feature extraction; Hidden Markov models; Testing; Training; Filler and Background Models; Handwriting Recognition; Hidden Markov Models; Script Independent; Spotting;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Frontiers in Handwriting Recognition (ICFHR), 2012 International Conference on
Conference_Location :
Bari
Print_ISBN :
978-1-4673-2262-1
Type :
conf
DOI :
10.1109/ICFHR.2012.264
Filename :
6424364
Link To Document :
بازگشت