DocumentCode
2142391
Title
Keyword Spotting in Offline Chinese Handwritten Documents Using a Statistical Model
Author
Huang, Liang ; Yin, Fei ; Chen, Qing-Hu ; Liu, Cheng-Lin
Author_Institution
Sch. of Electron. Inf., Wuhan Univ., Wuhan, China
fYear
2011
fDate
18-21 Sept. 2011
Firstpage
78
Lastpage
82
Abstract
This paper proposes a method for keyword spotting in offline Chinese handwritten documents using a statistical model. On a text query word, the method measures the similarity between the query word and every candidate word in the document by combining a character classifier and four classifiers characterizing the geometric contexts. By over-segmenting text lines into primitive segments, candidate characters and words are generated by concatenating consecutive segments, and the beam search strategy is used to search all the candidate words. The character classifier and the model combining weights are trained by optimizing a one-vs-all discrimination objective so as to maximize the similarity of true words and minimize the similarity of imposters. In experiments on a test dataset containing 1,015 pages of 180 writers, the proposed methods yields promising performance. For retrieving four-characer words, the recall, precision and F-measure are 92.47%, 83.76% and 87.90%, respectively.
Keywords
document handling; pattern classification; query processing; beam search strategy; character classifier; consecutive segment concatenation; geometric context characterization; keyword spotting method; offline Chinese handwritten documents; one-vs-all discrimination objective; statistical model; text query word; Context; Context modeling; Feature extraction; Image segmentation; Prototypes; Support vector machines; Training; Chinese handwritten documents; Keyword spotting; statistical model;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition (ICDAR), 2011 International Conference on
Conference_Location
Beijing
ISSN
1520-5363
Print_ISBN
978-1-4577-1350-7
Electronic_ISBN
1520-5363
Type
conf
DOI
10.1109/ICDAR.2011.25
Filename
6065280
Link To Document