DocumentCode :
2509749
Title :
An Information Extraction Model for Unconstrained Handwritten Documents
Author :
Thomas, Simon ; Chatelain, Clément ; Heutte, Laurent ; Paquet, Thierry
Author_Institution :
LITIS, Univ. de Rouen, St. Etienne du Rouvray, France
fYear :
2010
fDate :
23-26 Aug. 2010
Firstpage :
3412
Lastpage :
3415
Abstract :
In this paper, a new information extraction system by statistical shallow parsing in unconstrained handwritten documents is introduced. Unlike classical approaches found in the literature as keyword spotting or full document recognition, our approach relies on a strong and powerful global handwriting model. A entire text line is considered as an indivisible entity and is modeled with Hidden Markov Models. In this way, text line shallow parsing allows fast extraction of the relevant information in any document while rejecting at the same time irrelevant information. First results are promising and show the interest of the approach.
Keywords :
document handling; handwriting recognition; hidden Markov models; information retrieval; statistical analysis; full document recognition; hidden Markov models; information extraction model; keyword spotting; statistical shallow parsing; text line shallow parsing; unconstrained handwritten documents; Data mining; Databases; Feature extraction; Handwriting recognition; Hidden Markov models; Numerical models; Postal services; Handwriting recognition; information extraction; shallow parsing model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition (ICPR), 2010 20th International Conference on
Conference_Location :
Istanbul
ISSN :
1051-4651
Print_ISBN :
978-1-4244-7542-1
Type :
conf
DOI :
10.1109/ICPR.2010.833
Filename :
5597527
Link To Document :
بازگشت