DocumentCode
2145986
Title
Handwritten and Typewritten Text Identification and Recognition Using Hidden Markov Models
Author
Cao, Huaigu ; Prasad, Rohit ; Natarajan, Prem
Author_Institution
Raytheon BBN Technol., Cambridge, MA, USA
fYear
2011
fDate
18-21 Sept. 2011
Firstpage
744
Lastpage
748
Abstract
We present a system for identification and recognition of handwritten and typewritten text from document images using hidden Markov models (HMMs) in this paper. Our text type identification uses OCR decoding to generate word boundaries followed by word-level handwritten/typewritten identification using HMMs. We show that the contextual constraints from the HMM significantly improves the identification performance over the conventional Gaussian mixture model (GMM)-based method. Type identification is then used to estimate the frame sample rates and frame width of feature sequences for HMM OCR system for each type independently. This type-dependent approach to computing the frame sample rate and frame width shows significant improvement in OCR accuracy over type-independent approaches.
Keywords
Gaussian processes; document image processing; feature extraction; handwritten character recognition; hidden Markov models; image recognition; text analysis; word processing; Gaussian mixture model-based method; HMM OCR system; OCR accuracy; OCR decoding; contextual constraint; document image; feature sequence; handwritten text identification; handwritten text recognition; hidden Markov model; typewritten text identification; typewritten text recognition; word boundary; Adaptation models; Classification algorithms; Error analysis; Feature extraction; Hidden Markov models; Optical character recognition software; Training; Gaussian mixture model; hidden Markov model; optical character recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition (ICDAR), 2011 International Conference on
Conference_Location
Beijing
ISSN
1520-5363
Print_ISBN
978-1-4577-1350-7
Electronic_ISBN
1520-5363
Type
conf
DOI
10.1109/ICDAR.2011.155
Filename
6065410
Link To Document