Title :
DVHMM: variable length text recognition error model
Author :
Takasu, Atsuhiro ; Aihara, Kenro
Author_Institution :
Nat. Inst. of Informatics, Tokyo, Japan
Abstract :
This paper proposes a text recognition error model called the dual variable length output hidden Markov model (DVHMM) and gives a parameter estimation algorithm based on the EM algorithm. Although existing probabilistic error models are limited to substitution (1, 1), insertion (1, 0), and deletion (0, 1) errors, the DVHMM can handle error patterns of any pair (i, j) of lengths including substitution, insertion, and deletion.
Keywords :
document image processing; errors; hidden Markov models; optical character recognition; parameter estimation; probability; DVHMM; OCR; deletion; document recognition; dual variable length output hidden Markov model; insertion; parameter estimation; probabilistic error models; substitution; variable length text recognition error model; Automata; Character recognition; Error correction; Hidden Markov models; Informatics; Matrices; Optical character recognition software; Pattern recognition; Speech recognition; Text recognition;
Conference_Titel :
Pattern Recognition, 2002. Proceedings. 16th International Conference on
Print_ISBN :
0-7695-1695-X
DOI :
10.1109/ICPR.2002.1047807