DocumentCode
2397273
Title
Language-independent OCR using a continuous speech recognition system
Author
Schwartz, Richard ; LaPre, Christopher ; Makhoul, John ; Raphael, Christopher ; Zhao, Ying
Author_Institution
BBN Syst. & Technol. Corp., Cambridge, MA, USA
Volume
3
fYear
1996
fDate
25-29 Aug 1996
Firstpage
99
Abstract
In this paper we show how continuous speech recognition methods can be used for character recognition resulting in a technology that is language independent and does not require presegmentation of the data at the character and word levels. In multifont experiments on the ARPA Arabic OCR Corpus an average character error rate of 1.9% is obtained using the BBN BYBLOS continuous speech recognition system with no modifications. A first experiment using the identical system and procedures, trained and tested on a subset of the English Univ. of Washington OCR corpus resulted in 1.4% character error
Keywords
hidden Markov models; optical character recognition; speech recognition; ARPA Arabic OCR Corpus; BBN BYBLOS continuous speech recognition system; continuous speech recognition system; language-independent OCR; multifont experiments; Character recognition; Error analysis; Error correction; Hidden Markov models; Image segmentation; Natural languages; Optical character recognition software; Pattern recognition; Speech recognition; System testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Pattern Recognition, 1996., Proceedings of the 13th International Conference on
Conference_Location
Vienna
ISSN
1051-4651
Print_ISBN
0-8186-7282-X
Type
conf
DOI
10.1109/ICPR.1996.546802
Filename
546802
Link To Document