DocumentCode :
2397273
Title :
Language-independent OCR using a continuous speech recognition system
Author :
Schwartz, Richard ; LaPre, Christopher ; Makhoul, John ; Raphael, Christopher ; Zhao, Ying
Author_Institution :
BBN Syst. & Technol. Corp., Cambridge, MA, USA
Volume :
3
fYear :
1996
fDate :
25-29 Aug 1996
Firstpage :
99
Abstract :
In this paper we show how continuous speech recognition methods can be used for character recognition resulting in a technology that is language independent and does not require presegmentation of the data at the character and word levels. In multifont experiments on the ARPA Arabic OCR Corpus an average character error rate of 1.9% is obtained using the BBN BYBLOS continuous speech recognition system with no modifications. A first experiment using the identical system and procedures, trained and tested on a subset of the English Univ. of Washington OCR corpus resulted in 1.4% character error
Keywords :
hidden Markov models; optical character recognition; speech recognition; ARPA Arabic OCR Corpus; BBN BYBLOS continuous speech recognition system; continuous speech recognition system; language-independent OCR; multifont experiments; Character recognition; Error analysis; Error correction; Hidden Markov models; Image segmentation; Natural languages; Optical character recognition software; Pattern recognition; Speech recognition; System testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition, 1996., Proceedings of the 13th International Conference on
Conference_Location :
Vienna
ISSN :
1051-4651
Print_ISBN :
0-8186-7282-X
Type :
conf
DOI :
10.1109/ICPR.1996.546802
Filename :
546802
Link To Document :
بازگشت