Title :
Language-independent OCR using a continuous speech recognition system
Author :
Schwartz, Richard ; LaPre, Christopher ; Makhoul, John ; Raphael, Christopher ; Zhao, Ying
Author_Institution :
BBN Syst. & Technol. Corp., Cambridge, MA, USA
Abstract :
In this paper we show how continuous speech recognition methods can be used for character recognition resulting in a technology that is language independent and does not require presegmentation of the data at the character and word levels. In multifont experiments on the ARPA Arabic OCR Corpus an average character error rate of 1.9% is obtained using the BBN BYBLOS continuous speech recognition system with no modifications. A first experiment using the identical system and procedures, trained and tested on a subset of the English Univ. of Washington OCR corpus resulted in 1.4% character error
Keywords :
hidden Markov models; optical character recognition; speech recognition; ARPA Arabic OCR Corpus; BBN BYBLOS continuous speech recognition system; continuous speech recognition system; language-independent OCR; multifont experiments; Character recognition; Error analysis; Error correction; Hidden Markov models; Image segmentation; Natural languages; Optical character recognition software; Pattern recognition; Speech recognition; System testing;
Conference_Titel :
Pattern Recognition, 1996., Proceedings of the 13th International Conference on
Conference_Location :
Vienna
Print_ISBN :
0-8186-7282-X
DOI :
10.1109/ICPR.1996.546802