DocumentCode :
2261591
Title :
Multi-font recognition of printed Arabic using the BBN BYBLOS speech recognition system
Author :
LaPre, Christopher ; Zhao, Ying ; Raphael, Christopher ; Schwartz, Richard ; Makhoul, John
Author_Institution :
BBN Syst. & Technol. Corp., Cambridge, MA, USA
Volume :
4
fYear :
1996
fDate :
7-10 May 1996
Firstpage :
2136
Abstract :
We use a hidden Markov model (HMM) based continuous speech recognition system to perform off-line character recognition (OCR) of Arabic printed text. The HMM trainer and recognizer are used without change, however we modify the feature extraction stage to compute features relevant to OCR. Although we begin by segmenting the page into a collection of lines, no further segmentation is necessary for either recognition or training. Experiments on the ARPA Arabic data corpus yield a range of character error rates from under one percent for a single computer font to 2.8% for multiple-font recognition of a wide range of material from books, magazines and newspapers
Keywords :
feature extraction; hidden Markov models; image segmentation; optical character recognition; speech recognition; ARPA Arabic data corpus; BBN BYBLOS speech recognition system; HMM; HMM recognizer; HMM trainer; books; character error rates; continuous speech recognition system; experiments; feature extraction; hidden Markov model; magazines; multifont recognition; newspapers; off-line character recognition; page segmentation; printed Arabic; single computer font; training; Character recognition; Error analysis; Feature extraction; Handwriting recognition; Hidden Markov models; Histograms; Optical character recognition software; Optical materials; Speech recognition; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on
Conference_Location :
Atlanta, GA
ISSN :
1520-6149
Print_ISBN :
0-7803-3192-3
Type :
conf
DOI :
10.1109/ICASSP.1996.545738
Filename :
545738
Link To Document :
بازگشت