DocumentCode :
2563802
Title :
Autonomously normalized horizontal differentials as features for HMM-based Omni font-written OCR systems for cursively scripted languages
Author :
Attia, Mohamed ; Rashwan, Mohsen A A ; El-Mahallawy, Mohamed S M
Author_Institution :
Eng. Co. for the Dev. of Comput. Syst., RDI, Egypt
fYear :
2009
fDate :
18-19 Nov. 2009
Firstpage :
185
Lastpage :
190
Abstract :
Automatic font-written Optical Character Recognition (OCR) is highly desirable for numerous modern information technology (IT) applications. Reliable font-written OCR´s for Latin scripts are readily in use since long. For cursively scripted languages, that are the mother tongues of over one fourth of the world population, such OCR´s are however not available at a robust and reliable performance. In this regard, the main challenge is the mandatory connectivity of characters/ligatures (i.e. graphemes) that has to be resolved simultaneously upon the recognition of these graphemes. Among the various approaches tried over decades, Hidden Markov Models (HMM)-based OCR´s seem to be the most promising as they capitalize on the ability of HMM decoders to achieve segmentation and recognition simultaneously similar to the widely used HMM-based automatic speech recognition (ASR). Unlike ASR´s, what is missing in HMM-based OCR´s is the definition of a rigorously founded features vector capable to robustly achieving minimal “font type/size-independent” (omnifont) word error rates comparable to those realized with Latin scripts. Here comes the contribution of this paper that introduces such a sound features vector design, and experimentally shows its superiority in this regard.
Keywords :
hidden Markov models; image segmentation; optical character recognition; HMM-based omni font; cursively scripted languages; font-written OCR systems; grapheme recognition; hidden Markov models; image recognition; image segmentation; normalized horizontal differentials; optical character recognition; Application software; Automatic speech recognition; Character recognition; Error analysis; Hidden Markov models; Image processing; Optical character recognition software; Robustness; Signal processing; Writing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal and Image Processing Applications (ICSIPA), 2009 IEEE International Conference on
Conference_Location :
Kuala Lumpur
Print_ISBN :
978-1-4244-5560-7
Type :
conf
DOI :
10.1109/ICSIPA.2009.5478619
Filename :
5478619
Link To Document :
بازگشت