مرکز منطقه ای اطلاع رساني علوم و فناوري - Autonomously normalized horizontal differentials as features for HMM-based Omni font-written OCR systems for cursively scripted languages

DocumentCode :

2563802

Title :

Autonomously normalized horizontal differentials as features for HMM-based Omni font-written OCR systems for cursively scripted languages

Author :

Attia, Mohamed ; Rashwan, Mohsen A A ; El-Mahallawy, Mohamed S M

Author_Institution :

Eng. Co. for the Dev. of Comput. Syst., RDI, Egypt

fYear :

2009

fDate :

18-19 Nov. 2009

Firstpage :

185

Lastpage :

190

Abstract :

Automatic font-written Optical Character Recognition (OCR) is highly desirable for numerous modern information technology (IT) applications. Reliable font-written OCR´s for Latin scripts are readily in use since long. For cursively scripted languages, that are the mother tongues of over one fourth of the world population, such OCR´s are however not available at a robust and reliable performance. In this regard, the main challenge is the mandatory connectivity of characters/ligatures (i.e. graphemes) that has to be resolved simultaneously upon the recognition of these graphemes. Among the various approaches tried over decades, Hidden Markov Models (HMM)-based OCR´s seem to be the most promising as they capitalize on the ability of HMM decoders to achieve segmentation and recognition simultaneously similar to the widely used HMM-based automatic speech recognition (ASR). Unlike ASR´s, what is missing in HMM-based OCR´s is the definition of a rigorously founded features vector capable to robustly achieving minimal “font type/size-independent” (omnifont) word error rates comparable to those realized with Latin scripts. Here comes the contribution of this paper that introduces such a sound features vector design, and experimentally shows its superiority in this regard.

Keywords :

hidden Markov models; image segmentation; optical character recognition; HMM-based omni font; cursively scripted languages; font-written OCR systems; grapheme recognition; hidden Markov models; image recognition; image segmentation; normalized horizontal differentials; optical character recognition; Application software; Automatic speech recognition; Character recognition; Error analysis; Hidden Markov models; Image processing; Optical character recognition software; Robustness; Signal processing; Writing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Signal and Image Processing Applications (ICSIPA), 2009 IEEE International Conference on

Conference_Location :

Kuala Lumpur

Print_ISBN :

978-1-4244-5560-7

Type :

conf

DOI :

10.1109/ICSIPA.2009.5478619

Filename :

5478619

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2563802