Identifying Devnagri characters

Author

Karnik, R.R.

fYear

1999

fDate

20-22 Sep 1999

Firstpage

669

Lastpage

672

Abstract

Text in Devnagri script, used for Sanskrit, Hindi, Marathi, Nepali and Konkani languages, presents a single solid flank. There is no separation between the characters. For Sanskrit between words there may not be any separation. Coupled to this there is the practice of having more than twelve forms each for the thirty six consonants which results in additions to the principle character in vertical directions, some above and some below the main character. There are also compound consonants. The net result is that there are several thousand different forms or patterns which it is necessary to identify which may in addition be connected with each other without any visible separation. Prima facie the task appears to be impossible, but Devnagri script has some built in rules which makes it not just possible but the simplest of the tasks to recognise the individual characters. The paper introduces the approach and analysis by which the OCR program was developed. Results obtained from the field tests are presented

Keywords

document image processing; optical character recognition; Devnagri character identification; Devnagri script; OCR; optical character recognition; rules; Acoustic devices; Character recognition; Handwriting recognition; Magnetic heads; Natural languages; Optical character recognition software; Solids; Speech synthesis; Testing; Writing;

fLanguage

English

Publisher

ieee

Conference_Titel

Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on

Conference_Location

Bangalore

Print_ISBN

0-7695-0318-7

Type

conf

DOI

10.1109/ICDAR.1999.791876

Filename

791876