Title :
Identifying Devnagri characters
Abstract :
Text in Devnagri script, used for Sanskrit, Hindi, Marathi, Nepali and Konkani languages, presents a single solid flank. There is no separation between the characters. For Sanskrit between words there may not be any separation. Coupled to this there is the practice of having more than twelve forms each for the thirty six consonants which results in additions to the principle character in vertical directions, some above and some below the main character. There are also compound consonants. The net result is that there are several thousand different forms or patterns which it is necessary to identify which may in addition be connected with each other without any visible separation. Prima facie the task appears to be impossible, but Devnagri script has some built in rules which makes it not just possible but the simplest of the tasks to recognise the individual characters. The paper introduces the approach and analysis by which the OCR program was developed. Results obtained from the field tests are presented
Keywords :
document image processing; optical character recognition; Devnagri character identification; Devnagri script; OCR; optical character recognition; rules; Acoustic devices; Character recognition; Handwriting recognition; Magnetic heads; Natural languages; Optical character recognition software; Solids; Speech synthesis; Testing; Writing;
Conference_Titel :
Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on
Conference_Location :
Bangalore
Print_ISBN :
0-7695-0318-7
DOI :
10.1109/ICDAR.1999.791876