DocumentCode
3020090
Title
Zone identification in the printed Gujarati text
Author
Dholakia, Jignesh ; Negi, Atul ; Mohan, S. Rama
Author_Institution
Dept. of Appl. Math., M. S. Univ. of Baroda, Gujarat, India
fYear
2005
fDate
29 Aug.-1 Sept. 2005
Firstpage
272
Abstract
Gujarati, is a language from the Indo-Aryan family of languages, used by 50 million people in the western part of India. Gujarati-script used to write the Gujarati language, is a multilevel script, written in three zones: base character zone, upper modifier zone and lower modifier zone. Several characters are discriminated by the specific modifiers, which exist in the upper and lower zones. Hence, detecting the zone boundaries is an important task in the Gujarati OCR. Although the Gujarati script is in some respects related to the Devanagari script, there are certain peculiar differences, which prevent the use of already known techniques for zone boundary detection for scripts such as Bengali, Assamese and Devanagari where mature OCR systems already do exist. There is only one previous documented effort for Gujarati OCR, in which an approach to recognize a small subset of Gujarati alphabet was discussed. The present paper proposes a sophisticated method for accurate zone detection in images of printed Gujarati. It is expected that this approach shall make the way smoother for the design and development of Gujarati OCR systems for complete character sets.
Keywords
document image processing; edge detection; natural languages; optical character recognition; text analysis; Devanagari script; Gujarati OCR systems; Gujarati language; character sets; printed Gujarati text; zone boundary detection; Character recognition; Computational efficiency; Detection algorithms; Natural languages; Optical character recognition software; Shape; Text analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition, 2005. Proceedings. Eighth International Conference on
ISSN
1520-5263
Print_ISBN
0-7695-2420-6
Type
conf
DOI
10.1109/ICDAR.2005.258
Filename
1575552
Link To Document