Title :
Automatic recognition of printed Oriya script
Author :
Chaudhuri, B.B. ; Pal, U. ; Mitra, M.
Author_Institution :
Comput. Vision & Pattern Recognition Unit, Indian Stat. Inst., Calcutta, India
fDate :
6/23/1905 12:00:00 AM
Abstract :
The paper deals with an optical character recognition system for printed Oriya, a popular Indian script. The development of OCR for this script is difficult because a large number of characters have to be recognized. In the proposed system, the digitized document image is first passed through preprocessing modules like skew correction, line segmentation, zone detection, word and character segmentation, etc. These modules have been developed by combining some conventional techniques with some newly proposed ones. Next, individual characters are recognized using a combination of stroke and run-number based features, along with features obtained from the concept of a water reservoir. The feature detection methods are simple and robust. A prototype of the system has been tested on a variety of printed Oriya material, and currently achieves 96.3% character level accuracy on average
Keywords :
document image processing; feature extraction; natural languages; optical character recognition; OCR; automatic recognition; character segmentation; digitized document image; feature detection methods; line segmentation; optical character recognition system; popular Indian script; preprocessing modules; printed Oriya script; run-number based features; skew correction; water reservoir; word segmentation; zone detection; Character recognition; Computer vision; Image segmentation; Materials testing; Optical character recognition software; Prototypes; Reservoirs; Robustness; System testing; Water resources;
Conference_Titel :
Document Analysis and Recognition, 2001. Proceedings. Sixth International Conference on
Conference_Location :
Seattle, WA
Print_ISBN :
0-7695-1263-1
DOI :
10.1109/ICDAR.2001.953897