DocumentCode :
3776384
Title :
Determination of optimal features database for OCR of printed Telugu text
Author :
C. Vasantha Lakshmi;Sarika Singh;C. Patvardhan
Author_Institution :
Dept. of Physics & Com. Sc., Dayalbagh Educational Institute, Agra, U.P
fYear :
2015
Firstpage :
1
Lastpage :
6
Abstract :
OCR (Optical Character Recognition) systems are being developed due to their numerous applications even for Indian scripts like Telugu which are complicated due to the usage of a large number of symbols. OCR systems typically store pre-computed features of symbols to be recognized in a database. Recognition of an unknown symbol is performed by finding the symbol in the database that is nearest in features space. Design of an appropriate database is, therefore, a critical step. This is especially so when the OCR system targets recognition of numerous symbols in multiple fonts and sizes. The idea is to develop an OCR system that has small recognition times and high recognition accuracies. The naive approach of putting features of all symbols in all fonts and sizes in the database might be counterproductive on both counts. Experimental results on text document images with multiple fonts and sizes show that the strategy for database design for OCR of printed Telugu text proposed in this paper achieves both the objectives. This is the first reported approach for such a database design for Telugu OCR.
Keywords :
"Databases","Optical character recognition software","Feature extraction","Classification algorithms","Target recognition","Histograms","Prototypes"
Publisher :
ieee
Conference_Titel :
Systems Conference (NSC), 2015 39th National
Type :
conf
DOI :
10.1109/NATSYS.2015.7489112
Filename :
7489112
Link To Document :
بازگشت