DocumentCode
735875
Title
Page-level handwritten script identification using modified log-Gabor filter based features
Author
Singh, Pawan Kumar ; Chatterjee, Iman ; Sarkar, Ram
Author_Institution
Dept. of Comput. Sci. & Eng., Jadavpur Univ., Kolkata, India
fYear
2015
fDate
9-11 July 2015
Firstpage
225
Lastpage
230
Abstract
Automatic identification of scripts, an imperative research problem during the last few decades, has posed many challenges in any multi-script environment. As India is a multilingual country, therefore, text documents containing more than one language are very familiar phenomenon here. But to digitize these multi-lingual documents using any Optical Character Recognition (OCR) engine, first it is required to recognize the scripts used to write the same. In this paper, a page-level script identification technique for eight popular handwritten scripts namely, Bangla, Devanagari, Gurumukhi, Oriya, Tamil, Telugu, Urdu along with Roman has been proposed. To start with, Modified log-Gabor filters based texture features are designed from each of the document pages. Then the proposed model is evaluated using multiple classifiers and based on their identification accuracies, it is found that Simple Logistic performs the best. Outcome of the present experiment reveals the usefulness of the Modified log-Gabor filters based features in recognition of handwritten Indic scripts. A total of 240 document pages is used to carry out the present experiment and it yields 95.57% accuracy in identifying the scripts of the documents. Even if the proposed method is assessed on limited dataset, but considering the intricacies of the scripts, the outcome can be assumed reasonably acceptable.
Keywords
Gabor filters; document image processing; feature extraction; handwritten character recognition; image classification; optical character recognition; text analysis; Bangla; Devanagari; Gurumukhi; OCR engine; Oriya; Roman; Tamil; Telugu; Urdu; automatic script identification; handwritten Indic script recognition; modified log-Gabor filter based texture features; multilingual documents; multiple classifiers; multiscript environment; optical character recognition; page-level handwritten script identification technique; simple logistic; text documents; Feature extraction; Gabor filters; Information filters; Logistics; Optical character recognition software; Visualization; Handwritten Indic scripts; Modified logGabor filter; Optical Character Recognition; Page-level script identification;
fLanguage
English
Publisher
ieee
Conference_Titel
Recent Trends in Information Systems (ReTIS), 2015 IEEE 2nd International Conference on
Conference_Location
Kolkata
Type
conf
DOI
10.1109/ReTIS.2015.7232882
Filename
7232882
Link To Document