DocumentCode
117630
Title
Structural feature based approach for script identification from printed Indian document
Author
Obaidullah, Sk Md ; Mondal, Aniruddha ; Roy, Kaushik
Author_Institution
Dept. of Comput. Sc. & Eng., Aliah Univ., Kolkata, India
fYear
2014
fDate
20-21 Feb. 2014
Firstpage
120
Lastpage
124
Abstract
Script identification is a complex real life problem for automation of printed or handwritten document processing. The task becomes more challenging when it comes about a multi script/lingual country like India. For the development of OCR for a particular language the script needs to be identified first. That is why development of a script identification system is a pressing need. Till date no such work is available considering all 13 official Indian scripts. In this paper we present a scheme for script identification from printed document for 10 official Indian scripts namely Bangla, Devnagari, Roman, Oriya, Urdu, Gujarati, Telegu, Kannada, Malayalam and Kashmiri. Total 459 document pages are considered and 62 dimensional feature set is computed for the present work. Finally using simple logistic classifier with 5 fold cross validation an average identification rate of 98.9% is found.
Keywords
document image processing; handwriting recognition; natural language processing; optical character recognition; Bangla; Devnagari; Gujarati; India; Indian scripts; Kannada; Kashmiri; Malayalam; Oriya; Roman; Telegu; Urdu; document pages; handwritten document processing; multiscript-lingual country; optical character recognition; printed Indian document; printed document processing; script identification; script identification system; structural feature based approach; Computers; Databases; Educational institutions; Feature extraction; Logistics; Optical character recognition software; Signal processing; Feature Set; OCR; Printed Script Identification; Simple Logistic Classifier;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing and Integrated Networks (SPIN), 2014 International Conference on
Conference_Location
Noida
Print_ISBN
978-1-4799-2865-1
Type
conf
DOI
10.1109/SPIN.2014.6776933
Filename
6776933
Link To Document