DocumentCode :
643374
Title :
Comparison of different classifiers for script identification from handwritten document
Author :
Obaidullah, Sk Md ; Roy, Kaushik ; Das, Niladri
fYear :
2013
fDate :
26-28 Sept. 2013
Firstpage :
1
Lastpage :
6
Abstract :
For a multi script/lingual country like India Script identification is a complex real life problem for automation of document processing. Handwritten script identification is again much more complex compared to print one. Here scripts from multi script handwritten documents are identified and then performance is compared using different well known classifiers. We followed a two stage approach for the same. Firstly, we have identified six scripts used for writing six official languages of India in Handwritten domain, which are easily available to us. Using some Abstract/Mathematical features, Structure based features and Script dependent features at document level a 41 dimensional feature set is prepared. Then, a series of classifiers namely Logistic Model Tree, Random Forest, Multi Layer Perceptron, Sequential Minimal Optimization, LibLINEAR, RBFNetwork and Fuzzy Unordered Rule Induction Algorithm are applied on the feature set to classify among the six handwritten scripts and the results are compared. Among all these classifiers, Logistic Model Tree shows highest accuracy rate of 91.2% with a 5 fold cross validation whereas SMO model has lowest convergence time of 0.05s.
Keywords :
document image processing; fuzzy set theory; image classification; multilayer perceptrons; radial basis function networks; text analysis; India; LibLINEAR; RBFNetwork; SMO model; abstract-mathematical features; classifier comparison; dimensional feature set; document processing automation; fuzzy unordered rule induction algorithm; handwritten script identification; logistic model tree; multilayer perceptron; multilingual country; multiscript country; multiscript handwritten documents; random forest; script dependent features; script identification; sequential minimal optimization; structure based features; Accuracy; Computers; Feature extraction; Fractals; Logistics; Neurons; Optical character recognition software; Classifier; Handwritten Script Identification; Optical Character Recognizer; Weka;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing, Computing and Control (ISPCC), 2013 IEEE International Conference on
Conference_Location :
Solan
Print_ISBN :
978-1-4673-6188-0
Type :
conf
DOI :
10.1109/ISPCC.2013.6663388
Filename :
6663388
Link To Document :
بازگشت