Title :
Entropy based Script Identification of a multilingual Document Image
Author :
Bashir, Rumaan ; Quadri, S.M.K.
Author_Institution :
Dept. of Comput. Sci., Islamic Univ. of Sci. & Technol., Awantipora, India
Abstract :
Automatic Document Image Analysis has been a prime field of research in the past few decades. Script Identification is an essential part of automatic document image analysis. Script is essentially the text of a written document and languages are written using them. A huge set of techniques have been proposed and many scripts, foreign & domestic, have been identified. But so far, trivial work has been reported for the identification of Kashmiri script. In this paper we are proposing & experimentally testing identification of Kashmiri script collectively with three other related scripts viz. Roman, Devanagri & Urdu using entropy. First, a set of training images are experimented to prepare the knowledge base and later the actual samples have been evaluated. The proposed system offers an accuracy rate of 98.50%.
Keywords :
document image processing; entropy; Devanagri script; Kashmiri script identification; Roman script; Urdu script; automatic document image analysis; domestic scripts; entropy based script identification; foreign scripts; languages; multilingual document image; written document; Accuracy; Classification algorithms; Entropy; Feature extraction; Image analysis; Text analysis; Automatic Document Image Analysis; Column Entropy; Devangari; Energy; Entropy; Kashmiri; Multilingual; Quadra-lingual; Roman; Script; Script Identification; Urdu;
Conference_Titel :
Computing for Sustainable Global Development (INDIACom), 2014 International Conference on
Conference_Location :
New Delhi
Print_ISBN :
978-93-80544-10-6
DOI :
10.1109/IndiaCom.2014.6828005