Title :
Mixed Thai-English Character Classification Based on Histogram of Oriented Gradient Feature
Author :
Siriteerakul, Teera
Author_Institution :
Fac. of Sci., King Mongkut´s Inst. of Technol. Ladkrabang, Bangkok, Thailand
Abstract :
The task of classifying mixed Thai-English characters carries considerable challenges due to the number and complexity of the characters. This paper proposes and empirically investigates the performance of a classification system that uses Histogram of Oriented Gradient as an image feature with Support Vector Machine as a classification tool. The experiments were done on the datasets provided by NECTEC which consists of over 600,000 printed images of individual characters from 142 distinct classes. With this proposed method, an accuracy of 97% can be achieved without a look up dictionary or any post-processing system.
Keywords :
character recognition; image classification; natural language processing; support vector machines; NECTEC; classification tool; histogram of oriented gradient feature; mixed Thai-English character classification; support vector machine; Accuracy; Character recognition; Feature extraction; Histograms; Support vector machines; Training; Vectors; Character classification; Histogram of Oriented Gradient; Thai OCR;
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2013 12th International Conference on
Conference_Location :
Washington, DC
DOI :
10.1109/ICDAR.2013.173