Title :
An efficient character recognition scheme based on k-means clustering
Author :
Pourmohammad, Sajjad ; Soosahabi, Reza ; Maida, Anthony S.
Author_Institution :
Center for Adv. Comput. Studies, Univ. of Louisiana at Lafayette, Lafayette, LA, USA
Abstract :
Handwritten character recognition has been an active area of research. However, because of the recent advancements in mobile devices with limited amount of memory and computational power, efficient and simple algorithms for both online and offline character recognition have become more appealing. In this work, an efficient character recognition systems is proposed using LDA Analysis followed by a Bayesian discriminator function based on the Mahalonobis distance. Since LDA is tailored for Gaussian distributed data and the samples dimensionality is high, a couple of preprocessing steps have been applied to reduce dimensionality and cluster the data into semi-Gaussian subclasses. In the first step, affine transformations are applied to the training samples in order to make the scheme robust against distortion. Scaling and Rotation are among those popular distortions which have been considered in this work. Inactive pixels are cut off using a simple algorithm in the next step. Then, principal component analysis (PCA) and k-means clustering are applied. The results from preprocessing showed a great potential in dimensionality reduction using transformations that can preserve useful information. Numerical results on the MNIST dataset reached 3% error rate which is lower than the other linear approaches. The proposed linear techniques are discussed in a way that make it easier to have a much clearer understanding of the method and why it works compared to the other classification methods.
Keywords :
Bayes methods; Gaussian distribution; affine transforms; error statistics; handwritten character recognition; pattern clustering; principal component analysis; Bayesian discriminator function; Gaussian distributed data; LDA analysis; MNIST dataset; Mahalonobis distance; PCA; affine transformation; computational power; data cluster; dimensionality reduction; error rate; handwritten character recognition; inactive pixel; k-means clustering; linear technique; memory; mobile device; offline character recognition; online character recognition; principal component analysis; rotation distortion; samples dimensionality; scaling distortion; semiGaussian subclass; Character recognition; Computational complexity; Error analysis; Interpolation; Neural networks; Principal component analysis; Training;
Conference_Titel :
Modeling, Simulation and Applied Optimization (ICMSAO), 2013 5th International Conference on
Conference_Location :
Hammamet
Print_ISBN :
978-1-4673-5812-5
DOI :
10.1109/ICMSAO.2013.6552640