Title :
Binarization of Color Character Strings in Scene Images Using K-Means Clustering and Support Vector Machines
Author :
Wakahara, Toru ; Kita, Kohei
Author_Institution :
Fac. of Comput. & Inf. Sci., Hosei Univ., Koganei, Japan
Abstract :
This paper addresses the problem of binalizing multicolored character strings in scene images subject to heavy image degradations and complex backgrounds. The proposed method consists of four steps. The first step generates tentatively binarized images via every dichotomization of K clusters obtained by K-means clustering of constituent pixels of a given image in the HSI color space. The total number of tentatively binarized images equals 2K - 2. The second step divides each binarized image into a sequence of "single-character-like" images using an average aspect ratio of a character. The third step is use of support vector machines (SVM) to determine whether each "single-character-like" image represents a character or non-character. We feed the SVM with the mesh feature to output the degree of "character-likeness." The fourth step selects a single binarized image with the maximum average of "character-likeness" as an optimal binarization result. Experiments using a total of 1000 character strings extracted from the ICDAR 2003 robust word recognition dataset show that the proposed method achieves a correct binarization rate of 80.8%.
Keywords :
character recognition; pattern clustering; support vector machines; SVM; binarized image; character-likeness; color character strings; correct binarization rate; dichotomization; image degradation; k-means clustering; multicolored character strings; optimal binarization; robust word recognition dataset; scene images; single-character-like image; support vector machine; Character recognition; Feature extraction; Image color analysis; Image recognition; Image segmentation; Robustness; Support vector machines; K-means clustering; binarization of multicolored character strings; support vector machines;
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2011 International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4577-1350-7
Electronic_ISBN :
1520-5363
DOI :
10.1109/ICDAR.2011.63