Title :
A study for high performance character extraction from color scene images
Author :
Shirai, Keiichiro ; Wakabayashi, Masanori ; Okamoto, Masayuki ; Yamamoto, Hiroaki
Abstract :
This paper describes a method for extracting character strings from scene images. Most characters on scene images appear with the same color and font size at every word or text line. In our algorithm, a scene image is divided into several blocks based on edges in the color space at first. Then the blobs, which consist of similar color pixels, are extracted by a clustering in a color space for each block. Although these blobs are correspond to characters or background patterns, after connecting them using these aspect ratios and pitches, SVM (Support Vector Machine) on several textural features of these blobs will classify each connected blob into character or background patterns. Testing with 251 images from ICDAR 2003 Text Locating Competition shows effectiveness of our algorithm.
Keywords :
Computational intelligence; Data mining; Discrete cosine transforms; Filters; Image edge detection; Image segmentation; Layout; Machine learning; Pulse modulation; Text analysis; Character extraction; Clustering; SVM;
Conference_Titel :
Document Analysis Systems, 2008. DAS '08. The Eighth IAPR International Workshop on
Conference_Location :
Nara
Print_ISBN :
978-0-7695-3337-7
DOI :
10.1109/DAS.2008.57