DocumentCode
3340569
Title
Kanji Character Detection from Complex Real Scene Images based on Character Properties
Author
Xu, Lianli ; Nagayoshi, Hiroto ; Sako, Hiroshi
Author_Institution
Economic Dept. of the French Embassy in China, High Technol. Sect., Beijing
fYear
2008
fDate
16-19 Sept. 2008
Firstpage
278
Lastpage
285
Abstract
Character recognition in complex real scene images is a very challenging undertaking. The most popular approach is to segment the text area using some extra pre-knowledge, such as "characters are in a signboard\´\´, etc. This approach makes it possible to construct a very time-consuming method, but generality is still a problem. In this paper, we propose a more general method by utilizing only character features. Our algorithm consists of five steps: pre-processing to extract connected components, initial classification using primitive rules, strong classification using AdaBoost, Markov random field (MRF) clustering to combine connected components with similar properties, and post-processing using optical character recognition (OCR) results. The results of experiments using 11 images containing 1691 characters (including characters in bad condition) indicated the effectiveness of the proposed system, namely, that 52.9% of characters were extracted correctly with 625 noise components extracted as characters.
Keywords
Markov processes; image segmentation; optical character recognition; text analysis; AdaBoost; Kanji character detection; Markov random field clustering; character properties; complex real scene images; optical character recognition; text area segmentation; time-consuming method; Background noise; Character recognition; Clustering algorithms; Image analysis; Layout; Markov random fields; Optical character recognition software; Optical noise; Poles and towers; Text analysis; AdaBoost; Character Detection; MRF; OCR;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis Systems, 2008. DAS '08. The Eighth IAPR International Workshop on
Conference_Location
Nara
Print_ISBN
978-0-7695-3337-7
Type
conf
DOI
10.1109/DAS.2008.34
Filename
4669971
Link To Document