DocumentCode :
2021568
Title :
Recognition of Fragmented Characters Using Multiple Feature-Subset Classifiers
Author :
Chou, Chien-Hsing ; Guo, Chien-Yang ; Chang, Fu
Author_Institution :
Acad. Sinica, Taipei
Volume :
1
fYear :
2007
fDate :
23-26 Sept. 2007
Firstpage :
198
Lastpage :
202
Abstract :
This paper addresses the problem of recognizing fragmented characters in printed documents of poor printing quality, which often causes characters to break up. To enhance the recognition accuracy of such characters, most existing approaches attempt to improve the quality of character images by means of some mending techniques. We propose an alternative approach that adopts a bagging-predictor method to build classifiers, using only intact characters as training samples. The resultant classifiers can classify both intact and fragmented characters with a high degree of accuracy. Applying this approach to characters in archived Chinese newspapers, we extract two types of features from character images and form bagging predictors, each of which takes a subset of features as input. As a result, we are able to achieve drastic improvements in the recognition of fragmented characters.
Keywords :
document image processing; image classification; information retrieval systems; optical character recognition; archived Chinese newspapers; bagging-predictor method; character image quality; fragmented characters recognition; multiple feature-subset classifiers; printed documents; Bagging; Character recognition; Feature extraction; Information science; Ink; Nearest neighbor searches; Neodymium; Printing; Shape; Variable speed drives;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 2007. ICDAR 2007. Ninth International Conference on
Conference_Location :
Parana
ISSN :
1520-5363
Print_ISBN :
978-0-7695-2822-9
Type :
conf
DOI :
10.1109/ICDAR.2007.4378703
Filename :
4378703
Link To Document :
بازگشت