DocumentCode :
2057218
Title :
Binarization of historical documents using self-learning classifier based on K-Means and SVM
Author :
Djema, Amina ; Chibani, Youcef
Author_Institution :
Speech Commun. & Signal Process. Lab., Univ. of Sci. & Technol. Houari Boumediene, Algiers, Algeria
fYear :
2013
fDate :
9-13 Sept. 2013
Firstpage :
1
Lastpage :
5
Abstract :
This article aims to present a new binarization method of degraded historical document images. The new algorithm combines K-Means classification with a classical binarization method to generate a pure learning set and a conflict class. We use SVM classifier to manage the conflict class in order to make the final binarization that classifies each pixel of image document as foreground or background. Experiments are conducted on the standard datasets Dibco 2009 and Dibco 2011. The obtained results are very promising that allows opening a large margin of investigation.
Keywords :
history; image classification; learning (artificial intelligence); support vector machines; SVM classifier; historical document image binarization method; k-means classification; self-learning classifier; standard dataset Dibco 2009; standard dataset Dibco 2011; Degradation; Ink; Noise; Support vector machines; Text analysis; Training; Historical documents; K-means; SVM; binarization; classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing Conference (EUSIPCO), 2013 Proceedings of the 21st European
Conference_Location :
Marrakech
Type :
conf
Filename :
6811583
Link To Document :
بازگشت