DocumentCode
2057218
Title
Binarization of historical documents using self-learning classifier based on K-Means and SVM
Author
Djema, Amina ; Chibani, Youcef
Author_Institution
Speech Commun. & Signal Process. Lab., Univ. of Sci. & Technol. Houari Boumediene, Algiers, Algeria
fYear
2013
fDate
9-13 Sept. 2013
Firstpage
1
Lastpage
5
Abstract
This article aims to present a new binarization method of degraded historical document images. The new algorithm combines K-Means classification with a classical binarization method to generate a pure learning set and a conflict class. We use SVM classifier to manage the conflict class in order to make the final binarization that classifies each pixel of image document as foreground or background. Experiments are conducted on the standard datasets Dibco 2009 and Dibco 2011. The obtained results are very promising that allows opening a large margin of investigation.
Keywords
history; image classification; learning (artificial intelligence); support vector machines; SVM classifier; historical document image binarization method; k-means classification; self-learning classifier; standard dataset Dibco 2009; standard dataset Dibco 2011; Degradation; Ink; Noise; Support vector machines; Text analysis; Training; Historical documents; K-means; SVM; binarization; classification;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing Conference (EUSIPCO), 2013 Proceedings of the 21st European
Conference_Location
Marrakech
Type
conf
Filename
6811583
Link To Document