• DocumentCode
    2057218
  • Title

    Binarization of historical documents using self-learning classifier based on K-Means and SVM

  • Author

    Djema, Amina ; Chibani, Youcef

  • Author_Institution
    Speech Commun. & Signal Process. Lab., Univ. of Sci. & Technol. Houari Boumediene, Algiers, Algeria
  • fYear
    2013
  • fDate
    9-13 Sept. 2013
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    This article aims to present a new binarization method of degraded historical document images. The new algorithm combines K-Means classification with a classical binarization method to generate a pure learning set and a conflict class. We use SVM classifier to manage the conflict class in order to make the final binarization that classifies each pixel of image document as foreground or background. Experiments are conducted on the standard datasets Dibco 2009 and Dibco 2011. The obtained results are very promising that allows opening a large margin of investigation.
  • Keywords
    history; image classification; learning (artificial intelligence); support vector machines; SVM classifier; historical document image binarization method; k-means classification; self-learning classifier; standard dataset Dibco 2009; standard dataset Dibco 2011; Degradation; Ink; Noise; Support vector machines; Text analysis; Training; Historical documents; K-means; SVM; binarization; classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing Conference (EUSIPCO), 2013 Proceedings of the 21st European
  • Conference_Location
    Marrakech
  • Type

    conf

  • Filename
    6811583