• DocumentCode
    3695244
  • Title

    Page segmentation of historical document images with convolutional autoencoders

  • Author

    Kai Chen;Mathias Seuret;Marcus Liwicki;Jean Hennebert;Rolf Ingold

  • Author_Institution
    DIVA (Document, Image and Voice Analysis) research group, Department of Informatics, University of Fribourg, Switzerland
  • fYear
    2015
  • Firstpage
    1011
  • Lastpage
    1015
  • Abstract
    In this paper, we present an unsupervised feature learning method for page segmentation of historical handwritten documents available as color images. We consider page segmentation as a pixel labeling problem, i.e., each pixel is classified as either periphery, background, text block, or decoration. Traditional methods in this area rely on carefully hand-crafted features or large amounts of prior knowledge. In contrast, we apply convolutional autoencoders to learn features directly from pixel intensity values. Then, using these features to train an SVM, we achieve high quality segmentation without any assumption of specific topologies and shapes. Experiments on three public datasets demonstrate the effectiveness and superiority of the proposed approach.
  • Keywords
    "Support vector machines","Robustness","Image segmentation"
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition (ICDAR), 2015 13th International Conference on
  • Type

    conf

  • DOI
    10.1109/ICDAR.2015.7333914
  • Filename
    7333914