• DocumentCode
    3134579
  • Title

    Learning Text-Line Segmentation Using Codebooks and Graph Partitioning

  • Author

    Le Kang ; Kumar, Jayant ; Peng Ye ; Doermann, David

  • Author_Institution
    Inst. for Adv. Comput. Studies, Univ. of Maryland, College Park, MD, USA
  • fYear
    2012
  • fDate
    18-20 Sept. 2012
  • Firstpage
    63
  • Lastpage
    68
  • Abstract
    In this paper, we present a codebook based method for handwritten text-line segmentation which uses image-patches in the training data to learn a graph-based similarity for clustering. We first construct a codebook of image-patches using K-medoids, and obtain exemplars which encode local evidence. We then obtain the corresponding codewords for all patches extracted from a given image and construct a similarity graph using the learned evidence and partitioned to obtain text-lines. Our learning based approach performs well on a field dataset containing degraded and un-constrained handwritten Arabic document images. Results on ICDAR 2009 segmentation contest dataset show that the method is competitive with previous approaches.
  • Keywords
    graph theory; handwriting recognition; image segmentation; learning (artificial intelligence); natural language processing; pattern clustering; ICDAR 2009 segmentation contest dataset; K-medoids; codebooks; graph partitioning; graph-based similarity; handwritten Arabic document images; handwritten text-line segmentation; image-patches; learning; pattern clustering; Accuracy; Context; Fellows; Image segmentation; Support vector machines; Training; Training data; codebook; learning; segmentation; text-line;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Frontiers in Handwriting Recognition (ICFHR), 2012 International Conference on
  • Conference_Location
    Bari
  • Print_ISBN
    978-1-4673-2262-1
  • Type

    conf

  • DOI
    10.1109/ICFHR.2012.228
  • Filename
    6424371