Title :
Learning Text-Line Segmentation Using Codebooks and Graph Partitioning
Author :
Le Kang ; Kumar, Jayant ; Peng Ye ; Doermann, David
Author_Institution :
Inst. for Adv. Comput. Studies, Univ. of Maryland, College Park, MD, USA
Abstract :
In this paper, we present a codebook based method for handwritten text-line segmentation which uses image-patches in the training data to learn a graph-based similarity for clustering. We first construct a codebook of image-patches using K-medoids, and obtain exemplars which encode local evidence. We then obtain the corresponding codewords for all patches extracted from a given image and construct a similarity graph using the learned evidence and partitioned to obtain text-lines. Our learning based approach performs well on a field dataset containing degraded and un-constrained handwritten Arabic document images. Results on ICDAR 2009 segmentation contest dataset show that the method is competitive with previous approaches.
Keywords :
graph theory; handwriting recognition; image segmentation; learning (artificial intelligence); natural language processing; pattern clustering; ICDAR 2009 segmentation contest dataset; K-medoids; codebooks; graph partitioning; graph-based similarity; handwritten Arabic document images; handwritten text-line segmentation; image-patches; learning; pattern clustering; Accuracy; Context; Fellows; Image segmentation; Support vector machines; Training; Training data; codebook; learning; segmentation; text-line;
Conference_Titel :
Frontiers in Handwriting Recognition (ICFHR), 2012 International Conference on
Conference_Location :
Bari
Print_ISBN :
978-1-4673-2262-1
DOI :
10.1109/ICFHR.2012.228