• DocumentCode
    178445
  • Title

    A Coarse-to-Fine Word Spotting Approach for Historical Handwritten Documents Based on Graph Embedding and Graph Edit Distance

  • Author

    Peng Wang ; Eglin, V. ; Garcia, C. ; Largeron, C. ; Llados, J. ; Fornes, A.

  • Author_Institution
    LIRIS, INSA-Lyon, Villeurbanne, France
  • fYear
    2014
  • fDate
    24-28 Aug. 2014
  • Firstpage
    3074
  • Lastpage
    3079
  • Abstract
    Effective information retrieval on handwritten document images has always been a challenging task, especially historical ones. In the paper, we propose a coarse-to-fine handwritten word spotting approach based on graph representation. The presented model comprises both the topological and morphological signatures of the handwriting. Skeleton-based graphs with the Shape Context labelled vertexes are established for connected components. Each word image is represented as a sequence of graphs. Aiming at developing a practical and efficient word spotting approach for large-scale historical handwritten documents, a fast and coarse comparison is first applied to prune the regions that are not similar to the query based on the graph embedding methodology. Afterwards, the query and regions of interest are compared by graph edit distance based on the Dynamic Time Warping alignment. The proposed approach is evaluated on a public dataset containing 50 pages of historical marriage license records. The results show that the proposed approach achieves a compromise between efficiency and accuracy.
  • Keywords
    document image processing; graph theory; history; image representation; image retrieval; coarse-to-fine handwritten word spotting approach; dynamic time warping alignment; graph edit distance; graph embedding methodology; graph representation; handwritten document images; historical handwritten documents; historical marriage license records; information retrieval; public dataset; shape context labelled vertexes; skeleton-based graph; Context; Image segmentation; Labeling; Pattern recognition; Shape; Skeleton; Vectors; coarse-to-fine mechamism; graph edit distance; graph embedding; graph-based representation; word spotting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition (ICPR), 2014 22nd International Conference on
  • Conference_Location
    Stockholm
  • ISSN
    1051-4651
  • Type

    conf

  • DOI
    10.1109/ICPR.2014.530
  • Filename
    6977242