• DocumentCode
    183419
  • Title

    An Approach of Strike-Through Text Identification from Handwritten Documents

  • Author

    Adak, Chandranath ; Chaudhuri, Bidyut B.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Univ. of Kalyani, Kalyani, India
  • fYear
    2014
  • fDate
    1-4 Sept. 2014
  • Firstpage
    643
  • Lastpage
    648
  • Abstract
    A handwritten document may contain strike-through texts. If such texts are fed into an OCR system, the output will be garbage. In this paper, we propose a scheme to detect such strike-through texts/words. Using a graph based model, we represent a textual connected component as a graph. The start/end and intersection points of the ink-strokes of a component are marked as graph nodes. There exists an edge between two nodes if they are connected by object (ink) pixels. By eliminating parallel edges and self loops we obtain a simple, undirected, edge-weighted graph of the text-component. The edge-weight is found by adding horizontal/vertical moves weighted by 1 and diagonal moves weighted by √2. In this graph, we find the shortest path which is nearly as long as the width of the text component and maintains a reasonable degree of straightness. This path, if exist, is identified as the strike-through line. Here we deal with handwritten documents in English, Bengali and Devanagari script. Our approach delivers fairly good results.
  • Keywords
    document image processing; graph theory; handwriting recognition; handwritten character recognition; natural languages; optical character recognition; text analysis; Bengali script; Devanagari script; English script; OCR system; edge-weighted graph; graph based model; handwritten document; ink-strokes; parallel edges; self loops; strike-through text identification; textual connected component; undirected graph; Character recognition; Databases; Handwriting recognition; Hidden Markov models; Image edge detection; Ink; Optical character recognition software; Document image analysis; Handwritten document; Optical character recognition; Strike-through text;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on
  • Conference_Location
    Heraklion
  • ISSN
    2167-6445
  • Print_ISBN
    978-1-4799-4335-7
  • Type

    conf

  • DOI
    10.1109/ICFHR.2014.113
  • Filename
    6981092