• DocumentCode
    2530487
  • Title

    Fast handwriting recognition for indexing historical documents

  • Author

    Govindaraju, Venu ; Xue, Hanhong

  • Author_Institution
    Center of Excellence for Document Anal. & Recognition, State Univ. of New York, Buffalo, NY, USA
  • fYear
    2004
  • fDate
    2004
  • Firstpage
    314
  • Lastpage
    320
  • Abstract
    Handwriting recognition (HR) has been successfully used in several applications such as postal address interpretation [S. Srihari et al., (1997)], bank check reading [S. Impedovo et al., (1997)], and forms reading [S. Madhvanath et al., (1995)]. These applications are all characterized by small or fixed lexicons afforded by contextual knowledge. Machine recognition of handwriting in historical documents presents two primary challenges: (i) large lexicons (over 10000 words) leading to low recognition accuracy (less than 50%) and (ii) a need for high speed HR given the millions of handwritten manuscripts in digital library repositories and that the speed is usually inversely proportional to lexicon size. We address the issue of speed when dealing with large lexicons. We present several techniques to improve the processing speed for a gain of up to 7 times in matching time and describe a method whereby the large lexicon is divided into smaller sets and processed in parallel. With 4 processors 18 times speedup for the matching phase is achieved.
  • Keywords
    digital libraries; document handling; handwriting recognition; handwritten character recognition; history; indexing; digital library repositories; fast handwriting recognition; historical document indexing; lexicon size; machine recognition; postal address interpretation; Background noise; Handwriting recognition; History; Image analysis; Image recognition; Indexing; Noise figure; Software libraries; Text analysis; Venus;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Image Analysis for Libraries, 2004. Proceedings. First International Workshop on
  • Print_ISBN
    0-7695-2088-X
  • Type

    conf

  • DOI
    10.1109/DIAL.2004.1263260
  • Filename
    1263260