• DocumentCode
    3487097
  • Title

    Improving HMM-Based Keyword Spotting with Character Language Models

  • Author

    Fischer, Anath ; Frinken, Volkmar ; Bunke, Horst ; Suen, Ching

  • Author_Institution
    CENPARMI, Concordia Univ., Montreal, QC, Canada
  • fYear
    2013
  • fDate
    25-28 Aug. 2013
  • Firstpage
    506
  • Lastpage
    510
  • Abstract
    Facing high error rates and slow recognition speed for full text transcription of unconstrained handwriting images, keyword spotting is a promising alternative to locate specific search terms within scanned document images. We have previously proposed a learning-based method for keyword spotting using character hidden Markov models that showed a high performance when compared with traditional template image matching. In the lexicon-free approach pursued, only the text appearance was taken into account for recognition. In this paper, we integrate character n-gram language models into the spotting system in order to provide an additional language context. On the modern IAM database as well as the historical George Washington database, we demonstrate that character language models significantly improve the spotting performance.
  • Keywords
    document image processing; handwriting recognition; hidden Markov models; image matching; learning (artificial intelligence); HMM-based keyword spotting; IAM database; character hidden Markov models; character language models; character n-gram language models; full text transcription; historical George Washington database; learning-based method; lexicon-free approach; scanned document images; template image matching; text appearance; unconstrained handwriting images; Character recognition; Databases; Handwriting recognition; Hidden Markov models; Mathematical model; Text recognition; Viterbi algorithm; handwriting recognition; hidden Markov models; keyword spotting; language models;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition (ICDAR), 2013 12th International Conference on
  • Conference_Location
    Washington, DC
  • ISSN
    1520-5363
  • Type

    conf

  • DOI
    10.1109/ICDAR.2013.107
  • Filename
    6628672