• DocumentCode
    2841055
  • Title

    Modelling polyfont printed characters with HMMs and a shift invariant Hamming distance

  • Author

    Elms, A.J. ; Illingworth, J.

  • Author_Institution
    Dept. of Electron. & Electr. Eng., Surrey Univ., Guildford, UK
  • Volume
    1
  • fYear
    1995
  • fDate
    14-16 Aug 1995
  • Firstpage
    504
  • Abstract
    Rumours of the death of the problem of machine-printed text recognition have been greatly exaggerated. Reported results can be good enough to lead one to believe that this is a “solved problem”. Closer analysis reveals test data that is often limited in its range of fonts and point sizes. Worse still, results are commonly quoted for noise-free images, ignoring the problems of recognising “real” documents such as faxes. Various methods have been proposed for modelling characters with Hidden Markov Models. The authors, amongst others, have suggested representing a character by analysing the pixel pattern in columns of its image, and linking sequential column patterns together with a HMM. In this paper we propose a method of quantising the patterns by means of a Shift Invariant Hamming Distance. A full experimental evaluation (45 fonts, 5 point sizes) in typical noise results in a recognition accuracy of 99% in the top-3 choices, and 94% top-choice for the best font. The method has a significant advantage in recognising noisy word images, due to classification being achieved without a prior segmentation of the word into characters
  • Keywords
    Hamming codes; hidden Markov models; optical character recognition; vector quantisation; hidden Markov models; machine-printed text recognition; noise-free images; noisy word images; polyfont printed characters; sequential column patterns; shift invariant Hamming distance; Data analysis; Hamming distance; Hidden Markov models; Image analysis; Image recognition; Joining processes; Pattern analysis; Pixel; Testing; Text recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on
  • Conference_Location
    Montreal, Que.
  • Print_ISBN
    0-8186-7128-9
  • Type

    conf

  • DOI
    10.1109/ICDAR.1995.599044
  • Filename
    599044