• DocumentCode
    2633182
  • Title

    Extended character defect model for recognition of text from maps

  • Author

    Pezeshk, Aria ; Tutwiler, Richard L.

  • Author_Institution
    Appl. Res. Lab., Pennsylvania State Univ., State College, PA, USA
  • fYear
    2010
  • fDate
    23-25 May 2010
  • Firstpage
    85
  • Lastpage
    88
  • Abstract
    Topographic maps contain a small amount of text compared to other forms of printed documents. Furthermore, the text and graphical components typically intersect with one another thus making the extraction of text a very difficult task. Creating training sets with a suitable size from the actual characters in maps would therefore require the laborious processing of many maps with similar features and the manual extraction of character samples. This paper extends the types of defects represented by Baird´s document image degradation model in order to create pseudo randomly generated training sets that closely mimic the various artifacts and defects encountered in characters extracted from maps. Two Hidden Markov Models are then trained and used to recognize the text. Tests performed on extracted street labels show an improvement in performance from 88.4% when only the original Baird´s model is used to a character recognition rate of 93.2% when the extended defect model is used for training.
  • Keywords
    cartography; document image processing; hidden Markov models; learning (artificial intelligence); text analysis; document image degradation model; extended character defect model; hidden Markov models; pseudo randomly generated training sets; text recognition; topographic maps; Artificial neural networks; Character recognition; Data mining; Degradation; Feature extraction; Graphics; Hidden Markov models; Image recognition; Optical character recognition software; Text recognition; Hidden Markov Models; document image degradation model; feature extraction; text recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image Analysis & Interpretation (SSIAI), 2010 IEEE Southwest Symposium on
  • Conference_Location
    Austin, TX
  • Print_ISBN
    978-1-4244-7801-9
  • Type

    conf

  • DOI
    10.1109/SSIAI.2010.5483913
  • Filename
    5483913