• DocumentCode
    3141284
  • Title

    Multifont classification using typographical attributes

  • Author

    Jung, Min-Chul ; Shin, Yong-Chul ; Srihari, Sargur N.

  • Author_Institution
    Center of Excellence for Document Analysis Recognition, State Univ. of New York, Buffalo, NY, USA
  • fYear
    1999
  • fDate
    20-22 Sep 1999
  • Firstpage
    353
  • Lastpage
    356
  • Abstract
    This paper introduces a multifont classification scheme to help with the recognition of multifont and multisize characters. It uses typographical attributes such as ascenders, descenders and serifs obtained from a word image. The attributes are used as an input to a neural network classifier to produce the multifont classification results. It can classify 7 commonly used fonts for all point sizes from 7 to 18. The approach developed in this scheme can handle a wide range of image quality even with severely touching characters. The detection of the font can improve character segmentation as well as character recognition because the identification of the font provides information on the structure and typographical design of characters. Therefore, this multifont classification algorithm can be used for maintaining good recognition rates of a machine printed OCR system regardless of fonts and sizes. Experiments have shown that font classification accuracies reach high performance levels of about 95 percent even with severely touching characters. The technique developed for the selected 7 fonts in this paper can be applied to any other fonts
  • Keywords
    character sets; document image processing; image classification; image segmentation; neural nets; optical character recognition; ascenders; character recognition; character segmentation; descenders; font detection; font identification; image quality; machine printed OCR system; multifont character recognition; multifont classification; multisize character recognition; neural network classifier; serifs; severely touching characters; typographical attributes; typographical design; word image; Character recognition; Electronic switching systems; Image quality; Image segmentation; Optical character recognition software; Read only memory; Shape; Text analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on
  • Conference_Location
    Bangalore
  • Print_ISBN
    0-7695-0318-7
  • Type

    conf

  • DOI
    10.1109/ICDAR.1999.791797
  • Filename
    791797