• DocumentCode
    2629848
  • Title

    Perfect metrics

  • Author

    Ho, Tin Kam ; Baird, Henry S.

  • Author_Institution
    AT&T Bell Lab., Murray Hill, NJ, USA
  • fYear
    1993
  • fDate
    20-22 Oct 1993
  • Firstpage
    593
  • Lastpage
    597
  • Abstract
    The authors describe an experiment in the construction of perfect metrics for minimum-distance classification of character images. A perfect metric is one that, with high probability, is zero for correct classifications and non-zero for incorrect classifications. They promise excellent reject behavior in addition to good rank ordering. The approach is to infer from the training data faithful but concise representations of the empirical class-conditional distributions. In doing this, the authors have abandoned many visual simplifying assumptions about the distributions, e.g., that they are simply-connected, unimodal, convex, or parametric (e.g., Gaussian). The method requires unusually large and representative training sets, which we provide through pseudorandom generation of training samples using a realistic model of printing and imaging distortions. The authors illustrate the method on a challenging recognition problem: 3755 character classes of machine-print Chinese, in four typefaces, over a range of text sizes. In a test on over three million images, the perfect-metric classifier achieved better than 99% top-choice accuracy. In addition, it is shown that it is superior to a conventional parametric classifier
  • Keywords
    character sets; image classification; optical character recognition; OCR; character images; class-conditional distributions; correct classifications; imaging distortions; incorrect classifications; machine-print Chinese; minimum-distance classification; perfect metrics; perfect-metric classifier; printing; probability; pseudorandom generation; rank ordering; reject behavior; training sets; typefaces; Character generation; Character recognition; Image generation; Image recognition; Parametric statistics; Pixel; Printing; Testing; Tin; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 1993., Proceedings of the Second International Conference on
  • Conference_Location
    Tsukuba Science City
  • Print_ISBN
    0-8186-4960-7
  • Type

    conf

  • DOI
    10.1109/ICDAR.1993.395665
  • Filename
    395665