• DocumentCode
    419628
  • Title

    Probability table compression using distributional clustering for scanning n-tuple classifiers

  • Author

    Hu, Jianying ; Ratzlaff, Eugene

  • Author_Institution
    IBM T.J. Watson Res. Center, Yorktown Heights, NY, USA
  • Volume
    2
  • fYear
    2004
  • fDate
    23-26 Aug. 2004
  • Firstpage
    533
  • Abstract
    A method for compressing tables of probability distributions using distributional clustering is presented and applied to shrink the look-up tables of a scanning n-tuple handwritten character recognizer. Lossy compression is realized by clustering n-tuples that are observed to induce similar class probability distributions. A new distance metric called "weighted mean KL divergence" is introduced to assess similarity and account for the cumulative effect of merging two distributions. After compression, cluster membership is rebalanced in an annealing-like process. The proposed method is evaluated on three isolated-character subsets of the UNIPEN database. Compression ratios in excess of 2000:1 are demonstrated for 5-tuple classifiers.
  • Keywords
    data compression; handwritten character recognition; statistical distributions; UNIPEN database; distributional clustering; handwritten character recognizer; lossy compression; probability distribution table compression; scanning N-tuple classifiers; weighted mean KL divergence; Annealing; Character generation; Character recognition; Costs; Databases; Handwriting recognition; Maximum likelihood decoding; Maximum likelihood estimation; Merging; Probability distribution;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on
  • ISSN
    1051-4651
  • Print_ISBN
    0-7695-2128-2
  • Type

    conf

  • DOI
    10.1109/ICPR.2004.1334293
  • Filename
    1334293