• DocumentCode
    3165768
  • Title

    General Averaged Divergence Analysis

  • Author

    Tao, Dacheng ; Li, Xuelong ; Wu, Xindong ; Maybank, Stephen J.

  • Author_Institution
    Hong Kong Polytech. Univ., Hong Kong
  • fYear
    2007
  • fDate
    28-31 Oct. 2007
  • Firstpage
    302
  • Lastpage
    311
  • Abstract
    Subspace selection is a powerful tool in data mining. An important subspace method is the Fisher-Rao linear discriminant analysis (LDA), which has been successfully applied in many fields such as biometrics, bioinformatics, and multimedia retrieval. However, LDA has a critical drawback: the projection to a subspace tends to merge those classes that are close together in the original feature space. If the separated classes are sampled from Gaussian distributions, all with identical covariance matrices, then LDA maximizes the mean value of the Kullback-Leibler (KL) divergences between the different classes. We generalize this point of view to obtain a framework for choosing a subspace by 1) generalizing the KL divergence to the Bregman divergence and 2) generalizing the arithmetic mean to a general mean. The framework is named the general averaged divergence analysis (GADA). Under this GADA framework, a geometric mean divergence analysis (GMDA) method based on the geometric mean is studied. A large number of experiments based on synthetic data show that our method significantly outperforms LDA and several representative LDA extensions.
  • Keywords
    Gaussian distribution; arithmetic; covariance matrices; data mining; statistical analysis; Bregman divergence; Fisher-Rao linear discriminant analysis; Gaussian distributions; Kullback-Leibler divergences; arithmetic mean; covariance matrices; data mining; general averaged divergence analysis; general mean; geometric mean divergence analysis; subspace selection; synthetic data; Arithmetic; Biometrics; Computer science; Covariance matrix; Data mining; Gaussian distribution; Information systems; Linear discriminant analysis; Merging; USA Councils;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on
  • Conference_Location
    Omaha, NE
  • ISSN
    1550-4786
  • Print_ISBN
    978-0-7695-3018-5
  • Type

    conf

  • DOI
    10.1109/ICDM.2007.105
  • Filename
    4470254