Ensemble Kernel Mean Matching

Author

Yun-Qian Miao;Ahmed K. Farahat;Mohamed S. Kamel

Author_Institution

Univ. of Waterloo Waterloo, Waterloo, ON, Canada

fYear

2015

Firstpage

330

Lastpage

338

Abstract

The Kernel Mean Matching (KMM) is an elegant algorithm that produces density ratios between training and test data by minimizing their maximum mean discrepancy in a kernel space. The applicability of KMM to large-scale problems is however hindered by the quadratic complexity of calculating and storing the kernel matrices over training and test data. To address this problem, this paper proposes a novel ensemble algorithm for KMM, which divides test samples into smaller partitions, estimates a density ratio for each partition and then fuses these local estimates with a weighted sum. Our theoretical analysis shows that the ensemble KMM has a lower error bound than the centralized KMM, which uses all the test data at once to estimate the density ratio. Considering its suitability for distributed implementation, the proposed algorithm is also favorable in terms of time and space complexities. Experiments on benchmark datasets confirm the superiority of the proposed algorithm in terms of estimation accuracy and running time.

Keywords

"Kernel","Training","Partitioning algorithms","Estimation","Complexity theory","Density functional theory","Algorithm design and analysis"

Publisher

ieee

Conference_Titel

Data Mining (ICDM), 2015 IEEE International Conference on

ISSN

1550-4786

Type

conf

DOI

10.1109/ICDM.2015.127

Filename

7373337