DocumentCode :
67622
Title :
Adaptive Noisy Clustering
Author :
Chichignoud, Michael ; Loustau, Sebastien
Author_Institution :
ETH Zurich, Zurich, Switzerland
Volume :
60
Issue :
11
fYear :
2014
fDate :
Nov. 2014
Firstpage :
7279
Lastpage :
7292
Abstract :
The problem of adaptive noisy clustering is investigated. Given a set of noisy observations Zi = Xi + εi, i = 1,⋯,n, the goal is to design clusters associated with the law of Xi´s, with unknown density f with respect to the Lebesgue measure. Since we observe a corrupted sample, a direct approach as the popular k-means is not suitable in this case. In this paper, we propose a noisy k-means minimization, which is based on the k-means loss function and a deconvolution estimator of the density f. In particular, this approach suffers from the dependence on a bandwidth involved in the deconvolution kernel. Fast rates of convergence for the excess risk are proposed for a particular choice of the bandwidth, which depends on the smoothness of the density f. Then, we turn out into the main issue of this paper: the data-driven choice of the bandwidth. We state an adaptive upper bound using a modified version of Lespki´s method, called Empirical Risk Comparison, where empirical risks associated with different bandwidths are compared. Eventually, we illustrate that the selection rule can be used in many statistical problems of M-estimation where the empirical risk depends on a nuisance parameter.
Keywords :
adaptive estimation; deconvolution; minimisation; nonparametric statistics; pattern clustering; Lebesgue measure; Lespki method; M-estimation; adaptive noisy clustering; adaptive upper bound; data-driven choice; deconvolution estimator; deconvolution kernel; empirical risk comparison; excess risk; k-means loss function; noisy k-means minimization; noisy observations; selection rule; statistical problems; Bandwidth; Convergence; Deconvolution; Estimation; Kernel; Noise measurement; Standards; Adaptivity; M-estimation; deconvolution; errors-in-variables; fast rates; statistical learning;
fLanguage :
English
Journal_Title :
Information Theory, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9448
Type :
jour
DOI :
10.1109/TIT.2014.2356577
Filename :
6898023
Link To Document :
بازگشت