مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

67622

Title :

Adaptive Noisy Clustering

Author :

Chichignoud, Michael ; Loustau, Sebastien

Author_Institution :

ETH Zurich, Zurich, Switzerland

Volume :

Issue :

fYear :

2014

fDate :

Nov. 2014

Firstpage :

7279

Lastpage :

7292

Abstract :

The problem of adaptive noisy clustering is investigated. Given a set of noisy observations Z_i = X_i + ε_i, i = 1,⋯,n, the goal is to design clusters associated with the law of X_i´s, with unknown density f with respect to the Lebesgue measure. Since we observe a corrupted sample, a direct approach as the popular k-means is not suitable in this case. In this paper, we propose a noisy k-means minimization, which is based on the k-means loss function and a deconvolution estimator of the density f. In particular, this approach suffers from the dependence on a bandwidth involved in the deconvolution kernel. Fast rates of convergence for the excess risk are proposed for a particular choice of the bandwidth, which depends on the smoothness of the density f. Then, we turn out into the main issue of this paper: the data-driven choice of the bandwidth. We state an adaptive upper bound using a modified version of Lespki´s method, called Empirical Risk Comparison, where empirical risks associated with different bandwidths are compared. Eventually, we illustrate that the selection rule can be used in many statistical problems of M-estimation where the empirical risk depends on a nuisance parameter.

Keywords :

adaptive estimation; deconvolution; minimisation; nonparametric statistics; pattern clustering; Lebesgue measure; Lespki method; M-estimation; adaptive noisy clustering; adaptive upper bound; data-driven choice; deconvolution estimator; deconvolution kernel; empirical risk comparison; excess risk; k-means loss function; noisy k-means minimization; noisy observations; selection rule; statistical problems; Bandwidth; Convergence; Deconvolution; Estimation; Kernel; Noise measurement; Standards; Adaptivity; M-estimation; deconvolution; errors-in-variables; fast rates; statistical learning;

fLanguage :

English

Journal_Title :

Information Theory, IEEE Transactions on

Publisher :

ieee

ISSN :

0018-9448

Type :

jour

DOI :

10.1109/TIT.2014.2356577

Filename :

6898023

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=67622