Title of article :
On the layered nearest neighbour estimate, the bagged nearest neighbour estimate and the random forest method in regression and classification
Author/Authors :
Biau، نويسنده , , Gérard and Devroye، نويسنده , , Luc، نويسنده ,
Issue Information :
دوفصلنامه با شماره پیاپی سال 2010
Abstract :
Let X 1 , … , X n be identically distributed random vectors in R d , independently drawn according to some probability density. An observation X i is said to be a layered nearest neighbour (LNN) of a point x if the hyperrectangle defined by x and X i contains no other data points. We first establish consistency results on L n ( x ) , the number of LNN of x . Then, given a sample ( X , Y ) , ( X 1 , Y 1 ) , … , ( X n , Y n ) of independent identically distributed random vectors from R d × R , one may estimate the regression function r ( x ) = E [ Y | X = x ] by the LNN estimate r n ( x ) , defined as an average over the Y i ’s corresponding to those X i which are LNN of x . Under mild conditions on r , we establish the consistency of E | r n ( x ) − r ( x ) | p towards 0 as n → ∞ , for almost all x and all p ≥ 1 , and discuss the links between r n and the random forest estimates of Breiman (2001) [8]. We finally show the universal consistency of the bagged (bootstrap-aggregated) nearest neighbour method for regression and classification.
Keywords :
Bagging , random forests , One nearest neighbour estimate , Regression estimation , Layered nearest neighbours
Journal title :
Journal of Multivariate Analysis
Journal title :
Journal of Multivariate Analysis