DocumentCode :
44937
Title :
Outlier-Robust PCA: The High-Dimensional Case
Author :
Xu, Huan ; Caramanis, Constantine ; Mannor, Shie
Author_Institution :
Dept. of Mech. Eng., Nat. Univ. of Singapore, Singapore, Singapore
Volume :
59
Issue :
1
fYear :
2013
fDate :
Jan. 2013
Firstpage :
546
Lastpage :
572
Abstract :
Principal component analysis plays a central role in statistics, engineering, and science. Because of the prevalence of corrupted data in real-world applications, much research has focused on developing robust algorithms. Perhaps surprisingly, these algorithms are unequipped-indeed, unable-to deal with outliers in the high-dimensional setting where the number of observations is of the same magnitude as the number of variables of each observation, and the dataset contains some (arbitrarily) corrupted observations. We propose a high-dimensional robust principal component analysis algorithm that is efficient, robust to contaminated points, and easily kernelizable. In particular, our algorithm achieves maximal robustness-it has a breakdown point of 50% (the best possible), while all existing algorithms have a breakdown point of zero. Moreover, our algorithm recovers the optimal solution exactly in the case where the number of corrupted points grows sublinearly in the dimension.
Keywords :
data handling; principal component analysis; contaminated points; corrupted data; corrupted points; dataset; high-dimensional case; high-dimensional robust principal component analysis algorithm; optimal solution; outlier-robust PCA; real-world applications; robust algorithms; Approximation algorithms; Electric breakdown; Matrix decomposition; Noise; Principal component analysis; Robustness; Standards; Dimension reduction; outlier; principal component analysis (PCA); robustness; statistical learning;
fLanguage :
English
Journal_Title :
Information Theory, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9448
Type :
jour
DOI :
10.1109/TIT.2012.2212415
Filename :
6307864
Link To Document :
بازگشت