DocumentCode :
2006139
Title :
Simultaneously Removing Noise and Selecting Relevant Features for High Dimensional Noisy Data
Author :
Byeon, Boseon ; Rasheed, Khaled
Author_Institution :
Comput. Sci., Univ. of Georgia, Athens, GA
fYear :
2008
fDate :
11-13 Dec. 2008
Firstpage :
147
Lastpage :
152
Abstract :
The classification for the noisy training data in high dimension suffers from concurrent negative effects by noise and irrelevant/redundant features. Noise disrupts the training data and irrelevant/redundant features prevent the classifier from picking relevant features in building the model. Therefore they may reduce classification accuracy. This paper introduces a novel approach to improve the quality of training data sets with noisy dependent variable and high dimensionality by simultaneously removing noisy instances and selecting relevant features for classification. Our approach relies on two genetic algorithms, one for noise detection and the other for feature selection, and allows them to exchange their results periodically at certain generation intervals. Prototype selection is used to improve the performance along with the genetic algorithm in the noise detection method. This paper shows that our approach enhances the quality of noisy training data sets with high dimension and substantially increases the classification accuracy.
Keywords :
feature extraction; genetic algorithms; learning (artificial intelligence); pattern classification; genetic algorithm; high dimensional noisy data classification; machine learning; noise detection; noise removal; prototype selection; relevant feature selection; Application software; Computer science; Filtering; Filters; Genetic algorithms; Machine learning; Nearest neighbor searches; Noise generators; Prototypes; Training data; Feature Selection; Genetic Algorithm; Noise Detection; Outlier Detction; Prototype Selection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Applications, 2008. ICMLA '08. Seventh International Conference on
Conference_Location :
San Diego, CA
Print_ISBN :
978-0-7695-3495-4
Type :
conf
DOI :
10.1109/ICMLA.2008.87
Filename :
4724968
Link To Document :
بازگشت