DocumentCode :
3024925
Title :
A Cluster-Based Noise Detection Algorithm
Author :
Yin, Hua ; Dong, Hongbin ; Li, Yuxuan
fYear :
2009
fDate :
25-26 April 2009
Firstpage :
386
Lastpage :
389
Abstract :
For a classification problem, noise in real-world data can dramatically lower the predictive accuracy of a learner and increase the time in building model. Researchers have proved that preprocessing noise before learning can bring more advantages. Previous work mostly focus on class noise detection for the difficulties of attribute noise detection. In this paper, we present a cluster based noise detection algorithm, which synthetically considers attribute and class noise detection. Meanwhile, it has the ability of handling different types of datasets. Our algorithm separately detects class and attributes noise by computing the deviation to the center in the same cluster. we test its effect by adding different types of noise and noise level into datasets from the UCI repository, Our approach shows significant effectiveness in improving the predictive accuracy of classification.
Keywords :
data mining; education; noise (working environment); pattern classification; cluster based noise detection algorithm; learner predictive accuracy; modelling; pattern classification; Accuracy; Algorithm design and analysis; Application software; Clustering algorithms; Databases; Detection algorithms; Filters; Noise level; Noise reduction; Predictive models; classification; cluster; data mining; noise detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Database Technology and Applications, 2009 First International Workshop on
Conference_Location :
Wuhan, Hubei
Print_ISBN :
978-0-7695-3604-0
Type :
conf
DOI :
10.1109/DBTA.2009.39
Filename :
5207734
Link To Document :
بازگشت