Title :
A Cluster-Based Noise Detection Algorithm
Author :
Yin, Hua ; Dong, Hongbin ; Li, Yuxuan
Abstract :
For a classification problem, noise in real-world data can dramatically lower the predictive accuracy of a learner and increase the time in building model. Researchers have proved that preprocessing noise before learning can bring more advantages. Previous work mostly focus on class noise detection for the difficulties of attribute noise detection. In this paper, we present a cluster based noise detection algorithm, which synthetically considers attribute and class noise detection. Meanwhile, it has the ability of handling different types of datasets. Our algorithm separately detects class and attributes noise by computing the deviation to the center in the same cluster. we test its effect by adding different types of noise and noise level into datasets from the UCI repository, Our approach shows significant effectiveness in improving the predictive accuracy of classification.
Keywords :
data mining; education; noise (working environment); pattern classification; cluster based noise detection algorithm; learner predictive accuracy; modelling; pattern classification; Accuracy; Algorithm design and analysis; Application software; Clustering algorithms; Databases; Detection algorithms; Filters; Noise level; Noise reduction; Predictive models; classification; cluster; data mining; noise detection;
Conference_Titel :
Database Technology and Applications, 2009 First International Workshop on
Conference_Location :
Wuhan, Hubei
Print_ISBN :
978-0-7695-3604-0
DOI :
10.1109/DBTA.2009.39