DocumentCode :
2214000
Title :
An Efficient Outlier Mining Algorithm for Large Dataset
Author :
Yang, Peng ; Huang, Biao
Author_Institution :
Chongqing Univ. of Arts & Sci., Chongqing
Volume :
1
fYear :
2008
fDate :
19-21 Dec. 2008
Firstpage :
199
Lastpage :
202
Abstract :
Since an outlier often contains useful information, outlier detection is becoming a hot issue in data mining. Thus, an efficient outlier mining algorithm based on KNN is proposed in this paper. It can find outlier more accurately through defining a correlation matrix considering the importance and correlation between attributes. In addition, a data structure R-tree is used in the algorithm and it utilizes pruning scheme to drastically reduce the time consuming of computing. Experimental results show that our algorithm is more efficient than the traditional KNN algorithm. It will provide an effective solution for outlier mining in large dataset.
Keywords :
data mining; matrix algebra; tree data structures; correlation matrix; data mining; data structure R-tree; efficient outlier mining algorithm; Art; Clustering algorithms; Data mining; Data structures; Euclidean distance; Industrial engineering; Information management; Innovation management; Kernel; Principal component analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Management, Innovation Management and Industrial Engineering, 2008. ICIII '08. International Conference on
Conference_Location :
Taipei
Print_ISBN :
978-0-7695-3435-0
Type :
conf
DOI :
10.1109/ICIII.2008.220
Filename :
4737527
Link To Document :
بازگشت