DocumentCode :
3769643
Title :
On classification of biological data using outlier detection
Author :
Yushan Qiu;Xiaoqing Cheng;Wenpin Hou;Wai-Ki Ching
Author_Institution :
Advanced Modeling and Applied Computing Laboratory, Department of Mathematics, University of Hong Kong, Hong Kong
fYear :
2015
fDate :
8/1/2015 12:00:00 AM
Firstpage :
1
Lastpage :
7
Abstract :
With the rapid development of information technology, the number of datasets, as well as their complexity and dimension, have been growing dramatically. This dramatic growth of biology data and non-biological commercial databases becomes a challenging issue in data mining. Classification technique is one of the major tools in the captured research area. However, the performance of classification may be degraded when there exists noise in the captured databases. Therefore, outlier detection becomes an urgent need and the issue of how to integrate outlier detection method and classification techniques is an important and challenging issue. In this paper, we proposed a novel and effective approach based on k-means clustering to identify outliers in the databases. In particular, we employed one of famous classification techniques, Support Vector Machine (SVM), owing to its ability to handle highdimensional data set. We also compare the classification results with the multivariate outlier detection method. Numerical results on two different data sets indicate that the classification results after removing the outliers by our proposed method are much better than the multivariate outlier detection method.
Publisher :
iet
Conference_Titel :
Operations Research and its Applications in Engineering, Technology and Management (ISORA 2015), 12th International Symposium on
Print_ISBN :
978-1-78561-085-1
Type :
conf
DOI :
10.1049/cp.2015.0617
Filename :
7456010
Link To Document :
بازگشت