• DocumentCode
    3769643
  • Title

    On classification of biological data using outlier detection

  • Author

    Yushan Qiu;Xiaoqing Cheng;Wenpin Hou;Wai-Ki Ching

  • Author_Institution
    Advanced Modeling and Applied Computing Laboratory, Department of Mathematics, University of Hong Kong, Hong Kong
  • fYear
    2015
  • fDate
    8/1/2015 12:00:00 AM
  • Firstpage
    1
  • Lastpage
    7
  • Abstract
    With the rapid development of information technology, the number of datasets, as well as their complexity and dimension, have been growing dramatically. This dramatic growth of biology data and non-biological commercial databases becomes a challenging issue in data mining. Classification technique is one of the major tools in the captured research area. However, the performance of classification may be degraded when there exists noise in the captured databases. Therefore, outlier detection becomes an urgent need and the issue of how to integrate outlier detection method and classification techniques is an important and challenging issue. In this paper, we proposed a novel and effective approach based on k-means clustering to identify outliers in the databases. In particular, we employed one of famous classification techniques, Support Vector Machine (SVM), owing to its ability to handle highdimensional data set. We also compare the classification results with the multivariate outlier detection method. Numerical results on two different data sets indicate that the classification results after removing the outliers by our proposed method are much better than the multivariate outlier detection method.
  • Publisher
    iet
  • Conference_Titel
    Operations Research and its Applications in Engineering, Technology and Management (ISORA 2015), 12th International Symposium on
  • Print_ISBN
    978-1-78561-085-1
  • Type

    conf

  • DOI
    10.1049/cp.2015.0617
  • Filename
    7456010