• DocumentCode
    2008006
  • Title

    Distance Metric Learning and Support Vector Machines for Classification of Mass Spectrometry Proteomics Data

  • Author

    Liu, Qingzhong ; Qiao, Mengyu ; Sung, Andrew H.

  • Author_Institution
    Comput. Sci. Dept., New Mexico Inst. of Min. & Technol., Socorro, NM, USA
  • fYear
    2008
  • fDate
    11-13 Dec. 2008
  • Firstpage
    631
  • Lastpage
    636
  • Abstract
    Mass spectrometry becomes the most widely used measurement in proteomics research. High dimensionality of features and small dataset are two major limitations restrict the accuracy of classification in mass spectrum data analysis. To improve the data mining result, two major issues need to be highlighted, which are feature extraction and feature selection. The quality of the feature set determines the reliability of the prediction of disease status. A well-known approach is to detect peak values and apply support vector machine recursive feature elimination (SVMRFE) to choose feature sets for classification. In this article, we successfully apply a distance metric learning to classify proteomics mass spectrometry data. Experimental results show that distance metric learning can successfully be applied to the classification of proteomics data and the results are comparable to the best results by applying SVM to the feature sets chosen with the use of SVMRFE.
  • Keywords
    biology computing; data analysis; data mining; diseases; feature extraction; mass spectroscopy; pattern classification; proteomics; recursive estimation; support vector machines; data mining; disease status; distance metric learning; feature extraction; feature selection; for classification; mass spectrometry proteomics data; mass spectrum data analysis; proteomics mass spectrometry data; support vector machine recursive feature elimination; support vector machines; Chemicals; Data mining; Feature extraction; Machine learning; Mass spectroscopy; Noise generators; Noise reduction; Proteomics; Support vector machine classification; Support vector machines; Proteomics; classification; distance metric learning; feature selection; mass spectrum;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications, 2008. ICMLA '08. Seventh International Conference on
  • Conference_Location
    San Diego, CA
  • Print_ISBN
    978-0-7695-3495-4
  • Type

    conf

  • DOI
    10.1109/ICMLA.2008.91
  • Filename
    4725041