DocumentCode
2008006
Title
Distance Metric Learning and Support Vector Machines for Classification of Mass Spectrometry Proteomics Data
Author
Liu, Qingzhong ; Qiao, Mengyu ; Sung, Andrew H.
Author_Institution
Comput. Sci. Dept., New Mexico Inst. of Min. & Technol., Socorro, NM, USA
fYear
2008
fDate
11-13 Dec. 2008
Firstpage
631
Lastpage
636
Abstract
Mass spectrometry becomes the most widely used measurement in proteomics research. High dimensionality of features and small dataset are two major limitations restrict the accuracy of classification in mass spectrum data analysis. To improve the data mining result, two major issues need to be highlighted, which are feature extraction and feature selection. The quality of the feature set determines the reliability of the prediction of disease status. A well-known approach is to detect peak values and apply support vector machine recursive feature elimination (SVMRFE) to choose feature sets for classification. In this article, we successfully apply a distance metric learning to classify proteomics mass spectrometry data. Experimental results show that distance metric learning can successfully be applied to the classification of proteomics data and the results are comparable to the best results by applying SVM to the feature sets chosen with the use of SVMRFE.
Keywords
biology computing; data analysis; data mining; diseases; feature extraction; mass spectroscopy; pattern classification; proteomics; recursive estimation; support vector machines; data mining; disease status; distance metric learning; feature extraction; feature selection; for classification; mass spectrometry proteomics data; mass spectrum data analysis; proteomics mass spectrometry data; support vector machine recursive feature elimination; support vector machines; Chemicals; Data mining; Feature extraction; Machine learning; Mass spectroscopy; Noise generators; Noise reduction; Proteomics; Support vector machine classification; Support vector machines; Proteomics; classification; distance metric learning; feature selection; mass spectrum;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Applications, 2008. ICMLA '08. Seventh International Conference on
Conference_Location
San Diego, CA
Print_ISBN
978-0-7695-3495-4
Type
conf
DOI
10.1109/ICMLA.2008.91
Filename
4725041
Link To Document