Title :
Distance Metric Learning and Support Vector Machines for Classification of Mass Spectrometry Proteomics Data
Author :
Liu, Qingzhong ; Qiao, Mengyu ; Sung, Andrew H.
Author_Institution :
Comput. Sci. Dept., New Mexico Inst. of Min. & Technol., Socorro, NM, USA
Abstract :
Mass spectrometry becomes the most widely used measurement in proteomics research. High dimensionality of features and small dataset are two major limitations restrict the accuracy of classification in mass spectrum data analysis. To improve the data mining result, two major issues need to be highlighted, which are feature extraction and feature selection. The quality of the feature set determines the reliability of the prediction of disease status. A well-known approach is to detect peak values and apply support vector machine recursive feature elimination (SVMRFE) to choose feature sets for classification. In this article, we successfully apply a distance metric learning to classify proteomics mass spectrometry data. Experimental results show that distance metric learning can successfully be applied to the classification of proteomics data and the results are comparable to the best results by applying SVM to the feature sets chosen with the use of SVMRFE.
Keywords :
biology computing; data analysis; data mining; diseases; feature extraction; mass spectroscopy; pattern classification; proteomics; recursive estimation; support vector machines; data mining; disease status; distance metric learning; feature extraction; feature selection; for classification; mass spectrometry proteomics data; mass spectrum data analysis; proteomics mass spectrometry data; support vector machine recursive feature elimination; support vector machines; Chemicals; Data mining; Feature extraction; Machine learning; Mass spectroscopy; Noise generators; Noise reduction; Proteomics; Support vector machine classification; Support vector machines; Proteomics; classification; distance metric learning; feature selection; mass spectrum;
Conference_Titel :
Machine Learning and Applications, 2008. ICMLA '08. Seventh International Conference on
Conference_Location :
San Diego, CA
Print_ISBN :
978-0-7695-3495-4
DOI :
10.1109/ICMLA.2008.91