Title :
An adaptive feature reduction algorithm for cancer classification using wavelet decomposition of serum proteomic and DNA microarray data
Author :
Rashid, Sabrina ; Maruf, Golam Morshed
Author_Institution :
Dept. of ECE, Univ. of British Columbia, Vancouver, BC, Canada
Abstract :
A significant challenge in DNA microarray and mass spectrometric data analysis can be attributed to the problem of having a large number of features with a small number of samples or patients in the data set. Particular care is required to deal with such a problem as the low classification accuracy of a model brought about by the small number of features may depict a low predictive capability. To overcome the associated challenges, proper approaches for data preprocessing, feature reduction and identifying the optimal set of features are critical. In this paper, a novel technique has been proposed for feature reduction and cancer classification; which is applicable for two different types of biological data. The proposed method has been implemented on Surface enhanced laser desorption/ionization time-of-flight mass spectrometric (SELDI-TOF-MS) and DNA microarray data sets. This technique is self adaptive and independent of the type data sets. We have developed a two step strategy for feature reduction such as (1) data preprocessing which includes merging and t-testing and (2) wavelet decomposition. For classification purpose, support vector machine (SVM) has been proposed. By evaluating the performance of the proposed algorithm on the two types of datasets it has been shown that the classification accuracy, sensitivity and specificity obtained by the features selected by the proposed method consistently give excellent performance.
Keywords :
cancer; feature extraction; lab-on-a-chip; mass spectroscopy; medical diagnostic computing; proteins; proteomics; signal classification; signal processing; time of flight mass spectra; time of flight mass spectroscopy; DNA microarray data; adaptive feature reduction algorithm; cancer classification; data preprocessing; feature reduction; mass spectrometric data analysis; serum proteomic; support vector machine; surface enhanced laser desorption-ionization time-of-flight mass spectrometric data set; t-testing; wavelet decomposition; Accuracy; Algorithm design and analysis; Cancer; Kernel; Least squares approximation; Proteomics; Support vector machines; SELDI-TOF; microarray; support vector machine; t-test; wavelet decomposition;
Conference_Titel :
Bioinformatics and Biomedicine Workshops (BIBMW), 2011 IEEE International Conference on
Conference_Location :
Atlanta, GA
Print_ISBN :
978-1-4577-1612-6
DOI :
10.1109/BIBMW.2011.6112391