Title :
NSC-NSGA2: Optimal search for finding multiple thresholds for nearest shrunken centroid
Author :
Vinh Quoc Dang ; Chiou-Peng Lam
Author_Institution :
Sch. of Comput. & Security Sci., Edith Cowan Univ., Joondalup, WA, Australia
Abstract :
The Nearest Shrunken Centroid (NSC) method, with Prediction Analysis for Microarrays being its most well known implementation, has been widely used as a classification method for high dimensional biomedical data. A threshold value must also be provided in this method as input and normally, this is selected manually on a “trial and error” basis by executing the NSC method many times using a number of predetermined shrinkage threshold values. The optimal value is then obtained by minimizing the cross-validated error on the training data. This process can be time-consuming and the optimal threshold value may be limited by the granularity of the predetermined values. In this paper, an approach incorporating the NSC method and a multi-objective evolutionary algorithm, Non-dominated Sorting Algorithm 2, is proposed for obtaining the optimal shrinkage threshold value automatically. The NSC method acts as the fitness evaluator in the evolutionary process. Multiple objectives can be incorporated for determining the threshold values and a number of optimal solutions are obtained, each on the basis of tradeoffs between the objectives. By providing multiple potential solutions, it allows biomedical experts to better explore the joint behaviors of features in their data. The proposed approach also overcomes limitations normally associated with single objective approaches; a single optimum and the need to determine weightings associated with various objective functions in an aggregated objective function. The proposed approach was evaluated using the Alzheimer´s Disease, Colon and Leukemia cancer dataset.
Keywords :
cancer; data analysis; evolutionary computation; feature selection; lab-on-a-chip; medical information systems; sorting; Alzheimer disease dataset; NSC-NSGA2 method; aggregated objective function; biomedical experts; classification method; colon cancer dataset; feature selection; high dimensional biomedical data; leukemia cancer dataset; microarray prediction analysis; multiobjective evolutionary algorithm; nearest shrunken centroid method; nondominated sorting algorithm 2; optimal shrinkage threshold value; Accuracy; Biological cells; Classification algorithms; Colon; Sociology; Sorting; Statistics; Nearest shrunken centroid; feature selection; mullti-objective; non-dominated sorting; shrinkage thresholds;
Conference_Titel :
Bioinformatics and Biomedicine (BIBM), 2013 IEEE International Conference on
Conference_Location :
Shanghai
DOI :
10.1109/BIBM.2013.6732520