DocumentCode
1080367
Title
Significance of Gene Ranking for Classification of Microarray Samples
Author
Chaolin Zhang ; Xuesong Lu ; Xuegong Zhang
Author_Institution
Dept. of Biomed. Eng., State Univ. of New York, Stony Brook, NY
Volume
3
Issue
3
fYear
2006
Firstpage
312
Lastpage
320
Abstract
Many methods for classification and gene selection with microarray data have been developed. These methods usually give a ranking of genes. Evaluating the statistical significance of the gene ranking is important for understanding the results and for further biological investigations, but this question has not been well addressed for machine learning methods in existing works. Here, we address this problem by formulating it in the framework of hypothesis testing and propose a solution based on resampling. The proposed r-test methods convert gene ranking results into position p-values to evaluate the significance of genes. The methods are tested on three real microarray data sets and three simulation data sets with support vector machines as the method of classification and gene selection. The obtained position p-values help to determine the number of genes to be selected and enable scientists to analyze selection results by sophisticated multivariate methods under the same statistical inference paradigm as for simple hypothesis testing methods
Keywords
biology computing; cellular biophysics; genetics; learning (artificial intelligence); molecular biophysics; statistical analysis; support vector machines; gene ranking; gene selection; hypothesis testing; machine learning methods; microarray sample classification; position p-values; r-test methods; resampling; sophisticated multivariate methods; statistical inference paradigm; support vector machines; Cancer; Chaos; Data analysis; Filtering; Learning systems; Statistical analysis; Statistical distributions; Support vector machine classification; Support vector machines; Testing; Significance of gene ranking; classification; gene selection; microarray data analysis.; Algorithms; Artificial Intelligence; Cluster Analysis; Gene Expression Profiling; Oligonucleotide Array Sequence Analysis; Sample Size;
fLanguage
English
Journal_Title
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher
ieee
ISSN
1545-5963
Type
jour
DOI
10.1109/TCBB.2006.42
Filename
1668029
Link To Document