Title :
A theoretical analysis of gene selection
Author :
Mukherjee, Sach ; Roberts, Stephen J.
Author_Institution :
Dept. of Eng. Sci., Oxford Univ., UK
Abstract :
A great deal of recent research has focused on the challenging task of selecting differentially expressed genes from microarray data (´gene selection´). Numerous gene selection algorithms have been proposed in the literature, but it is often unclear exactly how these algorithms respond to conditions like small sample-sizes or differing variances. Choosing an appropriate algorithm can therefore be difficult in many cases. In this paper we propose a theoretical analysis of gene selection, in which the probability of successfully selecting relevant genes, using a given gene ranking function, is explicitly calculated in terms of population parameters. The theory developed is applicable to any ranking function which has a known sampling distribution, or one which can be approximated analytically. In contrast to empirical methods, the analysis can easily be used to examine the behaviour of gene selection algorithms under a wide variety of conditions, even when the numbers of genes involved runs into the tens of thousands. The utility of our approach is illustrated by comparing three well-known gene ranking functions.
Keywords :
biology computing; genetics; molecular biophysics; differentially expressed genes; gene ranking function; gene selection; microarray data; population parameters; Algorithm design and analysis; Bioinformatics; Cells (biology); Data analysis; Data engineering; Diseases; Needles; Probability; Sampling methods; Testing;
Conference_Titel :
Computational Systems Bioinformatics Conference, 2004. CSB 2004. Proceedings. 2004 IEEE
Print_ISBN :
0-7695-2194-0
DOI :
10.1109/CSB.2004.1332425