DocumentCode :
945706
Title :
Identifying differentially expressed genes in microarray experiments with model-based variance estimation
Author :
Cai, Xiaodong ; Giannakis, Georgios B.
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Miami, Coral Gables, FL, USA
Volume :
54
Issue :
6
fYear :
2006
fDate :
6/1/2006 12:00:00 AM
Firstpage :
2418
Lastpage :
2426
Abstract :
Statistical tests have been employed to identify genes differentially expressed under different conditions using data from microarray experiments. The variance of gene expression levels is often required in various statistical tests; however, due to the small number of replicates, the variance estimated from the sample variance is not accurate, which causes large false positive and negative errors. More accurate and robust variance estimation is thus highly desirable to improve the performance of statistical tests. In this paper, cluster analysis was performed on the microarray data using a model-based clustering method. The variance for each gene was then estimated from cluster variances. Since cluster variances are estimated from multiple genes whose microarray data have similar variance, the proposed estimation method pools the relevant genes together; this effectively increases the number of samples in variance estimation, thereby improving variance estimation. Using simulated data, it is shown that with the novel variance estimation, the performance of the t-test, regularized t-test, and a variant of SAM test, which is called the S-test here, can be improved. Using colon microarray data of Alon et al., it is demonstrated that the proposed method offers better or comparable performance compared with other gene pooling methods. Using the IHF microarray data of Arfin et al., it is shown that the proposed novel variance estimation decreases the significance of those genes having a small fold change but a high significant score assigned by the t-test using the sample variance, which potentially reduces false positive probability.
Keywords :
genetic engineering; modelling; statistical testing; colon microarray data; gene expression; microarray experiments; model-based clustering method; model-based variance estimation; statistical tests; Bayesian methods; Bioinformatics; Gene expression; Genomics; Probability; Robustness; Statistical analysis; Statistical distributions; Statistics; Testing; Clustering; micorarray; mixture model; statistical test; variance estimation;
fLanguage :
English
Journal_Title :
Signal Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1053-587X
Type :
jour
DOI :
10.1109/TSP.2006.873733
Filename :
1634844
Link To Document :
بازگشت