DocumentCode :
1526325
Title :
Robust Bayesian Clustering for Replicated Gene Expression Data
Author :
Sun, Jianyong ; Garibaldi, Jonathan M. ; Kenobi, Kim
Author_Institution :
Centre for Plant Integrative Biol. (CPIB), Univ. of Nottingham, Nottingham, UK
Volume :
9
Issue :
5
fYear :
2012
Firstpage :
1504
Lastpage :
1514
Abstract :
Experimental scientific data sets, especially biology data, usually contain replicated measurements. The replicated measurements for the same object are correlated, and this correlation must be carefully dealt with in scientific analysis. In this paper, we propose a robust Bayesian mixture model for clustering data sets with replicated measurements. The model aims not only to accurately cluster the data points taking the replicated measurements into consideration, but also to find the outliers (i.e., scattered objects) which are possibly required to be studied further. A tree-structured variational Bayes (VB) algorithm is developed to carry out model fitting. Experimental studies showed that our model compares favorably with the infinite Gaussian mixture model, while maintaining computational simplicity. We demonstrate the benefits of including the replicated measurements in the model, in terms of improved outlier detection rates in varying measurement uncertainty conditions. Finally, we apply the approach to clustering biological transcriptomics mRNA expression data sets with replicated measurements.
Keywords :
Bayes methods; Gaussian processes; RNA; biology computing; genetics; molecular biophysics; trees (mathematics); biological transcriptomics mRNA expression data sets; biology data; computational simplicity; experimental scientific data sets; infinite Gaussian mixture model; replicated gene expression data; robust Bayesian clustering; robust Bayesian mixture model; scientific analysis; tree-structured variational Bayes algorithm; Approximation methods; Bayesian methods; Biological system modeling; Clustering algorithms; Data models; Robustness; Tin; Replicated measurement; clustering; gene expression data.; outlier detection; robust clustering; variational Bayes; Bayes Theorem; Cluster Analysis; Gene Expression; Gene Expression Profiling; Normal Distribution; RNA, Messenger; Transcriptome;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2012.85
Filename :
6205736
Link To Document :
بازگشت