DocumentCode
72021
Title
Integrative Clustering by Nonnegative Matrix Factorization Can Reveal Coherent Functional Groups From Gene Profile Data
Author
Brdar, Sanja ; Crnojevic, Vladimir ; Zupan, Blaz
Author_Institution
Fac. of Tech. Sci., Univ. of Novi Sad, Novi Sad, Serbia
Volume
19
Issue
2
fYear
2015
fDate
Mar-15
Firstpage
698
Lastpage
708
Abstract
Recent developments in molecular biology and techniques for genome-wide data acquisition have resulted in abundance of data to profile genes and predict their function. These datasets may come from diverse sources and it is an open question how to commonly address them and fuse them into a joint prediction model. A prevailing technique to identify groups of related genes that exhibit similar profiles is profile-based clustering. Cluster inference may benefit from consensus across different clustering models. In this paper, we propose a technique that develops separate gene clusters from each of available data sources and then fuses them by means of nonnegative matrix factorization. We use gene profile data on the budding yeast S. cerevisiae to demonstrate that this approach can successfully integrate heterogeneous datasets and yield high-quality clusters that could otherwise not be inferred by simply merging the gene profiles prior to clustering.
Keywords
bioinformatics; cellular biophysics; data integration; genetics; matrix decomposition; microorganisms; molecular biophysics; pattern clustering; S. cerevisiae; budding yeast; clustering model consensus; coherent functional group; data source; gene cluster development; gene cluster fusion; gene cluster inference; gene dataset fusion; gene function prediction; gene profile data; gene profile merging; genome-wide data acquisition; heterogeneous dataset integration; high cluster quality; integrative clustering; joint prediction model; molecular biology; nonnegative matrix factorization; profile-based clustering; related gene group identification; similar gene profile; Bioinformatics; Clustering algorithms; DH-HEMTs; Data integration; Gene expression; Informatics; Matrix decomposition; Clustering; data fusion; gene profiling; gene set enrichment; nonnegative matrix factorization (NMF);
fLanguage
English
Journal_Title
Biomedical and Health Informatics, IEEE Journal of
Publisher
ieee
ISSN
2168-2194
Type
jour
DOI
10.1109/JBHI.2014.2316508
Filename
6786303
Link To Document