DocumentCode
3460565
Title
Clustering of SNPs by a Structural EM Algorithm
Author
Zhang, Yulong ; Ji, Liang
Author_Institution
Dept. of Autom., Tsinghua Univ., Beijing, China
fYear
2009
fDate
3-5 Aug. 2009
Firstpage
147
Lastpage
150
Abstract
In population based human genetic studies, unrelated individuals are collected and SNPs are measured. There are several kinds of generative models proposed for modeling the data containing a large number of SNPs loci according to the characters of human genome. However, such models can only deal with ordered loci. In this paper, we try to model the same data without using the order information. Firstly, we present a clustering model for SNPs by modifying the multi-block model used in GERBIL. It is a two-layer Bayesian network with multiple latent variables. It does not use the order information of the loci. Secondly, we solve the model by employing a structural EM algorithm combined with simulated annealing mechanism. A real data set was analyzed by the model. The results show that the SNPs can be clustered effectively. Such a model is potentially useful for clustering distantly correlated SNPs loci.
Keywords
belief networks; biology computing; genetics; genomics; molecular biophysics; clustering model; human genetic method; human genome; multiblock model; multiple latent variables; simulated annealing mechanism; structural EM algorithm; two-layer Bayesian network; Bayesian methods; Bioinformatics; Clustering algorithms; Data analysis; Genetics; Genomics; Hidden Markov models; Humans; Sequences; Simulated annealing; Bayesian network; EM algorithm; block; generative mdoel; latent variable; simulated annealing;
fLanguage
English
Publisher
ieee
Conference_Titel
Bioinformatics, Systems Biology and Intelligent Computing, 2009. IJCBS '09. International Joint Conference on
Conference_Location
Shanghai
Print_ISBN
978-0-7695-3739-9
Type
conf
DOI
10.1109/IJCBS.2009.97
Filename
5260711
Link To Document