DocumentCode :
2034553
Title :
SNPs and entropy based hierarchical clustering method for genetic phylogeny analysis
Author :
Wang, Jun ; Guo, Mao-zu
Author_Institution :
Sch. of Comput. Sci. & Technol., Harbin Inst. of Technol., Harbin, China
Volume :
5
fYear :
2010
fDate :
10-12 Aug. 2010
Firstpage :
2229
Lastpage :
2233
Abstract :
Cluster analysis is widely used in the genetic researches, especially in phylogeny analysis. However, it is time-consuming to infer the evolutionary dendrogram from large biological data. Thus, in this paper, single nucleotide polymorphisms (SNPs), which can characterize the genetic variations, are mined from the genetic sequences to reduce the dimensions of original data in phylogeny analysis. The cost of phylogeny analysis can be reduced and the noises can be eliminated by the mining algorithm. The common used measures for subpopulation genetic divergences, such as the Euclidean distance, often lose important genetic variation information in clustering process. Therefore, the relative information entropy is used to evaluate the subpopulation genetic diversity of given species. A new genetic distance is defined to measure the subpopulation divergence by combining the genetic diversity evaluation value and the sequence structure similarity among subpopulations. The new genetic distance is employed by a hierarchical clustering algorithm to infer the dendrogram of given species in genetic phylogeny analysis. The experimental results of human data show that our method can accurately evaluate the genetic divergences among subgroups of given species, and produce reasonable evolutionary dendrogram in shorter time.
Keywords :
biology computing; entropy; evolution (biological); genetics; pattern clustering; biological data; cluster analysis; entropy based hierarchical clustering; evolutionary dendrogram; genetic diversity; genetic phylogeny analysis; single nucleotide polymorphisms; Bioinformatics; Clustering algorithms; Diversity reception; Genomics; Phylogeny; SNPs; entropy; genetic distance; genetic diversity; hierarchical clustering; phylogeny analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on
Conference_Location :
Yantai, Shandong
Print_ISBN :
978-1-4244-5931-5
Type :
conf
DOI :
10.1109/FSKD.2010.5569539
Filename :
5569539
Link To Document :
بازگشت