Title :
Hierarchical Clustering of Lung Cancer Related Genes
Author :
Yu Wei ; Zhang Huajia ; Wu Kuanheng ; Lin Qiangqian ; He Miao
Author_Institution :
Life Sci. Sch., Sun Yat-sen Univ., Guangzhou
Abstract :
It is still at an initial stage to study lung cancer gene by using data mining techniques. In this paper, the hierarchical clustering method was applied to study lung cancer-related genes. Total 367 lung cancer associated genes sequences were down- loaded from GenBank and Ensembal. The nucleotide content distributes of each gene sequences was first calculated via Matlab. Each gene sequence was then defined as a point within a vector space of 84 dimensional, basing on the corresponding nucleotide content distributes. Similarity matrix between every two genes was calculated basing on the Pearson correlation. Hierarchical clustering analysis on the 367 gene sequences had been finally done using agglomerative method. All the data were divided into nine clusters according to the height 0.01. By comparing with the Gene Ontology (GO) annotation (http:// www.geneontology.org), the results show some correlativity between the clusters and GO function classification, which indicates certain correlativity between the base contents of range and the gene functions.
Keywords :
biology computing; cancer; cellular biophysics; data mining; genetics; lung; molecular biophysics; pattern clustering; Ensembal; GO function classification; GenBank; Matlab; Pearson correlation; agglomerative method; data mining techniques; gene sequences; hierarchical clustering analysis; lung cancer related genes; nucleotide; similarity matrix; Cancer; Clustering methods; Gene expression; Humans; Lungs; Mathematics; Metastasis; Ontologies; Proteins; Sequences;
Conference_Titel :
Bioinformatics and Biomedical Engineering, 2008. ICBBE 2008. The 2nd International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-1747-6
Electronic_ISBN :
978-1-4244-1748-3
DOI :
10.1109/ICBBE.2008.22