DocumentCode :
844510
Title :
Poisson-Based Self-Organizing Feature Maps and Hierarchical Clustering for Serial Analysis of Gene Expression Data
Author :
Wang, Haiying ; Zheng, Huiru ; Azuaje, Francisco
Author_Institution :
Sch. of Comput. & Mathematics, Ulster Univ.
Volume :
4
Issue :
2
fYear :
2007
Firstpage :
163
Lastpage :
175
Abstract :
Serial analysis of gene expression (SAGE) is a powerful technique for global gene expression profiling, allowing simultaneous analysis of thousands of transcripts without prior structural and functional knowledge. Pattern discovery and visualization have become fundamental approaches to analyzing such large-scale gene expression data. From the pattern discovery perspective, clustering techniques have received great attention. However, due to the statistical nature of SAGE data {i.e., underlying distribution), traditional clustering techniques may not be suitable for SAGE data analysis. Based on the adaptation and improvement of self-organizing maps and hierarchical clustering techniques, this paper presents two new clustering algorithms, namely, PoissonS and PoissonHC, for SAGE data analysis. Tested on synthetic and experimental SAGE data, these algorithms demonstrate several advantages over traditional pattern discovery techniques. The results indicate that, by incorporating statistical properties of SAGE data, PoissonS and PoissonHC, as well as a hybrid approach (neuro-hierarchical approach) based on the combination of PoissonS and PoissonHC, offer significant improvements in pattern discovery and visualization for SAGE data. Moreover, a user-friendly platform, which may improve and accelerate SAGE data mining, was implemented. The system is freely available on request from the authors for nonprofit use
Keywords :
biology computing; data mining; genetics; human computer interaction; molecular biophysics; pattern clustering; self-organising feature maps; stochastic processes; Poisson-based self-organizing feature maps; PoissonHC; PoissonS; clustering techniques; data mining; gene expression; hierarchical clustering; neurohierarchical approach; pattern discovery; pattern visualization; serial analysis; Acceleration; Clustering algorithms; Data analysis; Data mining; Data visualization; Gene expression; Large-scale systems; Pattern analysis; Self organizing feature maps; Testing; Pattern discovery and visualization; Poisson distribution; hybrid machine learning; self-organizing maps; serial analysis of gene expression.; Algorithms; Cluster Analysis; Data Interpretation, Statistical; Databases, Protein; Gene Expression Profiling; Information Storage and Retrieval; Pattern Recognition, Automated; Poisson Distribution; Proteome; Software; User-Computer Interface;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2007.070204
Filename :
4196529
Link To Document :
بازگشت