• DocumentCode
    844510
  • Title

    Poisson-Based Self-Organizing Feature Maps and Hierarchical Clustering for Serial Analysis of Gene Expression Data

  • Author

    Wang, Haiying ; Zheng, Huiru ; Azuaje, Francisco

  • Author_Institution
    Sch. of Comput. & Mathematics, Ulster Univ.
  • Volume
    4
  • Issue
    2
  • fYear
    2007
  • Firstpage
    163
  • Lastpage
    175
  • Abstract
    Serial analysis of gene expression (SAGE) is a powerful technique for global gene expression profiling, allowing simultaneous analysis of thousands of transcripts without prior structural and functional knowledge. Pattern discovery and visualization have become fundamental approaches to analyzing such large-scale gene expression data. From the pattern discovery perspective, clustering techniques have received great attention. However, due to the statistical nature of SAGE data {i.e., underlying distribution), traditional clustering techniques may not be suitable for SAGE data analysis. Based on the adaptation and improvement of self-organizing maps and hierarchical clustering techniques, this paper presents two new clustering algorithms, namely, PoissonS and PoissonHC, for SAGE data analysis. Tested on synthetic and experimental SAGE data, these algorithms demonstrate several advantages over traditional pattern discovery techniques. The results indicate that, by incorporating statistical properties of SAGE data, PoissonS and PoissonHC, as well as a hybrid approach (neuro-hierarchical approach) based on the combination of PoissonS and PoissonHC, offer significant improvements in pattern discovery and visualization for SAGE data. Moreover, a user-friendly platform, which may improve and accelerate SAGE data mining, was implemented. The system is freely available on request from the authors for nonprofit use
  • Keywords
    biology computing; data mining; genetics; human computer interaction; molecular biophysics; pattern clustering; self-organising feature maps; stochastic processes; Poisson-based self-organizing feature maps; PoissonHC; PoissonS; clustering techniques; data mining; gene expression; hierarchical clustering; neurohierarchical approach; pattern discovery; pattern visualization; serial analysis; Acceleration; Clustering algorithms; Data analysis; Data mining; Data visualization; Gene expression; Large-scale systems; Pattern analysis; Self organizing feature maps; Testing; Pattern discovery and visualization; Poisson distribution; hybrid machine learning; self-organizing maps; serial analysis of gene expression.; Algorithms; Cluster Analysis; Data Interpretation, Statistical; Databases, Protein; Gene Expression Profiling; Information Storage and Retrieval; Pattern Recognition, Automated; Poisson Distribution; Proteome; Software; User-Computer Interface;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2007.070204
  • Filename
    4196529