• DocumentCode
    1992149
  • Title

    Cluster Analysis of Regulatory Sequences with a Log Likelihood Ratio Statistics-based Similarity Measure

  • Author

    Zheng, Huiru ; Wang, Haiying ; Hu, Jinglu

  • Author_Institution
    Ulster Univ., Belfast
  • fYear
    2007
  • fDate
    14-17 Oct. 2007
  • Firstpage
    1220
  • Lastpage
    1224
  • Abstract
    Upstream regions in the DNA sequence are characterized by the presence of short regulatory motifs, which function as target binding sites for transcription factors. Finding two genes with common motifs in their regulatory regions may aid users in identifying co-regulated genes or inferring regulatory modules. By modelling pattern occurrences in the regulatory regions with Poisson statistics, this paper presents a log likelihood ratio statistics-based distance measure to calculate pair-wise similarities between sequences. To perform cluster analysis of regulatory sequences, this paper introduces two clustering algorithms on the basis of the incorporation of the log likelihood ratio statistics-based distance into hierarchical clustering and Self-Organizing Map. The proposed approach has been tested on a synthetic dataset and a real biological example. The results indicate that, in comparison to traditional distance functions, the log likelihood ratio statistics-based similarity measure offers considerable improvements in the process of regulatory sequence-based gene classification.
  • Keywords
    DNA; biology computing; cellular biophysics; genetics; molecular biophysics; molecular configurations; self-organising feature maps; stochastic processes; DNA sequence; Poisson statistics; cluster analysis; coregulated genes; gene classification; hierarchical clustering; log likelihood ratio statistics; regulatory sequences; self-organizing map; similarity measure; Algorithm design and analysis; Clustering algorithms; DNA; Gene expression; Mathematics; Performance analysis; Proteins; Sequences; Statistics; Testing; Poisson distribution; cluster analysis; log likelihood ratio; regulatory sequence;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Bioengineering, 2007. BIBE 2007. Proceedings of the 7th IEEE International Conference on
  • Conference_Location
    Boston, MA
  • Print_ISBN
    978-1-4244-1509-0
  • Type

    conf

  • DOI
    10.1109/BIBE.2007.4375719
  • Filename
    4375719