• DocumentCode
    2377647
  • Title

    Using graph modularity analysis to identify transcription factor binding sites

  • Author

    Francisco, Alexandre P. ; Schbath, Sophie ; Freitas, Ana T. ; Oliveira, Arlindo L.

  • Author_Institution
    INESC-ID/IST, Portugal
  • fYear
    2010
  • fDate
    18-18 Dec. 2010
  • Firstpage
    19
  • Lastpage
    26
  • Abstract
    Despite the remarkable success of computational biology methods in some areas of application like gene finding and sequence alignment, there are still topics for which no definitive approaches have been proposed. One of these is the accurate detection of biologically significant cis-regulatory motifs, that remains an open problem, despite intensive research in the field. Probabilistic motif finders are most popular, mainly because combinatorial motif finders generate extensive and hard to understand lists of potential motifs. In this work, we present Needle, a method for de novo motif discovery that works by post-processing the output of a combinatorial motif finder, using graph analysis techniques. The method is based on the identification of highly connected modules in the graph that is obtained by connecting the nodes that correspond to motifs if these motifs are co-located in the sequences under analysis. We have tested this method against several well known motif finders, using a set of recently published large-scale compendium of transcription factors, derived from diverse high-throughput experiments in several metazoan. Preliminary results show that the method is highly competitive with state of the art methods that use much more extensive information. We expect that future versions of the algorithm, that will include a number of improvements, will become one of the methods of choice to identify significant cis-regulatory motifs that include only a small conserved core.
  • Keywords
    bioinformatics; graph theory; molecular biophysics; molecular configurations; proteins; Needle method; cis-regulatory motif detection; combinatorial motif finder output post processing; computational biology methods; de novo motif discovery; graph analysis techniques; graph modularity analysis; highly connected module identification; metazoans; probabilistic motif finders; transcription factor binding site identification; transcription factors; binding sites; graph modularity analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Biomedicine Workshops (BIBMW), 2010 IEEE International Conference on
  • Conference_Location
    Hong, Kong
  • Print_ISBN
    978-1-4244-8303-7
  • Electronic_ISBN
    978-1-4244-8304-4
  • Type

    conf

  • DOI
    10.1109/BIBMW.2010.5703767
  • Filename
    5703767