• DocumentCode
    464291
  • Title

    Efficient and Scalable Motif Discovery using Graph-based Search

  • Author

    Sinha, Amit U. ; Bhatnagar, Raj

  • Author_Institution
    Dept. of ECECS, Cincinnati Univ., OH
  • fYear
    2007
  • fDate
    1-5 April 2007
  • Firstpage
    197
  • Lastpage
    204
  • Abstract
    Identification of short repeated patterns (motifs) in genomic sequences is the key to many problems in bioinformatics. The promoter regions of genes are an important target of search for such motifs (transcription factor binding sites). We present a new algorithm, Mortice, for detecting potential binding sites which are present in a given set of genomic sequences. An informed search is performed by organizing the input patterns and their variants in a graph. Such a strategy efficiently leads to the desired solutions. The background is modeled as a Markov process and a composite score function is used. We demonstrate the performance of our algorithm by testing it on real-life data sets from yeast and human promoter sequences. We compared the performance with several popular algorithms and found that other algorithms work well with lower organisms like yeast but only a couple of them work well with human data. We show that our algorithm scales linearly with the size of input dataset. We compare the computational efficiency of our algorithm with other algorithms and show that it performs faster for different datasets and motif sizes
  • Keywords
    biology computing; data mining; genetics; graph theory; search problems; Mortice algorithm; binding sites; bioinformatics; efficient motif discovery; genomic sequences; graph based search; scalable motif discovery; short repeated patterns; Bioinformatics; Computational biology; Computational intelligence; DNA; Fungi; Genomics; Humans; Libraries; Proteins; Sequences;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Bioinformatics and Computational Biology, 2007. CIBCB '07. IEEE Symposium on
  • Conference_Location
    Honolulu, HI
  • Print_ISBN
    1-4244-0710-9
  • Type

    conf

  • DOI
    10.1109/CIBCB.2007.4221224
  • Filename
    4221224