• DocumentCode
    1234306
  • Title

    A Sequential Monte Carlo Method for Motif Discovery

  • Author

    Liang, Kuo-Ching ; Wang, Xiaodong ; Anastassiou, Dimitris

  • Volume
    56
  • Issue
    9
  • fYear
    2008
  • Firstpage
    4496
  • Lastpage
    4507
  • Abstract
    We propose a sequential Monte Carlo (SMC)-based motif discovery algorithm that can efficiently detect motifs in datasets containing a large number of sequences. The statistical distribution of the motifs is modeled by an underlying position weight matrix (PWM), and both the PWM and the positions of the motifs within the sequences are estimated by the SMC algorithm. The proposed SMC motif discovery technique can locate motifs under a number of scenarios, including the single-block model, two-block model with unknown gap length, motifs of unknown lengths, motifs with unknown abundance, and sequences with multiple unique motifs. The accuracy of the SMC motif discovery algorithm is shown to be superior to that of the existing methods based on MCMC or EM algorithms. Furthermore, it is shown that the proposed method can be used to improve the results of existing motif discovery algorithms by using their results as the priors for the SMC algorithm.
  • Keywords
    Genomic sequence; Sequential Monte Carlo (SMC); genomic sequence; motif discovery; resampling; sequential Monte Carlo (SMC);
  • fLanguage
    English
  • Journal_Title
    Signal Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1053-587X
  • Type

    jour

  • DOI
    10.1109/TSP.2008.926194
  • Filename
    4531368