• DocumentCode
    1071224
  • Title

    Sequential modeling for identifying CpG island locations in human genome

  • Author

    Dasgupta, Nilanjan ; Lin, Simon ; Carin, Lawrence

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Duke Univ., Durham, NC, USA
  • Volume
    9
  • Issue
    12
  • fYear
    2002
  • Firstpage
    407
  • Lastpage
    409
  • Abstract
    We consider several sequential processing algorithms for identifying genes in human DNA, based on detecting CpG ("C proceeds G") islands. The algorithms are designed to capture the underlying statistical structure in a DNA sequence. Sequential processing using a Markov model and a hidden Markov model are shown to identify most CpG islands in annotated (marked) DNA subsequences available from publicly available DNA datasets. We also consider a wavelet-based hidden Markov tree (HMT). In the context of the HMT, we address design of adaptive wavelets matched to CpG islands, this accomplished via lifting and genetic-algorithm optimization.
  • Keywords
    DNA; genetic algorithms; genetics; hidden Markov models; medical signal processing; wavelet transforms; CpG island locations; DNA sequence; HMT; Markov model; adaptive wavelets; annotated DNA subsequences; genes; genetic-algorithm; hidden Markov model; human DNA; human genome; optimization; sequential modeling; sequential processing algorithms; statistical structure; wavelet-based hidden Markov tree; Algorithm design and analysis; Bioinformatics; Chemicals; DNA; Design optimization; Genomics; Hidden Markov models; Humans; Sequences; Signal processing algorithms;
  • fLanguage
    English
  • Journal_Title
    Signal Processing Letters, IEEE
  • Publisher
    ieee
  • ISSN
    1070-9908
  • Type

    jour

  • DOI
    10.1109/LSP.2002.806062
  • Filename
    1159624