DocumentCode
1071224
Title
Sequential modeling for identifying CpG island locations in human genome
Author
Dasgupta, Nilanjan ; Lin, Simon ; Carin, Lawrence
Author_Institution
Dept. of Electr. & Comput. Eng., Duke Univ., Durham, NC, USA
Volume
9
Issue
12
fYear
2002
Firstpage
407
Lastpage
409
Abstract
We consider several sequential processing algorithms for identifying genes in human DNA, based on detecting CpG ("C proceeds G") islands. The algorithms are designed to capture the underlying statistical structure in a DNA sequence. Sequential processing using a Markov model and a hidden Markov model are shown to identify most CpG islands in annotated (marked) DNA subsequences available from publicly available DNA datasets. We also consider a wavelet-based hidden Markov tree (HMT). In the context of the HMT, we address design of adaptive wavelets matched to CpG islands, this accomplished via lifting and genetic-algorithm optimization.
Keywords
DNA; genetic algorithms; genetics; hidden Markov models; medical signal processing; wavelet transforms; CpG island locations; DNA sequence; HMT; Markov model; adaptive wavelets; annotated DNA subsequences; genes; genetic-algorithm; hidden Markov model; human DNA; human genome; optimization; sequential modeling; sequential processing algorithms; statistical structure; wavelet-based hidden Markov tree; Algorithm design and analysis; Bioinformatics; Chemicals; DNA; Design optimization; Genomics; Hidden Markov models; Humans; Sequences; Signal processing algorithms;
fLanguage
English
Journal_Title
Signal Processing Letters, IEEE
Publisher
ieee
ISSN
1070-9908
Type
jour
DOI
10.1109/LSP.2002.806062
Filename
1159624
Link To Document