DocumentCode :
1544984
Title :
Optimal structure for automatic processing of DNA sequences
Author :
Davies, Stephen W. ; Eizenman, Moshe ; Pasupathy, Subbarayan
Author_Institution :
AT&T Bell Labs., Holmdel, NJ, USA
Volume :
46
Issue :
9
fYear :
1999
Firstpage :
1044
Lastpage :
1056
Abstract :
The faithful recovery of the base sequence in automatic DNA sequencing fundamentally depends on the underlying statistics of the DNA electrophoresis time series. Current DNA sequencing algorithms are heuristic in nature and modest in their use of statistical information. In this paper, a formal statistical model of the DNA time series is presented and then used to construct the optimal maximum-likelihood (ML) processor. The DNA-ML algorithm derived features Kalman prediction of peak locations, peak parameter estimation, whitened waveform comparison and multiple hypothesis processing using the M-algorithm. Properties of the algorithm are examined using both simulated and real data. Model parameters of critical importance and their impact on different types of error mechanisms, such as insertions and deletions, are pointed out. The statistical model of the DNA time-series and the structure of the DNA-ML algorithm provides a basis for future investigation and refinement of DNA sequencing techniques.
Keywords :
DNA; Kalman filters; biological techniques; electrophoresis; maximum likelihood sequence estimation; molecular biophysics; molecular configurations; parameter estimation; time series; DNA sequences; Kalman prediction; M-algorithm; MLSE; automatic processing; base sequence recovery; deletions; electrophoresis time series; error mechanisms; formal statistical model; insertions; multiple hypothesis processing; nuisance parameters; optimal maximum-likelihood processor; optimal structure; peak locations; peak parameter estimation; timing jitter; underlying statistics; whitened waveform comparison; Biomedical engineering; Chemical sensors; DNA; Electrokinetics; Fluorescence; Heuristic algorithms; Intersymbol interference; Kalman filters; Sequences; Statistics; Algorithms; Animals; Electrophoresis; Electrophoresis, Polyacrylamide Gel; Lens, Crystalline; Likelihood Functions; Models, Genetic; Models, Statistical; Reproducibility of Results; Sequence Analysis, DNA; Time Factors;
fLanguage :
English
Journal_Title :
Biomedical Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9294
Type :
jour
DOI :
10.1109/10.784135
Filename :
784135
Link To Document :
بازگشت