• DocumentCode
    1544984
  • Title

    Optimal structure for automatic processing of DNA sequences

  • Author

    Davies, Stephen W. ; Eizenman, Moshe ; Pasupathy, Subbarayan

  • Author_Institution
    AT&T Bell Labs., Holmdel, NJ, USA
  • Volume
    46
  • Issue
    9
  • fYear
    1999
  • Firstpage
    1044
  • Lastpage
    1056
  • Abstract
    The faithful recovery of the base sequence in automatic DNA sequencing fundamentally depends on the underlying statistics of the DNA electrophoresis time series. Current DNA sequencing algorithms are heuristic in nature and modest in their use of statistical information. In this paper, a formal statistical model of the DNA time series is presented and then used to construct the optimal maximum-likelihood (ML) processor. The DNA-ML algorithm derived features Kalman prediction of peak locations, peak parameter estimation, whitened waveform comparison and multiple hypothesis processing using the M-algorithm. Properties of the algorithm are examined using both simulated and real data. Model parameters of critical importance and their impact on different types of error mechanisms, such as insertions and deletions, are pointed out. The statistical model of the DNA time-series and the structure of the DNA-ML algorithm provides a basis for future investigation and refinement of DNA sequencing techniques.
  • Keywords
    DNA; Kalman filters; biological techniques; electrophoresis; maximum likelihood sequence estimation; molecular biophysics; molecular configurations; parameter estimation; time series; DNA sequences; Kalman prediction; M-algorithm; MLSE; automatic processing; base sequence recovery; deletions; electrophoresis time series; error mechanisms; formal statistical model; insertions; multiple hypothesis processing; nuisance parameters; optimal maximum-likelihood processor; optimal structure; peak locations; peak parameter estimation; timing jitter; underlying statistics; whitened waveform comparison; Biomedical engineering; Chemical sensors; DNA; Electrokinetics; Fluorescence; Heuristic algorithms; Intersymbol interference; Kalman filters; Sequences; Statistics; Algorithms; Animals; Electrophoresis; Electrophoresis, Polyacrylamide Gel; Lens, Crystalline; Likelihood Functions; Models, Genetic; Models, Statistical; Reproducibility of Results; Sequence Analysis, DNA; Time Factors;
  • fLanguage
    English
  • Journal_Title
    Biomedical Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9294
  • Type

    jour

  • DOI
    10.1109/10.784135
  • Filename
    784135