Title :
Feature extraction for DNA base-calling using NNLS
Author :
Andrade-Cetto, L. ; Manolakos, Elias S.
Author_Institution :
Dept. of Electr. & Comput. Eng., Northeastern Univ., Boston, MA
Abstract :
We present new algorithms that can be used to extract features from a DNA chromatogram prior to base-calling. The algorithms assume that the inter-base distance has already been equalized using methods such as those presented in L. Andrade-Cetto and E. Manolakos (2005). We show first how a good estimate of the peak diffusion (spread) can be calculated from the raw trace and without having to known the underlying base sequence. Using the estimated inter-peak distance and peak spread parameters a non-negative least squares problem can be formulated in order to find the weight factors of the multiple shapes immersed in broad peaks, typically found towards the end of the chromatogram. The two algorithms combined provide peak hypotheses that are tested by the subsequent base decisions and scoring stage of the base-caller using probabilistic methods
Keywords :
DNA; biological techniques; chromatography; feature extraction; least squares approximations; probability; DNA base-calling; DNA chromatogram; base sequence; feature extraction; nonnegative least squares problem; probabilistic methods; DNA computing; Digital signal processing; Distortion; Electrokinetics; Feature extraction; Labeling; Least squares approximation; Sequences; Shape; Signal processing algorithms;
Conference_Titel :
Statistical Signal Processing, 2005 IEEE/SP 13th Workshop on
Conference_Location :
Novosibirsk
Print_ISBN :
0-7803-9403-8
DOI :
10.1109/SSP.2005.1628816