DocumentCode :
2498999
Title :
DNA base-calling using polynomial classifiers
Author :
Mohammed, Omniyah G. ; Assaleh, Khaled T. ; Husseini, Ghaleb A. ; Majdalawieh, Amin F. ; Woodward, Scott R.
Author_Institution :
Electr. Eng. Dept., American Univ. of Sharjah, Sharjah, United Arab Emirates
fYear :
2010
fDate :
18-23 July 2010
Firstpage :
1
Lastpage :
5
Abstract :
Base-calling is one of many problems that can be solved using pattern recognition, the act of classifying raw data based on prior or statistical information extracted from the data into various classes. In this paper, we propose a new framework using polynomial classifiers to model electropherogram traces obtained from ABI sequencing machines to perform base-calling. Initially, pre-processing, which includes segmented normalization and peak sharpening, needs to be performed to reduce the imperfections caused in a trace as a result of the chemistry involved. Discriminative feature vectors are then extracted from the chromatogram traces and are expanded to a higher dimensional space by second order polynomial expansion. A linear classifier is then trained and bases are classified respectively. Chromatogram traces that were chosen for analysis belong to Homo sapiens, Saccharomyces mikatae and Drosophila melanogaster. Simulation results indicated an accuracy of up to 99.2% upon testing three different chromatogram traces consisting of about 600 to 800 bases each. The proposed model´s performance was compared to the existing standards: ABI and PHRED in terms of insertion, deletion and substitution errors. Simulation evidence indicated that the designed model performs comparably or slightly better than ABI in terms of deletion and insertion errors. Moreover, polynomial classifier resulted in negligible substitution errors compared to ABI. Polynomial classifier was also observed to perform comparable to PHRED in terms of deletion error and substitution errors. The results obtained demonstrate the potential of this model to perform base-calling.
Keywords :
DNA; biology computing; chromatography; pattern classification; ABI sequencing machines; DNA base-calling; Drosophila melanogaster; Homo sapiens; Saccharomyces mikatae; chromatogram traces; deletion error; discriminative feature vectors; electropherogram traces; linear classifier; pattern recognition; peak sharpening; polynomial classifiers; second order polynomial expansion; segmented normalization; substitution errors; Bayesian methods; DNA; Hidden Markov models; Polynomials; Support vector machine classification; Testing; Training;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks (IJCNN), The 2010 International Joint Conference on
Conference_Location :
Barcelona
ISSN :
1098-7576
Print_ISBN :
978-1-4244-6916-1
Type :
conf
DOI :
10.1109/IJCNN.2010.5596983
Filename :
5596983
Link To Document :
بازگشت