Title :
Multiple DNA sequence approximate matching
Author :
Kaplan, Kathleen M. ; Kaplan, John J.
Author_Institution :
Dept. of Syst. & Comput. Sci., Howard Univ., Washington, DC, USA
Abstract :
DNA matching is an important key to understanding genomes, evolution, relationships between organisms, and other concepts in genomics. Yet, comparing DNA is unlike matching typed words to a dictionary of words as there is no "true" spelling for DNA. Therefore, approximate matching algorithms must be used. There are many algorithms that can compare two DNA sequences, but when multiple sequences are to be compared, the matching becomes more difficult. Comparing multiple sequences of DNA can be performed using different known methods. These methods include dynamic programming, star alignments, tree alignments, and others, which are usually based on dynamic programming. The method proposed here is a novel method to compare all strings, not merely all strings to one, as in the star alignment method, and it does so in a different way. The proposed method, the Kaplan multiple sequence algorithm, or KMS, separates the multidimensional search area so that comparisons can be performed in parallel. This paper discusses the problem of comparing multiple sequences and introduces this new novel method to match multiple DNA strings.
Keywords :
DNA; biology computing; dynamic programming; molecular biophysics; DNA matching; DNA sequence; Kaplan multiple sequence algorithm; approximate matching; biomedical computing; computation theory; dynamic programming; evolution; genomes; multidimensional search area; polynomial approximation; set theory; star alignments; tree alignments; Approximation methods; Bioinformatics; Biomedical computing; DNA; Databases; Dictionaries; Dynamic programming; Genomics; Organisms; Sequences;
Conference_Titel :
Computational Intelligence in Bioinformatics and Computational Biology, 2004. CIBCB '04. Proceedings of the 2004 IEEE Symposium on
Print_ISBN :
0-7803-8728-7
DOI :
10.1109/CIBCB.2004.1393937