DocumentCode :
2695406
Title :
The genetic algorithm scheme for consensus sequences
Author :
Gilkerson, Joshua W. ; Jaromczyk, Jerzy W.
Author_Institution :
Univ. of Kentucky, Lexington
fYear :
2007
fDate :
25-28 Sept. 2007
Firstpage :
3870
Lastpage :
3878
Abstract :
A consensus sequence is a single sequence that represents characteristics of a family of sequences. Such synopses are most commonly used in the bioinformatics for sequence analysis. For example, algorithms that determine high quality consensus sequences are useful to construct a multiple alignment and consequently, a sequence logo (another representation that attempts to capture the important features of sequences). The determination of optimal consensus sequences is NP-hard (Gusfield). We present two new algorithms and compare them to earlier, published methods of determining consensus sequences. The first, CONSENSIZE, is an application of the genetic algorithm scheme (GAS). The other is a simple steepest descent search, usually not very useful for NP-hard problems, but surprisingly successful for this application. We discuss both algorithms and experimentally compare their accuracy and efficiency with the simulated annealing, multiple alignment and center string approaches. Test results are presented on both synthetic data and biological sequences.
Keywords :
biology computing; computational complexity; genetic algorithms; sequences; simulated annealing; CONSENSIZE; NP-hard problems; bioinformatics; biological sequences; center string; consensus sequences; genetic algorithm scheme; multiple alignment; sequence analysis; simulated annealing; steepest descent search; synthetic data; Genetic algorithms;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Evolutionary Computation, 2007. CEC 2007. IEEE Congress on
Conference_Location :
Singapore
Print_ISBN :
978-1-4244-1339-3
Electronic_ISBN :
978-1-4244-1340-9
Type :
conf
DOI :
10.1109/CEC.2007.4424975
Filename :
4424975
Link To Document :
بازگشت