Title :
Pareto Optimal Pairwise Sequence Alignment
Author :
DeRonne, Kevin W. ; Karypis, George
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of Minnesota, Minneapolis, MN, USA
Abstract :
Sequence alignment using evolutionary profiles is a commonly employed tool when investigating a protein. Many profile-profile scoring functions have been developed for use in such alignments, but there has not yet been a comprehensive study of Pareto optimal pairwise alignments for combining multiple such functions. We show that the problem of generating Pareto optimal pairwise alignments has an optimal substructure property, and develop an efficient algorithm for generating Pareto optimal frontiers of pairwise alignments. All possible sets of two, three, and four profile scoring functions are used from a pool of 11 functions and applied to 588 pairs of proteins in the ce_ref data set. The performance of the best objective combinations on ce_ref is also evaluated on an independent set of 913 protein pairs extracted from the BAliBASE RV11 data set. Our dynamic-programming-based heuristic approach produces approximated Pareto optimal frontiers of pairwise alignments that contain comparable alignments to those on the exact frontier, but on average in less than 1/58th the time in the case of four objectives. Our results show that the Pareto frontiers contain alignments whose quality is better than the alignments obtained by single objectives. However, the task of identifying a single high-quality alignment among those in the Pareto frontier remains challenging.
Keywords :
Pareto optimisation; bioinformatics; dynamic programming; evolutionary computation; heuristic programming; proteins; proteomics; BAliBASE RV11 data set; Pareto optimal frontiers; Pareto optimal pairwise sequence alignment; ce_ref data set; dynamic-programming-based heuristic approach; efficient algorithm; evolutionary profile; optimal substructure property; profile-profile scoring function; protein pairs; single high-quality alignment; Amino acids; Heuristic algorithms; Linear programming; Pareto optimization; Proteins; Vectors; Amino acids; Heuristic algorithms; Linear programming; Pareto; Pareto optimization; Proteins; Vectors; optimization; pairwise sequence alignment; Algorithms; Amino Acid Sequence; Computational Biology; Proteins; Sequence Alignment; Software;
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
DOI :
10.1109/TCBB.2013.2