Title :
Clustered Memetic Algorithm With Local Heuristics for Ab Initio Protein Structure Prediction
Author :
Islam, Md. Kamrul ; Chetty, Madhu
Author_Institution :
Gippsland Sch. of Inf. Technol., Monash Univ., Churchill, VIC, Australia
Abstract :
Low-resolution protein models are often used within a hierarchical framework for structure prediction. However, even with these simplified but realistic protein models, the search for the optimal solution remains NP complete. The complexity is further compounded by the multimodal nature of the search space. In this paper, we propose a systematic design of an evolutionary search technique, namely the memetic algorithm (MA), to effectively search the vast search space by exploiting the domain-specific knowledge and taking cognizance of the multimodal nature of the search space. The proposed MA achieves this by incorporating various novel features: 1) a modified fitness function includes two additional terms to account for the hydrophobic and polar nature of the residues; 2) a systematic (rather than random) generation of population automatically prevents an occurrence of invalid conformations; 3) a generalized nonisomorphic encoding scheme implicitly eliminates generation of twins (similar conformations) in the population; 4) the identification of a meme (protein substructures) during optimization from different basins of attraction - a process that is equivalent to implicit applications of threading principles; 5) a clustering of the population corresponds to basins of attraction that allows evolution to overcome the complexity of multimodal search space, thereby avoiding search getting trapped in a local optimum; and 6) a 2-stage framework gathers domain knowledge (i.e., substructures or memes) from different basins of attraction for a combined execution in the second stage. Experiments conducted with different lattice models using known benchmark protein sequences and comparisons carried out with recently reported approaches in this journal show that the proposed algorithm has robustness, speed, accuracy, and superior performance. The approach is generic and can easily be extended for applications to other classes of problems.
Keywords :
ab initio calculations; encoding; evolutionary computation; optimisation; pattern clustering; proteins; search problems; 2-stage framework; MA; NP complete solution; ab initio protein structure prediction; benchmark protein sequences; clustered memetic algorithm; domain knowledge; domain-specific knowledge; evolutionary search technique; hierarchical framework; invalid conformations; lattice models; local heuristics; low-resolution protein models; modified fitness function; multimodal search space complexity; nonisomorphic encoding scheme; optimal solution; realistic protein models; systematic design; threading principles; Clustering algorithms; Encoding; Heuristic algorithms; Lattices; Memetics; Proteins; Surface acoustic waves; Clustered memetic algorithm; meme incorporation; pull move; reverse pull move; self-avoiding walk (SAW);
Journal_Title :
Evolutionary Computation, IEEE Transactions on
DOI :
10.1109/TEVC.2012.2213258