Author_Institution :
Key Lab. of Adv. Control for Light Ind. (Minist. of China), Jiangnan Univ., Wuxi, China
Abstract :
Hidden Markov Models (HMMs) are powerful tools for multiple sequence alignment (MSA), which is known to be an NP-complete and important problem in bioinformatics. Learning HMMs is a difficult task, and many meta-heuristic methods, including particle swarm optimization (PSO), have been used for that. In this paper, a new variant of PSO, called the random drift particle swarm optimization (RDPSO) algorithm, is proposed to be used for HMM learning tasks in MSA problems. The proposed RDPSO algorithm, inspired by the free electron model in metal conductors in an external electric field, employs a novel set of evolution equations that can enhance the global search ability of the algorithm. Moreover, in order to further enhance the algorithmic performance of the RDPSO, we incorporate a diversity control method into the algorithm and, thus, propose an RDPSO with diversity-guided search (RDPSO-DGS). The performances of the RDPSO, RDPSO-DGS and other algorithms are tested and compared by learning HMMs for MSA on two well-known benchmark data sets. The experimental results show that the HMMs learned by the RDPSO and RDPSO-DGS are able to generate better alignments for the benchmark data sets than other most commonly used HMM learning methods, such as the Baum-Welch and other PSO algorithms. The performance comparison with well-known MSA programs, such as ClustalW and MAFFT, also shows that the proposed methods have advantages in multiple sequence alignment.
Keywords :
bioinformatics; hidden Markov models; learning (artificial intelligence); particle swarm optimisation; Baum-Welch algorithms; ClustalW programs; HMM learning tasks; MAFFT programs; MSA problems; RDPSO-DGS algorithm; benchmark data sets; bioinformatics; diversity control method; diversity-guided search; external electric field; free electron model; hidden Markov models; meta-heuristic methods; metal conductors; multiple sequence alignment; random drift particle swarm optimization; Bioinformatics; Convergence; Equations; Hidden Markov models; Mathematical model; Particle swarm optimization; Training; Hidden Markov Models; multiple sequence alignment; parameter learning; particle swarm optimization;