DocumentCode :
1336598
Title :
DNA Sequence Compression Using Adaptive Particle Swarm Optimization-Based Memetic Algorithm
Author :
Zhu, Zexuan ; Zhou, Jiarui ; Ji, Zhen ; Shi, Yu-hui
Author_Institution :
Shenzhen City Key Lab. of Embedded Syst. Design, Shenzhen Univ., Shenzhen, China
Volume :
15
Issue :
5
fYear :
2011
Firstpage :
643
Lastpage :
658
Abstract :
With the rapid development of high-throughput DNA sequencing technologies, the amount of DNA sequence data is accumulating exponentially. The huge influx of data creates new challenges for storage and transmission. This paper proposes a novel adaptive particle swarm optimization-based memetic algorithm (POMA) for DNA sequence compression. POMA is a synergy of comprehensive learning particle swarm optimization (CLPSO) and an adaptive intelligent single particle optimizer (AdpISPO)-based local search. It takes advantage of both CLPSO and AdpISPO to optimize the design of approximate repeat vector (ARV) codebook for DNA sequence compression. ARV is first introduced in this paper to represent the repeated fragments across multiple sequences in direct, mirror, pairing, and inverted patterns. In POMA, candidate ARV codebooks are encoded as particles and the optimal solution, which covers the most approximate repeated fragments with the fewest base variations, is identified through the exploration and exploitation of POMA. In each iteration of POMA, the leader particles in the swarm are selected based on weighted fitness values and each leader particle is fine-tuned with an AdpISPO-based local search, so that the convergence of the search in local region is accelerated. A detailed comparison study between POMA and the counterpart algorithms is performed on 29 (23 basic and 6 composite) benchmark functions and 11 real DNA sequences. POMA is observed to obtain better or competitive performance with a limited number of function evaluations. POMA also attains lower bits-per-base than other state-of-the-art DNA-specific algorithms on DNA sequence data. The experimental results suggest that the cooperation of CLPSO and AdpISPO in the framework of memetic algorithm is capable of searching the ARV codebook space efficiently.
Keywords :
bioinformatics; biology computing; convergence; data communication; evolutionary computation; learning (artificial intelligence); molecular biophysics; particle swarm optimisation; sequences; storage management; ARV codebook space; AdpISPO; CLPSO; DNA sequence compression; POMA; adaptive intelligent single particle optimizer; adaptive particle swarm optimization; comprehensive learning particle swarm optimization; convergence; data storage; data transmission; local search; memetic algorithm; Bioinformatics; DNA; Encoding; Genomics; Memetics; Optimization; Particle swarm optimization; Approximate repeat vector; DNA sequence compression; memetic algorithm; particle swarm optimization;
fLanguage :
English
Journal_Title :
Evolutionary Computation, IEEE Transactions on
Publisher :
ieee
ISSN :
1089-778X
Type :
jour
DOI :
10.1109/TEVC.2011.2160399
Filename :
6031913
Link To Document :
بازگشت