DocumentCode
2823563
Title
Short adjacent repeat identification based on Chemical Reaction Optimization
Author
Xu, Jin ; Lam, Albert Y S ; Li, Victor O K ; Li, Qiwei ; Fan, Xiaodan
Author_Institution
Dept. of Electr. & Electron. Eng., Univ. of Hong Kong, Hong Kong, China
fYear
2012
fDate
10-15 June 2012
Firstpage
1
Lastpage
8
Abstract
The analysis of short tandem repeats (STRs) in DNA sequences has become an attractive method for determining the genetic profile of an individual. Here we focus on a more general and practical issue named short adjacent repeats identification problem (SARIP), which is extended from STR by allowing short gaps between neighboring units. Presently, the best available solution to SARIP is BASARD, which uses Markov chain Monte Carlo algorithms to determine the posterior estimate. However, the computational complexity and the tendency to get stuck in a local mode lower the efficiency of BASARD and impede its wide application. In this paper, we prove that SARIP is NP-hard, and we also solve it with Chemical Reaction Optimization (CRO), a recently developed metaheuristic approach. CRO mimics the interactions of molecules in a chemical reaction and it can explore the solution space efficiently to find the optimal or near optimal solution(s). We test the CRO algorithm with both synthetic and real data, and compare its performance in mode searching with BASARD. Simulation results show that CRO enjoys dozens of times, or even a hundred times shorter computational time compared with BASARD. It is also demonstrated that CRO can obtain the global optima most of the time. Moreover, CRO is more stable in different runs, which is of great importance in practical use. Thus, CRO is by far the best method on SARIP.
Keywords
Bayes methods; DNA; Markov processes; Monte Carlo methods; biology computing; computational complexity; estimation theory; genetics; optimisation; BASARD; CRO algorithm; DNA sequences; Markov chain Monte Carlo algorithms; NP-hard; SARIP; STR; adjacent repeat identification; attractive method; chemical reaction optimization; computational complexity; genetic profile; hundred times shorter computational time; metaheuristic approach; mode searching; near optimal solution; posterior estimate; real data; short adjacent repeats identification problem; short tandem repeats; solution space; synthetic data; Chemicals; DNA; Optimization; Polynomials; Silicon; Tin; Vectors; Chemical Reaction Optimization; Short adjacent repeats; maximum a posteriori;
fLanguage
English
Publisher
ieee
Conference_Titel
Evolutionary Computation (CEC), 2012 IEEE Congress on
Conference_Location
Brisbane, QLD
Print_ISBN
978-1-4673-1510-4
Electronic_ISBN
978-1-4673-1508-1
Type
conf
DOI
10.1109/CEC.2012.6256614
Filename
6256614
Link To Document