DocumentCode :
2190759
Title :
Maximum likelihood phylogenetic reconstruction using gene order encodings
Author :
Hu, Fei ; Gao, Nan ; Zhang, Meng ; Tang, Jijun
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of South Carolina, Columbia, SC, USA
fYear :
2011
fDate :
11-15 April 2011
Firstpage :
1
Lastpage :
6
Abstract :
Gene order changes under rearrangement events such as inversions and transpositions have attracted increasing attention as a new type of data for phylogenetic analysis. Since these events are rare, they allow the reconstruction of evolutionary history far back in time. Many software have been developed for the inference of gene order phylogenies, including widely used maximum parsimony methods such as GRAPPA and MGR. However, these methods confronted great difficulties in dealing with emerging large nuclear genomes. In this study, we proposed three simple yet powerful maximum likelihood(ML) based methods for phylogenetic reconstruction by first encoding the gene orders into binary or multistate strings based on gene adjacency information presented in the given genomes and further converting these strings into molecular sequences. RAxML is at last used to compute the maximum likelihood phylogeny. We conducted extensive experiments using simulated datasets and found that although the multistate encoding is more complex and more time-consuming, it did not improve accuracy over the methods using simpler binary encodings. Among all methods tested in our experiments, MLBE is of the most accuracy in most cases and often returns phylogenies without errors. ML methods is also fast and in the most difficult case only takes up to three days to compute datasets with 40 genomes, making it very suitable for large scale analysis. We give three simple and robust phylogenetic reconstruction methods using different encodings based on maximum likelihood which has not been successfully applied for gene orderings before. Our development of these ML methods showed great potential in gene order analysis with respect to the high accuracy and stability, although formal mathematical and statistical analysis of these methods are much desired.
Keywords :
binary codes; encoding; evolution (biological); genetics; genomics; maximum likelihood estimation; statistical analysis; GRAPPA; MGR; ML method; MLBE; formal mathematical analysis; gene adjacency information; gene order encoding; large scale analysis; maximum likelihood phylogenetic reconstruction; maximum parsimony method; molecular sequence; multistate string; nuclear genome; statistical analysis; Accuracy; Amino acids; Bioinformatics; Encoding; Genomics; Phylogeny; Radio frequency;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), 2011 IEEE Symposium on
Conference_Location :
Paris
Print_ISBN :
978-1-4244-9896-3
Type :
conf
DOI :
10.1109/CIBCB.2011.5948459
Filename :
5948459
Link To Document :
بازگشت