• DocumentCode
    2481396
  • Title

    Exploring FPGAs for accelerating the phylogenetic likelihood function

  • Author

    Alachiotis, N. ; Sotiriades, E. ; Dollas, A. ; Stamatakis, A.

  • Author_Institution
    Dept. of Electron. & Comput. Eng., Tech. Univ. of Crete, Chania, Greece
  • fYear
    2009
  • fDate
    23-29 May 2009
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    Driven by novel biological wet lab techniques such as pyrosequencing there has been an unprecedented molecular data explosion. The growth of biological sequence data has significantly out-paced Moore\´s law. This development also poses new computational and architectural challenges for the field of phylogenetic inference, i.e., the reconstruction of evolutionary histories (trees) for a set of organisms which are represented by respective molecular sequences. Phylogenetic trees are currently increasingly reconstructed from multiple genes or even whole genomes. The introduced term "phylogenomics" reflects this development. Hence, there is an urgent need to deploy and develop new techniques and computational solutions to calculate the computationally intensive scoring functions for phylogenetic trees. In this paper, we propose a dedicated computer architecture to compute the phylogenetic maximum likelihood (ML) function. The ML criterion represents one of the most accurate statistical models for phylogenetic inference and accounts for 85% to 95% of total execution time in all state-of-the-art ML-based phylogenetic inference programs. We present the implementation of our architecture on an FPGA (field programmable gate array) and compare the performance to an efficient C implementation of the ML function on a high-end multi-core architecture with 16 cores. Our results are two-fold: (i) the initial exploratory implementation of the ML function for trees comprising 4 up to 512 sequences on an FPGA yields speedups of a factor 8.3 on average compared to execution on a single-core and is faster than the OpenMP-based parallel implementation on up to 16 cores in all but one case; and (ii) we are able to show that current FPGAs are capable to efficiently execute floating point intensive computational kernels.
  • Keywords
    biology computing; computer architecture; field programmable gate arrays; genetics; molecular biophysics; FPGA; biological sequence data; biological wet lab techniques; dedicated computer architecture; intensive scoring functions; molecular data explosion; molecular sequences; multicore architecture; phylogenetic inference; phylogenetic maximum likelihood function; phylogenetic trees; phylogenomics; pyrosequencing; statistical models; Acceleration; Biology computing; Computer architecture; Explosions; Field programmable gate arrays; Genomics; History; Moore´s Law; Organisms; Phylogeny;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on
  • Conference_Location
    Rome
  • ISSN
    1530-2075
  • Print_ISBN
    978-1-4244-3751-1
  • Electronic_ISBN
    1530-2075
  • Type

    conf

  • DOI
    10.1109/IPDPS.2009.5160929
  • Filename
    5160929