• DocumentCode
    3022207
  • Title

    High-performance direct pairwise comparison of large genomic sequences

  • Author

    Mueller, Christopher ; Dalkilic, Mehmet ; Lumsdaine, Andrew

  • Author_Institution
    Dept. of Comput. Sci., Indiana Univ., Bloomington, IN, USA
  • fYear
    2005
  • fDate
    4-8 April 2005
  • Abstract
    Many applications in comparative genomics lend themselves to implementations that take advantage of common high-performance features in modern microprocessors. However, the common suggestion that a data-parallel, multithreaded, or high-throughput implementation is possible often ignores the complexity of actually creating such software. In this paper, we present a data-parallel algorithm for a classic comparative genomics algorithm, the dot plot, along with a multiprocessor extension. For large genomic comparisons, these new algorithms achieve speedups of up to 14.4x over the sequential version. This speedup introduces the opportunity of performing full pairwise comparisons on entire genomes on a much larger scale than previously possible. We also present the experimental, model-driven approach used to develop the algorithm that allowed us to carefully study and evaluate implementation options and fully understand the parameters affecting its performance.
  • Keywords
    biology computing; genetics; parallel algorithms; sequences; Altivec; data-parallel algorithm; dot plot algorithm; genomics algorithm; high-performance direct pairwise comparison; large genomic sequences; sequence alignment; vector processor; Application software; Bioinformatics; Computer science; Databases; Filtering algorithms; Genomics; Microprocessors; Open systems; Parallel processing; Vector processors; Altivec; comparative genomics; data-parallel; dot plot; high-performance computing; pairwise comparison; sequence alignment; vector processor;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium, 2005. Proceedings. 19th IEEE International
  • Print_ISBN
    0-7695-2312-9
  • Type

    conf

  • DOI
    10.1109/IPDPS.2005.246
  • Filename
    1420094