• DocumentCode
    1086599
  • Title

    Distributed Sequence Alignment Applications for the Public Computing Architecture

  • Author

    Pellicer, S. ; Guihai Chen ; Chan, K.C.C. ; Yi Pan

  • Author_Institution
    Georgia State Univ., Atlanta
  • Volume
    7
  • Issue
    1
  • fYear
    2008
  • fDate
    3/1/2008 12:00:00 AM
  • Firstpage
    35
  • Lastpage
    43
  • Abstract
    The public computer architecture shows promise as a platform for solving fundamental problems in bioinformatics such as global gene sequence alignment and data mining with tools such as the basic local alignment search tool (BLAST). Our implementation of these two problems on the Berkeley open infrastructure for network computing (BOINC) platform demonstrates a runtime reduction factor of 1.15 for sequence alignment and 16.76 for BLAST. While the runtime reduction factor of the global gene sequence alignment application is modest, this value is based on a theoretical sequential runtime extrapolated from the calculation of a smaller problem. Because this runtime is extrapolated from running the calculation in memory, the theoretical sequential runtime would require 37.3 GB of memory on a single system. With this in mind, the BOINC implementation not only offers the reduced runtime, but also the aggregation of the available memory of all participant nodes. If an actual sequential run of the problem were compared, a more drastic reduction in the runtime would be seen due to an additional secondary storage I/O overhead for a practical system. Despite the limitations of the public computer architecture, most notably in communication overhead, it represents a practical platform for grid- and cluster-scale bioinformatics computations today and shows great potential for future implementations.
  • Keywords
    biology computing; data mining; extrapolation; genetics; BLAST; BOINC platform; Berkeley open infrastructure-for-network computing; basic local alignment search tool; bioinformatics; data mining; distributed sequence alignment applications; global gene sequence alignment; public computing architecture; runtime reduction factor; secondary storage I/O overhead; sequential runtime extrapolation; Application software; Bioinformatics; Computer architecture; Computer networks; Computer science; Concurrent computing; Data mining; Distributed computing; Grid computing; Runtime; Basic local alignment search tool (BLAST); Berkeley Open infrastructure for network computing (BOINC); gene sequence alignment; public computer; Algorithms; Database Management Systems; Databases, Factual; Internet; Sequence Alignment; Sequence Analysis; Software;
  • fLanguage
    English
  • Journal_Title
    NanoBioscience, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1536-1241
  • Type

    jour

  • DOI
    10.1109/TNB.2008.2000148
  • Filename
    4459723