• DocumentCode
    2160532
  • Title

    Modeling tryptic digestion on the Cell BE processor

  • Author

    Green, James R. ; Mahmoud, Hanan ; Dumontier, Michel

  • Author_Institution
    Dept. of Syst. & Comput. Eng., Carleton Univ., Ottawa, ON
  • fYear
    2009
  • fDate
    3-6 May 2009
  • Firstpage
    701
  • Lastpage
    705
  • Abstract
    The cell BE is a heterogeneous multi-core processor offering multiple levels of parallelism. When these are properly leveraged, the cell BE demonstrates impressive performance acceleration for several high performance computing applications, including exact string matching on streaming data. The present study investigates the suitability of the cell BE for a string matching problem of relevance to proteomics - the identification of tryptic digest points based on the presence of a short sequence motif. Three string matching algorithms are implemented and evaluated over several proteomic datasets. In its first application to bioinformatics, Parabix, a method of high-throughput XML stream processing which relies on bit transposition and the effective use of single-instruction multiple-data (SIMD) instructions, is applied here with great success. This method performs very well when the protein database is pre-processed in the form of parallel bit streams. Double buffering is also critical to hide the latency of DMA data transfers. Performance results are computed for both the cycle-accurate cell BE simulator and also using real hardware. This problem is also placed in the larger context of using the cell BE to achieve hypothesis-driven protein identification.
  • Keywords
    XML; bioinformatics; file organisation; microprocessor chips; parallel processing; string matching; cell BE processor; data streaming; double buffering; heterogeneous multicore processor; high-throughput XML stream processing; hypothesis-driven protein identification; memory memory access; parallel bit streams; parallel processing; protein database; proteomic datasets; single-instruction multiple-data instruction; string matching; tryptic digestion modeling; Acceleration; Bioinformatics; Databases; Delay; High performance computing; Multicore processing; Parallel processing; Proteins; Proteomics; XML; parallel processing; string matching;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Electrical and Computer Engineering, 2009. CCECE '09. Canadian Conference on
  • Conference_Location
    St. John´s, NL
  • ISSN
    0840-7789
  • Print_ISBN
    978-1-4244-3509-8
  • Electronic_ISBN
    0840-7789
  • Type

    conf

  • DOI
    10.1109/CCECE.2009.5090220
  • Filename
    5090220