DocumentCode
2160532
Title
Modeling tryptic digestion on the Cell BE processor
Author
Green, James R. ; Mahmoud, Hanan ; Dumontier, Michel
Author_Institution
Dept. of Syst. & Comput. Eng., Carleton Univ., Ottawa, ON
fYear
2009
fDate
3-6 May 2009
Firstpage
701
Lastpage
705
Abstract
The cell BE is a heterogeneous multi-core processor offering multiple levels of parallelism. When these are properly leveraged, the cell BE demonstrates impressive performance acceleration for several high performance computing applications, including exact string matching on streaming data. The present study investigates the suitability of the cell BE for a string matching problem of relevance to proteomics - the identification of tryptic digest points based on the presence of a short sequence motif. Three string matching algorithms are implemented and evaluated over several proteomic datasets. In its first application to bioinformatics, Parabix, a method of high-throughput XML stream processing which relies on bit transposition and the effective use of single-instruction multiple-data (SIMD) instructions, is applied here with great success. This method performs very well when the protein database is pre-processed in the form of parallel bit streams. Double buffering is also critical to hide the latency of DMA data transfers. Performance results are computed for both the cycle-accurate cell BE simulator and also using real hardware. This problem is also placed in the larger context of using the cell BE to achieve hypothesis-driven protein identification.
Keywords
XML; bioinformatics; file organisation; microprocessor chips; parallel processing; string matching; cell BE processor; data streaming; double buffering; heterogeneous multicore processor; high-throughput XML stream processing; hypothesis-driven protein identification; memory memory access; parallel bit streams; parallel processing; protein database; proteomic datasets; single-instruction multiple-data instruction; string matching; tryptic digestion modeling; Acceleration; Bioinformatics; Databases; Delay; High performance computing; Multicore processing; Parallel processing; Proteins; Proteomics; XML; parallel processing; string matching;
fLanguage
English
Publisher
ieee
Conference_Titel
Electrical and Computer Engineering, 2009. CCECE '09. Canadian Conference on
Conference_Location
St. John´s, NL
ISSN
0840-7789
Print_ISBN
978-1-4244-3509-8
Electronic_ISBN
0840-7789
Type
conf
DOI
10.1109/CCECE.2009.5090220
Filename
5090220
Link To Document