Title :
Characterization and analysis of HMMER and SVM-RFE parallel bioinformatics applications
Author :
Srinivasan, Uma ; Chen, Peng-Sheng ; Diao, Qian ; Lim, Chu-Cheow ; Li, Eric ; Chen, Yongjian ; Ju, Roy ; Zhang, Yimin
Author_Institution :
Microprocessor Technol. Lab, Intel Corp., Santa Clara, CA, USA
Abstract :
Bioinformatics applications constitute an emerging data-intensive, high-performance computing (HPC) domain. While there is much research on algorithmic improvements, (2004), the actual performance of an application also depends on how well the program maps to the target hardware. This paper presents a performance study of two parallel bioinformatics applications HMMER (sequence alignment) and SVM-RFE (gene expression analysis), on Intel x86 based hyperthread-capable (2002) shared-memory multiprocessor systems. The performance characteristics varied according to the application and target hardware characteristics. For instance, HMMER is compute intensive and showed better scalability on a 3.0 GHz system versus a 2.2 GHz system. However, SVM-RFE is memory intensive and showed better absolute performance on the 2.2 GHz machine which has better memory bandwidth. The performance is also impacted by processor features, e.g. hyperthreading (HT) (2002) and prefetching. With HMMER we could obtain -75% of the performance with HT enabled with respect to doubling the number of CPUs. While load balancing optimizations can provide speedup of -30% for HMMER on a hyperthreading-enabled system, the load balancing has to adapt to the target number of processors and threads. SVM-RFE benefits differently from the same load-balancing and thread scheduling tuning. We conclude that compiler and runtime optimizations play an important role to achieve the best performance for a given bioinformatics algorithm.
Keywords :
biology computing; multi-threading; processor scheduling; resource allocation; shared memory systems; storage management; HMMER; Intel x86 based hyperthread-capable shared-memory multiprocessor systems; SVM-RFE; compiler optimization; gene expression analysis; hyperthreading; load balancing optimizations; parallel bioinformatics applications; parallel processing; prefetching; runtime optimization; sequence alignment; thread scheduling tuning; workload analysis; Bandwidth; Bioinformatics; Gene expression; Hardware; Hidden Markov models; Load management; Multiprocessing systems; Performance analysis; Scalability; Yarn;
Conference_Titel :
Workload Characterization Symposium, 2005. Proceedings of the IEEE International
Print_ISBN :
0-7803-9461-5
DOI :
10.1109/IISWC.2005.1526004