DocumentCode
3737335
Title
Distributed computing of all-to-all comparison problems in heterogeneous systems
Author
Yi-Fan Zhang;Yu-Chu Tian;Wayne Kelly;Colin Fidge
Author_Institution
School of Electrical Engineering and Computer Science, Queensland University of Technology, GPO Box 2434, Brisbane, QLD 4001, Australia
fYear
2015
Firstpage
2053
Lastpage
2058
Abstract
The requirement of distributed computing of all-to-all comparison (ATAC) problems in heterogeneous systems is increasingly important in various domains. Though Hadoop-based solutions are widely used, they are inefficient for the ATAC pattern, which is fundamentally different from the MapReduce pattern for which Hadoop is designed. They exhibit poor data locality and unbalanced allocation of comparison tasks, particularly in heterogeneous systems. The results in massive data movement at runtime and ineffective utilization of computing resources, affecting the overall computing performance significantly. To address these problems, a scalable and efficient data and task distribution strategy is presented in this paper for processing large-scale ATAC problems in heterogeneous systems. It not only saves storage space but also achieves load balancing and good data locality for all comparison tasks. Experiments of bioinformatics examples show that about 89% of the ideal performance capacity of the multiple machines have be achieved through using the approach presented in this paper.
Keywords
"Distributed databases","Processor scheduling","Resource management","Runtime","Load management","Distribution strategy"
Publisher
ieee
Conference_Titel
Industrial Electronics Society, IECON 2015 - 41st Annual Conference of the IEEE
Type
conf
DOI
10.1109/IECON.2015.7392403
Filename
7392403
Link To Document