DocumentCode :
246359
Title :
Collaborative Multi-dimensional Dataset Processing with Distributed Cache Infrastructure in the Cloud
Author :
Youngmoon Eom ; Jonghwan Moon ; Jinwoong Kim ; Beomseok Nam
Author_Institution :
Sch. of Electr. & Comput. Eng., Ulsan Nat. Inst. of Sci. & Technol., Ulsan, South Korea
fYear :
2014
fDate :
8-12 Sept. 2014
Firstpage :
241
Lastpage :
248
Abstract :
As modern large scale systems are built with a large number of independent small servers, it is becoming more important to orchestrate and leverage a large number of distributed buffer cache memory seamlessly. Several previous studies showed that with large scale distributed caching facilities, traditional resource scheduling policies often fail to exhibit high cache hit ratio and to achieve good system load balance. A scheduling policy that solely considers system load results in low cache hit ratio, and a scheduling policy that puts more emphasis on cache hit ratio than load balance suffers from system load imbalance. To maximize the overall system throughput, distributed caching facilities should balance the workloads and also leverage cached data at the same time. In this work, we present a distributed job processing framework that yields high cache hit ratio while achieving good system load balance, the two of which are most critical performance factors to improve overall system throughput and job response time. Our framework is a component-based distributed data analysis framework that supports geographically distributed multiple job schedulers. The job scheduler in our framework employs a distributed job scheduling policy -- DEMA that considers both cache hit ratio and system load. In this paper, we show collaborative task scheduling can even further improve the performance by increasing the overall cache hit ratio while achieving load balance. Our experiments show that the proposed job scheduling policies outperform legacy load-based job scheduling policy in terms of job response time, load balancing, and cache hit ratio.
Keywords :
cache storage; cloud computing; data analysis; groupware; resource allocation; scheduling; DEMA; cache hit ratio; cloud; collaborative multidimensional dataset processing; collaborative task scheduling; component-based distributed data analysis framework; distributed buffer cache memory; distributed cache infrastructure; distributed caching facilities; distributed job processing framework; distributed job scheduling policy; job response time; job schedulers; legacy load-based job scheduling policy; load balance; overall system throughput; system load imbalance; Distributed databases; Dynamic scheduling; Load management; Processor scheduling; Servers; Throughput;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cloud and Autonomic Computing (ICCAC), 2014 International Conference on
Conference_Location :
London
Type :
conf
DOI :
10.1109/ICCAC.2014.18
Filename :
7024067
Link To Document :
بازگشت