Title :
Improving Multi-dimensional query processing with data migration in distributed cache infrastructure
Author :
Youngmoon Eom ; Jinwoong Kim ; Deukyeon Hwang ; Jaewon Kwak ; Minho Shin ; Beomseok Nam
Author_Institution :
Dept. of Comput. Sci. Eng., Ulsan Nat. Inst. of Sci. & Technol. (UNIST), Ulsan, South Korea
Abstract :
In distributed query processing systems where caching infrastructure is distributed and scales with the number of servers, it is becoming more important to orchestrate and leverage a large number of cached objects in distributed caching systems seamlessly as the present trend is to build large scalable distributed systems by connecting small heterogeneous machines. With a large scale distributed caching system, a scheduling policy must consider both cache hit ratio and system load balance to optimize multiple queries. A scheduling policy that considers system load but not cache hit ratio often fails to reuse cached data by not assigning a query to the sever that has data objects the query needs. On the contrary, a scheduling policy that considers cache hit ratio but not system load balance may suffer from system load imbalance. To maximize the overall system throughput and to reduce query response time, a multiple query scheduling policy must balance system load and also leverage cached objects. In this paper, we present a distributed query processing framework that exhibits high cache hit ratio while achieving good system load balance. In order to seamlessly manage our distributed scalable caching system, our framework performs autonomic cached data migrations to improve cache hit ratio. Our experiments show that our proposed query scheduling policy and data migration policy significantly improve system throughput by achieving high cache hit ratio while avoiding system load imbalance.
Keywords :
cache storage; query processing; resource allocation; scheduling; autonomic cached data migrations; cache hit ratio; cached objects; data migration policy; distributed cache infrastructure; distributed query processing framework; distributed query processing systems; distributed scalable caching system; large scalable distributed systems; large scale distributed caching system; leverage cached objects; multidimensional query processing; multiple query scheduling policy; query scheduling policy; scheduling policy; system load balance; system load imbalance; Distributed databases; Dynamic scheduling; Load management; Query processing; Servers; Throughput;
Conference_Titel :
High Performance Computing (HiPC), 2014 21st International Conference on
Print_ISBN :
978-1-4799-5975-4
DOI :
10.1109/HiPC.2014.7116906