DocumentCode :
1773961
Title :
A caching approach to process stream data in data warehouse
Author :
Naeem, Muhammad A.
Author_Institution :
Sch. of Comput. & Math. Sci., Auckland Univ. of Technol., Auckland, New Zealand
fYear :
2014
fDate :
Sept. 29 2014-Oct. 1 2014
Firstpage :
162
Lastpage :
167
Abstract :
Stream-based join algorithms are a promising technology for modern real-time data-warehouses. A particular category of stream-based joins is a semi-stream join where a single stream is joined with a disk based master data. The join operator typically works under limited main memory and this memory is generally not large enough to hold the whole disk-based master data. Recently, a seminal join algorithm called MESHJOIN (Mesh Join) has been proposed in the literature to process semi-stream data. MESHJOIN is a candidate for a resource-aware system setup. However, MESHJOIN is not very selective. In particular, MESHJOIN does not consider the characteristics of stream data and its performance is suboptimal for skewed stream data. In this paper I propose a novel Cached-based Semi-Stream Join (CSSJ) using a cache module. The algorithm is more appropriate for skewed distributions, and I present results for Zipfian distributions of the type that appear in many applications. I conduct a rigorous experimental study to test our algorithm. Our experiments show that CSSJ outperforms MESHJOIN significantly. I also present the cost model for our CSSJ and validate it with experiments.
Keywords :
cache storage; data warehouses; resource allocation; statistical distributions; CSSJ; MESHJOIN; Zipfian distributions; cache module; cached-based semistream join; caching approach; data warehouse; disk based master data; join operator; resource-aware system setup; seminal join algorithm; skewed distributions; stream data processing; stream-based join algorithm; Algorithm design and analysis; Business; Data warehouses; Loading; Probes; Real-time systems; Warehousing; Performance optimization; Real-time data warehouses; Semi-stream join;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Digital Information Management (ICDIM), 2014 Ninth International Conference on
Conference_Location :
Phitsanulok
Type :
conf
DOI :
10.1109/ICDIM.2014.6991406
Filename :
6991406
Link To Document :
بازگشت