Title :
Theoretical Analysis of Wavelet Synopsis on Partitioned Data Sets
Author_Institution :
Dept. of Software Design & Manage., Gachon Univ., Seongnam, South Korea
Abstract :
Wavelet synopsis is one of most popular dimensionality reduction methods and has been studied in various areas such as query optimization, approximate query answering, feature selection, etc. Currently, the size of data becomes much larger and the distributed data processing is increasingly important. The MapReduce well known as Google´s data processing environment is the most popular distributed platform with good scalability and fault tolerance. Thus, recently, the algorithms to construct wavelet synopses on the MapReduce platform were proposed. In this paper, we theoretically analyze wavelet synopsis on partitioned data sets. Although the wavelet synopsis on partitioned data sets was proposed in recent work, only the algorithmic implementation and experimental results were given but there was no theoretical analysis. Thus, we study theoretical analysis of the properties of wavelet synopsis on partitioned data sets and the correctness of merging them.
Keywords :
data compression; data mining; distributed processing; fault tolerant computing; wavelet transforms; Google data processing environment; MapReduce; MapReduce platform; algorithmic implementation; dimensionality reduction methods; distributed data processing; distributed platform; fault tolerance; partitioned data sets; theoretical wavelet synopsis analysis; Approximation algorithms; Approximation methods; Data processing; Distributed databases; Time complexity; Wavelet analysis;
Conference_Titel :
Information Science and Applications (ICISA), 2013 International Conference on
Conference_Location :
Suwon
Print_ISBN :
978-1-4799-0602-4
DOI :
10.1109/ICISA.2013.6579454