DocumentCode :
3767412
Title :
Model and Comparison of Membership Testing Approach for Massive Data
Author :
Gansen Zhao;Aiping Li;Zijing Li;Chuanghui Liu
Author_Institution :
Sch. of Comput. Sci., SCNU, Guangzhou, China
fYear :
2015
Firstpage :
31
Lastpage :
34
Abstract :
In the Big Data era, data sets can be so large that it has become a great challenge for many applications to efficiently test whether a given piece of data exists in a system already. It is crucial to explore a way to solve it. One feasible solution is to construct a data structure in memory to represent the massive data set. By looking at and computing over the data structure, it is possible to check if a given data is a member of the given data set. With this kind of solution, it is necessary to consider the efficiency of memory usage, etc. A number of space-efficient approaches, such as bitmap, bloom filter, what can be called "memory-based membership testing approaches", can provide practical implementation for this solution. However, there is not any recognized model and various theoretical performance comparisons for these approaches, resulting in the difficulty of choosing a proper approach for a specific scenario. This paper is devoted to investigate the way to compare the different performance of different memory-based membership testing approaches. Before that, a model including corresponding definitions, which can formally represent these approaches is proposed. Based on the proposed model, evaluation criteria are developed and the corresponding algorithms are articulated. Theoretical comparison on five memory-based membership testing approaches are given, which can give effective guidance for choosing an optimal approach for a specific scenario.
Keywords :
"Data structures","Error analysis","Standards","Measurement","Indexes","Computer science"
Publisher :
ieee
Conference_Titel :
Cloud Computing and Big Data (CCBD), 2015 International Conference on
Type :
conf
DOI :
10.1109/CCBD.2015.47
Filename :
7450527
Link To Document :
بازگشت