DocumentCode :
3601297
Title :
Multi-Granularity Locality-Sensitive Bloom Filter
Author :
Jiangbo Qian ; Qiang Zhu ; Huahui Chen
Author_Institution :
Sch. of Inf. Sci. & Eng., Ningbo Univ., Ningbo, China
Volume :
64
Issue :
12
fYear :
2015
Firstpage :
3500
Lastpage :
3514
Abstract :
In many applications, such as homeland security, image processing, social network, and bioinformatics, it is often required to support an approximate membership query (AMQ) to answer a question like “is an (query) object q near to at least one of the objects in the given data set Ω?” However, existing techniques for processing AMQs require a key parameter, i.e., the distance value, to be defined in advance for the query processing. In this paper, we propose a novel filter, called multi-granularity locality-sensitive Bloom filter (MLBF), which can process AMQs with multiple distance granularities. Specifically, the MLBF is composed of two Bloom filters (BF), one is called basic multi-granularity locality-sensitive BF (BMLBF), and the other is called multi-granularity verification BF (MVBF). The BMLBF is used to store the data objects. It adopts an alignable locality-sensitive hashing (LSH) function family to support multiple granularities. The MVBF is used to reduce the false positive rate of the MLBF. The false negative rate of the MLBF is reduced by applying AND-constructions followed by an OR-construction. In addition, based on the MLBF structure, we suggest a more spaceeffective variant, called the MLBF , to further reduce space cost. Theoretical analyses for estimating false positive/negative rates of the MLBF/MLBF are given. Experiments using synthetic and real data show that the theoretical estimates are quite accurate, and the MLBF/MLBF technique can handle AMQs with low false positive and negative rates for multiple distance granularities.
Keywords :
data structures; query processing; AMQ; AND-construction; BMLBF; LSH function; MVBF; OR-construction; approximate membership query processing; basic multigranularity locality-sensitive BF; locality-sensitive hashing function; multigranularity locality-sensitive Bloom filter; multigranularity verification BF; theoretical analysis; Bioinformatics; Gaussian distribution; Informatics; Query processing; Security; Social network services; US Department of Homeland Security; Approximate membership query; Bloom filter; approximate membership query; false positive/negative rates; locality-sensitive hashing; query processing;
fLanguage :
English
Journal_Title :
Computers, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9340
Type :
jour
DOI :
10.1109/TC.2015.2401011
Filename :
7035010
Link To Document :
بازگشت