Title :
Double-Phase Locality Sensitive Hashing of neighborhood development for multi-relational data
Author :
Ping Ling ; Xiangsheng Rong
Author_Institution :
Coll. of Comput. Sci. & Technol., Jiangsu Normal Univ., Xuzhou, China
Abstract :
As a fundamental issue of machine learning, neighborhood development is closely connected with neighbor searching, data index, clustering, classification, etc. Multi-Relational (MR) data refer to objects of the relational database, and they are widely used in multiple applications. Yet, neighborhood development algorithm for MR data has been missed since MR data is high-dimensional and highly-structured. Thus a Double-Phase Locality Sensitiveness Hashing (DPLSH) algorithm is proposed in this paper to develop neighborhood for MR data. DPLSH consists of offline and online hashing schemas, and is encoded with parameterization heuristics to make the algorithm data-adaptively and less costly. Based on hashing projections of DPLSH, a method family of neighborhood formulation is defined to specify diverse criteria of identifying the neighbors. Extensive experiments show that for MR data, the quality of neighborhood produced by DPLSH is better than its peers; for common data, DPLSH exhibits the competitive behaviors with the state of the art.
Keywords :
data mining; data structures; DPLSH algorithm; MR data; double-phase locality sensitive hashing; multirelational data; neighborhood development algorithm; offline hashing schemas; online hashing schemas; Accuracy; Algorithm design and analysis; Approximation methods; Data mining; Measurement; Relational databases; Vectors; Locality sensitive Hashing; Multi-relational data; double-phase approach; neighborhood development; parameterization;
Conference_Titel :
Computational Intelligence (UKCI), 2013 13th UK Workshop on
Conference_Location :
Guildford
Print_ISBN :
978-1-4799-1566-8
DOI :
10.1109/UKCI.2013.6651307