DocumentCode
133012
Title
Implementation of genetic network programming and knapsack problem for record clustering on distributed database
Author
Wedashwara, Wirarama ; Mabu, Shingo ; Obayashi, Masanao ; Kuremoto, Takashi
Author_Institution
Grad. Sch. of Sci. & Eng., Yamaguchi Univ., Yamaguchi, Japan
fYear
2014
fDate
9-12 Sept. 2014
Firstpage
935
Lastpage
940
Abstract
This research involves implementation of genetic network programming (GNP) and knapsack problem (KP) to solve record clustering on distributed databases. The objective is to distribute big data to certain sites with the limited amount of capacities by considering the similarity of distributed data in each site. GNP is used to extract rules from big data by considering characteristics (value ranges) of each attribute in a dataset. KP is used to distribute rules to each site by considering similarity (value) and data amount (weight) related to each rule to match the site capacities.
Keywords
Big Data; combinatorial mathematics; distributed databases; genetic algorithms; knapsack problems; pattern clustering; Big Data distribution; GNP; KP; Knapsack problem; attribute characteristics; attribute value ranges; combinational optimization problem; data amount weight; distributed data similarity value; distributed databases; genetic network programming; record clustering; rule extraction; site capacity matching; Data mining; Distributed databases; Economic indicators; Genetics; Optimization; Programming; Database Clustering; Genetic Network Programming; Knapsack Problem; Record Clustering;
fLanguage
English
Publisher
ieee
Conference_Titel
SICE Annual Conference (SICE), 2014 Proceedings of the
Conference_Location
Sapporo
Type
conf
DOI
10.1109/SICE.2014.6935234
Filename
6935234
Link To Document