• DocumentCode
    133012
  • Title

    Implementation of genetic network programming and knapsack problem for record clustering on distributed database

  • Author

    Wedashwara, Wirarama ; Mabu, Shingo ; Obayashi, Masanao ; Kuremoto, Takashi

  • Author_Institution
    Grad. Sch. of Sci. & Eng., Yamaguchi Univ., Yamaguchi, Japan
  • fYear
    2014
  • fDate
    9-12 Sept. 2014
  • Firstpage
    935
  • Lastpage
    940
  • Abstract
    This research involves implementation of genetic network programming (GNP) and knapsack problem (KP) to solve record clustering on distributed databases. The objective is to distribute big data to certain sites with the limited amount of capacities by considering the similarity of distributed data in each site. GNP is used to extract rules from big data by considering characteristics (value ranges) of each attribute in a dataset. KP is used to distribute rules to each site by considering similarity (value) and data amount (weight) related to each rule to match the site capacities.
  • Keywords
    Big Data; combinatorial mathematics; distributed databases; genetic algorithms; knapsack problems; pattern clustering; Big Data distribution; GNP; KP; Knapsack problem; attribute characteristics; attribute value ranges; combinational optimization problem; data amount weight; distributed data similarity value; distributed databases; genetic network programming; record clustering; rule extraction; site capacity matching; Data mining; Distributed databases; Economic indicators; Genetics; Optimization; Programming; Database Clustering; Genetic Network Programming; Knapsack Problem; Record Clustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    SICE Annual Conference (SICE), 2014 Proceedings of the
  • Conference_Location
    Sapporo
  • Type

    conf

  • DOI
    10.1109/SICE.2014.6935234
  • Filename
    6935234