• DocumentCode
    3724123
  • Title

    Scalable Hypergraph Learning and Processing

  • Author

    Jin Huang;Rui Zhang;Jeffrey Xu Yu

  • Author_Institution
    Dept. of Comput. &
  • fYear
    2015
  • Firstpage
    775
  • Lastpage
    780
  • Abstract
    A hypergraph allows a hyperedge to connect more than two vertices, using which to capture the high-order relationships, many hypergraph learning algorithms are shown highly effective in various applications. When learning large hypergraphs, converting them to graphs to employ the distributed graph frameworks is a common approach, yet it results in major efficiency drawbacks including an inflated problem size, the excessive replicas, and the unbalanced workloads. To avoid such drawbacks, we take a different approach and propose HyperX, which is a thin layer built upon Spark. To preserve the problem size, HyperX directly operates on a distributed hypergraph. To reduce the replicas, HyperX replicates the vertices but not the hyperedges. To balance the workloads, we investigate the hypergraph partitioning problem aiming at minimizing the space and the communication cost subject to two separate constraints on the hyperedge and the vertex workloads. With experiments on both real and synthetic datasets, we verify that HyperX significantly improves the efficiency of the learning algorithms when compared with the graph conversion approach.
  • Keywords
    "Partitioning algorithms","Algorithm design and analysis","Sparks","Machine learning algorithms","Proteins","Optimization","Approximation algorithms"
  • Publisher
    ieee
  • Conference_Titel
    Data Mining (ICDM), 2015 IEEE International Conference on
  • ISSN
    1550-4786
  • Type

    conf

  • DOI
    10.1109/ICDM.2015.33
  • Filename
    7373388