• DocumentCode
    3006971
  • Title

    Fast Quasi-biclique Mining with Giraph

  • Author

    Hsiao-Fei Liu ; Chung-Tsai Su ; An-Chiang Chu

  • Author_Institution
    CoreTech, Trend Micro, Inc., Taipei, Taiwan
  • fYear
    2013
  • fDate
    June 27 2013-July 2 2013
  • Firstpage
    347
  • Lastpage
    354
  • Abstract
    Quasi-biclique mining for bipartite graphs has found important applications in providing security services. However, the standard MapReduce algorithm for mining quasi-bicliques does not scale well due to the need of shuffling and reducing a huge number of map outputs. To cope with web-scale graphs, we propose a scalable algorithm with the use of Giraph, which is a new rising large-scale graph processing platform following the bulk synchronous parallel (BSP) model. Experimental results on real world domain-IP graphs demonstrate that our proposed solution is able to reduce CPU time by 80% and disk I/O by 95%, compared with the standard MapReduce algorithm.
  • Keywords
    Internet; data mining; parallel algorithms; security of data; BSP model; Giraph; MapReduce algorithm; Web-scale graphs; bipartite graphs; bulk synchronous parallel model; domain-IP graphs; fast quasi-biclique mining; large-scale graph processing platform; security services; Algorithm design and analysis; Bipartite graph; Communities; Data mining; Partitioning algorithms; Servers; Standards; Bipartite Graph; Bulk Synchronous Parallel; Giraph; Graph Partitioning; Quasi-Clique;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data (BigData Congress), 2013 IEEE International Congress on
  • Conference_Location
    Santa Clara, CA
  • Print_ISBN
    978-0-7695-5006-0
  • Type

    conf

  • DOI
    10.1109/BigData.Congress.2013.53
  • Filename
    6597157