• DocumentCode
    671475
  • Title

    Massively parallel learning of Bayesian networks with MapReduce for factor relationship analysis

  • Author

    Wei Chen ; Tengjiao Wang ; Dongqing Yang ; Kai Lei ; Yueqin Liu

  • Author_Institution
    Sch. of EECS, Peking Univ., Beijing, China
  • fYear
    2013
  • fDate
    4-9 Aug. 2013
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Bayesian Network (BN) is one of the most popular models in data mining technologies. Most of the algorithms of BN structure learning are developed for the centralized datasets, where all the data are gathered into a single computer node. They are often too costly or impractical for learning BN structures from large scale data. Through a simple interface with two functions, map and reduce, MapReduce facilitates parallel implementation of many real-world tasks such as data processing for search engines and machine learning. In this paper, we present a parallel algorithm for BN structure leaning from large-scale dateset by using a MapReduce cluster. We discuss the benefits of using MapReduce for BN structure learning, and demonstrate the performance of this approach by applying it to a real world financial factor relationships learning task from the domain of financial analysis.
  • Keywords
    Bayes methods; data mining; directed graphs; financial data processing; learning (artificial intelligence); parallel algorithms; search engines; BN structure learning; MapReduce cluster; centralized datasets; data mining technologies; data processing; financial analysis; financial factor relationship learning task analysis; large-scale dateset; machine learning; massively parallel learning; parallel algorithm; parallel implementation; search engines; Algorithm design and analysis; Bayes methods; Computational modeling; Data mining; Educational institutions; Mutual information; Training;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks (IJCNN), The 2013 International Joint Conference on
  • Conference_Location
    Dallas, TX
  • ISSN
    2161-4393
  • Print_ISBN
    978-1-4673-6128-6
  • Type

    conf

  • DOI
    10.1109/IJCNN.2013.6706814
  • Filename
    6706814