• DocumentCode
    2191126
  • Title

    Parallel EM-Clustering: Fast Convergence by Asynchronous Model Updates

  • Author

    Plant, Claudia ; Bohm, Christian

  • Author_Institution
    Florida State Univ., Tallahassee, FL, USA
  • fYear
    2010
  • fDate
    13-13 Dec. 2010
  • Firstpage
    178
  • Lastpage
    185
  • Abstract
    The data explosion in many applications requires efficient data mining solutions. Fortunately, emerging technologies like grid and cloud computing, high-performance multi-core processors and graphics processing units provide the potential to keep pace with the data explosion and open up new opportunities for designing efficient algorithms. In this paper, we propose a parallel variant of the Expectation Maximization (EM) algorithm suitable for clustering large data sets in a distributed environment. The conventional EM algorithm sequentially iterates two phases: In the E-step, points are assigned to the clusters and in the M-step the cluster models are updated. The basic idea of our approach is allowing asynchronous model updates for faster convergence and best usage of the available resources. The frequency of the updates can be flexibly adjusted to the specific characteristics of the environment including communication costs and computing power of the single devices. An extensive experimental evaluation demonstrates the benefits of our approach.
  • Keywords
    convergence; data mining; expectation-maximisation algorithm; parallel algorithms; pattern clustering; E-step; M-step; asynchronous model updates; communication cost; computing power; data explosion; data mining; expectation maximization; fast convergence; parallel EM clustering; parallel variant;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshops (ICDMW), 2010 IEEE International Conference on
  • Conference_Location
    Sydney, NSW
  • Print_ISBN
    978-1-4244-9244-2
  • Electronic_ISBN
    978-0-7695-4257-7
  • Type

    conf

  • DOI
    10.1109/ICDMW.2010.53
  • Filename
    5693298