Title :
Accelerating Expectation-Maximization Algorithms with Frequent Updates
Author :
Yin, Jiangtao ; Zhang, Yanfeng ; Gao, Lixin
Author_Institution :
Univ. of Massachusetts Amherst, Amherst, MA, USA
Abstract :
Expectation Maximization is a popular approach for parameter estimation in many applications such as image understanding, document classification, or genome data analysis. Despite the popularity of EM algorithms, it is challenging to efficiently implement these algorithms in a distributed environment. In particular, many EM algorithms that frequently update the parameters have been shown to be much more efficient than their concurrent counterparts. Accordingly, we propose two approaches to parallelize such EM algorithms in a distributed environment so as to scale to massive data sets. We prove that both approaches maintain the convergence properties of the EM algorithms. Based on the approaches, we design and implement a distributed framework, FreEM, to support the implementation of frequent updates for the EM algorithms. We show its efficiency through three well-known EM applications: k-means clustering, fuzzy c-means clustering and parameter estimation for the Gaussian Mixture model. We evaluate our framework on both a local cluster of machines and the Amazon EC2 cloud. Our evaluation shows that the EM algorithms with frequent updates implemented on FreEM can run much faster than those implementations with traditional concurrent updates.
Keywords :
Gaussian processes; convergence; distributed processing; expectation-maximisation algorithm; fuzzy set theory; parameter estimation; pattern clustering; Amazon EC2 cloud; EM algorithm; EM application; FreEM distributed framework; Gaussian mixture model; convergence property; distributed environment; expectation-maximization algorithm; frequent updates; fuzzy c-means clustering; k-means clustering; massive data set; parameter estimation; Acceleration; Algorithm design and analysis; Clustering algorithms; Convergence; Frequency control; Linear programming; Synchronization;
Conference_Titel :
Cluster Computing (CLUSTER), 2012 IEEE International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4673-2422-9
DOI :
10.1109/CLUSTER.2012.81