DocumentCode :
1522810
Title :
Linearithmic Time Sparse and Convex Maximum Margin Clustering
Author :
Zhang, Xiao-Lei ; Wu, Ji
Author_Institution :
Dept. of Electron. Eng., Tsinghua Univ., Beijing, China
Volume :
42
Issue :
6
fYear :
2012
Firstpage :
1669
Lastpage :
1692
Abstract :
Recently, a new clustering method called maximum margin clustering (MMC) was proposed and has shown promising performances. It was originally formulated as a difficult nonconvex integer problem. To make the MMC problem practical, the researchers either relaxed the original MMC problem to inefficient convex optimization problems or reformulated it to nonconvex optimization problems, which sacrifice the convexity for efficiency. However, no approaches can both hold the convexity and be efficient. In this paper, a new linearithmic time sparse and convex MMC algorithm, called support-vector-regression-based MMC (SVR-MMC), is proposed. Generally, it first uses the SVR as the core of the MMC. Then, it is relaxed as a convex optimization problem, which is iteratively solved by the cutting-plane algorithm. Each cutting-plane subproblem is further decomposed to a serial supervised SVR problem by a new global extended-level method (GELM). Finally, each supervised SVR problem is solved in a linear time complexity by a new sparse-kernel SVR (SKSVR) algorithm. We further extend the SVR-MMC algorithm to the multiple-kernel clustering (MKC) problem and the multiclass MMC (M3C) problem, which are denoted as SVR-MKC and SVR-M3C, respectively. One key point of the algorithms is the utilization of the SVR. It can prevent the MMC and its extensions meeting an integer matrix programming problem. Another key point is the new SKSVR. It provides a linear time interface to the nonlinear kernel scenarios, so that the SVR-MMC and its extensions can keep a linearthmic time complexity in nonlinear kernel scenarios. Our experimental results on various real-world data sets demonstrate the effectiveness and the efficiency of the SVR-MMC and its two extensions. Moreover, the unsupervised application of the SVR-MKC to the voice activity detection (VAD) shows that the SVR-MKC can achieve good performances that are close to its supervised counterpart, meet the real-time demand of the VAD, and need no- labeling for model training.
Keywords :
computational complexity; concave programming; convex programming; integer programming; pattern clustering; regression analysis; sparse matrices; speech processing; support vector machines; unsupervised learning; GELM; M3C problem; MKC problem; SKSVR algorithm; SVR-MMC algorithm; VAD; convex MMC algorithm; convex maximum margin clustering; convex optimization problem; cutting-plane algorithm; global extended-level method; integer matrix programming problem; linear time complexity; linear time interface; linearithmic time sparse clustering; model training; multiclass MMC problem; multiple-kernel clustering problem; nonconvex integer problem; nonconvex optimization problems; nonlinear kernel scenarios; serial supervised SVR problem; sparse-kernel SVR; support vector regression-based MMC; voice activity detection; Clustering algorithms; Complexity theory; Convex functions; Optimization; Support vector machines; Unsupervised learning; Clustering; maximum margin; multiple-kernel learning (MKL); unsupervised learning; voice activity detection (VAD);
fLanguage :
English
Journal_Title :
Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on
Publisher :
ieee
ISSN :
1083-4419
Type :
jour
DOI :
10.1109/TSMCB.2012.2197824
Filename :
6204103
Link To Document :
بازگشت