Title of article :
Systematic tuning of parameters in support vector clustering
Author/Authors :
Y?lmaz، نويسنده , , ?zlem and Achenie، نويسنده , , Luke E.K. and Srivastava، نويسنده , , Ranjan، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2007
Abstract :
Clustering algorithms divide a set of observations into groups so that members of the same group share common features. In most of the algorithms, tunable parameters are set arbitrarily or by trial and error, resulting in less than optimal clustering. This paper presents a global optimization strategy for the systematic and optimal selection of parameter values associated with a clustering method. In the process, a performance criterion for the optimization model is proposed and benchmarked against popular performance criteria from the literature (namely, the Silhouette coefficient, Dunn’s index, and Davies–Bouldin index). The tuning strategy is illustrated using the support vector clustering (SVC) algorithm and simulated annealing. In order to reduce the computational burden, the paper also proposes an alternative to the adjacency matrix method (used for the assignment of cluster labels), namely the contour plotting approach. Datasets tested include the iris and the thyroid datasets from the UCI repository, as well as lymphoma and breast cancer data. The optimal tuning parameters are determined efficiently, while the contour plotting approach leads to significant reductions in computational effort (CPU time) especially for large datasets. The performance criteria comparisons indicate mixed results. Specifically, the Silhouette coefficient and the Davies–Bouldin index perform better, while the Dunn’s index is worse on average than the proposed performance index.
Keywords :
Clustering , Support Vector Machines , Principal components analysis , Cluster validity , Gene expression data , Parameter tuning
Journal title :
Mathematical Biosciences
Journal title :
Mathematical Biosciences