DocumentCode :
152203
Title :
The use of k-means++ for approximate spectral clustering of large datasets
Author :
Yalcin, Berna ; Tasdemir, Kadim
Author_Institution :
Elektron. ve Haberlesme Muhendisligi, Istanbul Teknik Univ., İstanbul, Turkey
fYear :
2014
fDate :
23-25 April 2014
Firstpage :
220
Lastpage :
223
Abstract :
Spectral clustering (SC) has been commonly used in recent years, thanks to its nonparametric model, its ability to extract clusters of different manifolds and its easy application. However, SC is infeasible for large datasets because of its high computational cost and memory requirement. To address this challenge, approximate spectral clustering (ASC) has been proposed for large datasets. ASC involves two steps: firstly limited number of data representatives (also known as prototypes) are selected by sampling or quantization methods, then SC is applied to these representatives using various similarity criteria. In this study, several quantization and sampling methods are compared for ASC. Among them, k-means++, which is a recently popular algorithm in clustering, is used to select prototypes in ASC for the first time. Experiments on different datasets indicate that k-means++ is a suitable alternative to neural gas and selective sampling in terms of accuracy and computational cost.
Keywords :
approximation theory; data mining; pattern clustering; sampling methods; ASC; approximate spectral clustering; computational cost; data representatives; datasets; k-means++; memory requirement; neural gas; nonparametric model; prototypes; quantization methods; sampling methods; Approximation algorithms; Clustering algorithms; Computational modeling; Conferences; Self-organizing feature maps; Signal processing; Writing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing and Communications Applications Conference (SIU), 2014 22nd
Conference_Location :
Trabzon
Type :
conf
DOI :
10.1109/SIU.2014.6830205
Filename :
6830205
Link To Document :
بازگشت