Title :
A Time efficient Pattern Reduction algorithm for k-means based clustering
Author :
Tsai, Chun-Wei ; Yang, Chu-Sing ; Chiang, Ming-Chao
Author_Institution :
Nat. Sun Yat-sen Univ., Kaohsiung
Abstract :
In this paper, we present an efficient algorithm, called pattern reduction (PR) algorithm, to reduce the time required for data clustering based on iterative clustering algorithms. Conceptually similar to a lossy data compression scheme, this algorithm removes at each iteration those data patterns that are close to the centroid of a cluster or remain in the same cluster for a certain number of iterations in a row and are thus unlikely to be moved again from one cluster to another at later iterations by computing a new pattern to represent all the data patterns removed. Our simulation results - from 2 to 1,000 dimensions and 150 to 6,000,000 patterns - indicate that the proposed algorithm can reduce the computation time of k-means, genetic k-means algorithm (GKA) and k-means with genetic algorithm (KGA) from 10% up to about 80% and that for high dimensional data sets, it can even reduce the computation time for more than 70%.
Keywords :
data compression; genetic algorithms; iterative methods; pattern clustering; data clustering; genetic k-means algorithm; iterative clustering algorithms; k-means based clustering; lossy data compression; time efficient pattern reduction algorithm; Algorithm design and analysis; Clustering algorithms; Computational modeling; Data compression; Delay; Euclidean distance; Genetic algorithms; Heuristic algorithms; High performance computing; Iterative algorithms;
Conference_Titel :
Systems, Man and Cybernetics, 2007. ISIC. IEEE International Conference on
Conference_Location :
Montreal, Que.
Print_ISBN :
978-1-4244-0990-7
Electronic_ISBN :
978-1-4244-0991-4
DOI :
10.1109/ICSMC.2007.4413608