Title :
Redundancy reduction in self-organising map merging for scalable data clustering
Author :
Ganegedara, Hiran ; Alahakoon, Damminda
Author_Institution :
Clayton Sch. of IT, Monash Univ., Clayton, VIC, Australia
Abstract :
Self-organising maps are widely used for exploratory data analysis. High processing power requirement for large scale data clustering is a key problem with self-organising maps. Although a number of serial approaches have been developed to reduce the time requirement, algorithms that could utilise distributed computing outperforms serial algorithms for processing large datasets. An effective distributed approach is to divide the dataset into partitions, train a self-organising map on each partition and merge the maps to form a single map representing the whole data set. The recently proposed Parallel GSOM algorithm has demonstrated that parallel computation can significantly reduce training time for self-organising maps. However, if the actual clusters in the dataset are distributed across several partitions, the individual trained maps could contain redundant neurons. Presence of redundancy increases the time requirement for the merging process. Reduction of redundant neurons would reduce the time consumption of the merging process thereby improving the efficiency of the whole data clustering process. In this paper, we propose a redundant neuron reduction algorithm for self-organising maps which improves the efficiency of the merging process. We demonstrate that the proposed algorithm has faster performance over the Parallel GSOM algorithm.
Keywords :
merging; parallel algorithms; pattern clustering; self-organising feature maps; dataset partition; distributed approach; distributed computing; exploratory data analysis; high processing power requirement; large dataset processing; large scale data clustering; map merging; merging process; parallel GSOM algorithm; parallel computation; redundancy reduction; redundant neuron reduction algorithm; scalable data clustering; self-organising map; time requirement; Algorithm design and analysis; Clustering algorithms; Merging; Neurons; Partitioning algorithms; Redundancy; Vectors; Growing self-organising maps; redundancy reduction; scalable data clustering;
Conference_Titel :
Neural Networks (IJCNN), The 2012 International Joint Conference on
Conference_Location :
Brisbane, QLD
Print_ISBN :
978-1-4673-1488-6
Electronic_ISBN :
2161-4393
DOI :
10.1109/IJCNN.2012.6252722