Title :
Comparison of self-organizing map with K-means hierarchical clustering for bioinformatics applications
Author :
Shahapurkar, Somnath S. ; Sundareshan, Malur K.
Author_Institution :
Sort Test Technol. Dev., Intel Corp., Chandler, AZ, USA
Abstract :
The self-organizing map (SOM) has emerged as one of the popular choices for clustering data; however, when it comes to point density accuracy of codebooks or reliability and interpretability of the map, the SOM leaves much to be desired. In this paper, we compare the newly developed K-means hierarchical (KMH) clustering algorithm to the SOM. We also introduce a new initialization scheme for the K-means that improves codebook placement and, propose a novel visualization scheme that combines the principal component analysis (PCA) and minimal spanning tree (MST) in an arrangement that ensures reliability of the visualization unlike the SOM. A practical application of the algorithm is demonstrated on a challenging bioinformatics problem.
Keywords :
biology; data visualisation; pattern clustering; principal component analysis; self-organising feature maps; trees (mathematics); K-means hierarchical clustering algorithm; PCA; bioinformatics applications; codebook placement; initialization scheme; interpretability; minimal spanning tree; principal component analysis; reliability; self organizing map; visualization scheme; Automatic testing; Bioinformatics; Clustering algorithms; Clustering methods; Data analysis; Data visualization; Fungi; Organizing; Pattern analysis; Principal component analysis;
Conference_Titel :
Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on
Print_ISBN :
0-7803-8359-1
DOI :
10.1109/IJCNN.2004.1380117