Title :
Experimental Research on Impacts of Dimensionality on Clustering Algorithms
Author :
Meng, Hai-Dong ; Ma, Jin-Hui ; Xu, Guan-Dong
Author_Institution :
Sch. of Inf. Eng., Inner Mongolia Univ. of Sci. & Technol., Baotou, China
Abstract :
Experiments are carried out on datasets with different dimensions selected from UCI datasets by using two classical clustering algorithms. The results of the experiments indicate that when the dimensionality of the real dataset is less than or equal to 30, the clustering algorithms based on distance are effective. For high-dimensional datasets--dimensionality is greater than 30, the clustering algorithms are of weaknesses, even if we use dimension reduction methods, such as Principal Component Analysis (PCA).
Keywords :
algorithm theory; data handling; pattern clustering; principal component analysis; UCI dataset; clustering algorithm; dimension reduction method; dimensionality; high-dimensional dataset; principal component analysis; Accuracy; Algorithm design and analysis; Classification algorithms; Clustering algorithms; Data mining; Partitioning algorithms; Principal component analysis;
Conference_Titel :
Computational Intelligence and Software Engineering (CiSE), 2010 International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4244-5391-7
Electronic_ISBN :
978-1-4244-5392-4
DOI :
10.1109/CISE.2010.5677260