Title :
Fuzzy clustering of large-scale data sets using Principal Component Analysis
Author :
Arfaoui, Olfa ; Sassi, Minyar
Author_Institution :
Nat. Eng. Sch. of Tunis, Tunis, Tunisia
Abstract :
To effectively exploit large-scale data sets using a limited storage space, it is necessary to find a special treatment which reduces them. There are certain methods with this intention. We can quote clustering method. However, this method proves its limits in the case of large-scale data sets. In this paper, we propose to reduce the workspace using the Principal Component Analysis (PCA). We work with fuzzy clustering of a data set in which users don´t know the optimal number of clusters to be generated. We proved the effectiveness of the pre-processing use of this technique before any clustering operation.
Keywords :
data compression; data mining; pattern clustering; principal component analysis; fuzzy clustering; large-scale data sets; principal component analysis; quote clustering method; storage space; Clustering algorithms; Correlation; Covariance matrix; Eigenvalues and eigenfunctions; Equations; Mathematical model; Principal component analysis; Clustering; Data Compression; Fuzzy Logic; Principal Component Analysis; Sampling;
Conference_Titel :
Fuzzy Systems (FUZZ), 2011 IEEE International Conference on
Conference_Location :
Taipei
Print_ISBN :
978-1-4244-7315-1
Electronic_ISBN :
1098-7584
DOI :
10.1109/FUZZY.2011.6007435