Title :
Strategies for Big Data Clustering
Author :
Kurasova, Olga ; Marcinkevicius, Virginijus ; Medvedev, Viktor ; Rapecka, Aurimas ; Stefanovic, Pavel
Author_Institution :
Inst. of Math. & Inf., Vilnius Univ., Vilnius, Lithuania
Abstract :
In the paper, an overview of methods and technologies used for big data clustering is presented. The clustering is one of the important data mining issue especially for big data analysis, where large volume data should be grouped. Here some clustering methods are described, great attention is paid to the k-means method and its modifications, because it still remains one of the popular methods and is implemented in innovative technologies for big data analysis. Neural network-based self-organizing maps and their extensions for big data clustering are reviewed, too. Some strategies for big data clustering are also presented and discussed. It is shown the data of which volume can be clustered in the well known data mining systems WEKA and KNIME and when new sophisticated technologies are needed.
Keywords :
Big Data; data analysis; data mining; pattern clustering; self-organising feature maps; Big Data analysis; Big Data clustering; data mining; k-means method; neural network-based self-organizing maps; Algorithm design and analysis; Big data; Clustering algorithms; Clustering methods; Data mining; Data visualization; Distributed databases; Hadoop; big data; clustering methods; data mining;
Conference_Titel :
Tools with Artificial Intelligence (ICTAI), 2014 IEEE 26th International Conference on
Conference_Location :
Limassol
DOI :
10.1109/ICTAI.2014.115