DocumentCode :
1713924
Title :
Empirical study of soft clustering approaches for large data
Author :
Yangtao Wang ; Lihui Chen ; CheeKeong Chan
Author_Institution :
Sch. of Electr. & Electron. Eng., Nanyang Technol. Univ., Singapore, Singapore
fYear :
2013
Firstpage :
1
Lastpage :
5
Abstract :
Mining valuable information and knowledge in the prevalent large data nowadays is crucial to gain competitive advantages for many parties. Clustering is an important technique for data analysis to find the natural distribution of the unlabelled data. Clustering algorithms need to store the entire data into memory for analysis become infeasible when the data set is too large to be stored. To solve this problem, some approaches were developed by different strategies. In this paper, we have studied five soft clustering algorithms for large data using a chunk style strategy and evaluated their performances for different kinds of data sets. Each approach handles large data by processing data chunk by chunk instead of handling the entire data set at a time. Experimental results which show the effectiveness of these approaches are presented and discussed. The recommendations for the usage of these approaches are also proposed.
Keywords :
Big Data; data analysis; data mining; pattern clustering; chunk style strategy; data analysis; data chunk processing; large data; soft clustering approaches; unlabelled data; valuable information mining; Algorithm design and analysis; Approximation methods; Clustering algorithms; Data analysis; Data mining; Kernel; Linear programming;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information, Communications and Signal Processing (ICICS) 2013 9th International Conference on
Conference_Location :
Tainan
Print_ISBN :
978-1-4799-0433-4
Type :
conf
DOI :
10.1109/ICICS.2013.6782909
Filename :
6782909
Link To Document :
بازگشت