DocumentCode :
128708
Title :
Clustering based semantic data summarization technique: A new approach
Author :
Ahmed, Mariwan ; Mahmood, Abdun Naser
Author_Institution :
Sch. of Eng. & Inf. Technol., Univ. of New South Wales, Canberra, ACT, Australia
fYear :
2014
fDate :
9-11 June 2014
Firstpage :
1780
Lastpage :
1785
Abstract :
Due to advancement of computing and proliferation of data repositories, efficient data mining techniques are required to extract meaningful information. Summarization is such an important data analysis technique which can be broadly classified into two categories as semantic and syntactic methods. Syntactic methods consider a dataset as a sequence of bytes whereas semantic methods convert large dataset into a much smaller one yet maintaining low information loss. Clustering algorithms are widely used for semantic summarization such as basic k-means. Existing clustering based summarization techniques assume that a summary is represented using the cluster centroids. However, the centroids might not represent the actual data points in summary. In addition, many clustering algorithms, such as the most popular k-means algorithm requires the number of clusters as an input, which is not available for unsupervised summarization of unlabeled data. To address these issues, we propose a clustering based semantic summarization using a combination of x-means and k-medoid clustering algorithms. Our experimental analysis shows that, the proposed algorithm outperforms k-means based summarization techniques.
Keywords :
data analysis; data mining; information retrieval; pattern clustering; bytes sequence; cluster centroids; clustering based semantic data summarization technique; data analysis technique; data mining techniques; data repositories; information extraction; information loss; k-means algorithm; k-medoid clustering algorithms; large dataset; semantic methods; syntactic methods; x-means clustering algorithms; Conferences; Decision support systems; Industrial electronics; Clustering; Data Summarization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Industrial Electronics and Applications (ICIEA), 2014 IEEE 9th Conference on
Conference_Location :
Hangzhou
Print_ISBN :
978-1-4799-4316-6
Type :
conf
DOI :
10.1109/ICIEA.2014.6931456
Filename :
6931456
Link To Document :
بازگشت