DocumentCode
128708
Title
Clustering based semantic data summarization technique: A new approach
Author
Ahmed, Mariwan ; Mahmood, Abdun Naser
Author_Institution
Sch. of Eng. & Inf. Technol., Univ. of New South Wales, Canberra, ACT, Australia
fYear
2014
fDate
9-11 June 2014
Firstpage
1780
Lastpage
1785
Abstract
Due to advancement of computing and proliferation of data repositories, efficient data mining techniques are required to extract meaningful information. Summarization is such an important data analysis technique which can be broadly classified into two categories as semantic and syntactic methods. Syntactic methods consider a dataset as a sequence of bytes whereas semantic methods convert large dataset into a much smaller one yet maintaining low information loss. Clustering algorithms are widely used for semantic summarization such as basic k-means. Existing clustering based summarization techniques assume that a summary is represented using the cluster centroids. However, the centroids might not represent the actual data points in summary. In addition, many clustering algorithms, such as the most popular k-means algorithm requires the number of clusters as an input, which is not available for unsupervised summarization of unlabeled data. To address these issues, we propose a clustering based semantic summarization using a combination of x-means and k-medoid clustering algorithms. Our experimental analysis shows that, the proposed algorithm outperforms k-means based summarization techniques.
Keywords
data analysis; data mining; information retrieval; pattern clustering; bytes sequence; cluster centroids; clustering based semantic data summarization technique; data analysis technique; data mining techniques; data repositories; information extraction; information loss; k-means algorithm; k-medoid clustering algorithms; large dataset; semantic methods; syntactic methods; x-means clustering algorithms; Conferences; Decision support systems; Industrial electronics; Clustering; Data Summarization;
fLanguage
English
Publisher
ieee
Conference_Titel
Industrial Electronics and Applications (ICIEA), 2014 IEEE 9th Conference on
Conference_Location
Hangzhou
Print_ISBN
978-1-4799-4316-6
Type
conf
DOI
10.1109/ICIEA.2014.6931456
Filename
6931456
Link To Document