DocumentCode :
1603531
Title :
Document clustering around weighted-medoids
Author :
Mei, Jian-Ping ; Chen, Lihui
Author_Institution :
Sch. of Electr. & Electron. Eng., Nanyang Technol. Univ., Singapore, Singapore
fYear :
2011
Firstpage :
1
Lastpage :
5
Abstract :
In this paper, we propose a new similarity-based k-partitions clustering approach, called CAWP. Given the similarities of pairs of objects in the dataset, CAWP groups these objects into K non-overlaped clusters. Each cluster is represented by multiple objects with different weights, called prototype weight. The more representative an object is with respect to a cluster, the larger prototype weight is assigned to that object in the corresponding cluster. Compared with the traditional k-medoids approach, where each cluster is represented by a single medoid or representative object, the way of using prototype weights to allow multiple objects together to describe a cluster is more appropriate in our view. Experimental study using large document datasets show that CAWP is more favorable than other existing similarity-based clustering approaches as it achieves both good effectiveness and efficiency.
Keywords :
document handling; pattern clustering; CAWP; document clustering; document datasets; prototype weight; representative object; similarity-based k-partitions clustering; traditional k-medoids approach; weighted medoids; Approximation algorithms; Clustering algorithms; Data mining; Kernel; Prototypes; Vectors; Wireless application protocol;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information, Communications and Signal Processing (ICICS) 2011 8th International Conference on
Conference_Location :
Singapore
Print_ISBN :
978-1-4577-0029-3
Type :
conf
DOI :
10.1109/ICICS.2011.6173606
Filename :
6173606
Link To Document :
بازگشت