مرکز منطقه ای اطلاع رساني علوم و فناوري - A hubness based sampling approach for PAM algorithm

DocumentCode :

736538

Title :

A hubness based sampling approach for PAM algorithm

Author :

Zhenfeng, He

Author_Institution :

College of Mathematics and Computer Science, Fuzhou University, Fuzhou 350108, P.R. China

fYear :

2015

fDate :

28-30 July 2015

Firstpage :

4962

Lastpage :

4967

Abstract :

Hub is the data instance that appears frequently in other instances´ nearest neighbour lists. Hubness, the emergence of hubs, is an important property of high-dimensional datasets. An instance with a large hubness score is usually close to the centre of a cluster, whereas that with a small score is often an outlier or a boundary instance. In this paper, a hubness score based sampling approach is proposed for PAM algorithm. It selects some of the high hubness score instances to reduce redundancy, and at the same time, guarantees that every instance from original dataset will have some of its K nearest neighbours being selected. Experimental results on six UCI datasets and two synthetic datasets suggests: when K is set to 10, the approach removes more than 80% instances and increases clustering accuracy.

Keywords :

Accuracy; Algorithm design and analysis; Amplitude shift keying; Big data; Clustering algorithms; Partitioning algorithms; Training; K-Medoids clustering; high-dimensional data; hubness; sampling;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Control Conference (CCC), 2015 34th Chinese

Conference_Location :

Hangzhou, China

Type :

conf

DOI :

10.1109/ChiCC.2015.7260411

Filename :

7260411

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=736538