مرکز منطقه ای اطلاع رساني علوم و فناوري - An improved k-means clustering algorithm based on dissimilarity

DocumentCode :

3028383

Title :

An improved k-means clustering algorithm based on dissimilarity

Author :

Wang Shunye

Author_Institution :

Dept. of Comput. Sci. & Technol., Langfang Teachers Coll., Langfang, China

fYear :

2013

fDate :

20-22 Dec. 2013

Firstpage :

2629

Lastpage :

2633

Abstract :

K-means clustering algorithm is one of the most widely used clustering algorithms and has been applied in many fields of science and technology. A major problem of the original k-means clustering algorithm is that the cluster results depend on the initial centroids which choose at random. At the same time, the similarity measure on the algorithm based on distance is not suitable for big high- dimensional dataset. They all lead to severe degradation in performance. In this paper, an improved k-means clustering algorithm based on dissimilarity is proposed. It selects the initial centriods using the Huffman tree which uses dissimilarity matrix to construct. Many experiments confirm that the proposed algorithm is an efficient algorithm with better clustering accuracy on the same algorithm time complexity.

Keywords :

computational complexity; matrix algebra; pattern clustering; Huffman tree; algorithm time complexity; big high-dimensional dataset; cluster results; dissimilarity matrix; initial centroids; k-means clustering algorithm; Accuracy; Algorithm design and analysis; Classification algorithms; Clustering algorithms; Data mining; Iris; Machine learning algorithms; Huffman tree; dissimilarity; initial centriods; k-means;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Mechatronic Sciences, Electric Engineering and Computer (MEC), Proceedings 2013 International Conference on

Conference_Location :

Shengyang

Print_ISBN :

978-1-4799-2564-3

Type :

conf

DOI :

10.1109/MEC.2013.6885476

Filename :

6885476

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3028383