DocumentCode :
3565632
Title :
Comparison of distance measures for clustering data with mix attribute types for Indonesian potential-based regional grouping
Author :
Prasetyo, Hermawan ; Purwarianti, Ayu
Author_Institution :
Sch. of Electr. Eng. & Inf., Inst. Teknol. Bandung, Bandung, Indonesia
fYear :
2014
Firstpage :
13
Lastpage :
18
Abstract :
Every region in Indonesia has different potentials and need to be analyzed for national development considerations. This analyzed can be accomplished with clustering Indonesian regional potential data, which is collected from PODES enumeration. This data consist of both numeric and categorical attributes. However, most of clustering algorithm can be applied on either numeric or categorical data. K-prototypes algorithm, as clustering algorithm which can deal with mix data types, has limitation such as distance measurement. Selecting distance measures properly is thus important to increase its performance. This paper presents a comparison of distance measures for clustering mix attribute type data. We have applied k-prototypes algorithm with several distance measures on PODES11-DESA dataset and used Silhouette index for clustering evaluation. The results show that the best clustering is accomplished by applying Ratio on Mismatches distance for categorical attributes. For numeric attributes, there is no one best performing distance measure since the performance of numeric distance measures varies for each treatment.
Keywords :
distance measurement; pattern clustering; regional planning; Indonesian potential-based regional grouping; Indonesian regional potential data; PODES enumeration; PODES11-DESA dataset; categorical attribute; clustering algorithm; clustering evaluation; distance measurement; k-prototypes algorithm; mix attribute type data clustering; mix data type; national development consideration; numeric attribute; numeric distance measure; silhouette index; Algorithm design and analysis; Chebyshev approximation; Clustering algorithms; Educational institutions; Indexes; Prototypes; Sociology; clustering mix attribute types; distance measures; k-prototypes algorithm; regional potentials;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Technology Systems and Innovation (ICITSI), 2014 International Conference on
Type :
conf
DOI :
10.1109/ICITSI.2014.7048230
Filename :
7048230
Link To Document :
بازگشت