Title :
A binary space based on modified hamming distance for clustering
Author :
Feng-ning Ma ; Shi-qiang Jiang ; Ji-ting Yang ; Qin-yu Ren
Author_Institution :
Tianjin University, China, 300072
Abstract :
Any things can be seen as an entity represented by multiple properties. This paper define a binary space for clustering, in which for each entity, we convert the raw data into the attribute string and transform it into a binary string in accordance with binary tree. The order of each attribute is no importance. At the same time, because the weight of each attribute is different in this space, we use Modified Hamming distance (MHD) to replace Euclidean distance to calculate proximity between two binary strings. In this way, the binary space is more close to reality, simplifies the calculation and improves computing efficiency. Finally, we take k-means clustering algorithm for example and select 74 financial indexes data of 1796 listed companies to experiment. Due to use ‘0’ and ‘1’ representation, this space is simple and efficient in terms of clustering. The results show that this space performs better in processing mass data.
Keywords :
Hamming Distance; binary space; binary table; binary tree; clustering;
Conference_Titel :
Automatic Control and Artificial Intelligence (ACAI 2012), International Conference on
Conference_Location :
Xiamen
Electronic_ISBN :
978-1-84919-537-9
DOI :
10.1049/cp.2012.0908