DocumentCode :
3497527
Title :
Online incremental clustering with distance metric learning for high dimensional data
Author :
Okada, Shogo ; Nishida, Toyoaki
Author_Institution :
Dept..of Comput. Intell. & Syst. Sci., Tokyo Inst. of Technol. of Japan, Tokyo, Japan
fYear :
2011
fDate :
July 31 2011-Aug. 5 2011
Firstpage :
2047
Lastpage :
2054
Abstract :
In this paper, we present a novel incremental clustering algorithm which assigns of a set of observations into clusters and learns the distance metric iteratively in an incremental manner. The proposed algorithm SOINN-AML is composed based on the Self-organizing Incremental Neural Network (Shen et al 2006), which represents the distribution of unlabeled data and reports a reasonable number of clusters. SOINN adopts a competitive Hebbian rule for each input signal, and distance between nodes is measured using the Euclidean distance. Such algorithms rely on the distance metric for the input data patterns. Distance Metric Learning (DML) learns a distance metric for the high dimensional input space of data that preserves the distance relation among the training data. DML is not performed for input space of data in SOINN based approaches. SOINN-AML learns input space of data by using the Adaptive Distance Metric Learning (AML) algorithm which is one of the DML algorithms. It improves the incremental clustering performance of the SOINN algorithm by optimizing the distance metric in the case that input data space is high dimensional. In experimental results, we evaluate the performance by using two artificial datasets, seven real datasets from the UCI dataset and three real image datasets. We have found that the proposed algorithm outperforms conventional algorithms including SOINN (Shen et al 2006) and Enhanced SOINN (Shen et al 2007). The improvement of clustering accuracy (NMI) is between 0.03 and 0.13 compared to state of the art SOINN based approaches.
Keywords :
Hebbian learning; neural nets; pattern clustering; self-adjusting systems; Euclidean distance; SOINN-AML; UCI dataset; adaptive distance metric learning; clustering accuracy; competitive Hebbian rule; enhanced SOINN; high dimensional data; online incremental clustering; selforganizing incremental neural network; Algorithm design and analysis; Clustering algorithms; Equations; Feature extraction; Joining processes; Mathematical model; Measurement;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks (IJCNN), The 2011 International Joint Conference on
Conference_Location :
San Jose, CA
ISSN :
2161-4393
Print_ISBN :
978-1-4244-9635-8
Type :
conf
DOI :
10.1109/IJCNN.2011.6033478
Filename :
6033478
Link To Document :
بازگشت