Title :
A Modified K-means Algorithm for Sequence Clustering
Author :
Hsu, Jia-Lien ; Yang, Hong-Xiang
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Fu Jen Catholic Univ., Taipei, Taiwan
Abstract :
In this paper, we extend our research to construct a system which provides clustering services, more than user-active search. We use DCT mapping to extract features from sequences, and discuss sequence similarities of whole similarity and partial similarity. The two kinds of similarity concepts will be applied when clustering sequences of equal-length and variable-length, respectively.In the case of equal-length, we map a sequence to a dimensional point in the feature space, and then cluster these sequences accordingly by applying hierarchical clustering and partitional clustering (i.e., K-means). In the case of variable-length, we cut a sequence into subsequences by sliding window, and map subsequences to f-dimensional points. We propose a Modified K-means (MK) algorithm to handle partial similarity of subsequences. Finally, we implement our methods and perform experiments to show the efficiency and effectiveness of our approach.
Keywords :
discrete cosine transforms; pattern clustering; discrete cosine transform; feature extraction; hierarchical clustering; modified k-mean algorithm; sequence clustering; user-active search; Clustering algorithms; Computer science; Data mining; Discrete Fourier transforms; Discrete cosine transforms; Feature extraction; Hybrid intelligent systems; Indexing; Multimedia databases; Partitioning algorithms; Clustering; K-means; Sequences;
Conference_Titel :
Hybrid Intelligent Systems, 2009. HIS '09. Ninth International Conference on
Conference_Location :
Shenyang
Print_ISBN :
978-0-7695-3745-0
DOI :
10.1109/HIS.2009.64