Title :
Semi-supervised k-means clustering for multi-type relational data
Author :
Gao, Ying ; Qi, Hong ; Liu, Da-you ; Liu, He
Author_Institution :
Coll. of Comput. Sci. & Technol., Jilin Univ., Changchun
Abstract :
In many data mining tasks, there is a large supply of unlabeled data but limited labeled data since it is expensive generated. Therefore, a number of semi-supervised clustering algorithms have been proposed, but few of them are specially designed for multi-type relational data. In this paper, a semi-supervised k-means clustering algorithm for multi-type relational data is proposed, which is based on the combination of semi-supervised k-means method and multi-type relational data clustering. In order to achieve high performance, in the algorithm, we first analyze all kinds of relationships in data, which include intra-relationship, inter-relationship, explicit and implicit relationship; and then extend k-means clustering algorithm by seeding and new similarity measures, where attributes information, labeled data and all kinds of relationships are employed. The experimental results show the effectiveness of our method.
Keywords :
data mining; learning (artificial intelligence); pattern clustering; data clustering; data mining; multitype relational data; semisupervised k-means clustering; Algorithm design and analysis; Clustering algorithms; Cybernetics; Data mining; Educational institutions; Helium; Machine learning; Partitioning algorithms; Pattern analysis; Semisupervised learning; Semi-supervised learning; clustering algorithm; multi-type relational data;
Conference_Titel :
Machine Learning and Cybernetics, 2008 International Conference on
Conference_Location :
Kunming
Print_ISBN :
978-1-4244-2095-7
Electronic_ISBN :
978-1-4244-2096-4
DOI :
10.1109/ICMLC.2008.4620425