DocumentCode
594922
Title
Transfer heterogeneous unlabeled data for unsupervised clustering
Author
Shu Kong ; Donghui Wang
Author_Institution
Dept. of Comput. Sci. & Technol., Zhejiang Univ., Hangzhou, China
fYear
2012
fDate
11-15 Nov. 2012
Firstpage
1193
Lastpage
1196
Abstract
In this paper, we propose a novel method called THUNTER to transfer the heterogenous unlabeled data from the source domain to the target domain for clustering. Suppose the target data are a set of images, then the so-called heterogeneous unlabeled data can be a large set of text data or acoustic data. Our method aims to address how to transfer these large amount of heterogeneous data to the relatively smaller target data set for clustering. To the best of our knowledge, it is the first work in the community to transfer the unlabeled data, especially the unlabeled heterogeneous data, for unsupervised clustering. Furthermore, along with our method, a novel dictionary-based data transfer strategy (DicTrans) is introduced in this paper, which measures the fidelity of transferring the target data to the source domain and automatically decides how many to transfer. Through a series of experiments, the effectiveness of THUNTER and DicTrans are demonstrated with very promising performances.
Keywords
learning (artificial intelligence); pattern clustering; set theory; DicTrans; THUNTER; acoustic data; dictionary-based data transfer strategy; heterogeneous unlabeled data transfer; image set; source domain; target domain; text data; unsupervised clustering; Acoustics; Dictionaries; Face; Image reconstruction; Linear programming; Manifolds; Vectors;
fLanguage
English
Publisher
ieee
Conference_Titel
Pattern Recognition (ICPR), 2012 21st International Conference on
Conference_Location
Tsukuba
ISSN
1051-4651
Print_ISBN
978-1-4673-2216-4
Type
conf
Filename
6460351
Link To Document