• DocumentCode
    594922
  • Title

    Transfer heterogeneous unlabeled data for unsupervised clustering

  • Author

    Shu Kong ; Donghui Wang

  • Author_Institution
    Dept. of Comput. Sci. & Technol., Zhejiang Univ., Hangzhou, China
  • fYear
    2012
  • fDate
    11-15 Nov. 2012
  • Firstpage
    1193
  • Lastpage
    1196
  • Abstract
    In this paper, we propose a novel method called THUNTER to transfer the heterogenous unlabeled data from the source domain to the target domain for clustering. Suppose the target data are a set of images, then the so-called heterogeneous unlabeled data can be a large set of text data or acoustic data. Our method aims to address how to transfer these large amount of heterogeneous data to the relatively smaller target data set for clustering. To the best of our knowledge, it is the first work in the community to transfer the unlabeled data, especially the unlabeled heterogeneous data, for unsupervised clustering. Furthermore, along with our method, a novel dictionary-based data transfer strategy (DicTrans) is introduced in this paper, which measures the fidelity of transferring the target data to the source domain and automatically decides how many to transfer. Through a series of experiments, the effectiveness of THUNTER and DicTrans are demonstrated with very promising performances.
  • Keywords
    learning (artificial intelligence); pattern clustering; set theory; DicTrans; THUNTER; acoustic data; dictionary-based data transfer strategy; heterogeneous unlabeled data transfer; image set; source domain; target domain; text data; unsupervised clustering; Acoustics; Dictionaries; Face; Image reconstruction; Linear programming; Manifolds; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition (ICPR), 2012 21st International Conference on
  • Conference_Location
    Tsukuba
  • ISSN
    1051-4651
  • Print_ISBN
    978-1-4673-2216-4
  • Type

    conf

  • Filename
    6460351