• DocumentCode
    1946455
  • Title

    Combining clustering coefficient-based active learning and semi-supervised learning on networked data

  • Author

    He, Xiaoqi ; Liu, Yangguang ; Xu, Bin ; Jin, Xiaogang

  • Author_Institution
    Ningbo Inst. of Technol., Zhejiang Univ., Ningbo, China
  • fYear
    2010
  • fDate
    15-16 Nov. 2010
  • Firstpage
    305
  • Lastpage
    309
  • Abstract
    Active learning and semi-supervised learning are both important techniques to improve the learned model using unlabeled data, when labeled data is difficult to obtain, and unlabeled data is available in large quantity and easy to collect. Combining active learning with a semi-supervised learning algorithm that uses Gaussian field and harmonic functions was suggested recently. This work showed that empirical risk minimization (ERM) could find the next instance to label effectively, but the computation time consumption with ERM was large. In the case where the data is graphical in nature, we can leverage the graph topological analysis to rapidly select instances that are likely to be good candidates for labeling. This paper describes a novel approach of using clustering coefficient metric to identify the best instance next to label. We experiment on the 20 newsgroups dataset with three binary classification tasks, the results show that clustering coefficient strategy has similar performance to ERM with less time consumption.
  • Keywords
    Gaussian processes; graph theory; harmonic analysis; learning (artificial intelligence); pattern classification; pattern clustering; risk analysis; Gaussian field; active learning; binary classification; clustering coefficient; empirical risk minimization; graph topological analysis; harmonic function; semisupervised learning; Accuracy; Harmonic analysis; Learning; Machine learning; Measurement; Risk management; Uncertainty; active learning; clustering coefficient; empirical risk minimization; semi-supervised learning; uncertainty;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Systems and Knowledge Engineering (ISKE), 2010 International Conference on
  • Conference_Location
    Hangzhou
  • Print_ISBN
    978-1-4244-6791-4
  • Type

    conf

  • DOI
    10.1109/ISKE.2010.5680858
  • Filename
    5680858