• DocumentCode
    534911
  • Title

    Hierarchical exploration based active learning with support vector machine

  • Author

    Yang, Yanping ; Song, Enmin ; Ma, Guangzhi

  • Author_Institution
    Sch. of Comput. Sci. & Technol., Huazhong Univ. of Sci. & Technol., Wuhan, China
  • Volume
    1
  • fYear
    2010
  • fDate
    13-14 Sept. 2010
  • Firstpage
    299
  • Lastpage
    302
  • Abstract
    The goal of active learning is to minimize the amount of labeled data required for machine learning. Some methods have focused on exploiting the samples with high uncertainty, but those methods fail in getting a representative set of the data samples. Other methods try to explore the representative samples by utilizing the prior distribution of the dataset. However, they are often computationally expensive and need a large amount of labeled data for initialization. In this paper we develop a hierarchical exploration based active learning algorithm that takes into account both the distribution of the dataset and the decision boundary of the current hypothesis. Our method uses the support vector machine (SVM) as the classifier. The hierarchical clustering algorithm is used to discover the dataset´s structure step by step in a top-down manner. In each step of hierarchical structure discovery, the representative samples will be queried for labels to check the relative cluster´s purity. The cluster with low purity will be divided further. After the draft SVM model is built with those representative samples, the uncertain samples near decision boundary will be further labeled if it can help reduce the entropy of the classifier. To show the effectiveness of the proposed method, our proposed method is compared with five state-of-art algorithms on six datasets from UCI. Our method shows the best performance through the comparison.
  • Keywords
    data structures; learning (artificial intelligence); pattern classification; support vector machines; uncertainty handling; active learning; classifier; data representation; dataset´s structure; hierarchical clustering algorithm; hierarchical exploration; labeled data; machine learning; support vector machine; uncertainty; Educational institutions; active learning; hierarchical clustering; support; vector machine;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Natural Computing Proceedings (CINC), 2010 Second International Conference on
  • Conference_Location
    Wuhan
  • Print_ISBN
    978-1-4244-7705-0
  • Type

    conf

  • DOI
    10.1109/CINC.2010.5643833
  • Filename
    5643833