• DocumentCode
    1421785
  • Title

    Asking Generalized Queries to Domain Experts to Improve Learning

  • Author

    Du, Jun ; Ling, Charles X.

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Western Ontario, London, ON, Canada
  • Volume
    22
  • Issue
    6
  • fYear
    2010
  • fDate
    6/1/2010 12:00:00 AM
  • Firstpage
    812
  • Lastpage
    825
  • Abstract
    With the assistance of a domain expert, active learning can often select or construct fewer examples to request their labels to build an accurate classifier. However, previous works of active learning can only generate and ask specific queries. In real-world applications, the domain experts (or oracles) are often more readily to answer ??generalized queries?? with don´t-care attributes. The power of such generalized queries is that one generalized query is often equivalent to many specific ones. However, overly general queries are not good as answers from the domain experts (or oracles) can be highly uncertain, and this makes learning difficult. In this paper, we propose a novel active learning algorithm that asks good generalized queries. We, then, extend our algorithm to construct new, hierarchical features for both nominal and numeric attributes. We demonstrate experimentally that our new method asks significantly fewer queries compared with the previous works of active learning, even when the initial labeled data set is very small, and the oracle is inaccurate in class probability estimations. Our method can be readily deployed in real-world data mining tasks where obtaining labeled examples is costly.
  • Keywords
    data mining; learning (artificial intelligence); probability; query processing; active learning algorithm; class probability estimations; data mining tasks; domain experts; generalized queries; Active learning; domain expert; generalized query.;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2010.33
  • Filename
    5416719