• DocumentCode
    3316763
  • Title

    Rules selection in word sense disambiguation using Adaboost

  • Author

    Qin, Ying ; Wang, Xiaojie

  • Author_Institution
    Sch. of Inf. Eng., Beijing Univ. of Posts & Telecommun., China
  • fYear
    2005
  • fDate
    30 Oct.-1 Nov. 2005
  • Firstpage
    26
  • Lastpage
    29
  • Abstract
    Boosting algorithm is confirmed as a promising and practical machine learning method which has successfully been applied to some classification problems. Word sense disambiguation system using Boosting acquired the state-of-the-art performance. This paper explores the primary but unavoidable problem of rules selection in Adaboost applied to word sense disambiguation system, presenting the relations among rules selection, iteration number and the performance of the system on sparse data. The results show the increment of the iteration number in Adaboost trained on a small set of examples without noise is neither helpful nor harmful. The algorithm is sensitive to weak rules selection in two aspects: on one hand, some rules make the training error converge more quickly and have higher generalization ability simultaneously, on the other hand, conflictions may occur among weak rules built on different features causing trouble to the whole system.
  • Keywords
    computational linguistics; learning (artificial intelligence); natural languages; word processing; Adaboost; computational linguistics; iteration number; machine learning; natural languages; rules selection; sparse data; word sense disambiguation system; Boosting; Classification algorithms; Face detection; Face recognition; Handwriting recognition; Learning systems; Machine learning algorithms; Natural languages; Supervised learning; Tagging;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing and Knowledge Engineering, 2005. IEEE NLP-KE '05. Proceedings of 2005 IEEE International Conference on
  • Print_ISBN
    0-7803-9361-9
  • Type

    conf

  • DOI
    10.1109/NLPKE.2005.1598701
  • Filename
    1598701