• DocumentCode
    2831841
  • Title

    Improving classification decisions by multiple knowledge

  • Author

    Bi, Yaxin ; Mcclean, Sally ; Anderson, Terry

  • Author_Institution
    Fac. of Eng., Ulster Univ., Antrim
  • fYear
    2005
  • fDate
    16-16 Nov. 2005
  • Lastpage
    347
  • Abstract
    An important issue in data mining is how to make use of multiple discovered knowledge to improve future decisions. In this paper, we propose a new approach to combining multiple sets of rules for text categorization using Dempster´s rule of combination. We develop a boosting-like technique for generating multiple sets of rules based on rough set theory and model classification decisions from multiple sets of rules as pieces of evidence which can be combined by Dempster´s rule of combination. We apply these methods to 10 out of the 20-newsgroups - a benchmark data collection, individually and in combination. Our experimental results show that the performance of the best combination of the multiple sets of rules on the 10 groups of the benchmark data is statistically significantly better than that of the best single set of rules. The comparative analysis between the Dempster-Shafer and the majority voting methods along with an overfitting study confirm the advantage and the robustness of our approach
  • Keywords
    data mining; pattern classification; rough set theory; text analysis; Dempster combination rule; data mining; model classification decisions; multiple knowledge; rough set theory; text categorization; Bismuth; Boosting; Data engineering; Data mining; Decision trees; Induction generators; Knowledge engineering; Learning systems; Text categorization; Voting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Tools with Artificial Intelligence, 2005. ICTAI 05. 17th IEEE International Conference on
  • Conference_Location
    Hong Kong
  • ISSN
    1082-3409
  • Print_ISBN
    0-7695-2488-5
  • Type

    conf

  • DOI
    10.1109/ICTAI.2005.76
  • Filename
    1562958