• DocumentCode
    1407974
  • Title

    Using discretization and Bayesian inference network learning for automatic filtering profile generation

  • Author

    Lam, Wai ; Low, Kon Fan

  • Author_Institution
    Dept. of Syst. Eng. & Eng. Manage., Chinese Univ. of Hong Kong, Shatin, China
  • Volume
    30
  • Issue
    3
  • fYear
    2000
  • fDate
    8/1/2000 12:00:00 AM
  • Firstpage
    340
  • Lastpage
    351
  • Abstract
    We develop a new approach for text document filtering based on automatic construction of filtering profiles using Bayesian inference network learning. Bayesian inference networks, based on probability theory, offer a suitable framework to harness the uncertainty found in the nature of the filtering problem. In order to learn the networks effectively, we explore three different techniques for discretization. Good features of high predictive power are automatically obtained from the training document content. Our approach does not need to know in advance the subject or content of documents as well as the information needs expressed as topics. A series of experiments on a set of topics were conducted on two large-scale real-world document corpora. The empirical results demonstrate that our Bayesian inference network learning with advanced discretization achieves better performance over the simple naive Bayesian approach.
  • Keywords
    belief networks; inference mechanisms; information needs; information retrieval; learning (artificial intelligence); probability; uncertainty handling; Bayesian inference network learning; automatic filtering profile generation; discretization; information needs; probability theory; text document filtering; Bayesian methods; Databases; Feedback; Filtering theory; Information filtering; Information filters; Large-scale systems; Satellite broadcasting; Training data; Uncertainty;
  • fLanguage
    English
  • Journal_Title
    Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1094-6977
  • Type

    jour

  • DOI
    10.1109/5326.885115
  • Filename
    885115