• DocumentCode
    2895112
  • Title

    FAQ Extracting and Domain Filtering Based on Improved Bayes

  • Author

    Yu, Zhengtao ; Zong, Huanyun ; Xu, Yangbo ; Guo, Jianyi ; Mao, Yu ; Meng, Xiangyan

  • Author_Institution
    Sch. of Inf. Eng. & Autom., Kunming Univ. of Sci. & Technol., Kunming, China
  • fYear
    2009
  • fDate
    7-8 Nov. 2009
  • Firstpage
    108
  • Lastpage
    112
  • Abstract
    FAQ (frequently asked questions) is the basis of question answering system (QA) that oriented frequently asked questions database. For the FAQ is difficult to collect and organize, this paper proposed an automatic acquisition method of domain FAQ based on improved Bayes. Parsing HTML pages into DOM tree, combining with the restricted domain knowledge base, extracting the node information and structural characteristics of DOM tree as the classified feature, using the improved Bayesian classified learning algorithm, constructing the classification model, acquiring FAQ from the HTML page automatically and filtering out the domain FAQ , the experimental results of this method show that it has a remarkable effect.
  • Keywords
    Bayes methods; database management systems; information filtering; learning (artificial intelligence); automatic acquisition method; domain knowledge base; frequently asked questions database; improved Bayesian classified learning algorithm; node information; question answering system; structural characteristics; Classification tree analysis; Data engineering; Data mining; Databases; HTML; Information filtering; Information filters; Information systems; Internet; Space technology; FAQ Domain Filtering; FAQ Extracting; Improved Bayes; Question Answering Syste; Restricted domain;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Information Systems and Mining, 2009. WISM 2009. International Conference on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-0-7695-3817-4
  • Type

    conf

  • DOI
    10.1109/WISM.2009.30
  • Filename
    5368164