• DocumentCode
    2028344
  • Title

    A study of features on Primary Question detection in Chinese online forums

  • Author

    Sun, Lin ; Liu, Bingquan ; Wang, Baoxun ; Zhang, Deyuan ; Wang, Xiaolong

  • Author_Institution
    MOE-MS Key Lab. of Natural Language Process. & Speech, Harbin Inst. of Technol., Harbin, China
  • Volume
    5
  • fYear
    2010
  • fDate
    10-12 Aug. 2010
  • Firstpage
    2422
  • Lastpage
    2427
  • Abstract
    Primary Question detection in online forum is a subtask of extracting question-answer pairs. In this paper, by surveying the forms of questions in Chinese online forums, a combination of textual and N-gram features achieved via feature selection is adopted to help detecting primary questions. By viewing primary question detection a binary classification problem, decision tree classifier C4.5 and support vector machine are introduced to distinguish questions from non-questions separately. Experimental results across multiple datasets demonstrate that the mixture of textual and N-gram features performs better than using each of them separately under both C4.5 and support vector machine. By computing the weight of each feature in the two classifiers, the top 6 features are found the very same except for a little adjustment of order, showing that the combination of textual and N-gram features is universal and effective in detecting primary questions.
  • Keywords
    classification; decision trees; information retrieval; natural language processing; support vector machines; Chinese online forums; N-gram features; binary classification; decision tree classifier; feature selection; primary question detection; question-answer pairs; support vector machine; textual features; Classification tree analysis; Electronic mail; Feature extraction; Speech; Support vector machines; N-gram feature; classification; information extraction; primary question detection; textual feature;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on
  • Conference_Location
    Yantai, Shandong
  • Print_ISBN
    978-1-4244-5931-5
  • Type

    conf

  • DOI
    10.1109/FSKD.2010.5569298
  • Filename
    5569298