Title of article :
Machine Learning Approach for Answer Detection in Discussion Forums: An Application of Big Data Analytics
Author/Authors :
Khan, Atif Department of Computer Science - Islamia College Peshawar, Peshawar, Pakistan , Irfan Uddin, M. Institute of Computing - Kohat University of Science and Technology, Kohat, Pakistan , Ibrahim, Ibrahim Department of Computer Science - Islamia College Peshawar, Peshawar, Pakistan , Zubair,Muhammad Department of Computer Science - Islamia College Peshawar, Peshawar, Pakistan , Ahmad, Shafiq King Saud University - College of Engineering - Department of Industrial Engineering, Riyadh, Saudi Arabia , Al Firdausi, Muhammad Dzulqarnain King Saud University - College of Engineering - Department of Industrial Engineering, Riyadh, Saudi Arabia , Zaindin, Mazen King Saud University - College of Science - Department of Statistics and Operations Research, Riyadh, Saudi Arabia
Pages :
10
From page :
1
To page :
10
Abstract :
Nowadays, data are flooding into online web forums, and it is highly desirable to turn gigantic amount of data into actionable knowledge. Online web forums have become an integral part of the web and are main sources of knowledge. People use this platform to post their questions and get answers from other forum members. Usually, an initial post (question) gets more than one reply posts (answers) that make it difficult for a user to scan all of them for most relevant and quality answer. Thus, how to automatically extract the most relevant answer for a question within a thread is an important issue. In this research, we treat the task of answer extraction as classification problem. A reply post can be classified as relevant, partially relevant, or irrelevant to the initial post. To find the relevancy/similarity of a reply to the question, both lexical and nonlexical features are used. We proposed to use LinearSVC, a variant of support vector machine (SVM), for answer classification. Two selection techniques such as chi-square and univariate are employed to reduce the feature space size. The experimental results showed that LinearSVC classifier outperformed the other state-of-the-art classifiers in the context of classification accuracy for both Ubuntu and TripAdvisor (NYC) discussion forum datasets.
Keywords :
Machine Learning , Approach for Answer Detection , Approach , Answer Detection , Application , Big Data Analytics
Journal title :
Scientific Programming
Serial Year :
2020
Full Text URL :
Record number :
2611001
Link To Document :
بازگشت