Title :
Improving Low Quality Stack Overflow Post Detection
Author :
Ponzanelli, Luca ; Mocci, Andrea ; Bacchelli, Alberto ; Lanza, Mario ; Fullerton, David
Author_Institution :
REVEAL @ Fac. of Inf., Univ. of Lugano, Lugano, Switzerland
fDate :
Sept. 29 2014-Oct. 3 2014
Abstract :
Stack Overflow is a popular questions and answers (Q&A) website among software developers. It counts more than two millions of users who actively contribute by asking and answering thousands of questions daily. Identifying and reviewing low quality posts preserves the quality of site´s contents and it is crucial to maintain a good user experience. In Stack Overflow the identification of poor quality posts is performed by selected users manually. The system also uses an automated identification system based on textual features. Low quality posts automatically enter a review queue maintained by experienced users. We present an approach to improve the automated system in use at Stack Overflow. It analyzes both the content of a post (e.g., simple textual features and complex readability metrics) and community-related aspects (e.g., popularity of a user in the community). Our approach reduces the size of the review queue effectively and removes misclassified good quality posts.
Keywords :
Web sites; information retrieval; Q&A Website; automated identification system; community related aspects; complex readability metrics; questions and answers; software developers; stack overflow post detection; textual features; Communities; Entropy; Genetic algorithms; Indexes; Measurement; Readability metrics; Software;
Conference_Titel :
Software Maintenance and Evolution (ICSME), 2014 IEEE International Conference on
Conference_Location :
Victoria, BC
DOI :
10.1109/ICSME.2014.90