DocumentCode
176247
Title
Improving Low Quality Stack Overflow Post Detection
Author
Ponzanelli, Luca ; Mocci, Andrea ; Bacchelli, Alberto ; Lanza, Mario ; Fullerton, David
Author_Institution
REVEAL @ Fac. of Inf., Univ. of Lugano, Lugano, Switzerland
fYear
2014
fDate
Sept. 29 2014-Oct. 3 2014
Firstpage
541
Lastpage
544
Abstract
Stack Overflow is a popular questions and answers (Q&A) website among software developers. It counts more than two millions of users who actively contribute by asking and answering thousands of questions daily. Identifying and reviewing low quality posts preserves the quality of site´s contents and it is crucial to maintain a good user experience. In Stack Overflow the identification of poor quality posts is performed by selected users manually. The system also uses an automated identification system based on textual features. Low quality posts automatically enter a review queue maintained by experienced users. We present an approach to improve the automated system in use at Stack Overflow. It analyzes both the content of a post (e.g., simple textual features and complex readability metrics) and community-related aspects (e.g., popularity of a user in the community). Our approach reduces the size of the review queue effectively and removes misclassified good quality posts.
Keywords
Web sites; information retrieval; Q&A Website; automated identification system; community related aspects; complex readability metrics; questions and answers; software developers; stack overflow post detection; textual features; Communities; Entropy; Genetic algorithms; Indexes; Measurement; Readability metrics; Software;
fLanguage
English
Publisher
ieee
Conference_Titel
Software Maintenance and Evolution (ICSME), 2014 IEEE International Conference on
Conference_Location
Victoria, BC
ISSN
1063-6773
Type
conf
DOI
10.1109/ICSME.2014.90
Filename
6976134
Link To Document