• DocumentCode
    176247
  • Title

    Improving Low Quality Stack Overflow Post Detection

  • Author

    Ponzanelli, Luca ; Mocci, Andrea ; Bacchelli, Alberto ; Lanza, Mario ; Fullerton, David

  • Author_Institution
    REVEAL @ Fac. of Inf., Univ. of Lugano, Lugano, Switzerland
  • fYear
    2014
  • fDate
    Sept. 29 2014-Oct. 3 2014
  • Firstpage
    541
  • Lastpage
    544
  • Abstract
    Stack Overflow is a popular questions and answers (Q&A) website among software developers. It counts more than two millions of users who actively contribute by asking and answering thousands of questions daily. Identifying and reviewing low quality posts preserves the quality of site´s contents and it is crucial to maintain a good user experience. In Stack Overflow the identification of poor quality posts is performed by selected users manually. The system also uses an automated identification system based on textual features. Low quality posts automatically enter a review queue maintained by experienced users. We present an approach to improve the automated system in use at Stack Overflow. It analyzes both the content of a post (e.g., simple textual features and complex readability metrics) and community-related aspects (e.g., popularity of a user in the community). Our approach reduces the size of the review queue effectively and removes misclassified good quality posts.
  • Keywords
    Web sites; information retrieval; Q&A Website; automated identification system; community related aspects; complex readability metrics; questions and answers; software developers; stack overflow post detection; textual features; Communities; Entropy; Genetic algorithms; Indexes; Measurement; Readability metrics; Software;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Maintenance and Evolution (ICSME), 2014 IEEE International Conference on
  • Conference_Location
    Victoria, BC
  • ISSN
    1063-6773
  • Type

    conf

  • DOI
    10.1109/ICSME.2014.90
  • Filename
    6976134