DocumentCode
3248677
Title
Extracting problematic API features from forum discussions
Author
Yingying Zhang ; Daqing Hou
Author_Institution
Dept. of Comput. Sci., Clarkson Univ., Potsdam, NY, USA
fYear
2013
fDate
20-21 May 2013
Firstpage
142
Lastpage
151
Abstract
Software engineering activities often produce large amounts of unstructured data. Useful information can be extracted from such data to facilitate software development activities, such as bug reports management and documentation provision. Online forums, in particular, contain extensive valuable information that can aid in software development. However, no work has been done to extract problematic API features from online forums. In this paper, we investigate ways to extract problematic API features that are discussed as a source of difficulty in each thread, using natural language processing and sentiment analysis techniques. Based on a preliminary manual analysis of the content of a discussion thread and a categorization of the role of each sentence therein, we decide to focus on a negative sentiment sentence and its close neighbors as a unit for extracting API features. We evaluate a set of candidate solutions by comparing tool-extracted problematic API design features with manually produced golden test data. Our best solution yields a precision of 89%. We have also investigated three potential applications for our feature extraction solution: (i) highlighting the negative sentence and its neighbors to help illustrate the main API feature; (ii) searching helpful online information using the extracted API feature as a query; (iii) summarizing the problematic features to reveal the “hot topics” in a forum.
Keywords
application program interfaces; feature extraction; natural language processing; query processing; software engineering; discussion thread; forum discussions; manually produced golden test data; natural language processing; negative sentiment sentence; online information; problematic API feature extraction; problematic feature summarization; sentiment analysis techniques; Data mining; Dictionaries; Feature extraction; Message systems; Pattern matching; Software; Tutorials; APIs; AWT/Swing; Design Feedback; Information Extraction; Online Forums;
fLanguage
English
Publisher
ieee
Conference_Titel
Program Comprehension (ICPC), 2013 IEEE 21st International Conference on
Conference_Location
San Francisco, CA
ISSN
1063-6897
Type
conf
DOI
10.1109/ICPC.2013.6613842
Filename
6613842
Link To Document