DocumentCode :
2333096
Title :
Categorizing software applications for maintenance
Author :
McMillan, Collin ; Linares-Vásquez, Mario ; Poshyvanyk, Denys ; Grechanik, Mark
Author_Institution :
Dept. of Comput. Sci., Coll. of William & Mary, Williamsburg, VA, USA
fYear :
2011
fDate :
25-30 Sept. 2011
Firstpage :
343
Lastpage :
352
Abstract :
Software repositories hold applications that are often categorized to improve the effectiveness of various maintenance tasks. Properly categorized applications allow stakeholders to identify requirements related to their applications and predict maintenance problems in software projects. Unfortunately, for different legal and organizational reasons the source code is often not available, thus making it difficult to automatically categorize binary executables of software applications. In this paper, we propose a novel approach in which we use Application Programming Interface (API) calls from third-party libraries as attributes for automatic categorization of software applications that use these API calls. API calls can be extracted from source code and more importantly, from the byte-code of applications, thus making automatic categorization approaches applicable to closed source repositories. We evaluate our approach along with other machine learning algorithms for software categorization on two large Java repositories: an open-source repository containing 3,286 projects and a closed-source one with 745 applications. Our contribution is twofold: not only do we propose a new approach that makes it possible to categorize software projects without any source code using a small number of API calls as attributes, but also we carried out the first comprehensive empirical evaluation of automatic categorization approaches.
Keywords :
Java; application program interfaces; learning (artificial intelligence); project management; public domain software; software maintenance; software management; API calls; Java repository; application programming interface; automatic categorization; binary executables; byte-code; categorizing software applications; closed source repository; closed-source repository; legal reasons; machine learning algorithms; maintenance tasks; open-source repository; organizational reasons; predict maintenance problems; software categorization; software projects; software repository; source code; third-party library; Companies; Entropy; Java; Libraries; Machine learning algorithms; Software; Support vector machines; closed-source; machine learning; open-source; software categorization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Software Maintenance (ICSM), 2011 27th IEEE International Conference on
Conference_Location :
Williamsburg, VI
ISSN :
1063-6773
Print_ISBN :
978-1-4577-0663-9
Electronic_ISBN :
1063-6773
Type :
conf
DOI :
10.1109/ICSM.2011.6080801
Filename :
6080801
Link To Document :
بازگشت