DocumentCode :
2797092
Title :
Comparing Approaches to Mining Source Code for Call-Usage Patterns
Author :
Kagdi, Huzefa ; Collard, Michael L. ; Maletic, Jonathan I.
Author_Institution :
Kent State Univ. Ashland Univ., Kent
fYear :
2007
fDate :
20-26 May 2007
Firstpage :
20
Lastpage :
20
Abstract :
Two approaches for mining function-call usage patterns from source code are compared The first approach, itemset mining, has recently been applied to this problem. The other approach, sequential-pattern mining, has not been previously applied to this problem. Here, a call-usage pattern is a composition of function calls that occur in a function definition. Both approaches look for frequently occurring patterns that represent standard usage of functions and identify possible errors. Itemset mining produces unordered patterns, i.e., sets of function calls, whereas, sequential-pattern mining produces partially ordered patterns, i.e., sequences of function calls. The trade-off between the additional ordering context given by sequential-pattern mining and the efficiency of itemset mining is investigated. The two approaches are applied to the Lima kernel v2.6.14 and results show that mining ordered patterns is worth the additional cost.
Keywords :
data mining; software engineering; Lima kernel v2.6.14; call-usage pattern; itemset mining; sequential-pattern mining; software engineering; source code; Computational efficiency; Computer science; Costs; Data mining; Fault diagnosis; Inspection; Itemsets; Kernel; Linux; Software systems;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Mining Software Repositories, 2007. ICSE Workshops MSR '07. Fourth International Workshop on
Conference_Location :
Minneapolis, MN
Print_ISBN :
0-7695-2950-X
Type :
conf
DOI :
10.1109/MSR.2007.3
Filename :
4228657
Link To Document :
بازگشت