Title :
Towards a Theoretical Model for Software Growth
Author :
Herraiz, Israel ; Gonzalez-Barahona, Jesus M. ; Robles, Gregorio
Author_Institution :
Grupo de Sist. y Comun., Univ. Rey Juan Carlos, Rey Juan Carlos
Abstract :
Software growth (and more broadly, software evolution) is usually considered in terms of size or complexity of source code. However in different studies, usually different metrics are used, which make it difficult to compare approaches and results. In addition, not all metrics are equally easy to calculate for a given source code, which leads to the question of which one is the easiest to calculate without losing too much information. To address both issues, in this paper present a comprehensive study, based on the analysis of about 700,000 C source code files, calculating several size and complexity metrics for all of them. For this sample, we have found double Pareto statistical distributions for all metrics considered, and a high correlation between any two of them. This would imply that any model addressing software growth should produce this Pareto distributions, and that analysis based on any of the considered metrics should show a similar pattern, provided the sample of files considered is large enough.
Keywords :
C language; Pareto distribution; correlation methods; program diagnostics; software metrics; software prototyping; C source code file analysis; Pareto statistical distribution; correlation method; software evolution; software growth model; source code complexity metrics; source code size metrics; Licenses; Linux; Open source software; Packaging; Pareto analysis; Pattern analysis; Software engineering; Software measurement; Statistical analysis; Statistical distributions;
Conference_Titel :
Mining Software Repositories, 2007. ICSE Workshops MSR '07. Fourth International Workshop on
Conference_Location :
Minneapolis, MN
Print_ISBN :
0-7695-2950-X
DOI :
10.1109/MSR.2007.31