• DocumentCode
    3281333
  • Title

    Archeology of code duplication: recovering duplication chains from small duplication fragments

  • Author

    Wettel, Richard ; Marinescu, Radu

  • Author_Institution
    LOOSE Res. Group, Inst. e-Austria, Timisoara, Romania
  • fYear
    2005
  • fDate
    25-29 Sept. 2005
  • Abstract
    Code duplication is a common problem, and a well-known sign of bad design. As a result of that, in the last decade, the issue of detecting code duplication led to various solutions and tools that can automatically find duplicated blocks of code. However, duplicated fragments rarely remain identical after they are copied; they are oftentimes modified here and there. This adaptation usually "scatters" the duplicated code block into a large amount of small "islands" of duplication, which detected and analyzed separately hide the real magnitude and impact of the duplicated block. In this paper we propose a novel, automated approach for recovering duplication blocks, by composing small isolated fragments of duplication into larger and more relevant duplication chains. We validate both the efficiency and the scalability of the approach by applying it on several well known open-source case-studies and discussing some relevant findings. By recovering such duplication chains, the maintenance engineer is provided with additional cases of duplication that can lead to relevant refactorings, and which are usually missed by other detection methods.
  • Keywords
    automatic programming; program compilers; program diagnostics; software maintenance; code duplication; design flaw; duplication blocks; duplication chains; duplication fragments; quality assurance; Cloning; Computer industry; Filtering; Image analysis; Light scattering; Open source software; Power system reliability; Quality assurance; Scalability; Software systems; code duplication; design flaws; quality assurance;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Symbolic and Numeric Algorithms for Scientific Computing, 2005. SYNASC 2005. Seventh International Symposium on
  • Print_ISBN
    0-7695-2453-2
  • Type

    conf

  • DOI
    10.1109/SYNASC.2005.20
  • Filename
    1595830