• DocumentCode
    3368627
  • Title

    Clone detection: Why, what and how?

  • Author

    Akhin, Marat ; Itsykson, Vladimir

  • Author_Institution
    St.-Petersburg State Polytech. Univ., St. Petersburg, Russia
  • fYear
    2010
  • fDate
    13-15 Oct. 2010
  • Firstpage
    36
  • Lastpage
    42
  • Abstract
    Excessive code duplication is a bane of modern software development. Several experimental studies show that on average 15 percent of a software system can contain source code clones - repeatedly reused fragments of similar code. While code duplication may increase the speed of initial software development, it undoubtedly leads to problems during software maintenance and support. That is why many developers agree that software clones should be detected and dealt with at every stage of software development life cycle. This paper is a brief survey of current state-of-the-art in clone detection. First, we highlight main sources of code cloning such as copy-and-paste programming, mental code patterns and performance optimizations. We discuss reasons behind the use of these techniques from the developer\´s point of view and possible alternatives to them. Second, we outline major negative effects that clones have on software development. The most serious drawback duplicated code have on software maintenance is increasing the cost of modifications - any modification that changes cloned code must be propagated to every clone instance in the program. Software clones may also create new software bugs when a programmer makes some mistakes during code copying and modification. Increase of source code size due to duplication leads to additional difficulty of code comprehension. Third, we review existing clone detection techniques. Classification based on used source code representation model is given in this work. We also describe and analyze some concrete examples of clone detection techniques highlighting main distinctive features and problems that are present in practical clone detection. Finally, we point out some open problems in the area of clone detection. Currently questions like "What is a code clone?", "Can we predict the impact clones have on software quality" and "How can we increase both clone detection precision and recall at the same time? " stay open to further re- - search. We list the most important questions in modern clone detection and explain why they continue to remain unanswered despite all the progress in clone detection research.
  • Keywords
    program debugging; software maintenance; clone detection techniques; code comprehension; code duplication; copy-and-paste programming; mental code patterns; performance optimizations; software bugs; software development life cycle; software maintenance; software support; source code size; Cloning; Data mining; Electronic mail; Linux; Programming; Software maintenance; Clone detection; overview; program analysis; quality assurance; software maintenance;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Engineering Conference (CEE-SECR), 2010 6th Central and Eastern European
  • Conference_Location
    Moscow
  • Print_ISBN
    978-1-4577-0605-9
  • Type

    conf

  • DOI
    10.1109/CEE-SECR.2010.5783148
  • Filename
    5783148