• DocumentCode
    3024619
  • Title

    CMCD: Count Matrix Based Code Clone Detection

  • Author

    Yuan, Yang ; Guo, Yao

  • Author_Institution
    Nat. Eng. Res. Center for Software Eng., Peking Univ., Beijing, China
  • fYear
    2011
  • fDate
    5-8 Dec. 2011
  • Firstpage
    250
  • Lastpage
    257
  • Abstract
    This paper introduces CMCD, a Count Matrix based technique to detect clones in program code. The key concept behind CMCD is Count Matrix, which is created while counting the occurrence frequencies of every variable in situations specified by pre-determined counting conditions. Because the characteristics of the count matrix do not change due to variable name replacements or even switching of statements, CMCD works well on many hard-to-detect code clones, such as swapping statements or deleting a few lines, which are difficult for other state-of-the-art detection techniques. We have obtained the following interesting results using CMCD: (1) we successfully detected all 16 clone scenarios proposed by C. Roy et al., (2) we discovered two clone clusters with three copies each from 29 student-submitted compiler lab projects, (3) we identified 174 code clone clusters and a potential bug from JDK 1.6 source files.
  • Keywords
    matrix algebra; program diagnostics; software maintenance; CMCD; clone cluster; code clone detection; count matrix; Bipartite graph; Cloning; Layout; Programming; Switches; Syntactics; Vectors; Code clone detection; bipartite graph matching; count matrix;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Engineering Conference (APSEC), 2011 18th Asia Pacific
  • Conference_Location
    Ho Chi Minh
  • ISSN
    1530-1362
  • Print_ISBN
    978-1-4577-2199-1
  • Type

    conf

  • DOI
    10.1109/APSEC.2011.13
  • Filename
    6130694