DocumentCode
3024619
Title
CMCD: Count Matrix Based Code Clone Detection
Author
Yuan, Yang ; Guo, Yao
Author_Institution
Nat. Eng. Res. Center for Software Eng., Peking Univ., Beijing, China
fYear
2011
fDate
5-8 Dec. 2011
Firstpage
250
Lastpage
257
Abstract
This paper introduces CMCD, a Count Matrix based technique to detect clones in program code. The key concept behind CMCD is Count Matrix, which is created while counting the occurrence frequencies of every variable in situations specified by pre-determined counting conditions. Because the characteristics of the count matrix do not change due to variable name replacements or even switching of statements, CMCD works well on many hard-to-detect code clones, such as swapping statements or deleting a few lines, which are difficult for other state-of-the-art detection techniques. We have obtained the following interesting results using CMCD: (1) we successfully detected all 16 clone scenarios proposed by C. Roy et al., (2) we discovered two clone clusters with three copies each from 29 student-submitted compiler lab projects, (3) we identified 174 code clone clusters and a potential bug from JDK 1.6 source files.
Keywords
matrix algebra; program diagnostics; software maintenance; CMCD; clone cluster; code clone detection; count matrix; Bipartite graph; Cloning; Layout; Programming; Switches; Syntactics; Vectors; Code clone detection; bipartite graph matching; count matrix;
fLanguage
English
Publisher
ieee
Conference_Titel
Software Engineering Conference (APSEC), 2011 18th Asia Pacific
Conference_Location
Ho Chi Minh
ISSN
1530-1362
Print_ISBN
978-1-4577-2199-1
Type
conf
DOI
10.1109/APSEC.2011.13
Filename
6130694
Link To Document