Title :
Gapped code clone detection with lightweight source code analysis
Author :
Murakami, H. ; Hotta, Kazuhiro ; Higo, Y. ; Igaki, Hiroshi ; Kusumoto, Shinji
Author_Institution :
Grad. Sch. of Inf. Sci. & Technol., Osaka Univ., Suita, Japan
Abstract :
A variety of methods detecting code clones has been proposed before. In order to detect gapped code clones, AST-based technique, PDG-based technique, metric-based technique and text-based technique using the LCS algorithm have been proposed. However, each of those techniques has limitations. For example, existing AST-based techniques and PDG-based techniques require costs for transforming source files into intermediate representations such as ASTs or PDGs and comparing them. Existing metric-based techniques and text-based techniques using the LCS algorithm cannot detect code clones if methods or blocks are partially duplicated. This paper proposes a new method that detects gapped code clones using the Smith-Waterman algorithm to resolve those limitations. The Smith-Waterman algorithm is an algorithm for identifying similar alignments between two sequences even if they include some gaps. The authors developed the proposed method as a software tool named CDSW, and confirmed that the proposed method could resolve the limitations by conducting a quantitative evaluation with Bellon´s benchmark.
Keywords :
program diagnostics; software tools; text analysis; trees (mathematics); AST-based technique; Bellon benchmark; CDSW; LCS algorithm; PDG-based technique; Smith-Waterman algorithm; abstract syntax tree; gapped code clone detection; metric-based technique; program dependency graph; software tool; source code analysis; text-based technique; Accuracy; Algorithm design and analysis; Benchmark testing; Cloning; Educational institutions; Software algorithms; Software systems; Code Clone; Program Analysis; Software Maintenance; Tool Comparison;
Conference_Titel :
Program Comprehension (ICPC), 2013 IEEE 21st International Conference on
Conference_Location :
San Francisco, CA
DOI :
10.1109/ICPC.2013.6613837