Title :
LLVM-based code clone detection framework
Author :
Arutyun Avetisyan;Shamil Kurmangaleev;Sevak Sargsyan;Mariam Arutunian;Andrey Belevantsev
Author_Institution :
Institute for System Programming of the Russian Academy of Sciences, Moscow, Russia
Abstract :
Existed methods of code clones detection have some restrictions. Textual and lexical approaches cannot detect strongly modified fragments of code. Syntactic and metrics based approaches detect strong modifications with low accuracy. On the contrary, semantic approach accurately detects the cloned fragments of code with small changes as well as the strongly modified ones. Methods based on this approach are not scalable for analysis of large projects. This paper describes LLVM-based code clone detection framework, which uses program semantic analysis. It has high accuracy and is scalable for analysis million lines of source code. The tool embeds a testing system, which allows generating code clones for the project automatically. It is used for determining the developed algorithms accuracy. The instrument is applicable for all languages that can be compiled to LLVM bitcode. Proposed method was compared with two widely used tools MOSS and CloneDR. Results show that it has higher accuracy. The tool is scalable for analysis of Linux-2.6 kernel, which has about fourteen millions lines of source code.
Keywords :
"Cloning","Approximation algorithms","Semantics","Measurement","Image edge detection","Clustering algorithms","Algorithm design and analysis"
Conference_Titel :
Computer Science and Information Technologies (CSIT), 2015
DOI :
10.1109/CSITechnol.2015.7358259