Title :
Boreas: an accurate and scalable token-based approach to code clone detection
Author :
Yang Yuan ; Yao Guo
Author_Institution :
Peking Univ., Beijing, China
Abstract :
Detecting code clones in a program has many applications in software engineering and other related fields. In this paper, we present Boreas, an accurate and scalable token-based approach for code clone detection. Boreas introduces a novel counting-based method to define the characteristic matrices, which are able to describe the program segments distinctly and effectively for the purpose of clone detection. We conducted experiments on JDK 7 and Linux kernel 2.6.38.6 source code. Experimental results show that Boreas is able to match the detecting accuracy of a recently proposed syntactic-based tool Deckard, with the execution time reduced by more than an order of magnitude.
Keywords :
Java; Linux; matrix algebra; program debugging; program testing; software engineering; Boreas; Deckard syntactic-based tool; JDK 7 source code; Linux kernel 2.6.38.6 source code; characteristic matrices; counting-based method; execution time reduction; program code clone detection; scalable token-based approach; software engineering; Code clone detection; count matrix; count vector;
Conference_Titel :
Automated Software Engineering (ASE), 2012 Proceedings of the 27th IEEE/ACM International Conference on
Print_ISBN :
978-1-4503-1204-2
DOI :
10.1145/2351676.2351725