Title :
Fast Approximate Matching of Programs for Protecting Libre/Open Source Software by Using Spatial Indexes
Author :
Muller, A.J. ; Shinohara, Takeshi
Author_Institution :
Kyushu Inst. of Technol., Fukuoka
fDate :
Sept. 30 2007-Oct. 1 2007
Abstract :
To encourage open source/libre software development, it is desirable to have tools that can help to identify open source license violations. This paper describes the implementation of a tool that matches open source programs embedded inside pirate programs. The problem of binary program matching can be approximated by analyzing the similarity of program fragments generated from low-level instructions. These fragments are syntax trees that can be compared by using a tree distance function. Tree distance functions are generally very costly. Sequentially calculating the similarities of fragments with them becomes prohibitively expensive. In this paper we experimentally demonstrate how a spatial index can be used to substantially increase matching performance. These techniques allowed us to do exhaustive experiments that confirmed previous results on the subject. The paper also introduces the novel idea of using information retrieval techniques for calculating the similarity of bags of program fragments. It is possible to identify programs even when they are heavily obfuscated with the innovative approach described here.
Keywords :
information retrieval; public domain software; security of data; binary program matching; information retrieval; libre software protection; low-level instructions; open source license violations; open source software protection; program fast approximate matching; program fragment analysis; program identification; software tools; spatial indexes; syntax trees; tree distance function; Artificial intelligence; Data mining; Fingerprint recognition; Information retrieval; Licenses; Nearest neighbor searches; Open source software; Protection; Spatial databases; Spatial indexes;
Conference_Titel :
Source Code Analysis and Manipulation, 2007. SCAM 2007. Seventh IEEE International Working Conference on
Conference_Location :
Paris
Print_ISBN :
978-0-7695-2880-9
DOI :
10.1109/SCAM.2007.15