• DocumentCode
    3407293
  • Title

    Rendezvous: A search engine for binary code

  • Author

    Wei Ming Khoo ; Mycroft, Alan ; Anderson, Richard

  • Author_Institution
    Univ. of Cambridge, Cambridge, UK
  • fYear
    2013
  • fDate
    18-19 May 2013
  • Firstpage
    329
  • Lastpage
    338
  • Abstract
    The problem of matching between binaries is important for software copyright enforcement as well as for identifying disclosed vulnerabilities in software. We present a search engine prototype called Rendezvous which enables indexing and searching for code in binary form. Rendezvous identifies binary code using a statistical model comprising instruction mnemonics, control flow sub-graphs and data constants which are simple to extract from a disassembly, yet normalising with respect to different compilers and optimisations. Experiments show that Rendezvous achieves F2 measures of 86.7% and 83.0% on the GNU C library compiled with different compiler optimisations and the GNU coreutils suite compiled with gcc and clang respectively. These two code bases together comprise more than one million lines of code. Rendezvous will bring significant changes to the way patch management and copyright enforcement is currently performed.
  • Keywords
    copyright; optimising compilers; search engines; software reusability; statistical analysis; F2 measures; GNU C library; GNU coreutils suite; binary code search engine; clang; code bases; compiler optimisations; control flow subgraphs; gcc; instruction mnemonics; rendezvous; search engine prototype; software copyright enforcement; statistical model; Accuracy; Binary codes; Indexing; Libraries; Optimization; Search engines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Mining Software Repositories (MSR), 2013 10th IEEE Working Conference on
  • Conference_Location
    San Francisco, CA
  • ISSN
    2160-1852
  • Print_ISBN
    978-1-4799-0345-0
  • Type

    conf

  • DOI
    10.1109/MSR.2013.6624046
  • Filename
    6624046