• DocumentCode
    1304334
  • Title

    Reverse Engineering Input Syntactic Structure from Program Execution and Its Applications

  • Author

    Lin, Zhiqiang ; Zhang, Xiangyu ; Xu, Dongyan

  • Author_Institution
    Dept. of Comput. Sci., Purdue Univ., West Lafayette, IN, USA
  • Volume
    36
  • Issue
    5
  • fYear
    2010
  • Firstpage
    688
  • Lastpage
    703
  • Abstract
    Program input syntactic structure is essential for a wide range of applications such as test case generation, software debugging, and network security. However, such important information is often not available (e.g., most malware programs make use of secret protocols to communicate) or not directly usable by machines (e.g., many programs specify their inputs in plain text or other random formats). Furthermore, many programs claim they accept inputs with a published format, but their implementations actually support a subset or a variant. Based on the observations that input structure is manifested by the way input symbols are used during execution and most programs take input with top-down or bottom-up grammars, we devise two dynamic analyses, one for each grammar category. Our evaluation on a set of real-world programs shows that our technique is able to precisely reverse engineer input syntactic structure from execution. We apply our technique to hierarchical delta debugging (HDD) and network protocol reverse engineering. Our technique enables the complete automation of HDD, in which programmers were originally required to provide input grammars, and improves the runtime performance of HDD. Our client study on network protocol reverse engineering also shows that our technique supersedes existing techniques.
  • Keywords
    data structures; grammars; program debugging; protocols; reverse engineering; HDD automation; bottom-up grammars; hierarchical delta debugging; network protocol reverse engineering; network security; program input syntactic structure; software debugging; test case generation; top-down grammars; Application software; Automation; Computer science; Information security; Protocols; Reverse engineering; Runtime; Software debugging; Software testing; XML; Input syntactic structure; bottom-up grammar.; control dependence; delta debugging; grammar inference; reverse engineering; top-down grammar;
  • fLanguage
    English
  • Journal_Title
    Software Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0098-5589
  • Type

    jour

  • DOI
    10.1109/TSE.2009.54
  • Filename
    5210120