• DocumentCode
    700357
  • Title

    Improving pattern tracking with a language-aware tree differencing algorithm

  • Author

    Palix, Nicolas ; Falleri, Jean-Remy ; Lawall, Julia

  • Author_Institution
    LIG-Erods, UJF - Grenoble-Alps Univ., Grenoble, France
  • fYear
    2015
  • fDate
    2-6 March 2015
  • Firstpage
    43
  • Lastpage
    52
  • Abstract
    Tracking code fragments of interest is important in monitoring a software project over multiple versions. Various approaches, including our previous work on Herodotos, exploit the notion of Longest Common Subsequence, as computed by readily available tools such as GNU Diff, to map corresponding code fragments. Nevertheless, the efficient code differencing algorithms are typically line-based or word-based, and thus do not report changes at the level of language constructs. Furthermore, they identify only additions and removals, but not the moving of a block of code from one part of a file to another. Code fragments of interest that fall within the added and removed regions of code have to be manually correlated across versions, which is tedious and error-prone. When studying a very large code base over a long time, the number of manual correlations can become an obstacle to the success of a study. In this paper, we investigate the effect of replacing the current line-based algorithm used by Herodotos by tree-matching, as provided by the algorithm of the differencing tool GumTree. In contrast to the line-based approach, the tree-based approach does not generate any manual correlations, but it incurs a high execution time. To address the problem, we propose a hybrid strategy that gives the best of both approaches.
  • Keywords
    program debugging; software maintenance; source code (software); trees (mathematics); GNU Diff; GumTree; Herodotos; code differencing algorithm; code fragment tracking; language-aware tree differencing algorithm; longest common subsequence; pattern tracking; software project monitoring; tree-matching; Correlation; Kernel; Linux; Manuals; Vegetation; XML; code metrics; code tracking; tree-matching;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Analysis, Evolution and Reengineering (SANER), 2015 IEEE 22nd International Conference on
  • Conference_Location
    Montreal, QC
  • Type

    conf

  • DOI
    10.1109/SANER.2015.7081814
  • Filename
    7081814