• DocumentCode
    45060
  • Title

    REPENT: Analyzing the Nature of Identifier Renamings

  • Author

    Arnaoudova, V. ; Eshkevari, Laleh M. ; Di Penta, Massimiliano ; Oliveto, Rocco ; Antoniol, Giuliano ; Gueheneuc, Yann-Gael

  • Author_Institution
    Polytech. Montreal, Montreal, QC, Canada
  • Volume
    40
  • Issue
    5
  • fYear
    2014
  • fDate
    May-14
  • Firstpage
    502
  • Lastpage
    532
  • Abstract
    Source code lexicon plays a paramount role in software quality: poor lexicon can lead to poor comprehensibility and even increase software fault-proneness. For this reason, renaming a program entity, i.e., altering the entity identifier, is an important activity during software evolution. Developers rename when they feel that the name of an entity is not (anymore) consistent with its functionality, or when such a name may be misleading. A survey that we performed with 71 developers suggests that 39 percent perform renaming from a few times per week to almost every day and that 92 percent of the participants consider that renaming is not straightforward. However, despite the cost that is associated with renaming, renamings are seldom if ever documented-for example, less than 1 percent of the renamings in the five programs that we studied. This explains why participants largely agree on the usefulness of automatically documenting renamings. In this paper we propose REanaming Program ENTities (REPENT), an approach to automatically document-detect and classify-identifier renamings in source code. REPENT detects renamings based on a combination of source code differencing and data flow analyses. Using a set of natural language tools, REPENT classifies renamings into the different dimensions of a taxonomy that we defined. Using the documented renamings, developers will be able to, for example, look up methods that are part of the public API (as they impact client applications), or look for inconsistencies between the name and the implementation of an entity that underwent a high risk renaming (e.g., towards the opposite meaning). We evaluate the accuracy and completeness of REPENT on the evolution history of five open-source Java programs. The study indicates a precision of 88 percent and a recall of 92 percent. In addition, we report an exploratory study investigating and discussing how identifiers are renamed in the five programs, according to our taxonomy.
  • Keywords
    data flow analysis; pattern classification; software fault tolerance; software quality; source code (software); REPENT; data flow analysis; entity identifier; identifier renaming analysis; natural language tools; open-source Java programs; program entity renaming; public API; reanaming program entities; software evolution; software fault-proneness; software quality; source code lexicon; taxonomy dimensions; Documentation; Grammar; History; Java; Semantics; Software; Taxonomy; Identifier renaming; empirical study; mining software repositories; program comprehension; refactoring;
  • fLanguage
    English
  • Journal_Title
    Software Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0098-5589
  • Type

    jour

  • DOI
    10.1109/TSE.2014.2312942
  • Filename
    6776542