• DocumentCode
    1350856
  • Title

    Improving Source Code Lexicon via Traceability and Information Retrieval

  • Author

    De Lucia, Andrea ; Di Penta, Massimiliano ; Oliveto, Rocco

  • Author_Institution
    Dept. of Math. & Inf., Univ. of Salerno, Fisciano, Italy
  • Volume
    37
  • Issue
    2
  • fYear
    2011
  • Firstpage
    205
  • Lastpage
    227
  • Abstract
    The paper presents an approach helping developers to maintain source code identifiers and comments consistent with high-level artifacts. Specifically, the approach computes and shows the textual similarity between source code and related high-level artifacts. Our conjecture is that developers are induced to improve the source code lexicon, i.e., terms used in identifiers or comments, if the software development environment provides information about the textual similarity between the source code under development and the related high-level artifacts. The proposed approach also recommends candidate identifiers built from high-level artifacts related to the source code under development and has been implemented as an Eclipse plug-in, called COde Comprehension Nurturant Using Traceability (COCONUT). The paper also reports on two controlled experiments performed with master´s and bachelor´s students. The goal of the experiments is to evaluate the quality of identifiers and comments (in terms of their consistency with high-level artifacts) in the source code produced when using or not using COCONUT. The achieved results confirm our conjecture that providing the developers with similarity between code and high-level artifacts helps to improve the quality of source code lexicon. This indicates the potential usefulness of COCONUT as a feature for software development environments.
  • Keywords
    information retrieval; program diagnostics; software quality; COCONUT; bachelor student; candidate identifier; code comprehension nurturant using traceability; high level artifact; information retrieval; master student; software development; source code lexicon; textual similarity; Software traceability; empirical software engineering.; information retrieval; software development environments; source code comprehensibility; source code identifier quality;
  • fLanguage
    English
  • Journal_Title
    Software Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0098-5589
  • Type

    jour

  • DOI
    10.1109/TSE.2010.89
  • Filename
    5601742