• DocumentCode
    1999674
  • Title

    Identifying Word Relations in Software: A Comparative Study of Semantic Similarity Tools

  • Author

    Sridhara, Giriprasad ; Hill, Emily ; Pollock, Lori ; Vijay-Shanker, K.

  • Author_Institution
    Dept. of Comput. & Inf. Sci., Delaware Univ., Newark, DE
  • fYear
    2008
  • fDate
    10-13 June 2008
  • Firstpage
    123
  • Lastpage
    132
  • Abstract
    Modern software systems are typically large and complex, making comprehension of these systems extremely difficult. Experienced programmers comprehend code by seamlessly processing synonyms and other word relations. Thus, we believe that automated comprehension and software tools can be significantly improved by leveraging word relations in software. In this paper, we perform a comparative study of six state of the art, English-based semantic similarity techniques and evaluate their effectiveness on words from the comments and identifiers in software. Our results suggest that applying English-based semantic similarity techniques to software without any customization could be detrimental to the performance of the client software tools. We propose strategies to customize the existing semantic similarity techniques to software, and describe how various program comprehension tools can benefit from word relation information.
  • Keywords
    natural language processing; programming language semantics; reverse engineering; software tools; English-based semantic similarity; automated code comprehension; program comprehension; semantic similarity tool; software system; software tool; word relation; Databases; Natural languages; Performance evaluation; Programming profession; Software maintenance; Software performance; Software quality; Software systems; Software tools; Vehicles; automated natural language analysis; comparative study; semantic similarity techniques; tools; word relations;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Program Comprehension, 2008. ICPC 2008. The 16th IEEE International Conference on
  • Conference_Location
    Amsterdam
  • ISSN
    1092-8138
  • Print_ISBN
    978-0-7695-3176-2
  • Type

    conf

  • DOI
    10.1109/ICPC.2008.18
  • Filename
    4556124