• DocumentCode
    2400910
  • Title

    Inferring semantically related words from software context

  • Author

    Yang, Jinqiu ; Tan, Lin

  • Author_Institution
    Univ. of Waterloo, Waterloo, ON, Canada
  • fYear
    2012
  • fDate
    2-3 June 2012
  • Firstpage
    161
  • Lastpage
    170
  • Abstract
    Code search is an integral part of software development and program comprehension. The difficulty of code search lies in the inability to guess the exact words used in the code. Therefore, it is crucial for keyword-based code search to expand queries with semantically related words, e.g., synonyms and abbreviations, to increase the search effectiveness. However, it is limited to rely on resources such as English dictionaries and WordNet to obtain semantically related words in software, because many words that are semantically related in software are not semantically related in English. This paper proposes a simple and general technique to automatically infer semantically related words in software by leveraging the context of words in comments and code. We achieve a reasonable accuracy in seven large and popular code bases written in C and Java. Our further evaluation against the state of art shows that our technique can achieve a higher precision and recall.
  • Keywords
    C language; Java; query processing; software engineering; word processing; C language; English dictionaries; Java; WordNet; abbreviations; code search; keyword-based code search; program comprehension; semantically related word inference; software context; software development; synonyms; Context; Dictionaries; Gold; Java; Kernel; Linux; Semantically related words; code search; program comprehension;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Mining Software Repositories (MSR), 2012 9th IEEE Working Conference on
  • Conference_Location
    Zurich
  • ISSN
    2160-1852
  • Print_ISBN
    978-1-4673-1760-3
  • Type

    conf

  • DOI
    10.1109/MSR.2012.6224276
  • Filename
    6224276