Author :
Freitas, Jose Luis ; da Cruz, Daniela ; Henriques, Pedro Rangel
Abstract :
Comments are interspersed by the Programmer among code lines, at software development phase, with two main purposes: to help himself during the development phase, to help other programmers later on, during the maintenance phase. The former are memos to help him remembering to do something, they are not useful for those willing to understand code. The latter are explanations about the ideas he has in mind when he wrote the code, they can be a relevant aid for others and should be taken into consideration as a first step in program comprehension. Comments are scattered all over the source code, sometimes wrapping a block of code (placed at the beginning or at its end), other times complementing a single statement. If comments are inserted to help in understanding the programmer ideas, they will contain for sure concepts associated with problem domain In this paper we discuss an approach to locate a relevant code chunk (one where the programmer should focus the attention for software maintenance), using information retrieval techniques to locate problem domain concepts within comments. In our approach, comments are isolated marking their type (inline, block or javadoc comment) and keeping their context (code lines to which they are associated). Picking up concepts from the ontology that describes the problem, it is possible to find all the comments that contain that concept (similar words) and rate them. Reading comments from the retrieved list, the programmer can select those that seem to him meaningful and dive directly into the associated chunk. In the paper, we also survey Comment Analysis techniques and describe an environment, Darius, that aims at automatizing the approach proposed. Moreover, Dariusprovides functionality to study comments frequency in the source files of a given project, to support the discussion weather it is worthwhile or not to apply this program comprehension step.