DocumentCode :
3178897
Title :
Utilizing Web Search Engines for Program Analysis
Author :
Ratiu, Daniel ; Heinemann, Lars
Author_Institution :
Tech. Univ. Munchen, München, Germany
fYear :
2010
fDate :
June 30 2010-July 2 2010
Firstpage :
94
Lastpage :
103
Abstract :
Programming involves representing domain concepts by using programming abstractions. In object-oriented programs, concepts and relations of the business domain are represented as classes, attributes and methods. However, the concepts and relations that logically belong together are scattered across different modules, interleaved with technical concepts, and distorted due to implementation details. In this paper, we present an automatic method to identify logically related concepts and the relations among them. To achieve this, we systematically transform program identifiers into fragments of natural language sentences and check whether these sentence fragments are meaningful for humans. In order to automatically perform such checks, we use the World Wide Web as a knowledge base that contains a huge number of meaningful texts, and use the Google web search engine to validate the meaningfulness of these sentences. If the search engine returns a sufficient number of hits, we discovered a piece of knowledge in the code. By systematically applying this method, we obtain a condensed form of the knowledge embodied in the program which is an enabler for automatic analyses. We present our experience with several use-cases: (1) assessing the meaningfulness of identifiers, (2) extracting complex concepts from compound identifiers, (3) extracting a meaningful taxonomy from the class hierarchy, and (4) extracting complex conceptual relations from the code. We report on our observations during the analysis of real world Java code, discuss the limitations of our approach and sketch extension possibilities.
Keywords :
Internet; Java; knowledge based systems; object-oriented programming; program diagnostics; search engines; Google; Java code; Web search engine; World Wide Web; business domain; class hierarchy; knowledge base; natural language sentence; object-oriented program; program analysis; program fragment; program identifier; programming abstraction; sentence fragment; Cloning; Humans; Java; Natural languages; Scattering; Search engines; Shape; Taxonomy; Web search; Web sites; analysis of identifiers; concept location; domain knowledge; program analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Program Comprehension (ICPC), 2010 IEEE 18th International Conference on
Conference_Location :
Braga, Minho
ISSN :
1092-8138
Print_ISBN :
978-1-4244-7604-6
Electronic_ISBN :
1092-8138
Type :
conf
DOI :
10.1109/ICPC.2010.26
Filename :
5521759
Link To Document :
بازگشت