Title :
Graph-Based Statistical Language Model for Code
Author :
Anh Tuan Nguyen ; Nguyen, Tien N.
Author_Institution :
Iowa State Univ., Ames, IA, USA
Abstract :
n-gram statistical language model has been successfully applied to capture programming patterns to support code completion and suggestion. However, the approaches using n-gram face challenges in capturing the patterns at higher levels of abstraction due to the mismatch between the sequence nature in n-grams and the structure nature of syntax and semantics in source code. This paper presents GraLan, a graph-based statistical language model and its application in code suggestion. GraLan can learn from a source code corpus and compute the appearance probabilities of any graphs given the observed (sub)graphs. We use GraLan to develop an API suggestion engine and an AST-based language model, ASTLan. ASTLan supports the suggestion of the next valid syntactic template and the detection of common syntactic templates. Our empirical evaluation on a large corpus of open-source projects has shown that our engine is more accurate in API code suggestion than the state-of-the-art approaches, and in 75% of the cases, it can correctly suggest the API with only five candidates. ASTLan has also high accuracy in suggesting the next syntactic template and is able to detect many useful and common syntactic templates.
Keywords :
graph theory; object-oriented programming; probability; public domain software; source code (software); API suggestion engine; AST-based language model; ASTLan; GraLan; appearance probability; code completion; code suggestion; common syntactic template detection; graph-based statistical language model; n-gram statistical language model; open-source projects; programming patterns; source code corpus; valid syntactic template; Accuracy; Computational modeling; Context; Engines; Probability; Programming; Syntactics; API Suggestion; Graph-based Language Model; Syntactic Template Suggestion;
Conference_Titel :
Software Engineering (ICSE), 2015 IEEE/ACM 37th IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICSE.2015.336