DocumentCode
1607240
Title
Graph-Based Statistical Language Model for Code
Author
Anh Tuan Nguyen ; Nguyen, Tien N.
Author_Institution
Iowa State Univ., Ames, IA, USA
Volume
1
fYear
2015
Firstpage
858
Lastpage
868
Abstract
n-gram statistical language model has been successfully applied to capture programming patterns to support code completion and suggestion. However, the approaches using n-gram face challenges in capturing the patterns at higher levels of abstraction due to the mismatch between the sequence nature in n-grams and the structure nature of syntax and semantics in source code. This paper presents GraLan, a graph-based statistical language model and its application in code suggestion. GraLan can learn from a source code corpus and compute the appearance probabilities of any graphs given the observed (sub)graphs. We use GraLan to develop an API suggestion engine and an AST-based language model, ASTLan. ASTLan supports the suggestion of the next valid syntactic template and the detection of common syntactic templates. Our empirical evaluation on a large corpus of open-source projects has shown that our engine is more accurate in API code suggestion than the state-of-the-art approaches, and in 75% of the cases, it can correctly suggest the API with only five candidates. ASTLan has also high accuracy in suggesting the next syntactic template and is able to detect many useful and common syntactic templates.
Keywords
graph theory; object-oriented programming; probability; public domain software; source code (software); API suggestion engine; AST-based language model; ASTLan; GraLan; appearance probability; code completion; code suggestion; common syntactic template detection; graph-based statistical language model; n-gram statistical language model; open-source projects; programming patterns; source code corpus; valid syntactic template; Accuracy; Computational modeling; Context; Engines; Probability; Programming; Syntactics; API Suggestion; Graph-based Language Model; Syntactic Template Suggestion;
fLanguage
English
Publisher
ieee
Conference_Titel
Software Engineering (ICSE), 2015 IEEE/ACM 37th IEEE International Conference on
Conference_Location
Florence
Type
conf
DOI
10.1109/ICSE.2015.336
Filename
7194632
Link To Document