Learning to Generate Pseudo-Code from Source Code Using Statistical Machine Translation (T)

Author

Yusuke Oda;Hiroyuki Fudaba;Graham Neubig;Hideaki Hata;Sakriani Sakti;Tomoki Toda;Satoshi Nakamura

Author_Institution

Grad. Sch. of Inf. Sci., Nara Inst. of Sci. &

fYear

2015

Firstpage

574

Lastpage

584

Abstract

Pseudo-code written in natural language can aid the comprehension of source code in unfamiliar programming languages. However, the great majority of source code has no corresponding pseudo-code, because pseudo-code is redundant and laborious to create. If pseudo-code could be generated automatically and instantly from given source code, we could allow for on-demand production of pseudo-code without human effort. In this paper, we propose a method to automatically generate pseudo-code from source code, specifically adopting the statistical machine translation (SMT) framework. SMT, which was originally designed to translate between two natural languages, allows us to automatically learn the relationship between source code/pseudo-code pairs, making it possible to create a pseudo-code generator with less human effort. In experiments, we generated English or Japanese pseudo-code from Python statements using SMT, and find that the generated pseudo-code is largely accurate, and aids code understanding.

Keywords

"Natural languages","Computer languages","Software engineering","Programming profession","Generators","Software"

Publisher

ieee

Conference_Titel

Automated Software Engineering (ASE), 2015 30th IEEE/ACM International Conference on

Type

conf

DOI

10.1109/ASE.2015.36

Filename

7372045