Title :
Heuristic-based part-of-speech tagging of source code identifiers and comments
Author :
Reem S. Alsuhaibani;Christian D. Newman;Michael L. Collard;Jonathan I. Maletic
Author_Institution :
Computer Science Kent State University Kent, OH, USA
fDate :
9/1/2015 12:00:00 AM
Abstract :
An approach for using heuristics and static program analysis information to markup part-of-speech for program identifiers is presented. It does not use a natural language part-ofspeech tagger for identifiers within the code. A set of heuristics is defined akin to natural language usage of identifiers usage in code. Additionally, method stereotype information, which is automatically derived, is used in the tagging process. The approach is built using the srcML infrastructure and adds part-of-speech information directly into the srcML markup.
Keywords :
"Speech","Object recognition","Tagging","Natural languages","Conferences","Software","Computational linguistics"
Conference_Titel :
Mining Unstructured Data (MUD), 2015 IEEE 5th Workshop on
DOI :
10.1109/MUD.2015.7327960