Title :
Automatic extraction of glossary terms from natural language requirements
Author :
Dwarakanath, Anurag ; Ramnani, Roshni R. ; Sengupta, Sabyasachi
Author_Institution :
Accenture Technol. Labs., Bangalore, India
Abstract :
We present a method for the automatic extraction of glossary terms from unconstrained natural language requirements. The glossary terms are identified in two steps - a) compute units (which are candidates for glossary terms) b) disambiguate between the mutually exclusive units to identify terms. We introduce novel linguistic techniques to identify process nouns, abstract nouns and auxiliary verbs. The identification of units also handles co-ordinating conjunctions and adjectival modifiers. This requires solving co-ordination ambiguity and adjectival modifier ambiguity. The identification of terms among the units adapts an in-document statistical metric. We present an evaluation of our method over a real-life set of software requirements´ documents and compare our results with that of a base algorithm. The intricate linguistic classification and the tackling of ambiguity result in superior performance of our approach over the base algorithm.
Keywords :
computational linguistics; formal specification; formal verification; glossaries; natural language processing; software metrics; statistical analysis; abstract nouns; adjectival modifier ambiguity; automatic glossary term extraction; auxiliary verbs; co-ordinating conjunctions; co-ordination ambiguity; compute units; disambiguation; in-document statistical metric; intricate linguistic classification; linguistic techniques; mutually exclusive units; process nouns; software requirement documents; term identification; unconstrained natural language requirements; Abstracts; Concrete; Measurement; Natural languages; Pragmatics; Software; Terminology; Adjectival Ambiguity; Co-ordination Ambiguity; Glossary Term Extraction; Natural Language Processing;
Conference_Titel :
Requirements Engineering Conference (RE), 2013 21st IEEE International
Conference_Location :
Rio de Janeiro
DOI :
10.1109/RE.2013.6636736