Title :
Automated extraction of non-functional requirements in available documentation
Author :
Slankas, John ; Williams, Laurie
Author_Institution :
Dept. of Comput. Sci., North Carolina State Univ. Raleigh, Raleigh, NC, USA
Abstract :
While all systems have non-functional requirements (NFRs), they may not be explicitly stated in a formal requirements specification. Furthermore, NFRs may also be externally imposed via government regulations or industry standards. As some NFRs represent emergent system proprieties, those NFRs require appropriate analysis and design efforts to ensure they are met. When the specified NFRs are not met, projects incur costly re-work to correct the issues. The goal of our research is to aid analysts in more effectively extracting relevant non-functional requirements in available unconstrained natural language documents through automated natural language processing. Specifically, we examine which document types (data use agreements, install manuals, regulations, request for proposals, requirements specifications, and user manuals) contain NFRs categorized to 14 NFR categories (e.g. capacity, reliability, and security). We measure how effectively we can identify and classify NFR statements within these documents. In each of the documents evaluated, we found NFRs present. Using a word vector representation of the NFRs, a support vector machine algorithm performed twice as effectively compared to the same input to a multinomial naïve Bayes classifier. Our k-nearest neighbor classifier with a unique distance metric had an F1 measure of 0.54, outperforming in our experiments the optimal naïve Bayes classifier which had a F1 measure of 0.32. We also found that stop word lists beyond common determiners had no minimal performance effect.
Keywords :
Bayes methods; formal specification; learning (artificial intelligence); natural language processing; pattern classification; support vector machines; text analysis; user manuals; F1 measure; NFR statements; automated extraction; automated natural language processing; data use agreements; distance metric; document types; formal requirements specification; government regulations; industry standards; install manuals; k-nearest neighbor classifier; multinomial naïve Bayes classifier; nonfunctional requirements; optimal naïve Bayes classifier; support vector machine algorithm; system proprieties; unconstrained natural language documents; user manuals; word vector representation; Classification algorithms; Documentation; Machine learning algorithms; Measurement; Natural languages; Security; Standards; classification; documentation; machine learning; natural language processing; non-functional requirements;
Conference_Titel :
Natural Language Analysis in Software Engineering (NaturaLiSE), 2013 1st International Workshop on
Conference_Location :
San Francisco, CA
DOI :
10.1109/NAturaLiSE.2013.6611715