Title :
Automated text mining for requirements analysis of policy documents
Author :
Massey, Aaron K. ; Eisenstein, Jacob ; Anton, Annie I. ; Swire, Peter P.
Author_Institution :
Sch. of Interactive Comput., Georgia Inst. of Technol., Atlanta, GA, USA
Abstract :
Businesses and organizations in jurisdictions around the world are required by law to provide their customers and users with information about their business practices in the form of policy documents. Requirements engineers analyze these documents as sources of requirements, but this analysis is a time-consuming and mostly manual process. Moreover, policy documents contain legalese and present readability challenges to requirements engineers seeking to analyze them. In this paper, we perform a large-scale analysis of 2,061 policy documents, including policy documents from the Google Top 1000 most visited websites and the Fortune 500 companies, for three purposes: (1) to assess the readability of these policy documents for requirements engineers; (2) to determine if automated text mining can indicate whether a policy document contains requirements expressed as either privacy protections or vulnerabilities; and (3) to establish the generalizability of prior work in the identification of privacy protections and vulnerabilities from privacy policies to other policy documents. Our results suggest that this requirements analysis technique, developed on a small set of policy documents in two domains, may generalize to other domains.
Keywords :
data mining; data privacy; formal verification; software reliability; text analysis; Web sites; automated text mining; business practices; large-scale analysis; policy document readability; privacy protections; readability challenges; requirements analysis; requirements engineers; Analytical models; Companies; Google; Privacy; Regulators; Text mining;
Conference_Titel :
Requirements Engineering Conference (RE), 2013 21st IEEE International
Conference_Location :
Rio de Janeiro
DOI :
10.1109/RE.2013.6636700