DocumentCode
3697152
Title
Identifying Sensitive Data Items within Hadoop
Author
Ashwin Kumar TK;Hong Liu;Johnson P. Thomas;Goutam Mylavarapu
Author_Institution
Dept. of Comput. Sci., Oklahoma State Univ., Stillwater, OK, USA
fYear
2015
Firstpage
1308
Lastpage
1313
Abstract
Recent growth in big-data is raising security and privacy concerns. Organizations that collect data from various sources are at a risk of legal or business liabilities due to security breach and exposure of sensitive information. Only file-level access control is feasible in current Hadoop implementation and the sensitive information can only be identified manually or from the information provided by the data owner. The problem of identifying sensitive information manually gets complicated due to different types of data. When sensitive information is accessed by an unauthorized user or misused by an authorized person, they can compromise privacy. This paper is the first part of our intended access control framework for Hadoop and it automates the process of identifying sensitive data items manually. To identify such data items, the proposed framework harnesses data context, usage patterns and data provenance. In addition to this the proposed framework can also keep track of the data lineage.
Keywords
"Metadata","Generators","Electromyography","Context","Access control","Sensitivity","Neural networks"
Publisher
ieee
Conference_Titel
High Performance Computing and Communications (HPCC), 2015 IEEE 7th International Symposium on Cyberspace Safety and Security (CSS), 2015 IEEE 12th International Conferen on Embedded Software and Systems (ICESS), 2015 IEEE 17th International Conference on
Type
conf
DOI
10.1109/HPCC-CSS-ICESS.2015.293
Filename
7336348
Link To Document