DocumentCode :
3078007
Title :
Towards Provenance-Based Anomaly Detection in MapReduce
Author :
Cong Liao ; Squicciarini, Anna
Author_Institution :
Coll. of Inf. Sci. & Technol., Pennsylvania State Univ., University Park, PA, USA
fYear :
2015
fDate :
4-7 May 2015
Firstpage :
647
Lastpage :
656
Abstract :
MapReduce enables parallel and distributed processing of vast amount of data on a cluster of machines. However, such computing paradigm is subject to threats posed by malicious and cheating nodes or compromised user submitted code that could tamper data and computation since users maintain little control as the computation is carried out in a distributed fashion. In this paper, we focus on the analysis and detection of anomalies during the process of MapReduce computation. Accordingly, we develop a computational provenance system that captures provenance data related to MapReduce computation within the MapReduce framework in Hadoop. In particular, we identify a set of invariants against aggregated provenance information, which are later analyzed to uncover anomalies indicating possible tampering of data and computation. We conduct a series of experiments to show the efficiency and effectiveness of our proposed provenance system.
Keywords :
data analysis; parallel processing; security of data; Hadoop; MapReduce computation; computational provenance system; data tampering; provenance-based anomaly detection; Access control; Cloud computing; Containers; Distributed databases; Monitoring; Yarn; MapReduce; computation integrity; logging; provenance;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster, Cloud and Grid Computing (CCGrid), 2015 15th IEEE/ACM International Symposium on
Conference_Location :
Shenzhen
Type :
conf
DOI :
10.1109/CCGrid.2015.16
Filename :
7152530
Link To Document :
بازگشت