How Hadoop Clusters Break

Author

Rabkin, A. ; Katz, Randy H.

Author_Institution

Princeton Univ., Princeton, NJ, USA

Volume

Issue

fYear

2013

fDate

July-Aug. 2013

Firstpage

Lastpage

Abstract

This article describes an examination of a sample of several hundred support tickets for the Hadoop ecosystem, a widely used group of big data storage and processing systems; a taxonomy of errors and how they are addressed by supporters; and the misconfigurations that are the dominant cause of failures. Some design "antipatterns" and missing platform features contribute to these problems. Developers can use various methods to build more robust distributed systems, thereby helping users and administrators prevent some of these rough edges.

Keywords

data handling; parallel programming; Hadoop cluster; Hadoop ecosystem; data processing system; data storage system; distributed system; Analytical models; Cluster approximation; Data handling; Data storage systems; Information management; Software development; Software reliability; big data; cloud computing; distributed systems; reliability; system administration;

fLanguage

English

Journal_Title

Software, IEEE

Publisher

ieee

ISSN

0740-7459

Type

jour

DOI

10.1109/MS.2012.73

Filename

6216347

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=17997