Poster: Performing Cloud Computation on a Parallel File System

Author

Wilson, Ellis

fYear

2012

fDate

10-16 Nov. 2012

Firstpage

1545

Lastpage

1545

Abstract

The MapReduce (MR) framework is a programming environment that facilitates rapid parallel design of applications that process big data. While born in the Cloud arena, numerous other areas are now attempting to utilize it for their big data due to the speed of development. However, for HPC researchers and many others who already utilize centralized storage, MR marks a paradigm shift toward co-located storage and computation resources. In this work I attempt to reach the best of both worlds by exploring how to utilize MR on a network-attached parallel file system. This work is nearly complete and has unearthed key issues I´ve subsequently overcome to achieved desired high throughput. In my poster I describe many of these issues, demonstrate improvements possible with different architectural schemas, and provide reliability and fault-tolerance considerations for this novel combination of Cloud computation and HPC storage.

fLanguage

English

Publisher

ieee

Conference_Titel

High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion:

Conference_Location

Salt Lake City, UT

Print_ISBN

978-1-4673-6218-4

Type

conf

DOI

10.1109/SC.Companion.2012.317

Filename

6496101

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=1920372