DocumentCode
2420328
Title
GrayWulf: Scalable Clustered Architecture for Data Intensive Computing
Author
Szalay, Alexander S. ; Bell, Gordon ; VandenBerg, J. ; Wonders, A. ; Burns, Randal ; Dan Fay ; Heasley, J. ; Hey, T. ; Nieto-Santisteban, M. ; Thakar, A. ; van Ingen, C. ; Wilton, R.
Author_Institution
Johns Hopkins Univ., Baltimore, MD
fYear
2009
fDate
5-8 Jan. 2009
Firstpage
1
Lastpage
10
Abstract
Data intensive computing presents a significant challenge for traditional supercomputing architectures that maximize FLOPS since CPU speed has surpassed IO capabilities of HPC systems and BeoWulf clusters. We present the architecture for a three tier commodity component cluster designed for a range of data intensive computations operating on petascale data sets named GrayWulf. The design goal is a balanced system in terms of IO performance and memory size, according to Amdahl´s laws. The hardware currently installed at JHU exceeds one petabyte of storage and has 0.5 bytes/sec of I/O and 1 byte of memory for each CPU cycle. The GrayWulf provides almost an order of magnitude better balance than existing systems. The paper covers its architecture and reference applications. The software design is presented in a companion paper.
Keywords
parallel machines; pattern clustering; software architecture; Amdahl laws; BeoWulf clusters; GrayWulf; IO capabilities; commodity component cluster; data intensive computations; data intensive computing; software design; supercomputing architectures; Application software; Central Processing Unit; Cloud computing; Computer architecture; Data analysis; Grid computing; Hardware; High performance computing; Supercomputers; Workstations;
fLanguage
English
Publisher
ieee
Conference_Titel
System Sciences, 2009. HICSS '09. 42nd Hawaii International Conference on
Conference_Location
Big Island, HI
ISSN
1530-1605
Print_ISBN
978-0-7695-3450-3
Type
conf
DOI
10.1109/HICSS.2009.234
Filename
4755780
Link To Document