Title :
Block-Based Concurrent and Storage-Aware Data Streaming for Grid Applications with Lots of Small Files
Author :
Zhang, Wen ; Cao, Junwei ; Zhong, Yisheng ; Liu, Lianchen ; Wu, Cheng
Author_Institution :
Dept. of Autom., Tsinghua Univ., Beijing
Abstract :
Data streaming management and scheduling is required by many grid computing applications, especially when the volume of data to be processed is extremely high while available storage is relatively limited. Big bulk of data from scientific experiments is usually partitioned into lots of small files (LOSF), bringing challenges to data streaming supports. Block-based data transferring is proposed in this work and implemented using GridFTP, where the number of blocks or the size of each block must be carefully scheduled, taking makespan and available storage into account simultaneously. To increase processing efficiency, data streaming and processing have to be performed concurrently; data streaming scheduling must be storage-aware to avoid data overflow. Experimental results show that the optimization method for block-based concurrent and storage-aware data streaming proposed in this work is efficient to deal with the LOSF problem with a relatively good performance in terms of makespan and storage usage.
Keywords :
concurrency control; grid computing; scheduling; storage management; block-based concurrent data streaming management; grid computing; scheduling; storage-aware data streaming management; Application software; Concurrent computing; Grid computing; Information science; Information technology; Laboratories; Observatories; Pipeline processing; Scheduling algorithm; Storage automation; Data Streaming; Grid Computing; Lots of Small Files;
Conference_Titel :
Cluster Computing and the Grid, 2009. CCGRID '09. 9th IEEE/ACM International Symposium on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-3935-5
Electronic_ISBN :
978-0-7695-3622-4
DOI :
10.1109/CCGRID.2009.26