DocumentCode
1914508
Title
How GridFTP Pipelining, Parallelism and Concurrency Work: A Guide for Optimizing Large Dataset Transfers
Author
Yildirim, E. ; JangYoung Kim ; Kosar, Tevfik
Author_Institution
Dept. of Comput. Eng., Fatih Univ., Istanbul, Turkey
fYear
2012
fDate
10-16 Nov. 2012
Firstpage
506
Lastpage
515
Abstract
Optimizing the transfer of large files over high-bandwidth networks is a challenging task that requires the consideration of many parameters (e.g. network speed, roundtrip time, and current traffic). Unfortunately, this task becomes more complex when transferring datasets comprised of many small files. In this case, the performance of large dataset transfers not only depends on the characteristics of the transfer protocol and network, but also the number and the size distribution of the files that constitute the dataset. GridFTP is the most advanced transfer tool that provides functions to overcome large dataset transfer bottlenecks. Three of the most important parameters of GridFTP are pipelining, parallelism and concurrency. In this study, we research the effects of these three important parameters, provide models for optimization of these parameters, define guidelines and give an algorithm for their practical use for transfer of large datasets of varying size files.
Keywords
concurrency control; data communication; grid computing; optimisation; parallel processing; pipeline processing; transport protocols; GridFTP parallelism; GridFTP pipelining; concurrency work; high-bandwidth networks; large dataset transfer optimization; large file transfer optimisation; round-trip time; transfer protocol;
fLanguage
English
Publisher
ieee
Conference_Titel
High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion:
Conference_Location
Salt Lake City, UT
Print_ISBN
978-1-4673-6218-4
Type
conf
DOI
10.1109/SC.Companion.2012.73
Filename
6495855
Link To Document