DocumentCode :
2959059
Title :
iTransformer: Using SSD to Improve Disk Scheduling for High-performance I/O
Author :
Zhang, Xuechen ; Davis, Kei ; Jiang, Song
Author_Institution :
ECE Dept., Wayne State Univ., Detroit, MI, USA
fYear :
2012
fDate :
21-25 May 2012
Firstpage :
715
Lastpage :
726
Abstract :
The parallel data accesses inherent to large-scale data-intensive scientific computing require that data servers handle very high I/O concurrency. Concurrent requests from different processes or programs to hard disk can cause disk head thrashing between different disk regions, resulting in unacceptably low I/O performance. Current storage systems either rely on the disk scheduler at each data server, or use SSD as storage, to minimize this negative performance effect. However, the ability of the scheduler to alleviate this problem by scheduling requests in memory is limited by concerns such as long disk access times, and potential loss of dirty data with system failure. Meanwhile, SSD is too expensive to be widely used as the major storage device in the HPC environment. We propose iTransformer, a scheme that employs a small SSD to schedule requests for the data on disk. Being less space constrained than with more expensive DRAM, iTransformer can buffer larger amounts of dirty data before writing it back to the disk, or prefetch a larger volume of data in a batch into the SSD. In both cases high disk efficiency can be maintained even for concurrent requests. Furthermore, the scheme allows the scheduling of requests in the background to hide the cost of random disk access behind serving process requests. Finally, as a non-volatile memory, concerns about the quantity of dirty data are obviated. We have implemented iTransformer in the Linux kernel and tested it on a large cluster running PVFS2. Our experiments show that iTransformer can improve the I/O throughput of the cluster by 35% on average for MPI/IO benchmarks of various data access patterns.
Keywords :
DRAM chips; Linux; concurrency control; hard discs; operating system kernels; parallel processing; processor scheduling; DRAM; HPC environment; I/O concurrency; I/O performance; I/O throughput; Linux kernel; MPI/IO benchmark; PVFS2; SSD; buffer; concurrent request; data access pattern; data prefetch; data server; dirty data; disk access time; disk head thrashing; disk region; disk scheduler; disk scheduling; hard disk; high-performance I/O; iTransformer; large-scale data-intensive scientific computing; nonvolatile memory; parallel data access; process request; random disk access; request scheduling; storage device; storage system; system failure; Benchmark testing; Concurrent computing; Hard disks; Linux; Prefetching; Servers; Throughput; Disk Scheduler; Shared Storage; Solid State Drive;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel & Distributed Processing Symposium (IPDPS), 2012 IEEE 26th International
Conference_Location :
Shanghai
ISSN :
1530-2075
Print_ISBN :
978-1-4673-0975-2
Type :
conf
DOI :
10.1109/IPDPS.2012.70
Filename :
6267882
Link To Document :
بازگشت