• DocumentCode
    2959059
  • Title

    iTransformer: Using SSD to Improve Disk Scheduling for High-performance I/O

  • Author

    Zhang, Xuechen ; Davis, Kei ; Jiang, Song

  • Author_Institution
    ECE Dept., Wayne State Univ., Detroit, MI, USA
  • fYear
    2012
  • fDate
    21-25 May 2012
  • Firstpage
    715
  • Lastpage
    726
  • Abstract
    The parallel data accesses inherent to large-scale data-intensive scientific computing require that data servers handle very high I/O concurrency. Concurrent requests from different processes or programs to hard disk can cause disk head thrashing between different disk regions, resulting in unacceptably low I/O performance. Current storage systems either rely on the disk scheduler at each data server, or use SSD as storage, to minimize this negative performance effect. However, the ability of the scheduler to alleviate this problem by scheduling requests in memory is limited by concerns such as long disk access times, and potential loss of dirty data with system failure. Meanwhile, SSD is too expensive to be widely used as the major storage device in the HPC environment. We propose iTransformer, a scheme that employs a small SSD to schedule requests for the data on disk. Being less space constrained than with more expensive DRAM, iTransformer can buffer larger amounts of dirty data before writing it back to the disk, or prefetch a larger volume of data in a batch into the SSD. In both cases high disk efficiency can be maintained even for concurrent requests. Furthermore, the scheme allows the scheduling of requests in the background to hide the cost of random disk access behind serving process requests. Finally, as a non-volatile memory, concerns about the quantity of dirty data are obviated. We have implemented iTransformer in the Linux kernel and tested it on a large cluster running PVFS2. Our experiments show that iTransformer can improve the I/O throughput of the cluster by 35% on average for MPI/IO benchmarks of various data access patterns.
  • Keywords
    DRAM chips; Linux; concurrency control; hard discs; operating system kernels; parallel processing; processor scheduling; DRAM; HPC environment; I/O concurrency; I/O performance; I/O throughput; Linux kernel; MPI/IO benchmark; PVFS2; SSD; buffer; concurrent request; data access pattern; data prefetch; data server; dirty data; disk access time; disk head thrashing; disk region; disk scheduler; disk scheduling; hard disk; high-performance I/O; iTransformer; large-scale data-intensive scientific computing; nonvolatile memory; parallel data access; process request; random disk access; request scheduling; storage device; storage system; system failure; Benchmark testing; Concurrent computing; Hard disks; Linux; Prefetching; Servers; Throughput; Disk Scheduler; Shared Storage; Solid State Drive;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel & Distributed Processing Symposium (IPDPS), 2012 IEEE 26th International
  • Conference_Location
    Shanghai
  • ISSN
    1530-2075
  • Print_ISBN
    978-1-4673-0975-2
  • Type

    conf

  • DOI
    10.1109/IPDPS.2012.70
  • Filename
    6267882