• DocumentCode
    2947206
  • Title

    Beyond block I/O: Rethinking traditional storage primitives

  • Author

    Ouyang, Xiangyong ; Nellans, David ; Wipfel, Robert ; Flynn, David ; Panda, Dhabaleswar K.

  • fYear
    2011
  • fDate
    12-16 Feb. 2011
  • Firstpage
    301
  • Lastpage
    311
  • Abstract
    Over the last twenty years the interfaces for accessing persistent storage within a computer system have remained essentially unchanged. Simply put, seek, read and write have defined the fundamental operations that can be performed against storage devices. These three interfaces have endured because the devices within storage subsystems have not fundamentally changed since the invention of magnetic disks. Non-volatile (flash) memory (NVM) has recently become a viable enterprise grade storage medium. Initial implementations of NVM storage devices have chosen to export these same disk-based seek/read/write interfaces because they provide compatibility for legacy applications. We propose there is a new class of higher order storage primitives beyond simple block I/O that high performance solid state storage should support. One such primitive, atomic-write, batches multiple I/O operations into a single logical group that will be persisted as a whole or rolled back upon failure. By moving write-atomicity down the stack into the storage device, it is possible to significantly reduce the amount of work required at the application, filesystem, or operating system layers to guarantee the consistency and integrity of data. In this work we provide a proof of concept implementation of atomic-write on a modern solid state device that leverages the underlying log-based flash translation layer (FTL). We present an example of how database management systems can benefit from atomic-write by modifying the MySQL InnoDB transactional storage engine. Using this new atomic-write primitive we are able to increase system throughput by 33%, improve the 90th percentile transaction response time by 20%, and reduce the volume of data written from MySQL to the storage subsystem by as much as 43% on industry standard benchmarks, while maintaining ACID transaction semantics.
  • Keywords
    SQL; flash memories; magnetic disc storage; performance evaluation; random-access storage; ACID transaction semantics; MySQL InnoDB transactional storage engine; block I/O; computer system; database management systems; disk-based read-write interfaces; disk-based seek interfaces; enterprise grade storage medium; flash memory; log-based flash translation layer; magnetic disks; nonvolatile memory; storage primitives; storage subsystems; Ash; Atomic layer deposition; Computer crashes; Databases; Media; Semantics; Solids;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computer Architecture (HPCA), 2011 IEEE 17th International Symposium on
  • Conference_Location
    San Antonio, TX
  • ISSN
    1530-0897
  • Print_ISBN
    978-1-4244-9432-3
  • Type

    conf

  • DOI
    10.1109/HPCA.2011.5749738
  • Filename
    5749738