• DocumentCode
    25285
  • Title

    Efficient Range-Based Storage Management for Scalable Datastores

  • Author

    Margaritis, Giorgos ; Anastasiadis, Stergios V.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Univ. of Ioannina, Ioannina, Greece
  • Volume
    25
  • Issue
    11
  • fYear
    2014
  • fDate
    Nov. 2014
  • Firstpage
    2851
  • Lastpage
    2866
  • Abstract
    Scalable datastores are distributed storage systems capable of managing enormous amounts of structured data for online serving and analytics applications. Across different workloads, they weaken the relational and transactional assumptions of traditional databases to achieve horizontal scalability and availability, and meet demanding throughput and latency requirements. Efficiency tradeoffs at each storage server often lead to design decisions that sacrifice query responsiveness for higher insertion throughput. In order to address this limitation, we introduce the novel Rangetable storage structure and Rangemerge method so that we efficiently manage structured data in granularity of key ranges. We develop a general prototype framework and implement several representative methods as plugins to experimentally evaluate their performance under common operating conditions. We experimentally conclude that our approach incurs range-query latency that is minimal and has low sensitivity to concurrent insertions, achieves insertion performance that approximates that of write-optimized methods under modest query load, and reduces down to half the reserved disk space.
  • Keywords
    distributed databases; query processing; relational databases; storage management; Rangemerge method; Rangetable storage structure; distributed storage systems; query responsiveness; range-based storage management; range-query latency; relational databases; reserved disk space; scalable datastores; storage server; structured data management; transactional databases; write-optimized methods; Compaction; Complexity theory; Indexes; Merging; Servers; Throughput; Distributed systems; measurements; performance; storage management;
  • fLanguage
    English
  • Journal_Title
    Parallel and Distributed Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9219
  • Type

    jour

  • DOI
    10.1109/TPDS.2013.305
  • Filename
    6684149