• DocumentCode
    3076860
  • Title

    Deferred Lightweight Indexing for Log-Structured Key-Value Stores

  • Author

    Yuzhe Tang ; Iyengar, Arun ; Wei Tan ; Liana Fong ; Ling Liu ; Palanisamy, Balaji

  • Author_Institution
    Syracuse Univ., Syracuse, NY, USA
  • fYear
    2015
  • fDate
    4-7 May 2015
  • Firstpage
    11
  • Lastpage
    20
  • Abstract
    The recent shift towards write-intensive workload on big data (e.g., financial trading, social user-generated data streams)has pushed the proliferation of log-structured key-value stores, represented by Google´s BigTable [1], Apache HBase [2] andCassandra [3]. While providing key-based data access with aPut/Get interface, these key-value stores do not support value-based access methods, which significantly limits their applicability in modern web and database applications. In this paper, we present DELI, a DEferred Lightweight Indexing scheme on the log-structured key-value stores. To index intensively updated bigdata in real time, DELI aims at making the index maintenance as lightweight as possible. The key idea is to apply an append-only design for online index maintenance and to collect index garbage at carefully chosen time. DELI optimizes the performance of index garbage collection through tightly coupling its execution with a native routine process called compaction. The DELI´s system design is fault-tolerant and generic (to most key-valuestores), we implemented a prototype of DELI based on HBase without internal code modification. Our experiments show that the DELI offers significant performance advantage for the write-intensive index maintenance.
  • Keywords
    Big Data; cloud computing; indexing; software fault tolerance; Big Data; DELI; DELI system design; HBase; aPut/Get interface; append-only design; cloud computing; deferred lightweight indexing scheme; fault-tolerance; index garbage collection; key-based data access method; log-structured key-value stores; value-based access methods; write-intensive online index maintenance; Compaction; Data models; Electronic mail; Fault tolerance; Fault tolerant systems; Indexes; Maintenance engineering; Log-structured; NoSQL; indexing; key-value stores; secondary index;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster, Cloud and Grid Computing (CCGrid), 2015 15th IEEE/ACM International Symposium on
  • Conference_Location
    Shenzhen
  • Type

    conf

  • DOI
    10.1109/CCGrid.2015.150
  • Filename
    7152467