• DocumentCode
    117247
  • Title

    Achieving 100,000,000 database inserts per second using Accumulo and D4M

  • Author

    Kepner, Jeremy ; Arcand, William ; Bestor, David ; Bergeron, Bill ; Byun, Chansup ; Gadepally, Vijay ; Hubbell, Matthew ; Michaleas, Peter ; Mullen, Julie ; Prout, Andrew ; Reuther, Albert ; Rosa, Antonio ; Yee, Charles

  • Author_Institution
    MIT Lincoln Lab., Lexington, MA, USA
  • fYear
    2014
  • fDate
    9-11 Sept. 2014
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    The Apache Accumulo database is an open source relaxed consistency database that is widely used for government applications. Accumulo is designed to deliver high performance on unstructured data such as graphs of network data. This paper tests the performance of Accumulo using data from the Graph500 benchmark. The Dynamic Distributed Dimensional Data Model (D4M) software is used to implement the benchmark on a 216-node cluster running the MIT SuperCloud software stack. A peak performance of over 100,000,000 database inserts per second was achieved which is 100× larger than the highest previously published value for any other database. The performance scales linearly with the number of ingest clients, number of database servers, and data size. The performance was achieved by adapting several supercomputing techniques to this application: distributed arrays, domain decomposition, adaptive load balancing, and single-program-multiple-data programming.
  • Keywords
    cloud computing; distributed databases; public domain software; 216-node cluster running; Apache Accumulo database; D4M; Graph500 benchmark; MIT SuperCloud software stack; adaptive load balancing; database inserts; distributed arrays; domain decomposition; dynamic distributed dimensional data model; government applications; open source relaxed consistency database; single program multiple data programming; supercomputing techniques; Arrays; Benchmark testing; Databases; Measurement; Optimization; Parallel processing; Servers; Accumulo; Big Data; D4M; Graph500; Hadoop; MIT SuperCloud;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Extreme Computing Conference (HPEC), 2014 IEEE
  • Conference_Location
    Waltham, MA
  • Print_ISBN
    978-1-4799-6232-7
  • Type

    conf

  • DOI
    10.1109/HPEC.2014.7040945
  • Filename
    7040945