• DocumentCode
    3235236
  • Title

    DASH: a Recipe for a Flash-based Data Intensive Supercomputer

  • Author

    He, Jiahua ; Jagatheesan, Arun ; Gupta, Sandeep ; Bennett, Jeffrey ; Snavely, Allan

  • Author_Institution
    San Diego Supercomput. Center (SDSC), Univ. of California, San Diego, CA, USA
  • fYear
    2010
  • fDate
    13-19 Nov. 2010
  • Firstpage
    1
  • Lastpage
    11
  • Abstract
    Data intensive computing can be defined as computation involving large datasets and complicated I/O patterns. Data intensive computing is challenging because there is a five-orders-of-magnitude latency gap between main memory DRAM and spinning hard disks; the result is that an inordinate amount of time in data intensive computing is spent accessing data on disk. To address this problem we designed and built a prototype data intensive supercomputer named DASH that exploits flash-based Solid State Drive (SSD) technology and also virtually aggregated DRAM to fill the latency gap . DASH uses commodity parts including Intel® X25-E flash drives and distributed shared memory (DSM) software from ScaleMP®. The system is highly competitive with several commercial offerings by several metrics including achieved IOPS (input output operations per second), IOPS per dollar of system acquisition cost, IOPS per watt during operation, and IOPS per gigabyte (GB) of available storage. We present here an overview of the design of DASH, an analysis of its cost efficiency, then a detailed recipe for how we designed and tuned it for high data-performance, lastly show that running data-intensive scientific applications from graph theory, biology, and astronomy, we achieved as much as two orders-of- magnitude speedup compared to the same applications run on traditional architectures.
  • Keywords
    DRAM chips; hard discs; parallel machines; DASH; DSM software; I/O patterns; SSD technology; ScaleMP; X25-E flash drives; data intensive computing; distributed shared memory software; flash-based data intensive supercomputer; flash-based solid state drive technology; graph theory; memory DRAM; spinning hard disks; Drives; Measurement; Memory management; Random access memory; Software; Spinning; Tuning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing, Networking, Storage and Analysis (SC), 2010 International Conference for
  • Conference_Location
    New Orleans, LA
  • Print_ISBN
    978-1-4244-7557-5
  • Electronic_ISBN
    978-1-4244-7558-2
  • Type

    conf

  • DOI
    10.1109/SC.2010.16
  • Filename
    5645466