• DocumentCode
    3732300
  • Title

    In-memory Query System for Scientific Dataseis

  • Author

    Hsuan-Te Chiu;Jerry Chou;Venkat Vishwanath;Kesheng Wu

  • Author_Institution
    Nat. Tsing Hua Univ., Hsinchu, Taiwan
  • fYear
    2015
  • Firstpage
    362
  • Lastpage
    371
  • Abstract
    The growing gap between compute performance and I/O bandwidth coupled with the increasing data volumes has resulted in a bottleneck to the traditional post-simulation data processing method. Hence in-situ computing and query-driven data analysis are important techniques to minimize data movement. By taking advantage of the growing memory capacity on supercomputers, we developed an in-memory query system for scientific data analysis. Our approach is a combination of bitmap indexing, spatial data layout re-organization, distributed shared memory, and location-aware parallel execution. Our evaluations using real scientific datasets showed that we can aggregate the memory capacity from thousands of computes nodes to analyze a 750GB simulation dataset without transferring data to remote nodes or storage systems. Comparing to traditional solutions based on out-of-core parallel file systems, we achieve significant higher query performance.
  • Keywords
    "Indexing","Data analysis","Computational modeling","Arrays","Data models","Analytical models"
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Systems (ICPADS), 2015 IEEE 21st International Conference on
  • Electronic_ISBN
    1521-9097
  • Type

    conf

  • DOI
    10.1109/ICPADS.2015.53
  • Filename
    7384316