• DocumentCode
    2487151
  • Title

    Query-driven parallel exploration of large datasets

  • Author

    Atanasov, Atanas ; Srinivasan, Madhusudhanan ; Weinzierl, Tobias

  • Author_Institution
    Tech. Univ. Munchen, Munich, Germany
  • fYear
    2012
  • fDate
    14-15 Oct. 2012
  • Firstpage
    23
  • Lastpage
    30
  • Abstract
    Recent advances in supercomputing capabilities pose a multi-faceted data retrieval challenge to the exploration and visualisation of the obtained results: the bandwidth between visualisation devices and the high-performance computing (HPC) clusters neither scales with the simulation data nor with the compute power, the total memory footprint of the data on the supercomputer often exceeds the aggregate memory on the visualisation, and the data has to be distributed among several visualisation nodes working in parallel to render a visual. In the present paper, we introduce an on-demand data exploration paradigm that leverages HPC capabilities and distributed visualisation without requiring a large memory footprint on the visualisation cluster. Regions of interest within the data are specified by the user in the form of queries. These queries, augmented by node identifiers on the visualisation cluster, are automatically distributed among multiple compute nodes of the HPC cluster. The compute nodes work in parallel to assemble and merge data in response to the user query until the data distribution matches the visualisation cluster´s topology. Query results are then simultaneously streamed to the right visualisation nodes. Our approach allows for interactive exploration of data residing on HPC resources, irrespective of memory footprint. The streaming of data to the visualisation nodes scales with the bandwidth of the interconnecting network and the HPC cluster´s domain decomposition, while the latter is hidden from the visualisation and can change dynamically. We demonstrate the capability of our query-driven approach with a turbulent mixing dataset, and show that it supports interactive data exploration on HPC systems.
  • Keywords
    data visualisation; interactive systems; parallel processing; query processing; rendering (computer graphics); HPC clusters; data distribution; data visualisation; distributed visualisation; domain decomposition; high-performance computing; interactive data exploration; interconnecting network bandwidth; large datasets; multifaceted data retrieval; on-demand data exploration paradigm; parallel visualisation nodes; query-driven approach; query-driven parallel exploration; rendering; simulation data; supercomputer; turbulent mixing dataset; user query response; visualisation devices; Computational modeling; Data models; Data visualization; Distributed databases; Load modeling; Supercomputers; Topology; On-demand data exploration; computational steering; distributed visualisation; large-scale data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Large Data Analysis and Visualization (LDAV), 2012 IEEE Symposium on
  • Conference_Location
    Seattle, WA
  • Print_ISBN
    978-1-4673-4732-7
  • Type

    conf

  • DOI
    10.1109/LDAV.2012.6378972
  • Filename
    6378972