• DocumentCode
    249413
  • Title

    Parallel POD Compression of Time-Varying Big Datasets Using m-Swap on the K Computer

  • Author

    Chongke Bi ; Ono, Keishi ; Lu Yang

  • Author_Institution
    RIKEN, Kobe, Japan
  • fYear
    2014
  • fDate
    June 27 2014-July 2 2014
  • Firstpage
    438
  • Lastpage
    445
  • Abstract
    Thanks to the supercomputer, more and more complicated simulations are successfully achieved. On the other hand, to analyze and understand the intrinsic properties of the big datasets from the simulations is an urgent research for scientists. However, the explosive size of the big datasets makes such kind of task difficult. Therefore, reduction of the size of the big datasets becomes an important topic, in which data compression and parallel computing are the two key techniques. In this paper, we presented a parallel data compression approach to reduce the size of time-varying big datasets. Firstly, we employ the proper orthogonal decomposition (POD) method for compression. The POD method can extract the underlying features of datasets to greatly reduce the size of big datasets. Meanwhile, the compressed datasets can be decompressed linearly. This feature can help scientists to interactively visualize big datasets for analysis. Then, we introduced a novel m-swap method to effectively parallelize the POD compression algorithm. The m-swap method can reach a high performance through fully using all parallel computing processors. In another word, no idle processors exist in the parallel compression process. Furthermore, the m-swap method can greatly reduce the cost of interprocessor communication. This is achieved by controlling the data transfer among 2m processors to obtain the best balance of computation cost of these processors. Finally, the effectiveness of our method will be demonstrated through compressing several time-varying big datasets on the K computer with ten thousands of processors.
  • Keywords
    Big Data; data compression; parallel processing; K computer; data transfer; interprocessor communication cost reduction; m-swap method; parallel POD compression algorithm; parallel computing processors; parallel data compression; proper orthogonal decomposition method; time-varying big datasets; Compression algorithms; Image coding; Parallel algorithms; Program processors; Three-dimensional displays; Vectors; POD; m-swap; parallel compression;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data (BigData Congress), 2014 IEEE International Congress on
  • Conference_Location
    Anchorage, AK
  • Print_ISBN
    978-1-4799-5056-0
  • Type

    conf

  • DOI
    10.1109/BigData.Congress.2014.70
  • Filename
    6906813