• DocumentCode
    2524972
  • Title

    Grid Datafarm Architecture for Petascale Data Intensive Computing

  • Author

    Tatebe, Osamu ; Morita, Youhei ; Matsuoka, Satoshi ; Soda, Noriyuki ; Sekiguchi, Satoshi

  • fYear
    2002
  • fDate
    21-24 May 2002
  • Firstpage
    102
  • Lastpage
    102
  • Abstract
    The Grid Datafarm (Gfarm) architecture is designed for global petascale data-intensive computing. It provides a global parallel filesystem with online petascale storage, scalable I/O bandwidth, and scalable parallel processing, and it can exploit local I/O in a grid of clusters with tens of thousands of nodes. Gfarm parallel I/O APIs and commands provide a single filesystem image and manipulate filesystem metadata consistently. Fault tolerance and load balancing are automatically managed by file duplication or recomputation using a command history log. Preliminary performance evaluation has shown scalable disk I/O and network bandwidth on 64 nodes of the Presto III Athlon cluster. The Gfarm parallel I/O write and read operations has achieved data transfer rates of 1.74 GB/s and 1.97 GB/s, respectively, using 64 cluster nodes. The Gfarm parallel file copy reached 443 MB/s with 23 parallel streams on the Myrinet 2000. The Gfarm architecture is expected to enable petascale data-intensive Grid computing with an I/O bandwidth scales to the TB/s range and scalable computational power.
  • Keywords
    Bandwidth; Computer architecture; Computer industry; Concurrent computing; Fault tolerance; Grid computing; Large Hadron Collider; Large-scale systems; Petascale computing; Processor scheduling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing and the Grid, 2002. 2nd IEEE/ACM International Symposium on
  • Print_ISBN
    0-7695-1582-7
  • Type

    conf

  • DOI
    10.1109/CCGRID.2002.1017117
  • Filename
    1540446