• DocumentCode
    2414733
  • Title

    Distributed data access in the Sequential Access Model at the D0 experiment at Fermilab

  • Author

    Terekhov, Igor ; White, Victoria

  • Author_Institution
    Fermi Nat. Accel. Lab., Batavia, IL, USA
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    310
  • Lastpage
    311
  • Abstract
    Presents the Sequential Access Model (SAM), which is the data-handling system for D0, one of two primary high-energy experiments at Fermilab. During the next several years, the D0 experiment will store a total of about 1 PByte of data, including raw detector data and data processed at various levels. The design of SAM is not specific to the D0 experiment and carries few assumptions about the underlying mass storage level; its ideas are applicable to any sequential data access. By definition, in the sequential access mode, a user application needs to process a stream of data by accessing each data unit exactly once, the order of the data units in the stream being irrelevant. The units of data are laid out sequentially in files. The adopted model allows for a significant optimization of system performance, a reduction in user file latency and an increase in the overall throughput. In particular, caching is done with the knowledge of all the files that are needed “in the near future”, which is defined as all the files being used by already-running or submitted jobs. The bulk of the data is stored in files on tape in the mass storage system Enstore. All of the data managed by SAM is cataloged in great detail in a relational database (Oracle)
  • Keywords
    cache storage; data acquisition; data handling; distributed databases; high energy physics instrumentation computing; magnetic tape storage; relational databases; 1 PByte; Enstore mass storage system; Fermi National Accelerator Laboratory; Fermilab D0 experiment; Oracle relational database; Sequential Access Model; caching; data cataloguing; data files; data handling system; data stream; data units; distributed data access; high-energy physics experiment; magnetic tape storage; mass storage; processed data; raw detector data; running jobs; sequential data access; submitted jobs; system performance optimization; throughput; user file latency; Data handling; Delay; Information retrieval; Laboratories; Libraries; Relational databases; Samarium; Storage automation; System performance; Throughput;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High-Performance Distributed Computing, 2000. Proceedings. The Ninth International Symposium on
  • Conference_Location
    Pittsburgh, PA
  • ISSN
    1082-8907
  • Print_ISBN
    0-7695-0783-2
  • Type

    conf

  • DOI
    10.1109/HPDC.2000.868672
  • Filename
    868672