• DocumentCode
    1594064
  • Title

    A Service for Data-Intensive Computations on Virtual Clusters

  • Author

    Schmidt, Rainer ; Sadilek, Christian ; King, Ross

  • Author_Institution
    ARC Digital Memory Eng., Austrian Res. Centers GmbH, Vienna
  • fYear
    2009
  • Firstpage
    28
  • Lastpage
    33
  • Abstract
    Digital preservation deals with the long-term storage, access, and maintenance of digital data objects. In order to prevent a loss of information, digital libraries and archives are increasingly faced with the need to electronically preserve vast amounts of data while having limited computational resources in-house. However, due to the potentially immense data sets and computationally intensive tasks involved, preservation systems have become a recognized challenge for e-science. We argue that grid and cloud technology can provide the crucial technology for building scalable preservation systems. In this paper, we present recent developments on a job submission service that is based on standard grid mechanisms and capable of providing a large cluster of virtual machines. The service allows clients to specify and execute preservation tools on large data sets based on dynamically generated job descriptors. This approach allows us to utilize a cloud infrastructure that is based on platform virtualization as a scaling environment for the execution of preservation workflows. Finally, we present experimental results that have been conducted on the Amazon EC2 and S3 utility cloud infrastructure.
  • Keywords
    data handling; digital libraries; grid computing; information retrieval systems; virtual machines; Amazon EC2 utility cloud infrastructure; Amazon S3 utility cloud infrastructure; cloud technology; data-intensive computations; digital archives; digital libraries; digital preservation; grid technology; job submission service; virtual clusters; virtual machines; Cloud computing; Cultural differences; Data engineering; Information technology; Planets; Platform virtualization; Protocols; Software libraries; Standards development; Virtual machining; cloud computing; cluster computing; component model; data intensive; digital preservation; grid computing; web services; workflow;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intensive Applications and Services, 2009. INTENSIVE '09. First International Conference on
  • Conference_Location
    Valencia
  • Print_ISBN
    978-1-4244-3683-5
  • Electronic_ISBN
    978-0-7695-3585-2
  • Type

    conf

  • DOI
    10.1109/INTENSIVE.2009.13
  • Filename
    4976418