Author_Institution :
Div. of Inf. & Comput. Sci., Lawrence Berkeley Lab., CA, USA
Abstract :
Modern scientific computing involves organizing, moving, visualizing, and analyzing massive amounts of data from around the world, as well as employing large-scale computation. The distributed systems that solve large-scale problems will always involve aggregating and scheduling many resources. Data must be located and staged, cache and network capacity must be available at the same time as computing capacity, etc. Every aspect of such a system is dynamic: locating and scheduling resources, adapting running application systems to availability and congestion in the middleware and infrastructure, responding to human interaction, etc. The technologies, the middleware services, and the architectures that are used to build useful high-speed, wide area distributed systems, constitute the field of data intensive computing. This paper explores some of the history and future directions of that field
Keywords :
cache storage; client-server systems; performance evaluation; scheduling; wide area networks; cache; data analysis; data intensive computing; data visualization; distributed systems; future directions; high-speed computing; history; human interaction; large-scale computation; middleware; network capacity; resource scheduling; scientific computing; wide area computing; wide area distributed systems; Availability; Computer networks; Data visualization; Dynamic scheduling; Humans; Large-scale systems; Middleware; Organizing; Processor scheduling; Scientific computing;