• DocumentCode
    2284994
  • Title

    Data-WISE: Efficient management of data-intensive workflows in scheduled grid environments

  • Author

    Dasgupta, Gargi ; Dasgupta, Koustuv ; Viswanathan, Balaji

  • fYear
    2008
  • fDate
    7-11 April 2008
  • Firstpage
    488
  • Lastpage
    495
  • Abstract
    The execution of data-intensive workflow applications in scientific and enterprise grids has gained popularity in recent times. Such applications process large and dynamic data sets, and often present scope for optimized data handling that can be exploited for performance. Traditionally, core grid middleware technologies of scheduling and orchestration, have treated data management as a background activity - decoupled from job management and handled at the storage and/or network protocol level. We believe that an important requirement for building data-aware grid technologies lies in managing data flows at the application level, in conjunction with their computation counterparts. To this end, we present Data-WISE, an end-to-end framework for management of data-intensive workflows as first class citizens, that addresses aspects of data flow orchestration, co-scheduling and runtime management. The optimizations are focused on exploiting application structure for use of data parallelism, replication, and runtime adaptations. We implement data-WISE on a real testbed and demonstrate significant improvements in terms of application response time, resource utilization, and adaptability to varying resource conditions. The proposed framework acts as an important step towards making distributed execution of data-intensive workflows a reality.
  • Keywords
    data handling; grid computing; middleware; optimisation; scheduling; Data-WISE; core grid middleware technologies; data flow orchestration; data parallelism; data-intensive workflow management; enterprise grids; job management; optimized data handling; runtime management; scheduled grid environments; Computer network management; Data flow computing; Data handling; Environmental management; Grid computing; Middleware; Parallel processing; Protocols; Runtime; Technology management;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Network Operations and Management Symposium, 2008. NOMS 2008. IEEE
  • Conference_Location
    Salvador, Bahia
  • ISSN
    1542-1201
  • Print_ISBN
    978-1-4244-2065-0
  • Electronic_ISBN
    1542-1201
  • Type

    conf

  • DOI
    10.1109/NOMS.2008.4575172
  • Filename
    4575172