• DocumentCode
    1267405
  • Title

    Dynamic cluster resource allocations for jobs with known and unknown memory demands

  • Author

    Xiao, Li ; Chen, Songqing ; Zhang, Xiaodong

  • Author_Institution
    Dept. of Comput. Sci., Coll. of William & Mary, Williamsburg, VA, USA
  • Volume
    13
  • Issue
    3
  • fYear
    2002
  • fDate
    3/1/2002 12:00:00 AM
  • Firstpage
    223
  • Lastpage
    240
  • Abstract
    The cluster system we consider for load sharing is a compute farm which is a pool of networked server nodes providing high-performance computing for CPU-intensive, memory-intensive, and I/O active jobs in a batch mode. Existing resource management systems mainly target at balancing the usage of CPU loads among server nodes. With the rapid advancement of CPU chips, memory and disk access speed improvements significantly lag behind advancement of CPU speed, increasing the penalty for data movement, such as page faults and I/O operations, relative to normal CPU operations. Aiming at reducing the memory resource contention caused by page faults and I/O activities, we have developed and examined load sharing policies by considering effective usage of global memory in addition to CPU load balancing in clusters. We study two types of application workloads: 1) Memory demands are known in advance or are predictable and 2) memory demands are unknown and dynamically changed during execution. Besides using workload traces with known memory demands, we have also made kernel instrumentation to collect different types of workload execution traces to capture dynamic memory access patterns. Conducting different groups of trace-driven simulations, we show that our proposed policies can effectively improve overall job execution performance by well utilizing both CPU and memory resources with known and unknown memory demands
  • Keywords
    distributed memory systems; operating system kernels; resource allocation; virtual machines; workstation clusters; CPU loads; compute farm; dynamic cluster resource allocations; dynamic memory access patterns; job execution performance; kernel instrumentation; load sharing; memory demands; memory- intensive workloads; networked server nodes; trace-driven simulations; workload execution traces; Resource management;
  • fLanguage
    English
  • Journal_Title
    Parallel and Distributed Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9219
  • Type

    jour

  • DOI
    10.1109/71.993204
  • Filename
    993204