• DocumentCode
    1744734
  • Title

    Dynamic load sharing with unknown memory demands in clusters

  • Author

    Chen, Songqing ; Xiao, Li ; Zhang, Xiaodong

  • Author_Institution
    Dept. of Comput. Sci., Coll. of William & Mary, Williamsburg, VA, USA
  • fYear
    2001
  • fDate
    36982
  • Firstpage
    109
  • Lastpage
    118
  • Abstract
    A compute farm is a pool of clustered workstations to provide high performance computing services for CPU-intensive, memory-intensive, and I/O active jobs in a batch mode. Existing load sharing schemes with memory considerations assume jobs´ memory demand sizes are known in advance or predictable based on users´ hints. This assumption can greatly simplify the designs and implementations of load sharing schemes, but is not desirable in practice. In order to address this concern, we present three new results and contributions in this study. Conducting Linux kernel instrumentation, we have collected different types of workload execution traces to quantitatively characterize job interactions, and modeled page fault behavior as a function of the overloaded memory sizes and the amount of jobs´ I/O activities. Based on experimental results and collected dynamic system information, we have built a simulation model which accurately emulates the memory system operations and job migrations with virtual memory considerations. We have proposed a memory-centric load sharing scheme and its variations to effectively process dynamic memory, allocation demands, aiming at minimizing execution time of each individual job by dynamically migrating and remotely submitting jobs to eliminate or reduce page faults and to reduce the queuing time for CPU services. Conducting trace-driven simulations, we have examined these load sharing policies to show their effectiveness
  • Keywords
    Unix; operating system kernels; resource allocation; storage allocation; virtual storage; workstation clusters; CPU-intensive; Linux kernel; batch mode; dynamic load sharing; execution time; experimental results; high performance computing; input output active jobs; job interactions; memory-intensive; page fault behavior; queuing time; trace driven simulation; unknown memory demands; virtual memory; workload execution traces; workstation clusters; Application software; Computational modeling; Computer science; Educational institutions; Electric breakdown; Instruments; Kernel; Linux; Monitoring; Workstations;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Distributed Computing Systems, 2001. 21st International Conference on.
  • Conference_Location
    Mesa, AZ
  • Print_ISBN
    0-7695-1077-9
  • Type

    conf

  • DOI
    10.1109/ICDSC.2001.918939
  • Filename
    918939