• DocumentCode
    598618
  • Title

    A practical method for estimating performance degradation on multicore processors, and its application to HPC workloads

  • Author

    Dwyer, Tim ; Fedorova, Alexandra ; Blagodurov, Sergey ; Roth, Michael ; Gaud, F. ; Jian Pei

  • Author_Institution
    Sch. of Comput. Sci., Simon Fraser Univ., Burnaby, BC, Canada
  • fYear
    2012
  • fDate
    10-16 Nov. 2012
  • Firstpage
    1
  • Lastpage
    11
  • Abstract
    When multiple threads or processes run on a multi-core CPU they compete for shared resources, such as caches and memory controllers, and can suffer performance degradation as high as 200%. We design and evaluate a new machine learning model that estimates this degradation online, on previously unseen workloads, and without perturbing the execution. Our motivation is to help data center and HPC cluster operators effectively use workload consolidation. Data center consolidation is about placing many applications on the same server to maximize hardware utilization. In HPC clusters, processes of the same distributed applications run on the same machine. Consolidation improves hardware utilization, but may sacrifice performance as processes compete for resources. Our model helps determine when consolidation is overly harmful to performance. Our work is the first to apply machine learning to this problem domain, and we report on our experience reaping the advantages of machine learning while navigating around its limitations. We demonstrate how the model can be used to improve performance fidelity and save energy for HPC workloads.
  • Keywords
    computer centres; human factors; learning (artificial intelligence); multi-threading; multiprocessing systems; parallel processing; resource allocation; workstation clusters; HPC cluster operators; HPC workloads; data center consolidation; distributed applications; energy saving; execution perturbation; machine learning model; multicore CPU; multicore processors; multiple threads; online degradation; performance degradation estimation; resource sharing; workload consolidation; Accuracy; Data models; Degradation; Hardware; Machine learning; Predictive models; Radiation detectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing, Networking, Storage and Analysis (SC), 2012 International Conference for
  • Conference_Location
    Salt Lake City, UT
  • ISSN
    2167-4329
  • Print_ISBN
    978-1-4673-0805-2
  • Type

    conf

  • DOI
    10.1109/SC.2012.11
  • Filename
    6468532