• DocumentCode
    150187
  • Title

    Extendable framework for monitoring heterogeneous multi-accelerator HPC cluster

  • Author

    Deepika, H.V. ; Mangala, N. ; Babu, N. Sarat Chandra

  • Author_Institution
    Hybrid Comput. Group, Centre for Dev. of Adv. Comput., Bangalore, India
  • fYear
    2014
  • fDate
    5-7 March 2014
  • Firstpage
    244
  • Lastpage
    249
  • Abstract
    The superior performance:power ratio of accelerators is motivating new cluster architectures with varied accelerator combinations. Monitoring ensures normal functioning of the cluster by detecting service degradations and prompt rectification. This paper describes a modular and extendable monitoring framework for heterogeneous multi-accelerator clusters which will be useful for future HPC systems. The framework can support third party software plugins to provide different functional features. A monitoring tool has been developed on the basis of this framework to monitor CPU, GPGPU and FPGA accelerators, network, storage, user jobs and other relevant services of a heterogeneous cluster; the tool is also capable of auto rectification to a certain extent.
  • Keywords
    field programmable gate arrays; graphics processing units; parallel processing; CPU accelerators; FPGA accelerators; GPGPU accelerators; HPC systems; accelerator combinations; accelerators power ratio; central processing unit; cluster monitoring; extendable monitoring framework; field programmable gate arrays; general-purpose graphics processing unit; heterogeneous multiaccelerator HPC cluster; high performance computing; service degradations; third party software plugins; Computer architecture; Databases; Field programmable gate arrays; Graphics processing units; Monitoring; Probes; Servers; FPGA; GPU; Many core; accelerator; cluster; heterogeneous; monitoring; plugin;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computing for Sustainable Global Development (INDIACom), 2014 International Conference on
  • Conference_Location
    New Delhi
  • Print_ISBN
    978-93-80544-10-6
  • Type

    conf

  • DOI
    10.1109/IndiaCom.2014.6828136
  • Filename
    6828136