• DocumentCode
    3298545
  • Title

    Experiences with a Private Enterprise Cloud: Providing Fault Tolerance and High Availability for Interactive EDA Applications

  • Author

    Kamath, Vinayak ; Giri, Ritwik ; Muralidhar, R.

  • Author_Institution
    Intel Corp., India
  • fYear
    2013
  • fDate
    June 28 2013-July 3 2013
  • Firstpage
    770
  • Lastpage
    777
  • Abstract
    Silicon Design and Electronic Design Automation (EDA) business is highly competitive and time to market is of utmost importance in the semiconductor industry where companies put in a lot of effort to make sure that the first silicon is as healthy as possible. Hence it is imperative that the EDA compute environment provides maximum uptime to design engineers by utilizing several different High Performance Computing (HPC) technologies. In this paper we present Intel´s EDA compute infrastructure, along with a detailed software and system architecture for supporting workload checkpointing, restoration and migration for specifically EDA jobs that are interactive in nature. We also describe our experiences in providing high availability for such EDA applications using existing popular HA/FT techniques. We believe that this is one of the few detailed descriptions of the EDA compute infrastructure of a large and complex semiconductor design company and that this will be useful in addressing future HPC challenges for EDA workloads as HPC technologies mature and evolve.
  • Keywords
    checkpointing; cloud computing; electronic design automation; interactive systems; production engineering computing; semiconductor device manufacture; semiconductor industry; software architecture; EDA job migration; EDA job restoration; FT technique; HA technique; HPC technologies; Intel EDA compute infrastructure; electronic design automation business; fault tolerance; high availability; high performance computing technologies; interactive EDA applications; private enterprise cloud; semiconductor design company; semiconductor industry; silicon design business; software architecture; system architecture; workload checkpointing; Availability; Checkpointing; Companies; Computational modeling; Fault tolerance; Servers; Silicon; EDA; Fault tolerance; High availability; interactive computing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cloud Computing (CLOUD), 2013 IEEE Sixth International Conference on
  • Conference_Location
    Santa Clara, CA
  • Print_ISBN
    978-0-7695-5028-2
  • Type

    conf

  • DOI
    10.1109/CLOUD.2013.72
  • Filename
    6740221