• DocumentCode
    3238201
  • Title

    Cycle-approximate Retargetable Performance Estimation at the Transaction Level

  • Author

    Hwang, Yonghyun ; Abdi, Samar ; Gajski, Daniel

  • Author_Institution
    Center for Embedded Comput. Syst., Univ. of California, Irvine, Irvine, CA
  • fYear
    2008
  • fDate
    10-14 March 2008
  • Firstpage
    3
  • Lastpage
    8
  • Abstract
    This paper presents a novel cycle-approximate performance estimation technique for automatically generated transaction level models (TLMs) for heterogeneous multi-core designs. The inputs are application C processes and their mapping to processing units in the platform. The processing unit model consists of pipelined datapath, memory hierarchy and branch delay model. Using the processing unit model, the basic blocks in the C processes are analyzed and annotated with estimated delays. This is followed by a code generation phase where delay-annotated C code is generated and linked with a SystemC wrapper consisting of inter-process communication channels. The generated TLM is compiled and executed natively on the host machine. Our key contribution is that the estimation technique is close to cycle-accurate, it can be applied to any multi-core platform and it produces high-speed native compiled TLMs. For experiments, timed TLMs for industrial scale designs such as MP3 decoder were automatically generated for 4 heterogeneous multi-processor platforms with up to 5 PEs under 1 minute. Each TLM simulated under 1 second, compared to 3-4 hrs of instruction set simulation (ISS) and 15-18 hrs of RTL simulation. Comparison to on-board measurement showed only 8 % error on average in estimated number of cycles.
  • Keywords
    hardware description languages; instruction sets; logic design; logic simulation; program compilers; RTL simulation; SystemC wrapper; application C processes; automatically generated transaction level models; branch delay model; code generation phase; cycle-approximate retargetable performance estimation; delay-annotated C code; heterogeneous multicore designs; instruction set simulation; inter-process communication channels; memory hierarchy; pipelined datapath; processing unit model; time 15 hr to 18 hr; time 3 hr to 4 hr; Application software; Communication channels; Decoding; Delay estimation; Digital audio players; Embedded computing; Job shop scheduling; Multicore processing; Processor scheduling; Space exploration;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Design, Automation and Test in Europe, 2008. DATE '08
  • Conference_Location
    Munich
  • Print_ISBN
    978-3-9810801-3-1
  • Electronic_ISBN
    978-3-9810801-4-8
  • Type

    conf

  • DOI
    10.1109/DATE.2008.4484651
  • Filename
    4484651