• DocumentCode
    2344262
  • Title

    Functional correctness for CMP interconnects

  • Author

    Abdel-Khalek, Rawan ; Parikh, Ritesh ; DeOrio, Andrew ; Bertacco, Valeria

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Univ. of Michigan, Ann Arbor, MI, USA
  • fYear
    2011
  • fDate
    9-12 Oct. 2011
  • Firstpage
    352
  • Lastpage
    359
  • Abstract
    As transistor counts continue to scale, modern designs are transitioning towards large chip multi-processors (CMPs). In order to match the advancing performance of CMPs, on-chip interconnects are becoming increasingly complex, commonly deploying advanced network-on-chip (NoC) structures. Ensuring the correct operation of these system-level infrastructures has become increasingly problematic and, in order to avoid the potential for functional design errors manifesting into the final product, there is a need for mechanisms to safeguard communication integrity at runtime. In this paper, we propose SafeNoC, an end-to-end error detection and recovery solution to ensure the functional correctness of CMP interconnects. SafeNoC augments the existing interconnect with a simple, lightweight checker network that is guaranteed to deliver messages correctly. For each data message sent over the primary NoC, a look-ahead signature is transmitted over the checker network and is used to detect errors in the corresponding data message. If a functional communication bug is detected, a novel recovery algorithm reconstructs the data that was in flight at the time of the error occurrence, ensuring that it reaches the intended destination. In our experiments, we found that SafeNoC can recover from a wide variety of errors, with almost no performance impact in the absence of errors. A lightweight solution, SafeNoC occupies a 2.41% area overhead in a 64-core CMP, 7× smaller than common retransmission-based approaches.
  • Keywords
    integrated circuit interconnections; multiprocessing systems; network-on-chip; CMP interconnects; SafeNoC; chip multiprocessors; data message; end-to-end error detection; functional communication bug; functional correctness; functional design error; lightweight checker network; look-ahead signature; network-on-chip; on-chip interconnects; system-level infrastructure; transistor counts; Computer bugs; Hardware; Routing protocols; Runtime; Software; System recovery; Topology;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Design (ICCD), 2011 IEEE 29th International Conference on
  • Conference_Location
    Amherst, MA
  • ISSN
    1063-6404
  • Print_ISBN
    978-1-4577-1953-0
  • Type

    conf

  • DOI
    10.1109/ICCD.2011.6081423
  • Filename
    6081423