• DocumentCode
    581019
  • Title

    Functional post-silicon diagnosis and debug for networks-on-chip

  • Author

    Abdel-Khalek, Rawan ; Bertacco, Valeria

  • Author_Institution
    Comput. Sci. & Eng. Dept., Univ. of Michigan, Ann Arbor, MI, USA
  • fYear
    2012
  • fDate
    5-8 Nov. 2012
  • Firstpage
    557
  • Lastpage
    563
  • Abstract
    Networks-on-chip (NoCs) have emerged as a favorable solution to provide higher bandwidth interconnects for large chip multiprocessors (CMPs). In order to enhance the inter-connect´s performance, the NoC is often designed to include complex components and advanced features. Along with the increase in complexity and size, ensuring the functional correctness of the NoC can be particularly challenging This challenge pervades the entire verification effort, and particularly post-silicon validation, due to the lack of observability of the networks complex internal operation. We propose a post-silicon validation platform that enhances observability of network activity by periodically taking snapshots of the packets in flight. Each node´s local cache is configured to store the snapshot logs in a temporary space allocated for post-silicon validation and released at deployment. Each snapshot log is periodically and locally analyzed by a software algorithm, running on the processor´s core, in order to detect functional errors. If an error is detected, the snapshot logs are aggregated and additional debug data is extracted. This includes an overview of the traffic in the network at the time surrounding the manifestation of the error, as well as a partial reconstruction of the routes followed by the packets in flight. In our experiments, we found that this approach allows us to detect several types of functional errors, as well as observe over 50% of the network´s traffic on average and reconstruct at least half of each of their routes through the network.
  • Keywords
    multiprocessor interconnection networks; network-on-chip; CMP; NoC; debug data; functional post-silicon diagnosis; large chip multiprocessors; network complex internal operation observability; network traffic; networks-on-chip; snapshot logs; software algorithm; Clocks; Computer bugs; Debugging; Monitoring; Observability;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer-Aided Design (ICCAD), 2012 IEEE/ACM International Conference on
  • Conference_Location
    San Jose, CA
  • ISSN
    1092-3152
  • Type

    conf

  • Filename
    6386727