• DocumentCode
    2900141
  • Title

    DSBS: Distributed and Scalable Barrier Synchronization in Many-Core Network-on-Chips

  • Author

    Chen, Xiaowen ; Chen, Shuming

  • Author_Institution
    Nat. Univ. of Defense Technol., Changsha, China
  • fYear
    2011
  • fDate
    16-18 Nov. 2011
  • Firstpage
    1030
  • Lastpage
    1037
  • Abstract
    Abstract-This paper proposes a distributed and scalable hardware solution for efficient barrier synchronization management on many-core Network-on-Chips (NoCs). It includes two hardware modules, named Root Distributed and Scalable Barrier Synchronizer (Root DSBS) and Leaf Distributed and Scalable Barrier Synchronizer (Leaf DSBS). The Root DSBS is located in the central node, connecting to the processor core and the network interface. It provides a set of globally addressed barrier counters, sets the barrier and counts arriving "barrier acquire" requests, and releases the barrier and sends out "barrier release" acknowledgements once the barrier condition is satisfied. The Leaf DSBS is integrated into each router in the on-chip network. It is responsible for efficiently transmitting barrier synchronization related packets in the on-chip network to the Root DSBS. The Root DSBS in the central node and all Leaf DSBSs in routers cooperate together to accomplish barrier synchronization. Our solution has two salient features. One is called "Unicast Merging" - "barrier acquire" packets towards the same barrier are merged into one packet when they pass through the same router simultaneously. The purpose is to minimize the completion time of barrier acquiring by reducing the number of barrier synchronization related packets. The other is called "Broadcasting" - a "barrier release" packet is broadcasted to all synchronized nodes. Its object is to reduce area cost by avoiding storing synchronized node numbers as well as to minimize the completion time of barrier releasing by avoiding sending unicast "barrier release" packets. To evaluate the performance, we investigate hardware cost and employ both synthetic and application experiments. Synthesis and experiment results show that our distributed and scalable barrier synchronization obtains both area and performance advantage over the conventional barrier synchronization counterpart. The Root DSBS and Leaf DSBSs can run over 2GHz- in TSMC® 65nm technology with small area overhead. Our solution only costs a little completion time and generates well distributed and uniform network traffic. When the network size is 16×16, the application\´s performance improvement can achieve 24.60%.
  • Keywords
    distributed processing; network-on-chip; DSBS; Leaf DSBS; Network-on-Chips; NoCs; Root DSBS; distributed and scalable barrier synchronization; leaf distributed and scalable barrier synchronizer; many core network-on-chips; network interface; processor core; root distributed and scalable barrier synchronizer; unicast merging; Corporate acquisitions; Hardware; Merging; Radiation detectors; Synchronization; System-on-a-chip; Unicast; Many-core; Network-on-Chips; Synchronization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Trust, Security and Privacy in Computing and Communications (TrustCom), 2011 IEEE 10th International Conference on
  • Conference_Location
    Changsha
  • Print_ISBN
    978-1-4577-2135-9
  • Type

    conf

  • DOI
    10.1109/TrustCom.2011.141
  • Filename
    6120934