• DocumentCode
    129024
  • Title

    Time-critical computing on a single-chip massively parallel processor

  • Author

    de Dinechin, Benoit Dupont ; van Amstel, Duco ; Poulhies, Marc ; Lager, Guillaume

  • fYear
    2014
  • fDate
    24-28 March 2014
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    The requirement of high performance computing at low power can be met by the parallel execution of an application on a possibly large number of programmable cores. However, the lack of accurate timing properties may prevent parallel execution from being applicable to time-critical applications. We illustrate how this problem has been addressed by suitably designing the architecture, implementation, and programming model, of the Kalray MPPA®-256 single-chip many-core processor. The MPPA® -256 (Multi-Purpose Processing Array) processor integrates 256 processing engine (PE) cores and 32 resource management (RM) cores on a single 28nm CMOS chip. These VLIW cores are distributed across 16 compute clusters and 4 I/O subsystems, each with a locally shared memory. On-chip communication and synchronization are supported by an explicitly addressed dual network-on-chip (NoC), with one node per compute cluster and 4 nodes per I/O subsystem. Off-chip interfaces include DDR, PCI and Ethernet, and a direct access to the NoC for low-latency processing of data streams. The key architectural features that support time-critical applications are timing compositional cores, independent memory banks inside the compute clusters, and the data NoC whose guaranteed services are determined by network calculus. The programming model provides communicators that effectively support distributed computing primitives such as remote writes, barrier synchronizations, active messages, and communication by sampling. POSIX time functions expose synchronous clocks inside compute clusters and mesosynchronous clocks across the MPPA®-256 processor.
  • Keywords
    multiprocessing systems; network-on-chip; parallel processing; CMOS chip; NoC; POSIX time functions; VLIW cores; distributed computing primitives; dual network-on-chip; high performance computing; key architectural features; low-latency processing; multipurpose processing array processor; processing engine cores; resource management cores; singlechip manycore processor; singlechip massively parallel processor; synchronous clocks; time-critical computing; Computational modeling; Computer architecture; Programming; Registers; Time factors; Timing; VLIW;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Design, Automation and Test in Europe Conference and Exhibition (DATE), 2014
  • Conference_Location
    Dresden
  • Type

    conf

  • DOI
    10.7873/DATE.2014.110
  • Filename
    6800311