• DocumentCode
    1265615
  • Title

    LP-NUCA: Networks-in-Cache for High-Performance Low-Power Embedded Processors

  • Author

    Gracia, D.S. ; Dimitrakopoulos, G. ; Arnal, T.M. ; Katevenis, M.G.H. ; Yufera, V.V.

  • Author_Institution
    Dept. de Inf. e Ing. de Sist., Univ. de Zaragoza, Zaragoza, Spain
  • Volume
    20
  • Issue
    8
  • fYear
    2012
  • Firstpage
    1510
  • Lastpage
    1523
  • Abstract
    High-end embedded processors demand complex on-chip cache hierarchies satisfying several contradicting design requirements such as high-performance operation and low energy consumption. This paper introduces light-power (LP) nonuniform cache architecture (NUCA), a tiled-cache addressing both goals. LP-NUCA places a group of small and low-latency tiles between the L1 and the last level cache (LLC) that adapt better to the application working sets and keep most recently evicted blocks close to L1. LP-NUCA is built around three specialized “networks-in-cache,” each aimed at a separate cache operation. To prove the design feasibility, we have fully implemented LP-NUCA in a 90-nm technology. From the VLSI implementation, we observe that the proposed networks-in-cache incur minimal area, latency, and power overhead. To further reduce the energy consumption, LP-NUCA employs two network-wide techniques (miss wave stopping and sectoring) that together reduce the dynamic cache energy by 35% without degrading performance. Our evaluations also show that LP-NUCA improves performance with respect to cache hierarchies similar to those found in high-end embedded processors. Similar results have been obtained after scaling to a 32-nm technology.
  • Keywords
    VLSI; cache storage; embedded systems; low-power electronics; microprocessor chips; LP-NUCA; VLSI; dynamic cache; high performance low power embedded processors; light power nonuniform cache architecture; minimal area; minimal latency; minimal power overhead; networks-in-cache; size 32 nm; size 90 nm; tiled cache addressing; Delay; Network topology; Program processors; Routing; System-on-a-chip; Topology; Cache organization; VLSI; interconnection networks; low-power design; network-on-chip; nonuniform cache architecture (NUCA);
  • fLanguage
    English
  • Journal_Title
    Very Large Scale Integration (VLSI) Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1063-8210
  • Type

    jour

  • DOI
    10.1109/TVLSI.2011.2158249
  • Filename
    5941025