Abstract :
Big data processing, e.g., graph computation and MapReduce, is characterized by massive parallelism in computation and a large amount of fine-grained random memory accesses often with structural localities due to graph-like data dependency. Recently, GPU is gaining more and more attention for servers due to its capability of parallel computation. However, the current GPU architecture is not well suited to big data workloads due to the limited capability of handling a large number of memory requests. In this paper, we present a special function unit, called memory fast-forward (MFF) unit, to address this problem. Our proposed MFF unit provides two key functions. First, it supports pointer chasing which enables computation threads to issue as many memory requests as possible to increase the potential of coalescing memory requests. Second, it coalesces memory requests bound for the same cache block, often due to structural locality, thereby reducing memory traffics. Both pointer chasing and memory request coalescing contribute to reducing memory stall time as well as improving the real utilization of memory bandwidth, by removing duplicate memory traffics, thereby improving performance and energy efficiency. Our experiments with graph computation algorithms and real graphs show that the proposed MFF unit can improve the energy efficiency of GPU in graph computation by average 54.6% at a negligible area cost.
Keywords :
Big Data; cache storage; graphics processing units; parallel processing; power aware computing; GPU; MFF unit; big data processing; cache block; coalescing memory requests; computation threads; energy efficiency; graph computation algorithms; memory fast-forward unit; memory stall reduction; memory traffic reduction; parallel computation; pointer chasing; structural locality; Arrays; Big data; Graphics processing units; Instruction sets; Memory management; Registers;