Abstract :
Shared memory in a parallel computer provides programmers with the valuable abstraction of a shared address space--through which any part of a computation can access any datum Although uniform access simplifies programming, it also hides communication, which can lead to inefficient programs The check-in, check-out (CICO) performance model for cache-coherent, shared-memory parallel computers helps a programmer identify the communication underlying memory references and account for its cost CICO consists of annotations that a programmer can use to elucidate communication and a model that attributes costs to these annotations The annotations can also serve as directives to a memory system to improve program performance Inserting CICO annotations requires reasoning about the dynamic cache behavior of a program, which is not always easy This paper describes Cachier, a tool that automatically inserts CICO annotations into shared-memory programs A novel feature of this tool is its use of both dynamic information, obtained from a program execution trace, as well as static information, obtained from program analysis We measured several benchmarks annotated by Cachier by running them on a simulation of the DiriSW cache coherence protocol [10], which supports these directives The results show that programs annotated by Cachier perform significantly better than both programs without CICO annotations and programs that were annotated by hand