Author_Institution :
Res. Sch. of Comput. Sci., Australian Nat. Univ., Canberra, ACT, Australia
Abstract :
We detail the design and experiences in delivering a specialty multicore computing course whose materials are openly available. The course ambitiously covers three multicore programming paradigms: shared memory (OpenMP), device (CUDA) and message passing (RCCE), and involves significant practical work on their respective platforms: an UltraSPARC T2, Fermi GPU and the Intel Single-Chip Cloud Computer. Specialized multicore architecture topics include chip multiprocessing, virtualization support, on-chip accelerators and networks, transactional memory and speculative execution. The mode of delivery emphasizes the relationship between programming performance and the underlying computer architecture, necessitating the need to provide suitable infrastructure in the form of instrumented test programs and the use of performance evaluation tools. Further infrastructure had to be created to facilitate the safe, convenient and efficient use by students on the GPU and Single-Chip Cloud Computer. The programming assignments, based on the theme of the LINPACK benchmark, also required significant infrastructure for reliably determining correctness and assisting debugging. While the course assumed as background knowledge an introductory computer systems and concurrency course, we found that students could learn device programming in a short time, by building on their knowledge of shared memory programming. However, we found that more time is needed for learning message passing. We also found that, provided students had a suitably strong computer systems background, they could successfully meet the course´s learning objectives, although the skill of correctly interpreting performance data remains difficult to learn when suitable performance analysis tools are not available.
Keywords :
computer science education; educational courses; message passing; multiprocessing systems; parallel architectures; programming; teaching; CUDA; Fermi GPU; Intel single-chip cloud computer; OpenMP; RCCE; UltraSPARC T2; chip multiprocessing; computer architecture; concurrency course; device programming; introductory computer systems; message passing; multicore programming paradigms; on-chip accelerators; programming performance; shared memory programming; specialized multicore architecture; specialty multicore computing course; speculative execution; teaching; transactional memory; virtualization support; Computers; Graphics processing unit; Kernel; Message passing; Multicore processing; Programming profession; computing education; multicore computing; parallel computing;