Author :
Milenkovic, Milena ; Milenkovic, Aleksandar ; Burtscher, Martin
Abstract :
Instruction and data address traces are widely used by computer designers for quantitative evaluations of new architectures and workload characterization, as well as by software developers for program optimization, performance tuning, and debugging. Such traces are typically very large and need to be compressed to reduce the storage, processing, and communication bandwidth requirements. However, preexisting general-purpose and trace-specific compression algorithms are designed for software implementation and are not suitable for runtime compression. Compressing program execution traces at runtime in hardware can deliver insights into the behavior of the system under test without any negative interference with normal program execution. Traditional debugging tools, on the other hand, have to stop the program frequently to examine the state of the processor. Moreover, software developers often do not have access to the entire history of computation that led to an erroneous state. In addition, stepping through a program is a tedious task and may interact with other system components in such a way that the original errors disappear, thus preventing any useful insight. The need for unobtrusive tracing is further underscored by the development of computer systems that feature multiple processing cores on a single chip. In this paper, we introduce a set of algorithms for compressing instruction and data address traces that can easily be implemented in an on-chip trace compression module and describe the corresponding hardware structures. The proposed algorithms are analytically and experimentally evaluated. Our results show that very small hardware structures suffice to achieve a compression ratio similar to that of a software implementation of gzip while being orders of magnitude faster. A hardware structure with slightly over 2 KB of state achieves a compression ratio of 125.9 for instruction address traces, whereas gzip achieves a compression ratio of 87.4. For data addr- - ess traces, a hardware structure with 5 KB of state achieves a compression ratio of 6.1, compared to 6.8 achieved by gzip
Keywords :
computer debugging; data compression; data handling; program debugging; communication bandwidth; data address traces; debugging tools; gzip; hardware structure; hardware structures; on-chip trace compression module; program execution traces; trace-specific compression algorithms; unobtrusive real-time compression; Algorithm design and analysis; Bandwidth; Compression algorithms; Computer aided instruction; Computer architecture; Design optimization; Hardware; Runtime; Software debugging; Software performance;