Paper

Charlton Rose. CITCAT: Constructing Instruction Traces from Cache-filtered Address Traces. Master's thesis, Brigham Young University, 1999.

Abstract

Traces are valuable to computer architects because they allow researchers to subject hypothetical systems to real workloads using trace-driven simulation. An instruction trace is a record of a processor's instruction-level activity, including the opcode and operands of each instruction. In general, traces affected by the tracing process have limited utility. However, instruction traces are nearly impossible to collect without perturbing the system being traced.

In the past, researchers have taken several approaches towards collecting traces, such as (trap based) single stepping, inlining, hardware monitoring, and processor simulation. These approaches fail to produce accurate traces because they interfere with the processor's normal execution or are too difficult to implement correctly.

Because processors are deterministic machines, their behavior can be predicted if their initial states and external inputs are known. CITCAT is a procedure that exploits this fact to generate nearly perfect instruction traces using trace-driven simulation. CITCAT combines the best features of traditional tracing techniques to produce long, accurate instruction traces from cache-filtered address traces (CATs), which can easily be collected without perturbing the system being traced.

Successful implementations of CITCAT require a processor simulator, the ability to initialize it with the state of an actual machine, and the ability to simulate a realistic sequence of asynchronous events. The simulator is initialized with an initial machine state (IMS) image, while events are simulated according to an asynchronous event schedule (AES). Both of these records are extracted from a single CAT. Because the simulator replays the instruction sequence executed by the original processor that produced the CAT, it is possible to convert the CAT into any type of system trace.

The feasibility of CITCAT is demonstrated through an implementation on a MIPS R4400 microprocessor based system. System-specific challenges are overcome with system-specific solutions, and efforts to generate instruction traces are ultimately successful. An R4400 CITCAT driver and other operating system patches, and how they enable the generation of IMS and AES records, are described.

Because CITCAT instruction traces are computed, rather than stored, CITCAT has potential for development as an extremely efficient, lossless trace compression algorithm.