6–20 Privileged Architecture Library Code
21264/EV68A Hardware Reference Manual
Performance Counter Support
6.10.2.3 Aggregate Counting Mode Description
6.10.2.3.1 Cycle counting
Counts cycles.
PCTR0 is incremented by the number of cycles counted, that is, 1.
6.10.2.3.2 Retired instructions cycles
PCTR0 is incremented by up to 8 retired instructions per cycle when enabled via
I_CTL[PCT0_EN] and either I_CTL[SPCE] or PCTX[PPCE]. On overflow, an inter-
rupt is triggered as ISUM[PC0] if enabled via IER_CM[PCEN0].
The 21264/EV68A can retire up to 11 instructions per cycle, which exceeds PCTR0's
maximum increment of 8 per cycle. However, no retires go uncounted because the
21264/EV68A cannot sustain 11 retires per cycle, and the 21264/EV68A corrects
PCTR0 in subsequent cycles.
A squashed instruction does not count as a retire.
6.10.2.3.3 Bcache miss or long latency probes cycles
This input counts the number of times the Bcache result was a miss.
Essentially, a long latency probe is a data request from other processes that cause
Bcache misses in a system.
This count is phase shifted three cycles early and thus includes events that occurred
three cycles before the start and before the end of the ProfileMe window.
6.10.2.3.4 Mbox replay traps cycles
This input counts Mbox replay traps.
6.10.2.4 Counter Modes for Aggregate Mode
Table 6–11 shows the counter modes that are used with Aggregate mode.
6.10.3 ProfileMe Mode Programming Guidelines
Use the following information to program counters in ProfileMe mode.
6.10.3.1 ProfileMe Mode Precautions
Squashed NOPs count as valid fetched instructions.
Counter 1 must be explicitly cleared in the trap handler before each data collection.
Table 6–11 Aggregate Mode Performance Counter IPR Input Select Fields
SL0[4] SL1[3:2] PCTR0 PCTR1
0 00 Retired instructions Cycle counting
0 01 Cycle counting Not defined
0 10 Retired instructions Bcache miss or long latency probes
0 11 Cycle counting Mbox replay traps