Intel
®
IXP42X product line and IXC1100 control plane processors—Intel XScale
®
Processor
Intel
®
IXP42X Product Line of Network Processors and IXC1100 Control Plane Processor
DM September 2006
140 Order Number: 252480-006US
performance statistics could be gathered (like hit rates, number of write-backs per data
cache miss, and number of times the data cache buffers fill up per request).
3.7.4.1 Instruction Cache Efficiency Mode
PMN0 totals the number of instructions that were executed, which does not include
instructions fetched from the instruction cache that were never executed. This can
happen if a branch instruction changes the program flow; the instruction cache may
retrieve the next sequential instructions after the branch, before it receives the target
address of the branch.
PMN1 counts the number of instruction fetch requests to external memory. Each of
these requests loads 32 bytes at a time.
Statistics derived from these two events:
• Instruction cache miss-rate. This is derived by dividing PMN1 by PMN0.
• The average number of cycles it took to execute an instruction or commonly
referred to as cycles-per-instruction (CPI). CPI can be derived by dividing CCNT by
PMN0, where CCNT was used to measure total execution time.
3.7.4.2 Data Cache Efficiency Mode
PMN0 totals the number of data cache accesses, which includes cacheable and non-
cacheable accesses, mini-data cache access and accesses made to locations configured
as data RAM.
Note that STM and LDM will each count as several accesses to the data cache
depending on the number of registers specified in the register list. LDRD will register
two accesses.
PMN1 counts the number of data cache and mini-data cache misses. Cache operations
do not contribute to this count. See “Register 7: Cache Functions” on page 81 for a
description of these operations.
The statistic derived from these two events is:
Data cache miss-rate. This is derived by dividing PMN1 by PMN0.
3.7.4.3 Instruction Fetch Latency Mode
PMN0 accumulates the number of cycles when the instruction-cache is not able to
deliver an instruction to the IXP42X product line and IXC1100 control plane processors
due to an instruction-cache miss or instruction-TLB miss. This event means that the
processor core is stalled.
PMN1 counts the number of instruction fetch requests to external memory. Each of
these requests loads 32 bytes at a time. This is the same event as measured in
instruction cache efficiency mode.
Statistics derived from these two events:
• The average number of cycles the processor stalled waiting for an instruction fetch
from external memory to return. This is calculated by dividing PMN0 by PMN1. If
the average is high then IXP42X product line and IXC1100 control plane processors
may be starved of the bus external to the IXP42X product line and IXC1100 control
plane processors.
• The percentage of total execution cycles the processor stalled waiting on an
instruction fetch from external memory to return. This is calculated by dividing
PMN0 by CCNT, which was used to measure total execution time.