Intel Processor Computer Hardware User Manual


 
14-2 March, 2003 Developers Manual
Intel
®
80200 Processor based on Intel
®
XScale
Microarchitecture
Performance Considerations
14.2 Branch Prediction
The Intel
®
80200 processor implements dynamic branch prediction for the ARM* instructions B
and BL and for the Thumb* instruction B. Any instruction that specifies the PC as the destination
is predicted as not taken. For example, an LDR or a MOV that loads or moves directly to the PC is
predicted not taken and incur a branch latency penalty.
These instructions -- ARM B, ARM BL and Thumb B -- enter into the branch target buffer when
they are “taken” for the first time. (A “taken” branch refers to when they are evaluated to be true.)
Once in the branch target buffer, the Intel
®
80200 processor dynamically predicts the outcome of
these instructions based on previous outcomes. Table 14-2 shows the branch latency penalty when
these instructions are correctly predicted and when they are not. A penalty of zero for correct
prediction means that the Intel
®
80200 processor can execute the next instruction in the program
flow in the cycle following the branch.
14.3 Addressing Modes
All load and store addressing modes implemented in the Intel
®
80200 processor do not add to the
instruction latencies numbers.
Table 14-2. Branch Latency Penalty
Core Clock Cycles
Description
ARM* Thumb*
+0 + 0
Predicted Correctly
. The instruction is in the branch target cache and is
correctly predicted.
+4 + 5
Mispredicted
. There are three occurrences of branch misprediction, all
of which incur a 4-cycle branch delay penalty.
1. The instruction is in the branch target buffer and is predicted
not-taken, but is actually taken.
2. The instruction is not in the branch target buffer and is a taken
branch.
3. The instruction is in the branch target buffer and is predicted taken,
but is actually not-taken