ARM R4 Computer Hardware User Manual


 
Prefetch Unit
ARM DDI 0363E Copyright © 2009 ARM Limited. All rights reserved. 5-4
ID013010 Non-Confidential, Unrestricted Access
5.2.2 Branch predictor
Branch prediction in the processor is dynamic and is based around a global history prediction
scheme. In addition, there is extra logic to handle predictions that thrash and to predict the end
of long loops.
The global history scheme is an adaptive predictor that learns the behavior of branches during
execution, based on the historical pattern of behavior of the preceding branches. For each
pattern of branch behavior, the history table holds a 2-bit hint value. The 2-bit hint indicates if
the next branch must be predicted taken or predicted not-taken based on the behavior of previous
branches. The history table contains 256 entries.
For loops beyond a certain number of iterations, the branch history is not large enough to learn
the history and predict the loop exit. The PFU includes logic to count the number of iterations
(up to 31) of a loop, and thereby predict the not-taken branch that exits the loop. If the number
of iterations taken exceeds 31, the loop branch is never predicted as not-taken.
If multiple branch histories index into the same hint value, this can cause thrashing in the history
table and reduce accuracy of the branch predictor. Logic in the branch predictor detects these
cases and provides some hysteresis for the hint value.
For direct branches, the target address is calculated statically from the instruction encoding and
the program counter. For indirect branches, the hint value predicts if the branch is taken or
not-taken, and the return stack can sometimes be used to predict the target address. When the
destination of a branch cannot be calculated statically, or popped from the return stack, PFU
assumes the branch to be not-taken.
The PFU updates the history for each occurrence of a branch when the DPU indicates how the
branch was resolved.
Configuring the branch predictor
You can configure the branch predictor by setting bits in the Auxiliary Control Register:
Set bits [16:15] to b00 to enable prediction using the pattern history tables.
Set bits [16:15] to b01 to force branches to be always predicted taken.
Set bits [16:15] to b10 to force branches to be always predicted not-taken.
Set bit [21] to disable prediction using the dynamic branch predictor loop cache.
Set bit [20] to disable prediction using the dynamic branch predictor register extension
cache.
For more information, see c1, Auxiliary Control Register on page 4-38
5.2.3 Incorrect predictions and correction
The DPU resolves branches that the dynamic branch predictor predicts at the Wr-stage of the
pipeline, see Figure 1-3 on page 1-17. A misprediction causes the PFU to flush the pipeline and
fetch the correct instruction stream.