IBM SA14-2339-04 Personal Computer User Manual


 
Code Optimization and Instruction Timings C-3
Align branch targets that are unlikely to be hit by “fall-through” code on cache line boundaries (such
as the address of functions such as strcpy), to minimize the number of unused instructions in cache
line fills.
C.2 Instruction Timings
The following timing descriptions consider only “first order” effects of cache misses in the ICU
(instruction-side) and DCU (data-side) arrays.
The timing descriptions
do not
provide complete descriptions of the performance penalty associated
with cache misses; the timing descriptions do not consider bus contention between the instruction-
side and the data-side, or the time associated with performing line fills or flushes. Unless specifically
stated otherwise, the number of cycles apply to systems having zero-wait memory access.
C.2.1 General Rules
Instructions execute in order.
All instructions, assuming cache hits, execute in one cycle, except:
Divide instructions execute in 35 clock cycles.
Branches execute in one or three clock cycles, as described in “Branches.”
MAC and multiply instructions execute in one to five cycles as described in “Multiplies.”
Aligned load/store instructions that hit in the cache execute in one clock cycle/word. See
“Alignment” for information on execution timings for unaligned load/stores.
In isolation, a data cache control instruction takes two cycles in the processor pipeline. However,
subsequent DCU accesses are stalled until a cache control instruction finishes accessing the data
cache array.
Note: Note that subsequent DCU accesses do not remain stalled while transfers associated with
previous data cache control instructions continue on the PLB.
C.2.2 Branches
Branch instructions are decoded in prefetch buffer 0 (PFB0) and the decode stage of the instruction
pipeline. Branch targets, whether the branch is known or predicted taken, can be fetched from the
PFB0 and DCD stages. Incorrectly predicted branches can be corrected from the DCD or EXE
(execute) stages of the pipeline.
Branches can be known taken or known not taken, or can have address or condition dependencies.
Branches having address dependencies are never predicted taken. The directions of conditional
branches having no address dependencies are statically predicted.
Conditional branches may depend on the results of an instruction that is changing the CR or the CTR.
Address dependencies can occur when:
•Abclr instruction that is known taken, or unresolved, follows (immediately, or separated by only
one instruction) a link updating instruction (mtlr or a branch and link).
•Abcctr instruction that is known taken, or unresolved, follows (immediately, or separated by only
one instruction) a counter updating instruction (mtctr or a branch that decrements the counter).