Support User Manuals

IBM SA14-2339-04 Personal Computer User Manual

Open as PDF

of 552

Code Optimization and Instruction Timings C-3

Align branch targets that are unlikely to be hit by “fall-through” code on cache line boundaries (such

as the address of functions such as strcpy), to minimize the number of unused instructions in cache

line ﬁlls.

C.2 Instruction Timings

The following timing descriptions consider only “ﬁrst order” effects of cache misses in the ICU

(instruction-side) and DCU (data-side) arrays.

The timing descriptions

do not

provide complete descriptions of the performance penalty associated

with cache misses; the timing descriptions do not consider bus contention between the instruction-

side and the data-side, or the time associated with performing line ﬁlls or ﬂushes. Unless speciﬁcally

stated otherwise, the number of cycles apply to systems having zero-wait memory access.

C.2.1 General Rules

Instructions execute in order.

All instructions, assuming cache hits, execute in one cycle, except:

• Divide instructions execute in 35 clock cycles.

• Branches execute in one or three clock cycles, as described in “Branches.”

• MAC and multiply instructions execute in one to ﬁve cycles as described in “Multiplies.”

• Aligned load/store instructions that hit in the cache execute in one clock cycle/word. See

“Alignment” for information on execution timings for unaligned load/stores.

• In isolation, a data cache control instruction takes two cycles in the processor pipeline. However,

subsequent DCU accesses are stalled until a cache control instruction ﬁnishes accessing the data

cache array.

Note: Note that subsequent DCU accesses do not remain stalled while transfers associated with

previous data cache control instructions continue on the PLB.

C.2.2 Branches

Branch instructions are decoded in prefetch buffer 0 (PFB0) and the decode stage of the instruction

pipeline. Branch targets, whether the branch is known or predicted taken, can be fetched from the

PFB0 and DCD stages. Incorrectly predicted branches can be corrected from the DCD or EXE

(execute) stages of the pipeline.

Branches can be known taken or known not taken, or can have address or condition dependencies.

Branches having address dependencies are never predicted taken. The directions of conditional

branches having no address dependencies are statically predicted.

Conditional branches may depend on the results of an instruction that is changing the CR or the CTR.

Address dependencies can occur when:

•Abclr instruction that is known taken, or unresolved, follows (immediately, or separated by only

one instruction) a link updating instruction (mtlr or a branch and link).

•Abcctr instruction that is known taken, or unresolved, follows (immediately, or separated by only

one instruction) a counter updating instruction (mtctr or a branch that decrements the counter).

previous next