Toshiba TX39 Computer Hardware User Manual


 
Architecture
40
4.2 Delay Slot
Some R3900 Processor Core instructions are executed with a delay of one instruction cycle. The cycle in
which an instruction is delayed is called a delay slot. A delay occurs with load instructions and branch/jump
instructions.
4.2.1 Delayed load
With load instructions, a one-cycle delay occurs while waiting for the data being loaded to become
available for use by another instruction. The R3900 Processor Core checks the instruction in the
delay slot (the instruction immediately following the load instruction) to see if that instruction needs
to use the load result; if so, it stalls the pipeline (see Figure 4-2).
With the R3000A, if the instruction following a load instruction required access to the loaded data,
then a NOP had to be inserted immediately after the load instruction. The delay load feature in the
R3900 Processor Core eliminates the need for a NOP instruction, resulting in smaller code size than
with the R3000A.
LW r2, 20(r0) F D E M W
ADD r3, r1, r2 F D ES E M W
Pipeline stall
4.2.2 Delayed branching
Figure 4-3 shows the pipeline flow for jump/branch instructions. The branch target address that must
be generated for these type of instructions does not become available until the E stage too late to be
used by the instruction in the branch delay slot. The branch target instruction is fetched immediately
after the branch delay slot cycle.
It is, however, possible to fetch a different instruction that would normally be executed prior to the
branch instruction.
Branch/Jump F D E M W
instruction
Target address
Branch delay slot F D E M W
Branch target address F D E M W
You can make effective use of the branch delay slot as follows.
Since the instruction immediately following a branch instruction will be executed just priot to the
branch, you can therefore place an instruction (that logically should be executed just before the
branch) into the delay slot following the branch instruction.
Figure 4-2. Load delay slot and pipeline stall
Figure 4-3. Branch instruction delay slot