Support User Manuals

Intel Processor Computer Hardware User Manual

Open as PDF

of 289

B-42 March, 2003 Developer’s Manual

Intel

®

80200 Processor based on Intel

®

XScale

™

Microarchitecture

Optimization Guide

B.5.5 Scheduling the MRA and MAR Instructions (MRRC/MCRR)

The MRA (MRRC) instruction has an issue latency of 1 cycle, a result latency of 2 or 3 cycles

depending on the destination register value being accessed and a resource latency of 2 cycles.

Consider the code sample:

mra r6, r7, acc0

mra r8, r9, acc0

add r1, r1, #1

The code shown above would incur a 1-cycle stall due to the 2-cycle resource latency of an MRA

instruction. The code can be rearranged as shown below to prevent this stall.

mra r6, r7, acc0

add r1, r1, #1

mra r8, r9, acc0

Similarly, the code shown below would incur a 2 cycle penalty due to the 3-cycle result latency for

the second destination register.

mra r6, r7, acc0

mov r1, r7

mov r0, r6

add r2, r2, #1

The stalls incurred by the code shown above can be prevented by rearranging the code:

mra r6, r7, acc0

add r2, r2, #1

mov r0, r6

mov r1, r7

The MAR (MCRR) instruction has an issue latency, a result latency, and a resource latency of 2

cycles. Due to the 2-cycle issue latency, the pipeline would always stall for 1 cycle following a

MAR instruction. The use of the MAR instruction should, therefore, be used only where

absolutely necessary.

previous next