Support User Manuals

AMD 250 Computer Hardware User Manual

Open as PDF

of 384

Chapter 4 Instruction-Decoding Optimizations 73

Software Optimization Guide for AMD64 Processors

25112 Rev. 3.06 September 2005

4.2 Load-Execute Instructions

A load-execute instruction is an instruction that loads a value from memory into a register and then

performs an operation on that value. Many general purpose instructions, such as ADD, SUB, AND,

etc., have load-execute forms:

add rax, QWORD PTR [foo]

This instruction loads the value foo from memory and then adds it to the value in the RAX register.

The work performed by a load-execute instruction can also be accomplished by using two discrete

instructions—a load instruction followed by an execute instruction. The following example employs

discrete load and execute stages:

mov rbx, QWORD PTR [foo]

add rax, rbx

The first statement loads the value foo from memory into the RBX register. The second statement

adds the value in RBX to the value in RAX.

The following optimizations govern the use of load-execute instructions:

• Load-Execute Integer Instructions on page 73.

• Load-Execute Floating-Point Instructions with Floating-Point Operands on page 74.

• Load-Execute Floating-Point Instructions with Integer Operands on page 74.

4.2.1 Load-Execute Integer Instructions

Optimization

❖ When performing integer computations, use load-execute instructions instead of discrete load

and execute instructions. Use discrete load and execute instructions only to avoid scheduler stalls for

longer executing instructions and to explicitly schedule load and execute operations.

Application

This optimization applies to:

• 32-bit software

• 64-bit software

Rationale

Most load-execute integer instructions are DirectPath decodable and can be decoded at the rate of

three per cycle. Splitting a load-execute integer instruction into two separate instructions reduces

decoding bandwidth and increases register pressure, which results in lower performance.

previous next