Support User Manuals

AMD 250 Computer Hardware User Manual

Open as PDF

of 384

Chapter 4 Instruction-Decoding Optimizations 75

Software Optimization Guide for AMD64 Processors

25112 Rev. 3.06 September 2005

Application

This optimization applies to:

• 32-bit software

• 64-bit software

Rationale

The load-execute floating-point instructions that take integer operands are VectorPath instructions and

generate two micro-ops in a cycle, while discrete load and execute intructions enable a third

DirectPath instruction to be decoded in the same cycle. In some situations, these optimizations can

also reduce execution time if FILD can be scheduled several instructions ahead of the arithmetic

instruction in order to cover the FILD latency.

Example

Avoid code such as the following, which uses load-execute floating-point instructions that take

integer operands:

fld QWORD PTR [foo] ; Push foo onto FP stack [ST(0) = foo].

fimul DWORD PTR [bar] ; Multiply bar by ST(0) [ST(0) = bar * foo].

fiadd DWORD PTR [baz] ; Add baz to ST(0) [ST(0) = baz + (bar * foo)].

Instead, use code such as the following, which uses discrete load and execute instructions:

fild DWORD PTR [bar] ; Push bar onto FP stack.

fild DWORD PTR [baz] ; Push baz onto FP stack.

fld QWORD PTR [foo] ; Push foo onto FP stack.

fmulp st(2), st ; Multiply and pop [ST(1) = foo * bar, ST(0) = baz].

faddp st(1), st ; Add and pop [ST(0) = baz + (foo * bar)].

previous next