AMD 250 Manual

A SERVICE OF

next previous

Appendix C Instruction Latencies 321

Software Optimization Guide for AMD64 Processors

25112 Rev. 3.06 September 2005

MOVSS mem32,

xmmreg

F3h 0Fh 11h mm-xxx-xxx DirectPath 2

MOVUPS xmmreg1,

xmmreg2

0Fh 10h 11-xxx-xxx Double 2

MOVUPS xmmreg,

mem128

0Fh 10h mm-xxx-xxx VectorPath 7

MOVUPS xmmreg1,

xmmreg2

0Fh 11h 11-xxx-xxx Double 2

MOVUPS mem128,

xmmreg

0Fh 11h mm-xxx-xxx VectorPath 4

MULPS xmmreg1,

xmmreg2

0Fh 59h 11-xxx-xxx Double FMUL 5 1

MULPS xmmreg,

mem128

0Fh 59h mm-xxx-xxx Double FMUL 7 1

MULSS xmmreg1,

xmmreg2

F3h 0Fh 59h 11-xxx-xxx DirectPath FMUL 4

MULSS xmmreg,

mem32

F3h 0Fh 59h mm-xxx-xxx DirectPath FMUL 6

ORPS xmmreg1,

xmmreg2

0Fh 56h 11-xxx-xxx Double FMUL 3 1

ORPS xmmreg,

mem128

0Fh 56h mm-xxx-xxx Double FMUL 5 1

PAVGB mmreg1,

mmreg2

0Fh E0h 11-xxx-xxx DirectPath FADD/FMUL 2

PAVGB mmreg, mem64 0Fh E0h mm-xxx-xxx DirectPath FADD/FMUL 4

PAVGW mmreg1,

mmreg2

0Fh E3h 11-xxx-xxx DirectPath FADD/FMUL 2

PAVGW mmreg, mem64 0Fh E3h mm-xxx-xxx DirectPath FADD/FMUL 4

Table 18. SSE Instructions (Continued)

Syntax

Encoding

Decode

type

FPU pipe(s) Latency Note

Prefix

byte

First

byte

2nd

byte

ModRM byte

Notes:

1. The low half of the result is available one cycle earlier than listed.

2. The second latency value indicates when the low half of the result becomes available.

3. The high half of the result is available one cycle earlier than listed.

4. The latency listed is the absolute minimum, while average latencies may be higher and are a function of internal

pipeline conditions.

5. For the PREFETCHNTA/T0/T1/T2 instructions, the mem8 value refers to an address in the 64-byte line to be

prefetched.

6. The 8-clock latency is only visible to younger stores that need to do an external write. The 2-clock latency is

visible to the other stores and instructions.

7. This is the execution latency for the instruction. The time to complete the external write depends on the memory

speed and the hardware implementation.