AMD 250 Computer Hardware User Manual


 
290 Instruction Latencies Appendix C
25112 Rev. 3.06 September 2005
Software Optimization Guide for AMD64 Processors
OR mem16/32/64, reg16/32/64 09h mm-xxx-xxx DirectPath 4
OR reg8, mreg8 0Ah 11-xxx-xxx DirectPath 1
OR reg8, mem8 0Ah mm-xxx-xxx DirectPath 4
OR reg16/32/64, mreg16/32/64 0Bh 11-xxx-xxx DirectPath 1
OR reg16/32/64, mem16/32/64 0Bh mm-xxx-xxx DirectPath 4
OR AL, imm8 0Ch DirectPath 1
OR AX, imm16 0Dh DirectPath 1
OR EAX, imm32 0Dh DirectPath 1
OR RAX, imm32 (sign extended) 0Dh DirectPath 1
OR mreg8, imm8 80h 11-001-xxx DirectPath 1
OR mem8, imm8 80h mm-001-xxx DirectPath 4
OR mreg16/32/64, imm16/32 81h 11-001-xxx DirectPath 1
OR mem16/32/64, imm16/32 81h mm-001-xxx DirectPath 4
OR mreg16/32/64, imm8 (sign extended) 83h 11-001-xxx DirectPath 1
OR mem16/32/64, imm8 (sign extended) 83h mm-001-xxx DirectPath 4
OUT imm8, AL E6h VectorPath ~
OUT imm8, AX E7h VectorPath ~
OUT imm8, EAX E7h VectorPath ~
OUT DX, AL EEh VectorPath 165
OUT DX, AX EFh VectorPath 165
OUT DX, EAX EFh VectorPath 165
POP ES 07h VectorPath 10
POP SS 17h VectorPath 31
POP DS 1Fh VectorPath 10
Table 13. Integer Instructions (Continued)
Syntax
Encoding
Decode
type
Latency Note
First
byte
Second
byte
ModRM
byte
Notes:
1. Static timing assumes a predicted branch.
2. Store operation also updates ESP—the new register value is available one clock earlier than the specified
latency.
3. The clock count, regardless of the number of shifts or rotates, as determined by CL or imm8.
4. LEA instructions have a latency of 1 when there are two source operands (as in the case of the base + index
form LEA EAX, [EDX+EDI]). Forms with a scale or more than two source operands will have a latency of 2 (LEA
EAX, [EBX+EBX*8]).
5. These instructions have an effective latency as shown. They map to internal NOPs that can be issued at a rate of
three per cycle but do not occupy execution resources.
6. The latency of repeated string instructions can be found in “Latency of Repeated String Instructions” on
page 167.
7. The first latency value is for 32-bit mode. The second is for 64-bit mode.
8. This opcode is used as a REX prefix in 64-bit mode.