IA-32 Instruction Latency and Throughput C
C-13
MOVLHPS
3
xmm, xmm 44 22 MMX_SHFT
MOVMSKPS r32, xmm 6 6 2 2 FP_MISC
MOVSS xmm, xmm 4 4 2 2 MMX_SHFT
MOVUPS xmm, xmm 6 6 1 1 FP_MOVE
MULPS xmm, xmm 7 6 4+1 2 2 2 FP_MUL
MULSS xmm, xmm 7 6 2 2 FP_MUL
ORPS
3
xmm, xmm 442222MMX_ALU
RCPPS
3
xmm, xmm 662442MMX_MISC
RCPSS
3
xmm, xmm 661221MMX_MISC,
MMX_SHFT
RSQRTPS
3
xmm, xmm 662442MMX_MISC
RSQRTSS
3
xmm, xmm 66 441MMX_MISC,
MMX_SHFT
SHUFPS
3
xmm, xmm,
imm8
662222MMX_SHFT
SQRTPS xmm, xmm 40 39 29+28 40 39 58 FP_DIV
SQRTSS xmm, xmm 32 23 30 32 23 29 FP_DIV
SUBPS xmm, xmm 5 4 4 2 2 2 FP_ADD
SUBSS xmm, xmm 5 4 3 2 2 1 FP_ADD
UCOMISS xmm, xmm 7 6 1 2 2 1 FP_ADD,
FP_MISC
UNPCKHPS
3
xmm,
xmm
663222MMX_SHFT
UNPCKLPS
3
xmm,
xmm
443222MMX_SHFT
XORPS
3
xmm, xmm 442222MMX_ALU
FXRSTOR 150
FXSAVE 100
See “Table Footnotes”
Table C-4 Streaming SIMD Extension Single-precision Floating-point
Instructions (continued)
Instruction Latency
1
Throughput Execution Unit
2