Intel IA-32 Computer Accessories User Manual


 
IA-32 Intel® Architecture Optimization
2-18
The cmov and fcmov instructions are available on the Pentium II and
subsequent processors, but not on Pentium processors and earlier 32-bit
Intel architecture processors. Be sure to check whether a processor
supports these instructions with the
cpuid instruction.
Spin-Wait and Idle Loops
The Pentium 4 processor introduces a new pause instruction; the
instruction is architecturally a
nop on all IA-32 implementations. To the
Pentium 4 processor, this instruction acts as a hint that the code
sequence is a spin-wait loop. Without a
pause instruction in such loops,
the Pentium 4 processor may suffer a severe penalty when exiting the
loop because the processor may detect a possible memory order
violation. Inserting the
pause instruction significantly reduces the
likelihood of a memory order violation and as a result improves
performance.
In Example 2-4, the code spins until memory location A matches the
value stored in the register
eax. Such code sequences are common when
protecting a critical section, in producer-consumer sequences, for
barriers, or other synchronization.
Example 2-3 Eliminating Branch with CMOV Instruction
test ecx, ecx
jne 1h
mov eax, ebx
1h:
; To optimize code, combine jne and mov into one cmovcc
; instruction that checks the equal flag
test ecx, ecx ; test the flags
cmoveq eax, ebx ; if the equal flag is set, move
; ebx to eax - the lh: tag no longer
; needed