Vol. 3 19-21
ARCHITECTURE COMPATIBILITY
19.18.10 WAIT/FWAIT Prefix Differences
On the Intel486 processor, when a WAIT/FWAIT instruction precedes a floating-point
instruction (one which itself automatically synchronizes with the previous floating-
point instruction), the WAIT/FWAIT instruction is treated as a no-op. Pending
floating-point exceptions from a previous floating-point instruction are processed not
on the WAIT/FWAIT instruction but on the floating-point instruction following the
WAIT/FWAIT instruction. In such a case, the report of a floating-point exception may
appear one instruction later on the Intel486 processor than on a P6 family or Pentium
FPU, or on Intel
387 math coprocessor.
19.18.11 Operands Split Across Segments and/or Pages
On the P6 family, Pentium, and Intel486 processor FPUs, when the first half of an
operand to be written is inside a page or segment and the second half is outside, a
memory fault can cause the first half to be stored but not the second half. In this situ
-
ation, the Intel 387 math coprocessor stores nothing.
19.18.12 FPU Instruction Synchronization
On the 32-bit x87 FPUs, all floating-point instructions are automatically synchro-
nized; that is, the processor automatically waits until the previous floating-point
instruction has completed before completing the next floating-point instruction. No
explicit WAIT/FWAIT instructions are required to assure this synchronization. For the
8087 math coprocessors, explicit waits are required before each floating-point
instruction to ensure synchronization. Although 8087 programs having explicit WAIT
instructions execute perfectly on the 32-bit IA-32 processors without reassembly,
these WAIT instructions are unnecessary.
19.19 SERIALIZING INSTRUCTIONS
Certain instructions have been defined to serialize instruction execution to ensure
that modifications to flags, registers and memory are completed before the next
instruction is executed (or in P6 family processor terminology “committed to machine
state”). Because the P6 family processors use branch-prediction and out-of-order
execution techniques to improve performance, instruction execution is not generally
serialized until the results of an executed instruction are committed to machine state
(see
Chapter 2, “Intel® 64 and IA-32 Architectures,” in the Intel® 64 and IA-32
Architectures Software Developer’s Manual, Volume 1).
As a result, at places in a program or task where it is critical to have execution
completed for all previous instructions before executing the next instruction (for
example, at a branch, at the end of a procedure, or in multiprocessor dependent
code), it is useful to add a serializing instruction. See
Section 8.3, “Serializing
Instructions,” for more information on serializing instructions.