3-654 Vol. 2A MOVNTPD—Store Packed Double-Precision Floating-Point Values Using Non-Temporal
Hint
INSTRUCTION SET REFERENCE, A-M
MOVNTPD—Store Packed Double-Precision Floating-Point Values Using
Non-Temporal Hint
Description
Moves the double quadword in the source operand (second operand) to the destina-
tion operand (first operand) using a non-temporal hint to minimize cache pollution
during the write to memory. The source operand is an XMM register, which is
assumed to contain two packed double-precision floating-point values. The destina-
tion operand is a 128-bit memory location.
The non-temporal hint is implemented by using a write combining (WC) memory
type protocol when writing the data to memory. Using this protocol, the processor
does not write the data into the cache hierarchy, nor does it fetch the corresponding
cache line from memory into the cache hierarchy. The memory type of the region
being written to can override the non-temporal hint, if the memory address specified
for the non-temporal store is in an uncacheable (UC) or write protected (WP)
memory region. For more information on non-temporal stores, see “Caching of
Temporal vs. Non-Temporal Data” in Chapter 10 in the Intel® 64 and IA-32 Architec-
tures Software Developer’s Manual, Volume 1.
Because the WC protocol uses a weakly-ordered memory consistency model, a
fencing operation implemented with the SFENCE or MFENCE instruction should be
used in conjunction with MOVNTPD instructions if multiple processors might use
different memory types to read/write the destination memory locations.
In 64-bit mode, use of the REX.R prefix permits this instruction to access additional
registers (XMM8-XMM15).
Operation
DEST ← SRC;
Intel C/C++ Compiler Intrinsic Equivalent
MOVNTPD void _mm_stream_pd(double *p, __m128d a)
SIMD Floating-Point Exceptions
None.
Opcode Instruction 64-Bit
Mode
Compat/
Leg Mode
Description
66 0F 2B /r MOVNTPD m128,
xmm
Valid Valid Move packed double-precision
floating-point values from xmm to
m128 using non-temporal hint.