Intel IA-32 Computer Accessories User Manual


 
IA-32 Intel® Architecture Optimization
6-8
The Prefetch Instructions – Pentium 4 Processor
Implementation
Streaming SIMD Extensions include four flavors of prefetch
instructions, one non-temporal, and three temporal. They correspond to
two types of operations, temporal and non-temporal.
The non-temporal instruction is
prefetchnta Fetch the data into the second-level cache, minimizing
cache pollution.
The temporal instructions are
prefetcht0 Fetch the data into all cache levels, that is, to the
second-level cache for the Pentium 4 processor.
prefetcht1 Identical to prefetcht0
prefetcht2
Identical to prefetcht0
Prefetch and Load Instructions
The Pentium 4 processor has a decoupled execution and memory
architecture that allows instructions to be executed independently with
memory accesses if there are no data and resource dependencies.
Programs or compilers can use dummy load instructions to imitate
prefetch functionality, but preloading is not completely equivalent to
prefetch instructions. Prefetch instructions provide a greater
performance than preloading.
NOTE. At the time of prefetch, if the data is already
found in a cache level that is closer to the processor
than the cache level specified by the instruction, no
data movement occurs.