11-28 Vol. 3
MEMORY CACHE CONTROL
To avoid problems related to implicit caching, the operating system must explicitly
invalidate the cache when changes are made to cacheable data that the cache coher-
ency mechanism does not automatically handle. This includes writes to dual-ported
or physically aliased memory boards that are not detected by the snooping mecha
-
nisms of the processor, and changes to page- table entries in memory.
The code in Example 11-1 shows the effect of implicit caching on page-table entries.
The linear address F000H points to physical location B000H (the page-table entry for
F000H contains the value B000H), and the page-table entry for linear address F000
is PTE_F000.
Example 11-1. Effect of Implicit Caching on Page-Table Entries
mov EAX, CR3; Invalidate the TLB
mov CR3, EAX; by copying CR3 to itself
mov PTE_F000, A000H; Change F000H to point to A000H
mov EBX, [F000H];
Because of speculative execution in the P6 and more recent processor families, the
last MOV instruction performed would place the value at physical location B000H into
EBX, rather than the value at the new physical address A000H. This situation is
remedied by placing a TLB invalidation between the load and the store.
11.8 EXPLICIT CACHING
The Pentium III processor introduced four new instructions, the PREFETCHh instruc-
tions, that provide software with explicit control over the caching of data. These
instructions provide “hints” to the processor that the data requested by a PREFETCHh
instruction should be read into cache hierarchy now or as soon as possible, in antici-
pation of its use. The instructions provide different variations of the hint that allow
selection of the cache level into which data will be read.
The PREFETCHh instructions can help reduce the long latency typically associated
with reading data from memory and thus help prevent processor “stalls.” However,
these instructions should be used judiciously. Overuse can lead to resource conflicts
and hence reduce the performance of an application. Also, these instructions should
only be used to prefetch data from memory; they should not be used to prefetch
instructions. For more detailed information on the proper use of the prefetch instruc
-
tion, refer to Chapter 7, “Optimizing Cache Usage,” in the Intel® 64 and IA-32 Archi-
tectures Optimization Reference Manual.