Vol. 3 8-7
MULTIPLE-PROCESSOR MANAGEMENT
The act of one processor writing data into the currently executing code segment of a
second processor with the intent of having the second processor execute that data as
code is called cross-modifying code. As with self-modifying code, IA-32 processors
exhibit model-specific behavior when executing cross-modifying code, depending
upon how far ahead of the executing processors current execution pointer the code
has been modified.
To write cross-modifying code and insure that it is compliant with current and future
versions of the IA-32 architecture, the following processor synchronization algorithm
must be implemented:
(* Action of Modifying Processor *)
Memory_Flag ← 0; (* Set Memory_Flag to value other than 1 *)
Store modified code (as data) into code segment;
Memory_Flag ← 1;
(* Action of Executing Processor *)
WHILE (Memory_Flag ≠ 1)
Wait for code to update;
ELIHW;
Execute serializing instruction; (* For example, CPUID instruction *)
Begin executing modified code;
(The use of this option is not required for programs intended to run on the Intel486
processor, but is recommended to insure compatibility with the Pentium 4, Intel
Xeon, P6 family, and Pentium processors.)
Like self-modifying code, cross-modifying code will execute at a lower level of perfor-
mance than non-cross-modifying (normal) code, depending upon the frequency of
modification and specific characteristics of the code.
The restrictions on self-modifying code and cross-modifying code also apply to the
Intel 64 architecture.
8.1.4 Effects of a LOCK Operation on Internal Processor Caches
For the Intel486 and Pentium processors, the LOCK# signal is always asserted on the
bus during a LOCK operation, even if the area of memory being locked is cached in
the processor.
For the P6 and more recent processor families, if the area of memory being locked
during a LOCK operation is cached in the processor that is performing the LOCK oper
-
ation as write-back memory and is completely contained in a cache line, the
processor may not assert the LOCK# signal on the bus. Instead, it will modify the
memory location internally and allow it’s cache coherency mechanism to insure that
the operation is carried out atomically. This operation is called “cache locking.” The
cache coherency mechanism automatically prevents two or more processors that