AMD 250 Computer Hardware User Manual


 
96 Cache and Memory Optimizations Chapter 5
25112 Rev. 3.06 September 2005
Software Optimization Guide for AMD64 Processors
5.3 Cache-Coherent Nonuniform Memory Access
(ccNUMA)
Optimization
For applications with multiple threads, use OS functions to run a thread on a particular node and let
that thread allocate the memory that it requires so that the memory used is local to that node. In the
Microsoft Windows environment, the function to run a thread on a particular node is
SetThreadAffinityMask( ).
Be sure operating systems are properly configured to support ccNUMA. All versions of Microsoft
Windows XP for AMD64 and Windows Server for AMD64 support ccNUMA without any changes.
The 32-bit versions of Windows Server 2003, Enterprise Edition and Windows Server 2003,
Datacenter Edition require the /PAE boot parameter to support ccNUMA.
For 64-bit Linux, there may be separate kernels supporting ccNUMA that should be selected.
Application
This optimization applies to:
32-bit software
64-bit software
Rationale
Most multiple processor systems available today employ a symmetric multiprocessing (SMP)
architecture. Processors on an SMP platform generally share a common or centralized memory bus,
having identical memory access latencies regardless of the processor position. Because the processors
use the same bus and memory, system performance may be negatively affected when bottlenecks
occur due to increased demands on the single memory bus. Figure 1 shows a simplified block diagram
for a two processor SMP system.