Support User Manuals

AMD 250 Computer Hardware User Manual

Open as PDF

of 384

96 Cache and Memory Optimizations Chapter 5

25112 Rev. 3.06 September 2005

Software Optimization Guide for AMD64 Processors

5.3 Cache-Coherent Nonuniform Memory Access

(ccNUMA)

Optimization

For applications with multiple threads, use OS functions to run a thread on a particular node and let

that thread allocate the memory that it requires so that the memory used is local to that node. In the

Microsoft Windows environment, the function to run a thread on a particular node is

SetThreadAffinityMask( ).

Be sure operating systems are properly configured to support ccNUMA. All versions of Microsoft

Windows XP for AMD64 and Windows Server for AMD64 support ccNUMA without any changes.

The 32-bit versions of Windows Server 2003, Enterprise Edition and Windows Server 2003,

Datacenter Edition require the /PAE boot parameter to support ccNUMA.

For 64-bit Linux, there may be separate kernels supporting ccNUMA that should be selected.

Application

This optimization applies to:

• 32-bit software

• 64-bit software

Rationale

Most multiple processor systems available today employ a symmetric multiprocessing (SMP)

architecture. Processors on an SMP platform generally share a common or centralized memory bus,

having identical memory access latencies regardless of the processor position. Because the processors

use the same bus and memory, system performance may be negatively affected when bottlenecks

occur due to increased demands on the single memory bus. Figure 1 shows a simplified block diagram

for a two processor SMP system.

previous next