A SERVICE OF

logo

Appendix A 43
Performance Guidelines for AMD Athlon™ 64 and AMD Opteron™
ccNUMA Multiprocessor Systems
40555 Rev. 3.00 June 2006
A.5 Why Is 0 Hop-1 Hop Case Slower Than
0 Hop-0 Hop Case on a System under High
Background Load (High Subscription) for Write-
Only Threads?
When a 0 hop-0 hop scenario is subjected to a very high background load, the system sees the
following traffic pattern, where each node gets memory requests from the threads as described:
Node 0: 2 foreground threads.
Node 1: 1 background thread.
Node 3: 1 background thread.
Node 2: 1 background thread.
In the 0 hop-1 hop case, the system sees the following traffic pattern:
Node 0: 1 foreground thread
Node 1: 1 foreground and 1 background threads.
Node 3: 1 background thread.
Node 2: 1 background thread.
The 0 hop-1 hop case suffers from a greater load imbalance than the 0 hop-0 hop case, with node 1
suffering the worst effect of this imbalance.
Each of the background threads, as before, asks for data at a rate of 4GB/s and each of the foreground
threads asks for data at a rate of 2.98 GB/s.
Data shows that there is a total memory access rate of 4.78 GB/s on node 1 and several buffer queues
on node 1 are saturated and cannot absorb the data provided by the memory controller any faster.
A.6 Support for a ccNUMA-Aware Scheduler for
AMD64 ccNUMA Multiprocessor Systems
Developers should ensure that the OS is properly configured to support ccNUMA. All versions of
Microsoft
®
Windows
®
XP for AMD64 and Windows Server for AMD64 support ccNUMA without
any configuration changes. The 32-bit versions of Windows Server 2003, Enterprise Edition and
Windows Server 2003, Datacenter Edition require the /PAE boot parameter to support ccNUMA. For
64-bit Linux
®
, there may be separate kernels supporting ccNUMA that should be selected. The 2.6.x
Linux kernels feature NUMA awareness in the scheduler[11]. Most SuSE and Red Hat Enterprise
distributions of 64-bit Linux have the ccNUMA aware kernel. Solaris 10 and subsequent versions of
Solaris for AMD64 support ccNUMA without any changes.