Multi-Core and Hyper-Threading Technology 7
7-31
User/Source Coding Rule 24. (H impact, M generality) Beware of false
sharing within a cache line (64 bytes on Intel Pentium 4, Intel Xeon,
Pentium M, Intel Core Duo processors), and within a sector (128 bytes on
Pentium 4 and Intel Xeon processors).
When a common block of parameters is passed from a parent thread to
several worker threads, it is desirable for each work thread to create a
private copy of frequently accessed data in the parameter block.
Placement of Shared Synchronization Variable
On processors based on Intel NetBurst microarchitecture, bus reads
typically fetch 128 bytes into a cache, the optimal spacing to minimize
eviction of cached data is 128 bytes. To prevent false-sharing,
synchronization variables and system objects (such as a critical section)
should be allocated to reside alone in a 128-byte region and aligned to a
128-byte boundary. Example 7-6 shows a way to minimize the bus
traffic required to maintain cache coherency in MP systems. This
technique is also applicable to MP systems using IA-32 processors with
or without Hyper-Threading Technology.
On Pentium M, Intel Core Solo and Intel Core Duo processors, a
synchronization variable should be placed alone and in separate cache
line to avoid false-sharing. Software must not allow a synchronization
variable to span across page boundary.
User/Source Coding Rule 25. (M impact, ML generality) Place each
synchronization variable alone, separated by 128 bytes or in a separate cache
line.
User/Source Coding Rule 26. (H impact, L generality) Do not place any
spin lock variable to span a cache line boundary.
At the code level, false sharing is a special concern in the following
cases:
• Global data variables and static data variables that are placed in the
same cache line and are written by different threads.