Intel IA-32 Computer Accessories User Manual


 
IA-32 Intel® Architecture Optimization
7-46
Per-instance Stack Offset
Each instance an application runs in its own linear address space; but the
address layout of data for stack segments is identical for the both
instances. When the instances are running in lock step, stack accesses
are likely to cause of excessive evictions of cache lines in the first-level
data cache for some implementations of Hyper-Threading Technology
in IA-32 processors.
Although this situation (two copies of an application running in lock
step) is seldom an objective for multithreaded software or a
multiprocessor platform, it can happen by an end-user’s direction. One
solution is to allow application instance to add a suitable linear
address-offset for its stack. Once this offset is added at start-up, a buffer
of linear addresses is established even when two copies of the same
application are executing using two logical processors in the same
physical processor package. The space has negligible impact on running
dissimilar applications and on executing multiple copies of the same
application.
{ DWORD Stack_offset, ID_Thread1, ID_Thread2, ID_Thread3;
Stack_offset = 1024;
// Stack offset between parent thread and the first child thread.
ID_Thread1 = CreateThread(Func_thread_entry, &Stack_offset);
// Call OS thread API.
Stack_offset = 2048;
ID_Thread2 = CreateThread(Func_thread_entry, &Stack_offset);
Stack_offset = 3072;
ID_Thread3 = CreateThread(Func_thread_entry, &Stack_offset);
}
Example 7-9 Adding an Offset to the Stack Pointer of Three Threads (Contd.)