Intel IA-32 Computer Accessories User Manual


 
IA-32 Intel® Architecture Optimization
7-18
Adjust the private stack of each thread in an application so the
spacing between these stacks is not offset by multiples of 64 KB or
1 MB (prevents unnecessary cache line evictions) when targeting
IA-32 processors supporting Hyper-Threading Technology.
Add a per-instance stack offset when two instances of the same
application are executing in lock steps to avoid memory accesses
that are offset by multiples of 64 KB or 1 MB when targeting IA-32
processors supporting Hyper-Threading Technology.
See “Memory Optimization” for more details.
Key Practices of Front-end Optimization
Key practices for front-end optimization on processors that support
Hyper-Threading Technology are:
Avoid Excessive Loop Unrolling to ensure the Trace Cache is
operating efficiently.
Optimize code size to improve locality of Trace Cache and increase
delivered trace length.
See “Front-end Optimization” for more details.
Key Practices of Execution Resource Optimization
Each physical processor has dedicated execution resources. Logical
processors in physical processors supporting Hyper-Threading
Technology share specific on-chip execution resources. Key practices
for execution resource optimization include:
Optimize each thread to achieve optimal frequency scaling first.
Optimize multithreaded applications to achieve optimal scaling with
respect to the number of physical processors.
Use on-chip execution resources cooperatively if two threads are
sharing the execution resources in the same physical processor
package.