Intel IA-32 Computer Accessories User Manual


 
IA-32 Intel® Architecture Optimization
2-30
Assembly/Compiler Coding Rule 16. (H impact, H generality) Align data
on natural operand size address boundaries. If the data will be accesses with
vector instruction loads and stores, align the data on 16 byte boundaries.
For best performance, align data as follows:
Align 8-bit data at any address.
Align 16-bit data to be contained within an aligned four byte word.
Align 32-bit data so that its base address is a multiple of four.
Align 64-bit data so that its base address is a multiple of eight.
Align 80-bit data so that its base address is a multiple of sixteen.
Align 128-bit data so that its base address is a multiple of sixteen.
A 64-byte or greater data structure or array should be aligned so that its
base address is a multiple of 64. Sorting data in decreasing size order is
one heuristic for assisting with natural alignment. As long as 16-byte
boundaries (and cache lines) are never crossed, natural alignment is not
strictly necessary, though it is an easy way to enforce this.
Example 2-11 shows the type of code that can cause a cache line split.
The code loads the addresses of two
dword arrays. 029e70feh is not a
4-byte-aligned address, so a 4-byte access at this address will get 2 bytes
from the cache line this address is contained in, and 2 bytes from the
cache line that starts at 029e7100h. On processors with 64-byte cache
lines, a similar cache line split will occur every 8 iterations. Figure 2-1
illustrates the situation of accessing a data element that span across
cache line boundaries.