IA-32 Intel® Architecture Optimization
2-40
Cache line size for Pentium 4 and Pentium M processors can impact
streaming applications (for example, multimedia). These reference and
use data only once before discarding it. Data accesses which sparsely
utilize the data within a cache line can result in less efficient utilization
of system memory bandwidth. For example, arrays of structures can be
decomposed into several arrays to achieve better packing, as shown in
Example 2-19.
The efficiency of such optimizations depends on usage patterns. If the
elements of the structure are all accessed together but the access pattern
of the array is random, then
array_of_struct avoids unnecessary
prefetch even though it wastes memory.
Example 2-19 Decomposing an Array
struct {/* 1600 bytes */
int a, c, e;
char b, d;
} array_of_struct [100];
struct {/* 1400 bytes */
int a[100], c[100], e[100];
char b[100], d[100];
} struct_of_array;
struct {/* 1200 bytes */
int a, c, e;
} hybrid_struct_of_array_ace[100];
struct {/* 200 bytes */
char b, d;
} hybrid_struct_of_array_bd[100];