Coding for SIMD Architectures 3
3-25
The __declspec(align(16)) specifications can be placed before data
declarations to force 16-byte alignment. This is particularly useful for
local or global data declarations that are assigned to 128-bit data types.
The syntax for it is
__declspec(align(integer-constant))
where the integer-constant is an integral power of two but no greater
than 32. For example, the following increases the alignment to 16-bytes:
__declspec(align(16)) float buffer[400];
The variable buffer could then be used as if it contained 100 objects of
type
__m128 or F32vec4. In the code below, the construction of the
F32vec4 object, x, will occur with aligned data.
void foo() {
F32vec4 x = *(__m128 *) buffer;
...
}
Without the declaration of __declspec(align(16)), a fault may occur.
Alignment by Using a
union Structure. Preferably, when feasible, a
union can be used with 128-bit data types to allow the compiler to align
the data structure by default. Doing so is preferred to forcing alignment
with
__declspec(align(16)) because it exposes the true program
intent to the compiler in that
__m128 data is being used. For example:
union {
float f[400];
__
m128 m[100];
} buffer;
The 16-byte alignment is used by default due to the __m128 type in the
union; it is not necessary to use __declspec(align(16)) to force it.