Intel IA-32 Computer Accessories User Manual


 
IA-32 Intel® Architecture Optimization
4-2
For planning considerations of using the new SIMD integer instructions,
refer to “Checking for Streaming SIMD Extensions 2 Support” in
Chapter 3.
General Rules on SIMD Integer Code
The overall rules and suggestions are as follows:
Do not intermix 64-bit SIMD integer instructions with x87
floating-point instructions. See “Using SIMD Integer with x87
Floating-point” section in this chapter. Note that all of the SIMD
integer instructions can be intermixed without penalty.
When writing SSE2 code that works with both integer and
floating-point data, use the subset of SIMD convert instructions or
load/store instructions to ensure that the input operands in XMM
registers contain properly defined data type to match the instruction.
Code sequences containing cross-typed usage will produce the same
result across different implementations, but will incur a significant
performance penalty. Using SSE or SSE2 instructions to operate on
type-mismatched SIMD data in the XMM register is strongly
discouraged.
Use the optimization rules and guidelines described in Chapter 2
and Chapter 3 that apply to the Pentium 4, Intel Xeon and
Pentium M processors.
Take advantage of hardware prefetcher where possible. Use prefetch
instruction only when data access patterns are irregular and prefetch
distance can be pre-determined. (for details, refer to Chapter 6,
“Optimizing Cache Usage”).
Emulate conditional moves by using masked compares and logicals
instead of using conditional branches.