Support User Manuals

Intel IA-32 Computer Accessories User Manual

Open as PDF

of 568

5-1

5

Optimizing for SIMD

Floating-point Applications

This chapter discusses general rules of optimizing for the

single-instruction, multiple-data (SIMD) floating-point instructions

available in Streaming SIMD Extensions (SSE), Streaming SIMD

Extensions 2 (SSE2)and Streaming SIMD Extensions 3 (SSE3). This

chapter also provides examples that illustrate the optimization

techniques for single-precision and double-precision SIMD

floating-point applications.

General Rules for SIMD Floating-point Code

The rules and suggestions listed in this section help optimize

floating-point code containing SIMD floating-point instructions.

Generally, it is important to understand and balance port utilization to

create efficient SIMD floating-point code. The basic rules and

suggestions include the following:

• Follow all guidelines in Chapter 2 and Chapter 3.

• Exceptions: mask exceptions to achieve higher performance. When

exceptions are unmasked, software performance is slower.

• Utilize the flush-to-zero and denormals-are-zero modes for higher

performance to avoid the penalty of dealing with denormals and

underflows.

• Incorporate the prefetch instruction where appropriate (for details,

refer to Chapter 6, “Optimizing Cache Usage”).

• Use MMX technology instructions and registers if the computations

can be done in SIMD integer for shuffling data.

previous next