Support User Manuals

Intel IA-32 Computer Accessories User Manual

Open as PDF

of 568

IA-32 Intel® Architecture Optimization

3-16

SIMD Extensions 2 integer SIMD and __m128d is used for double

precision floating-point SIMD. These types enable the programmer to

choose the implementation of an algorithm directly, while allowing the

compiler to perform register allocation and instruction scheduling where

possible. These intrinsics are portable among all Intel architecture-based

processors supported by a compiler. The use of intrinsics allows you to

obtain performance close to the levels achievable with assembly. The

cost of writing and maintaining programs with intrinsics is considerably

less. For a detailed description of the intrinsics and their use, refer to the

Intel® C++ Compiler User’s Guide.

Example 3-10 shows the loop from Example 3-8 using intrinsics.

The intrinsics map one-to-one with actual Streaming SIMD Extensions

assembly code. The

xmmintrin.h header file in which the prototypes

for the intrinsics are defined is part of the Intel C++ Compiler included

with the VTune Performance Enhancement Environment CD.

Intrinsics are also defined for the MMX technology ISA. These are

based on the

__m64 data type to represent the contents of an mm register.

You can specify values in bytes, short integers, 32-bit values, or as a

64-bit object.

Example 3-10 Simple Four-Iteration Loop Coded with Intrinsics

#include <xmmintrin.h>

void add(float *a, float *b, float *c)

{

__m128 t0, t1;

t0 = _mm_load_ps(a);

t1 = _mm_load_ps(b);

t0 = _mm_add_ps(t0, t1);

_mm_store_ps(c, t0);

}

previous next