Support User Manuals

Intel IA-32 Computer Accessories User Manual

Open as PDF

of 568

Coding for SIMD Architectures 3

3-15

Assembly

Key loops can be coded directly in assembly language using an

assembler or by using inlined assembly (C-asm) in C/C++ code. The

Intel compiler or assembler recognize the new instructions and registers,

then directly generate the corresponding code. This model offers the

opportunity for attaining greatest performance, but this performance is

not portable across the different processor architectures.

Example 3-9 shows the Streaming SIMD Extensions inlined assembly

encoding.

Intrinsics

Intrinsics provide the access to the ISA functionality using C/C++ style

coding instead of assembly language. Intel has defined three sets of

intrinsic functions that are implemented in the Intel

®

C++ Compiler to

support the MMX technology, Streaming SIMD Extensions and

Streaming SIMD Extensions 2. Four new C data types, representing

64-bit and 128-bit objects are used as the operands of these intrinsic

functions.

__m64 is used for MMX integer SIMD, __m128 is used for

single-precision floating-point SIMD,

__m128i is used for Streaming

Example 3-9 Streaming SIMD Extensions Using Inlined Assembly Encoding

void add(float *a, float *b, float *c)

{

__asm {

mov eax, a

mov edx, b

mov ecx, c

movaps xmm0, XMMWORD PTR [eax]

addps xmm0, XMMWORD PTR [edx]

movaps XMMWORD PTR [ecx], xmm0

}

}

previous next