Support User Manuals

AMD 250 Computer Hardware User Manual

Open as PDF

of 384

216 Optimizing with SIMD Instructions Chapter 9

25112 Rev. 3.06 September 2005

Software Optimization Guide for AMD64 Processors

9.13 Clearing MMX™ and XMM Registers with XOR

Instructions

Optimization

Use instructions that perform XOR operations (PXOR, XORPS, and XORPD) to clear all the bits in

MMX and XMM registers.

Application

This optimization applies to:

• 32-bit software

• 64-bit software

Rationale

The latency of the MMX XOR instruction (PXOR) is only 3 cycles and comparable to the 3 cycles

required to load data, assuming it is in the L1 data cache. The SSE and SSE2 XOR instructions

(XORPS and XORPD, respectively) also have latencies of 3 cycles.

Examples

The following examples illustrate how to clear the bits in a register using the different exclusive-OR

instructions:

; MMX

pxor mm0, mm0 ; Clear the MM0 register.

; SSE

xorps xmm0, xmm0 ; Clear the XMM0 register.

; SSE2

xorpd xmm0, xmm0 ; Clear the XMM0 register.

previous next