Support User Manuals

AMD 250 Computer Hardware User Manual

Open as PDF

of 384

Chapter 10 x87 Floating-Point Optimizations 237

Software Optimization Guide for AMD64 Processors

25112 Rev. 3.06 September 2005

Chapter 10 x87 Floating-Point Optimizations

AMD Athlon™ 64 and AMD Opteron™ processors support multiple methods of performing

floating-point operations. They support the older x87 assembly instructions in addition to the more

recent SIMD instructions (SSE, SSE2, and 3DNow!™ technologies). Many of the suggestions in this

chapter are also generally applicable to the AMD Athlon 64 and AMD Opteron processors, with the

exception of SSE2 optimizations and expanded register usage.

AMD Athlon 64 and AMD Opteron processors are 64-bit processors that are fully backwards

compatible with 32-bit code. In general, 64-bit operating systems support the x87 and 3DNow!

instructions in 32-bit threads; however, 64-bit operating systems may not support x87 and 3DNow!

instructions in 64-bit threads. To make it easier to later migrate from 32-bit to 64-bit code, you may

want to avoid x87 and 3DNow! instructions altogether and use only SSE and SSE2 instructions when

writing new 32-bit code.

This chapter details the methods used to optimize floating-point code to the pipelined x87 floating-

point registers.

This chapter covers the following topics:

Topic Page

Using Multiplication Rather Than Division 238

Achieving Two Floating-Point Operations per Clock Cycle 239

Floating-Point Compare Instructions 244

Using the FXCH Instruction Rather Than FST/FLD Pairs 245

Floating-Point Subexpression Elimination 246

Accumulating Precision-Sensitive Quantities in x87 Registers 247

Avoiding Extended-Precision Data 248

previous next