AMD 250 Computer Hardware User Manual


 
192 Integer Optimizations Chapter 8
25112 Rev. 3.06 September 2005
Software Optimization Guide for AMD64 Processors
8.9 Optimizing Integer Division
Optimization
When possible, use smaller data types for integer division.
Application
This optimization applies to:
32-bit software
64-bit software
Rationale
Division by a 16-bit value is significantly faster than division by a 32-bit value—about a 26 clock
latency versus 42. Likewise, division by a 32-bit value is faster than division by a 64-bit value—about
42 clocks versus 74. Refer to IDIV in table 15. In algorithms in which integer division contributes a
substantial component to performance, it may be beneficial to check whether using a smaller divide
type is possible. Study the assembly language output generated by high-level language compilers to
verify that the desired code is generated. Compilers often generate code that converts 16-bit types into
32-bit values that are then used to perform 32-bit division, thus eliminating the advantage of using 16-
bit integer types. If the compiler cannot be coerced into producing the desired code, then compiler
intrinsics or assembly language are required.