IBM PPC440X5 Computer Hardware User Manual

Open as PDF

of 590

User’s Manual

Preliminary PPC440x5 CPU Core

optimize.fm.

September 12, 2002

Page 569 of 589

Appendix B. PPC440x5 Core Compiler Optimizations

This appendix describes some potential optimizations for compilers.

1. Place target addresses (subroutine entry points) on cache line boundaries (32-bytes)

2. Up to ﬁve instructions between a load and a use of the load result. Assuming a data cache hit, the worst

case scenario for the PPC440x5 core is ﬁve instructions between a load-use, in order to avoid any bub-

bles. The ﬁve instructions are:

• One dispatch, together with the load

• Two the cycle after

• Two the cycle after that

In the next cycle, the use of the load result can dispatch. Therefore, the compiler should try to schedule

as many as five instructions between the load and use of the load result. However, if some of the instruc-

tion pairs between the load-use have pipeline dependencies (such that they cannot dispatch together),

there is no benefit in including the extra instructions between the load-use, and other scheduling optimi-

zations could be made.

In the worst case of instruction pairings, the maximum performance can be achieved with only two

instructions between the load and use of the load result. This is the case when the load instruction pairs

with the instruction before it (instead of after it), and then the next two instructions require the same pipe,

so only one can dispatch during the cycle after the load, and then third instruction after the load needs the

same pipe as the second, so they cannot dispatch together either. In such a case, the third instruction

after the load might as well be the use of the load result. See item 3 for information about which instruc-

tion pairings can dispatch together.

3. Pair instructions for dual dispatch. The rules for instruction dispatch in the PPC440x5 core are as follows:

loads and stores can only use the L-Pipe. Branches, CR-updates, XER-updates (“o” forms of arithmetic

instructions), multiply, divide, system instructions (such as

rﬁ and sc), and any SPR accesses (mtspr,

mfspr) can only use the I-Pipe. All other instructions (primarily non-CR-updating and non-XER-updating

arithmetic and logic instructions) can use either the J-Pipe or the I-Pipe. Instructions should be paired so

that they can dispatch as pairs. For example, pair loads and stores with any other instructions. Pair CR-

updates with non-CR-updating instructions and so on.

4. Do not bother to try to schedule instructions between CR-updates and branches that are conditional on

those CR-updates (with some exceptions).

The exceptions are for CR-updates caused by multiply, divide, multiply-accumulate, mtcrf, tlbsx., and

stwcx. instructions. If a branch depends on the CR result of one of these instructions, one or more

instructions should be scheduled (if possible) between the CR update and the branch. Of course, it is also

the general case (as pointed out in item 3) that the compiler should schedule instructions so they can

issue in pairs, and a CR-update and a branch both issue to the I-Pipe, so they cannot issue together.

(The compiler should try to set things up so a CR-update and a following branch (regardless of any CR-

dependency by the branch) can issue in pairs.) This can mean the CR-update can get paired with the

instruction before it, and the branch with the instruction after it, such that there is dual issue in both

cycles. However, if this pairing is not possible, an instruction should be inserted (if possible, of course; do

not create no-ops for no reason) between the CR-update and the branch to allow the dual issue.

The point of this item is to explain that there is no need to separate the CR-update and the branch simply

for the sake of the CR-dependency. That is, there is no extra cycle penalty associated with the CR-

update/branch CR-dependency, beyond the “standard” penalty of the inability to dual issue, unless the

CR-update is one of the types mentioned above.

previous next

Top Automotive Device Types

Top Automotive Brands

Top Baby Care Device Types

Top Baby Care Brands

Top Car Audio & Video Device Types

Top Car Audio & Video Brands

Top Cellphone Device Types

Top Cellphone Brands

Top Communications Device Types

Top Communications Brands

Top Computer Device Types

Top Computer Brands

Top Fitness Device Types

Top Fitness Brands

Top Home Audio Device Types

Top Home Audio Brands

Top Household Appliance Device Types

Top Household Appliance Brands

Top Kitchen Appliance Device Types

Top Kitchen Appliance Brands

Top Laundry Appliance Device Types

Top Laundry Appliance Brands

Top Lawn & Garden Device Types

Top Lawn & Garden Brands

Top Marine Equipment Device Types

Top Marine Equipment Brands

Top Musical Instrument Device Types

Top Musical Instrument Brands

Top Outdoor Cooking Device Types

Top Outdoor Cooking Brands

Top Personal Care Device Types

Top Personal Care Brands

Top Photography Device Types

Top Photography Brands

Top Portable Media Device Types

Top Portable Media Brands

Top Power Tools Device Types

Top Power Tools Brands

Top TV and Video Device Types

Top TV and Video Brands

Top Videogame Device Types

Top Videogame Brands

IBM PPC440X5 Computer Hardware User Manual