IBM SA14-2339-04 Personal Computer User Manual

Open as PDF

of 552

Code Optimization and Instruction Timings C-5

Table C-2 summarizes the multiply and MAC instruction timings. In the table, the syntax “[o]” indicates

that the instruction has an “o” form that updates XER[SO,OV], and a “non-o” form. The syntax “[.]”

indicates that the instruction has a “record” form that updates CR[CR0], and a “non-record” form.

C.2.4 Scalar Load Instructions

Generally, the PPC405 executes cachable load instructions that hit in the data cache array or line ﬁll

buffer, or noncachable load instructions that hit in the line ﬁll buffer (when enabled), in one cycle.

However, the pipelined nature of load instructions can even cause loads that hit in the cache or line ﬁll

buffer to appear to take extra cycles under some conditions.

If a load is followed by an instruction that uses the load target as an operand, a load-use dependency

exists. When the load target is returned, it is forwarded to the operand register of the “using”

instruction. This forwarding results in an additional cycle of latency to a load immediately followed by

a “using” instruction, causing the load to appear to execute in two cycles.

To improve cache-to-core timing or data-side on-chip memory (OCM)- to-core timing, the system

designer can disable operand forwarding from the data cache unit (DCU) or OCM to the core. When

operand forwarding is disabled, the load data needed by the “using” instruction is placed in an

intermediate latch before the load data is forwarded to the operand register of the “using” instruction.

When the load target is returned, it is forwarded to the operand register of the “using” instruction. This

introduces two additional cycles of latency to a load immediately followed by a “using” instruction,

causing the load instruction to appear to execute in three cycles.

Because the PPC405 can execute instructions that follow load misses if no load-use dependency

exists, the load and the “using” instruction should be separated by two “non-using” instructions when

possible. If only one instruction can be placed between the load and the “using” instruction, the load

appears to execute in two cycles.

Table C-2. Multiply and MAC Instruction Timing

Operation

Reissue Rate

Cycles

Latency

Cycles

MAC

MAC and negative MAC instructions

Halfword

× Halfword

mullhw[.], mullhwu[.], mulhhw[.],

mulhhwu[.],

mulchw[.], mulchwu[.]

mulli[.], mullw[o][.],

mulhw[.], mulhwu[.]

Halfword

× Word

mulli[.], mullw[o][.],

mulhw[.], mulhwu[.]

Word

× Word

mullw[o][.], mulhw[.], mulhwu[.]

previous next

Top Automotive Device Types

Top Automotive Brands

Top Baby Care Device Types

Top Baby Care Brands

Top Car Audio & Video Device Types

Top Car Audio & Video Brands

Top Cellphone Device Types

Top Cellphone Brands

Top Communications Device Types

Top Communications Brands

Top Computer Device Types

Top Computer Brands

Top Fitness Device Types

Top Fitness Brands

Top Home Audio Device Types

Top Home Audio Brands

Top Household Appliance Device Types

Top Household Appliance Brands

Top Kitchen Appliance Device Types

Top Kitchen Appliance Brands

Top Laundry Appliance Device Types

Top Laundry Appliance Brands

Top Lawn & Garden Device Types

Top Lawn & Garden Brands

Top Marine Equipment Device Types

Top Marine Equipment Brands

Top Musical Instrument Device Types

Top Musical Instrument Brands

Top Outdoor Cooking Device Types

Top Outdoor Cooking Brands

Top Personal Care Device Types

Top Personal Care Brands

Top Photography Device Types

Top Photography Brands

Top Portable Media Device Types

Top Portable Media Brands

Top Power Tools Device Types

Top Power Tools Brands

Top TV and Video Device Types

Top TV and Video Brands

Top Videogame Device Types

Top Videogame Brands

IBM SA14-2339-04 Personal Computer User Manual