IBM 750GL Computer Accessories User Manual


 
User’s Manual
IBM PowerPC 750GX and 750GL RISC Microprocessor
gx_08.fm.(1.2)
March 27, 2006
Bus Interface Operation
Page 287 of 377
data cache. If there is a miss in the L2 cache, then the request is passed on to the bus interface unit (BIU) via
three additional L2-to-BIU reload-request queues. Data returned from the bus is loaded into the data-cache
reload buffer, one of the L2 reload buffers, and the critical word is forwarded to the load/store unit.
A dedicated snoop copyback queue has been added, which enables a fifth transaction to pipeline on the bus.
It supports enveloped write transactions with the assertion of DBWO
. All snoop copybacks are issued from
this queue.
A maximum of four reloads can be in progress through the L2 cache. The instruction cache will only request
one reload at a time, and the data cache can request up to four. There can be a maximum of one instruction
cache and three data cache reloads, or four data cache reloads.
An example of 1-level address pipelining is shown in Figure 8-5 on page 287. Note that to support address
pipelining, the memory system must no longer require the first address on the bus in order to complete the
first data tenure, and possibly can also queue the second address to maximize the parallelism on the bus.
8.2.2.1 Miss-under-Miss and System Performance
The MuM feature allows loads and stores that miss in the L1 cache to continue to the L2 cache, even though
the L1 cache is busy reloading a prior miss. Hence the name, miss-under-miss (MuM). If MuM requests also
miss in the L2 cache, they will proceed to the 60x bus in a pipelined fashion. A performance benefit is realized
when pipelining on the 60x bus because the penalty for large memory latency only occurs with the first
memory access. The greatest performance advantage is achieved if MuM requests can be sustained for as
long as possible. Load/store instruction sequences affect how much benefit the MuM feature will produce.
The best sequence is a series of load instructions that reference a different cache-line index (EA[20:26]).
Blocks of memory can be efficiently loaded into the data cache with tight loops that increment the address by
x'20', as in the following example.
The L/S to data cache has two lines to indicate the normal request path and the MuM request path. MuM can
serially request up to three more loads (hold,Eib0, and Eib1) but the address queues are really in the BIU
which can hold up to 4 loads. MuM will be throttled by other events such as a full 3 entry Store queue in the
L/S.
Figure 8-5. First Level Address Pipelining
Addr #1 Addr #2
Data #1 Data #2
AACK
DBG
ABB
BG
(1) (2)
(1) (2)
DBB
(1), (2) - Indicates masters 1 and 2,
or transactions 1 and 2 from
the same master.