IBM PPC440X5 Computer Hardware User Manual


 
User’s Manual
PPC440x5 CPU Core Preliminary
Page 116 of 589
cache.fm.
September 12, 2002
support direct attachment to 32-bit and 64-bit PLB subsystems, as well as 128-bit PLB subsystems. The DCC
handles frequency synchronization between the PPC440x5 core and the PLB, and can operate at any ratio of
n:1, n:2, and n:3, where n is an integer greater than the corresponding denominator.
The DCC also handles the execution of the PowerPC data cache management instructions, for touching
(prefetching), flushing, invalidating, or zeroing cache lines, or for flash invalidation of the entire cache.
Resources for controlling and debugging the data cache operation are also provided.
The DCC interfaces to the Auxiliary Processor (AP) port to provide direct load/store access to the data cache
for AP load and store operations, as well as for floating-point load and store instructions. AP load and store
instructions can access up to 16 bytes (one quadword) in a single cycle.
Extensive load, store, and flush queues are also provided, such that up to three outstanding line fills, up to
four outstanding load misses, and up to two outstanding line flushes can be pending, with the DCC continuing
to service subsequent load and store hits in an out-of-order fashion.
The rest of this section describes each of these functions in more detail.
4.3.1 DCC Operations
When the DCC executes a load, store, or data cache management instruction, the DCC first translates the
effective address specified by the instruction into a real address (see Memory Management on page 133 for
more information on address translation). Next, the DCC searches the data cache array for the cache line
associated with the real address of the requested data. If the cache line is found in the array (a cache hit),
that cache line is used to satisfy the request, according to the type of operation (load, store, and so on).
If the cache line is not found in the array (a cache miss), the next action depends upon the type of instruction
being executed, as well as the storage attributes of the memory page containing the data being accessed.
For most operations, and assuming the memory page is cacheable (see Caching Inhibited (I) on page 145),
the DCC will send a request for the entire cache line (32 bytes) to the data read PLB interface. The request to
the data read PLB interface is sent using the specific byte address requested by the instruction, so that the
memory subsystem may read the cache line target word first (if it supports such operation) and supply the
specific byte[s] requested before retrieving the rest of the cache line.
While the DCC is waiting for a cache line read to complete, it can continue to process subsequent instruc-
tions, and handle those accesses that hit in the data cache. That is, the data cache is completely non-
blocking.
As the DCC receives each portion of the cache line from the data read PLB interface, it is placed into one of
three data cache line fill data (DCLFD) buffers. Data from these buffers may be bypassed to the GPR file to
satisfy load instructions, without waiting for the entire cache line to be filled. Once the entire cache line has
been filled into the buffer, it will be written into the data cache at the first opportunity (either when the data
cache is otherwise idle, or when subsequent operations require that the DCLFD buffer be written to the data
cache).
If a memory subsystem error (such as an address time-out, invalid address, or some other type of hardware
error external to the PPC440x5 core) occurs during the filling of the cache line, the line will still be written into
the data cache, and data from the line may still be delivered to the GPR file for load instructions. However,
the DCC will also report a Data Machine Check exception to the instruction unit of the PPC440x5 core, and a
Machine Check interrupt (if enabled) will result. See Machine Check Interrupt on page 178 for more informa-
tion on Machine Check interrupts.