Compaq ECQD2KCTE Laptop User Manual


 
A–8 Alpha Architecture Handbook
tion addresses differ, and if they do not, inserting up to 8K bytes of padding between the
arrays. This rule will avoid thrashing in direct-mapped TBs and in some large direct-mapped
data caches with total sizes of 32 pages (256 KB) or more.
Usually, this padding will mean zero extra bytes in the executable image, just a skip in virtual
address space to the next-higher page boundary.
For large caches, the rule above should be applied to the I-stream, in addition to all the
D-stream references. Some implementations will have combined I-stream/D-stream large
caches.
Both of the rules above can be satisfied simultaneously, thus often eliminating thrashing in all
anticipated direct-mapped cache/TB implementations.
A.3.4 Sequential Read/Write — Factor of 1
All other things being equal, sequences of consecutive reads or writes should use ascending
(rather than descending) memory addresses. Where possible, the memory address for a block
of 2**Kbytes should be on a 2**K boundary, since this minimizes the number of different
cache blocks used and minimizes the number of partially written cache blocks.
To avoid overrunning memory bandwidth, sequences of more than eight quadword load or
store instructions should be broken up with intervening instructions (if there is any useful work
to be done).
For consecutive reads, implementors should give first priority to prefetching ascending cache
blocks and second priority to absorbing up to eight consecutive quadword load instructions
(aligned on a 64-byte boundary) without stalling.
For consecutive writes, implementors should give first priority to avoiding read overhead for
fully written aligned cache blocks and second priority to absorbing up to eight consecutive
quadword store instructions (aligned on a 64-byte boundary) without stalling.
A.3.5 Prefetching — Factor of 3
Prefetching can be directed toward a cache block (a cache line) in the primary cache.
Alpha hardware, beginning with the 21164 (EV5) and subsequent, supports cache block
prefetching. Cache block prefetching is performed by the following load operations to the R31
or F31 register:
Table A–1: Cache Block Prefetching
Type Instructions Operation
Normal Prefetch LDL R31, xxx (Rn) If the load operation hits in the Dcache, the
instruction is dismissed; otherwise, the
addressed cache block is allocated into the
Dcache.