Compaq EV67 Network Card User Manual


 
Alpha 21264/EV67 Hardware Reference Manual
Internal Architecture 2–25
Special Cases of Alpha Instruction Execution
Figure 2–9 Pipeline Timing for Integer Load Instructions
There are two cycles in which the IQ may speculatively issue instructions that use load
data before Dcache hit information is known. Any instructions that are issued by the IQ
within this 2-cycle speculative window are kept in the IQ with their requests inhibited
until the load instruction’s hit condition is known, even if they are not dependent on the
load operation. If the load instruction hits, then these instructions are removed from the
queue. If the load instruction misses, then the execution of these instructions is aborted
and the instructions are allowed to request service again.
For example, in Figure 2–9, instruction 1 and instruction 2 are issued within the specu-
lative window of the load instruction. If the load instruction hits, then both instructions
will be deleted from the queue by the start of cycle 7—one cycle later than normal for
instruction 1 and at the normal time for instruction 2. If the load instruction misses, both
instructions are aborted from the execution pipelines and may request service again in
cycle 6.
IQ-issued instructions are aborted if issued within the speculative window of an integer
load instruction that missed in the Dcache, even if they are not dependent on the load
data. However, if software misses are likely, the 21264/EV67 can still benefit from
scheduling the instruction stream for Dcache miss latency. The 21264/EV67 includes a
saturating counter that is incremented when load instructions hit and is decremented
when load instructions miss. When the upper bit of the counter equals zero, the integer
load latency is increased to five cycles and the speculative window is removed. The
counter is 4 bits wide and is incremented by 1 on a hit and is decremented by two on a
miss.
Since load instructions to R31 do not produce a result, they do not create a speculative
window when they execute and, therefore, never waste IQ-issue cycles if they miss.
Floating-point load instructions that hit in the Dcache have a latency of four cycles. Fig-
ure 2–10 shows the pipeline timing for floating-point load instructions. In Figure 2–10:
Symbol Meaning
Q Issue queue
R Register file read
E Execute
D Dcache access
B Data bus active
1Cycle Number
ILD
Instruction 1
Instruction 2
2 3 4 5 6 7 8
QREDB
QR
Q
Hit
FM-05814.AI4