Fujitsu Fujitsu SPARC64 V Computer Accessories User Manual


 
30 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V Release 1.0, 1 July 2002
SPARC64 V implements JMPL and CALL return prediction hardware in a form of
special stack, called the Return Address Stack (RAS). Whenever a CALL or JMPL that
writes to %o7 (r[15]) occurs, SPARC64 V pushes the return address (PC+8) onto
the RAS. When either of the synthetic instructions retl (JMPL [%o7+8]) and ret (JMPL
[%i7+8]) are subsequently executed, the return address is predicted to be the
address stored on the top of the RAS and the RAS is popped. If the prediction in
the RAS is incorrect, SPARC64 V backs up and starts issuing instructions from the
correct target address. This backup takes a few extra cycles.
Programming Note
For maximum performance, software and compilers must
take into account how the RAS works. For example, tricks that do nonstandard
returns in hopes of boosting performance may require more cycles if they cause the
wrong RAS value to be used for predicting the address of the return. Heavily nested
calls can also cause earlier entries in the RAS to be overwritten by newer entries,
since the RAS only has a limited number of entries. Eventually, some return
addresses will be mispredicted because of the overflow of the RAS.
6.3.7 Floating-Point Operate (FPop) Instructions
The complete conditions of generating an
fp_exception_other
exception with
FSR.ftt =
unfinished_FPop
are described in Section B.6, Floating-Point Nonstandard
Mode on page 61.
The SPARC64 V-specific FMADD and FMSUB instructions (described below) are also
floating-point operations. They require the floating-point unit to be enabled;
otherwise, an
fp_disabled
trap is generated. They also affect the FSR, like FPop
instructions. However, these instructions are not included in the FPop category and,
hence, reserved encodings in these opcodes generate an
illegal_instruction
exception, as
defined in Section 6.3.9 of Commonality.
6.3.8 Implementation-Dependent Instructions
SPARC64 V uses the IMPDEP2 instruction to implement the Floating-Point Multiply-
Add/Subtract and Negative Multiply-Add/Subtract instructions; these have an op3
field = 37
16
(IMPDEP2). See Floating-Point Multiply-Add/Subtract on page 50 for fuller
definitions of these instructions. Opcode space is reserved in IMPDEP2 for the quad-
precision forms of these instructions. However, SPARC64 V does not currently
implement the quad-precision forms, and the processor generates an
illegal_instruction
exception if a quad-precision form is specified. Since these instructions are not part
of the required SPARC V9 architecture, the operating system does not supply
software emulation routines for the quad versions of these instructions.
SPARC64 V uses the IMPDEP1 instruction to implement the graphics acceleration
instructions.