Compaq ECQD2KCTE Laptop User Manual


 
Software Considerations A–11
16-bit quotient digit plus a 48-bit new partial dividend. Three more such steps can generate the
full quotient. Having prior knowledge of the possible sizes of the divisor and dividend, normal-
izing away leading bytes of zeros, and performing an early-out test can reduce the average
number of multiplies to about five (compared to a best case of one and a worst case of nine).
A.4.3 Byte Swap
When it is necessary to swap all the bytes of a datum, perhaps because the datum originated on
a machine of the opposite byte numbering convention, the simplest sequence is to use the VAX
floating-point load instruction to swap words, followed by an integer sequence to swap four
pairs of bytes. Assume as shown below that an aligned quadword datum is in memory at loca-
tion X and is to be left in R1 after byte-swapping; temp is an aligned quadword temporary, and
"." (period) in the comments stands for a byte of zeros. Similar sequences can be used for data
in registers, sometimes doing the byte swaps first and word swap second:
; X = ABCD EFGH
LDG F0,X ; F0 = GHEF CDAB
STT F0,temp
LDQ R1,temp ; R1 = GHEF CDAB
SLL R1,#8,R2 ; R2 = HEFC DAB.
SRL R1,#8,R1 ; R1 = .GHE FCDA
ZAP R2,#55(hex),R2 ; R2 = H.F. D.B.
ZAP R1,#AA(hex),R1 ; R1 = .G.E .C.A
OR R1,R2,R1 ; R1 = HGFE DCBA
For bulk swapping of arrays, this sequence can be usefully unrolled about four times and
scheduled, using four different aligned quadword memory temps.
A.4.4 Stylized Code Forms
Using the same stylized code form for a common operation improves the readability of com-
piler output and increases the likelyhood that an implementation will speed up the stylized
form.
A.4.4.1 NOP
The universal NOP form is:
UNOP == LDQ_U R31,0(Rx)
In most implementations, UNOP should encounter no operand issue delays, no destination
issue delay, and no functional unit issue delays. (In some implementations, it may encounter an
operand issue delay for Rx.) Implementations are free to optimize UNOP into no action and
zero execution cycles.
If the actual instruction is encoded as LDQ_U Rn,0(Rx), where n is other than 31, and such an
instruction generates a memory-management exception, it is UNPREDICTABLE whether
UNOP would generate the same exception. On most implementations, UNOP does not gener-
ate memory management exceptions.