Compaq ECQD2KCTE Laptop User Manual


 
A–1
Appendix A
Software Considerations
A.1 Hardware-Software Compact
The Alpha architecture, like all RISC architectures, depends on careful attention to data align-
ment and instruction scheduling to achieve high performance.
Since there will be various implementations of the Alpha architecture, it is not obvious how
compilers can generate high-performance code for all implementations. This chapter gives
some scheduling guidelines that, if followed by all compilers and respected by all implementa-
tions, will result in good performance. As such, this section represents a good-faith compact
between hardware designers and software writers. It represents a set of common goals, not a
set of architectural requirements. Thus, an Appendix, not a Chapter.
Many of the performance optimizations discussed below provide an advantage only for fre-
quently executed code. For rarely executed code, they may produce a bigger program that is
not any faster. Some of the branching optimizations also depend on good prediction of which
path from a conditional branch is more frequently executed. These optimizations are best deter-
mined by using an execution profile, either an estimate generated by compiler heuristics, or a
real profile of a previous run, such as that gathered by PC-sampling in PCA.
Each computer architecture has a "natural word size." For the PDP-11, it is 16 bits; for VAX,
32 bits; and for Alpha, 64 bits. Other architectures also have a natural word size that varies
between 16 and 64 bits. Except for very low-end implementations, ALU data paths, cache
access paths, chip pin buses, and main memory data paths are all usually the natural word size.
As an architecture becomes commercially successful, high-end implementations inevitably
move to double-width data paths that can transfer an aligned (at an even natural word address)
pair of natural words in one cycle. For Alpha, this means 128-bit wide data paths will eventu-
ally be implemented. It is difficult to get much speed advantage from paired transfers unless
the code being executed has instructions and data appropriately aligned on aligned octaword
boundaries. Since this is difficult to retrofit to old code, the following sections sometimes
encourage "over-aligning" to octaword boundaries in anticipation of high-speed Alpha
implementations.