Intel IA-32 Computer Accessories User Manual


 
IA-32 Intel® Architecture Optimization
2-86
In some situations, the byte count of the data to operate is known by the
context (versus from a parameter passed from a call). One can take a
simpler approach than those required for a general-purpose library
routine. For example, if the byte count is also small, using rep
movsb/stosb with count less than four can ensure good address
alignment and loop-unrolling to finish the remaining data; using
movsd/stosd can reduce the overhead associated with iteration.
Using a REP prefix with string move instructions can provide high
performance in the situations described above. However, using a REP
prefix with string scan instructions (scasb, scasw, scasd, scasq) or
compare instructions (cmpsb, cmpsw, smpsd, smpsq) is not
recommended for high performance. Consider using SIMD instructions
instead.
Address Calculations
Use the addressing modes for computing addresses rather than using the
general-purpose computation. Internally, memory reference instructions
can have four operands:
relocatable load-time constant
immediate constant
base register
scaled index register
In the segmented model, a segment register may constitute an additional
operand in the linear address calculation. In many cases, several integer
instructions can be eliminated by fully using the operands of memory
references.