Intel IA-32 Computer Accessories User Manual


 
IA-32 Intel® Architecture Optimization
Index-6
R
reciprocal instructions, 5-2
rounding control option, A-6
S
sampling
event-based, A-10
Self-modifying code, 2-47
SFENCE Instruction, 6-15, 6-16
signed unpack, 4-7
SIMD integer code, 4-2
SIMD-floating-point code, 5-1
simplified 3D geometry pipeline, 6-22
simplified clipping to an arbitrary signed range,
4-28
single-pass versus multi-pass execution, 6-41
smart cache, 1-31
SoA format, 3-29
software write-combining, 6-43
spin loops, 9-9
spread prefetch, 6-33
Stack Alignment
Example of dynamic, 2-43
Stack alignment, 2-42
stack alignment, 3-22
stack frame, D-2
stack frame optimization, D-9
state transitions, 9-2
static branch prediction algorithm, 2-20
static power, 9-1
static prediction, 2-19
static prediction algorithm, 2-19
streaming stores
coherent requests, 6-13
non-coherent requests, 6-13
strip mining, 3-32, 3-34
strip-mining, 6-37, 6-38
Structs
Aligning, 2-39
swizzling data. See data swizzling.
System Bus Optimization, 7-33
T
targeting a processor option, A-3
time-based sampling, A-9
time-consuming innermost loops, 6-7
TLB. See transaction lookaside buffer
transaction lookaside buffer, 6-47
transcendental functions, 2-72
transfer latency, E-7, E-9
U
unpack instructions, 4-11
unsigned unpack, 4-6
using MMX code for copy or shuffling
functions, 5-17
V
vector class library, 3-17
vectorization, 3-12
vectorized code, 3-18
vectorizer switch options, A-5
vertical versus horizontal computation, 5-5
VTune analyzer, 3-10, A-1
VTune Performance Analyzer, 3-10
W
write-combining buffer, 6-43
write-combining memory, 6-43