Intel IA-32 Computer Accessories User Manual


 
x
Hardware Prefetch ......................................................................................................... 6-19
Example of Effective Latency Reduction with H/W Prefetch .......................................... 6-20
Example of Latency Hiding with S/W Prefetch Instruction ............................................ 6-22
Software Prefetching Usage Checklist........................................................................... 6-24
Software Prefetch Scheduling Distance ......................................................................... 6-25
Software Prefetch Concatenation................................................................................... 6-26
Minimize Number of Software Prefetches...................................................................... 6-29
Mix Software Prefetch with Computation Instructions .................................................... 6-32
Software Prefetch and Cache Blocking Techniques....................................................... 6-34
Hardware Prefetching and Cache Blocking Techniques ................................................ 6-39
Single-pass versus Multi-pass Execution....................................................................... 6-41
Memory Optimization using Non-Temporal Stores................................................................ 6-43
Non-temporal Stores and Software Write-Combining..................................................... 6-43
Cache Management....................................................................................................... 6-44
Video Encoder .......................................................................................................... 6-45
Video Decoder.......................................................................................................... 6-45
Conclusions from Video Encoder and Decoder Implementation .............................. 6-46
Optimizing Memory Copy Routines.......................................................................... 6-46
TLB Priming.............................................................................................................. 6-47
Using the 8-byte Streaming Stores and Software Prefetch....................................... 6-48
Using 16-byte Streaming Stores and Hardware Prefetch ......................................... 6-50
Performance Comparisons of Memory Copy Routines ............................................ 6-52
Deterministic Cache Parameters .......................................................................................... 6-53
Cache Sharing Using Deterministic Cache Parameters................................................. 6-55
Cache Sharing in Single-core or Multi-core.................................................................... 6-55
Determine Prefetch Stride Using Deterministic Cache Parameters ............................... 6-56
Chapter 7 Multi-Core and Hyper-Threading Technology
Performance and Usage Models............................................................................................. 7-2
Multithreading................................................................................................................... 7-2
Multitasking Environment ................................................................................................. 7-4
Programming Models and Multithreading ............................................................................... 7-6
Parallel Programming Models .......................................................................................... 7-7
Domain Decomposition............................................................................................... 7-7
Functional Decomposition................................................................................................ 7-8
Specialized Programming Models.................................................................................... 7-8
Producer-Consumer Threading Models.................................................................... 7-10
Tools for Creating Multithreaded Applications................................................................ 7-14
Optimization Guidelines ........................................................................................................ 7-16
Key Practices of Thread Synchronization ...................................................................... 7-16