Intel IA-32 Computer Accessories User Manual


 
IA-32 Intel® Architecture Optimization
2-4
Use Available Performance Tools
Current-generation compiler, such as the Intel C++ Compiler:
Set this compiler to produce code for the target processor
implementation
Use the compiler switches for optimization and/or
profile-guided optimization. These features are summarized in
the “Intel® C++ Compiler” section. For more detail, see the
Intel® C++ Compiler User’s Guide.
Current-generation performance monitoring tools, such as VTune™
Performance Analyzer:
Identify performance issues, use event-based sampling, code
coach and other analysis resource.
Measure workload characteristics such as instruction
throughput, data traffic locality, memory traffic characteristics,
etc.
Characterize the performance gain.
Optimize Performance Across Processor Generations
Use a cpuid dispatch strategy to deliver optimum performance for
all processor generations.
Use deterministic cache parameter leaf of cpuid to deliver scalable
performance that are transparent across processor families with
different cache sizes.
Use compatible code strategy to deliver optimum performance for
the current generation of IA-32 processor family and future IA-32
processors.
Use a low-overhead threading strategy so that a multi-threaded
application delivers optimal multi-processor scaling performance
when executing on processors that have hardware multi-threading
support, or deliver nearly identical single-processor scaling when
executing on a processor without hardware multi-threading support.