Intel IA-32 Computer Accessories User Manual


 
IA-32 Intel® Architecture Optimization
B-16
accesses (i.e., are also 3rd-level misses). This can decrease the average
measured BSQ latencies for workloads that frequently thrash (miss or
prefetch a lot into) the 2nd-level cache but hit in the 3rd-level cache.
This effect may be less of a factor for workloads that miss all on-chip
caches, since all BSQ entries due to such references will become bus
transactions.
Metrics Descriptions and Categories
The Performance metrics for Intel Pentium 4 and Intel Xeon processors
are listed in Table B-1. These performance metrics consist of recipes to
program specific Pentium 4 and Intel Xeon processor performance
monitoring events to obtain event counts that represent one of the
following: number of instructions, cycles, or occurrences. Table B-1
also includes a few ratios that are derived from counts of other
performance metrics.
On IA-32 processors that support Hyper-Threading Technology, the
performance counters and associated model specific registers (MSRs)
are extended to support Hyper-Threading Technology. A subset of the
performance monitoring events allow the event counts to be qualified by
logical processors. The programming interface for qualification of
performance monitoring events by logical processors is documented in
IA-32 Intel® Architecture Software Developer’s Manual, Volumes
3A & 3B. Other performance monitoring events produce counts that are
independent of which logical processor is associated with the
microarchitectural events. The qualification of the performance metrics
on IA-32 processors that support Hyper-Threading Technology is listed
in Table B-5 and Table B-6.
In Table B-1, the recipe for programming the performance metrics using
performance-monitoring event is arranged as follows:
Column 1 specifies performance metrics. This may be a
single-event metric; for example, the metric Instructions Retired is
based on the counts of the performance monitoring event
instr_retired, using a specific set of event mask bits. Or it can be