IA-32 Intel® Architecture Optimization
B-46
4. Most commonly used x87 instructions (e.g., fmul, fadd, fdiv, fsqrt, fstp, etc.) decode into a singleμop.
However, transcendental and some x87 instructions decode into several
μops; in these limited cases, the metrics will
count the number of
μops thatare actually tagged.
5. This metric may not be supported in all models of the Pentium 4 processor family.
6. Set the following CCCR bits to make edge triggered:
Compare=1; Edge=1; Threshold=0
7. Must program both MSR_FSB_ESCR0 and MSR_FSB_ESCR1.
8. Must program both MSR_BSU_ESCR0 and MSR_BSU_ESCR1.
Performance Metrics and Tagging Mechanisms
A number of metrics require more tags to be specified in addition to
programming a counting event; for example, the metric Split Loads
Retired requires specifying a
split_load_retired tag in addition to
programming the
replay_event to count at retirement. This section
describes three sets of tags that are used in conjunction with three
at-retirement counting events:
front_end_event, replay_event, and
execution_event. Please refer to Appendix A of the IA-32 Intel®
Architecture Software Developer’s Manual, Volume 3B for the
description of the at-retirement events.
Tags for replay_event
Table B-2 provides a list of the tags that are used by various metrics in
Table B-1. These tags enable you to mark
μops at earlier stage of
execution and count the
μops at retirement using the replay_event.
These tags require at least two MSR’s (see Table B-2, column 2 and
column 3) to tag the
μops so they can be detected at retirement. Some
tags require additional MSR (see Table B-2, column 4) to select the
event types for these tagged
μops. The event names referenced in
column 4 are those from the Pentium 4 processor performance
monitoring events.