Intel NetBurst Computer Hardware User Manual

Open as PDF

of 17

A Detailed Look Inside the Intel

NetBurst

™

Micro-Architecture of the Intel Pentium

4 Processor

Page 16

b) avoiding the need to access off-chip caches, which can increase the realized bandwidth compared to a

normal load-miss, which returns data to all cache levels.

The situations that are less likely to benefit from software-controlled data prefetch are the following:

§ In cases that are already bandwidth bound, prefetching tends to increase bandwidth demands, and thus not be

effective.

§ Prefetching too far ahead may cause eviction of cached data from the caches prior to actually being used in

execution; not prefetching far enough ahead can reduce the ability to overlap memory and execution latencies.

§ When the prefetch can only be usefully placed in locations where the likelihood of that prefetch’s getting used

is low. Prefetches consume resources in the processor and the use of too many prefetches can limit their

effectiveness. Examples of this include prefetching data in a loop for a reference outside the loop, and

prefetching in a basic block that is frequently executed, but which seldom precedes the reference for which the

prefetch is targeted.

Automatic hardware prefetch is a new feature in the Pentium 4 processor. It can bring cache lines into the unified

second-level cache based on prior reference patterns.

Pros and Cons of Software and Hardware Prefetching. Software prefetching has the following characteristics:

§ Handles irregular access patterns, which would not trigger the hardware prefetcher

§ Handles prefetching of short arrays and avoids hardware prefetching’s start-up delay before initiating the

fetches

§ Must be added to new code; does not benefit existing applications.

In comparison, hardware prefetching for Pentium 4 processor has the following characteristics:

§ Works with existing applications

§ Requires regular access patterns

§ Has a start-up penalty before the hardware prefetcher triggers and begins initiating fetches. This has a larger

effect for short arrays when hardware prefetching generates a request for data beyond the end of an array,

which is not actually utilized. However, software prefetching can recognize and handle these cases by using

fetch bandwidth to hide the latency for the initial data in the next array. The penalty diminishes if it is

amortized over longer arrays.

§ Avoids instruction and issue port bandwidth overhead.

Loads and Stores

The Pentium 4 processor employs the following techniques to speed up the execution of memory operations:

§ speculative execution of loads

§ reordering of loads with respect to loads and stores

§ multiple outstanding misses

§ buffering of writes

§ forwarding of data from stores to dependent loads.

Performance may be enhanced by not exceeding the memory issue bandwidth and buffer resources provided by the

machine. Up to one load and one store may be issued each cycle from the memory port’s reservation stations. In

order to be dispatched to the reservation stations, there must be a buffer entry available for that memory operation.

There are 48 load buffers and 24 store buffers. These buffers hold the µop and address information until the

operation is completed, retired, and deallocated.

The Pentium 4 processor is designed to enable the execution of memory operations out of order with respect to other

instructions and with respect to each other. Loads can be carried out speculatively, that is, before all preceding

Top Automotive Device Types

Top Automotive Brands

Top Baby Care Device Types

Top Baby Care Brands

Top Car Audio & Video Device Types

Top Car Audio & Video Brands

Top Cellphone Device Types

Top Cellphone Brands

Top Communications Device Types

Top Communications Brands

Top Computer Device Types

Top Computer Brands

Top Fitness Device Types

Top Fitness Brands

Top Home Audio Device Types

Top Home Audio Brands

Top Household Appliance Device Types

Top Household Appliance Brands

Top Kitchen Appliance Device Types

Top Kitchen Appliance Brands

Top Laundry Appliance Device Types

Top Laundry Appliance Brands

Top Lawn & Garden Device Types

Top Lawn & Garden Brands

Top Marine Equipment Device Types

Top Marine Equipment Brands

Top Musical Instrument Device Types

Top Musical Instrument Brands

Top Outdoor Cooking Device Types

Top Outdoor Cooking Brands

Top Personal Care Device Types

Top Personal Care Brands

Top Photography Device Types

Top Photography Brands

Top Portable Media Device Types

Top Portable Media Brands

Top Power Tools Device Types

Top Power Tools Brands

Top TV and Video Device Types

Top TV and Video Brands

Top Videogame Device Types

Top Videogame Brands

Intel NetBurst Computer Hardware User Manual