IA-32 Intel® Architecture Optimization
1-32
• Power-optimized bus
The system bus is optimized for power efficiency; increased bus
speed supports 667 MHz.
• Data Prefetch
Intel Core Solo and Intel Core Duo processors implement improved
hardware prefetch mechanisms: one mechanism can look ahead and
prefetch data into L1 from L2. These processors also provide
enhanced hardware prefetchers similar to those of the Pentium M
processor (see Table 1-2).
Front End
Execution of SIMD instructions on Intel Core Solo and Intel Core Duo
processors are improved over Pentium M processors by the following
enhancements:
• Micro-op fusion
Scalar SIMD operations on register and memory have single
micro-op flows comparable to X87 flows. Many packed instructions
are fused to reduce its micro-op flow from four to two micro-ops.
• Eliminating decoder restrictions
Intel Core Solo and Intel Core Duo processors improve decoder
throughput with micro-fusion and macro-fusion, so that many more
SSE and SSE2 instructions can be decoded without restriction. On
Pentium M processors, many single micro-op SSE and SSE2
instructions must be decoded by the main decoder.
• Improved packed SIMD instruction decoding
On Intel Core Solo and Intel Core Duo processors, decoding of most
packed SSE instructions is done by all three decoders. As a result
the front end can process up to three packed SSE instructions every
cycle. There are some exceptions to the above; some
shuffle/unpack/shift operations are not fused and require the main
decoder.