IBM P5 570 Server User Manual


 
Chapter 2. Architecture and technical overview 21
2.1.1 Simultaneous multi-threading
As a permanent requirement for performance improvements at the application level,
simultaneous multi-threading (SMT) functionality is embedded in the POWER5 chip
technology. Developers are familiar with process-level parallelism (multi-tasking) and
thread-level parallelism (multi-threads). SMT is the next stage of processor saturation for
throughput-oriented applications to introduce the method of instruction group-level parallelism
to support multiple pipelines to the processor. The instruction groups are chosen from
different hardware threads belonging to a single OS image.
SMT is activated by default when an OS that supports it is loaded. On a 2-way POWER5
processor based system, the operating system discovers the available processors as a 4-way
system. To achieve a higher performance level, SMT is also applicable in Micro-Partitioning,
capped or uncapped, and dedicated partition environments (2.9, “Virtualization” on page 38).
Simultaneous multi-threading is supported on POWER5 processor-based systems running
AIX 5L V5.3 or Linux-based systems at a required 2.6 kernel. AIX provides the smtctl
command that turns SMT on and off without subsequent reboot. For Linux, an additional boot
option must be set to activate SMT after a reboot.
The SMT mode maximizes the usage of the execution units. In the POWER5 chip, more
rename registers have been introduced (for Floating Point operation, rename registers are
increased to 120), that are essential for out-of-order execution and vital for the SMT.
Enhanced SMT features
To improve SMT performance for various workload mixes and provide robust quality of
service, POWER5 provides two features:
Dynamic resource balancing
The objective of dynamic resource balancing is to ensure that the two threads
executing on the same processor flow smoothly through the system.
Depending on the situation, the POWER5 processor resource balancing logic has a
different thread throttling mechanism.
Adjustable thread priority
Adjustable thread priority lets software determine when one thread should have a
greater (or lesser) share of execution resources.
The POWER5 supports eight software-controlled priority levels for each thread.
ST operation
Not all applications benefit from SMT. Having threads executing on the same processor does
not increase the performance of applications with execution unit limited performance or
applications that consume all of the chip’s memory bandwidth. For this reason, the POWER5
processor supports the ST execution mode. In this mode, the POWER5 processor gives all of
the physical resources to the active thread, enabling it to achieve higher performance than a
POWER4 processor-based system at equivalent frequencies. Highly optimized scientific
codes are one example where ST operation is ideal.
2.1.2 Dynamic power management
In current CMOS
1
technologies, chip power is one of the most important design parameters.
With the introduction of SMT, more instructions execute per cycle per processor core, thus
increasing the core’s and the chip’s total switching power. To reduce switching power,
1
complementary metal oxide semiconductor