13-10 Vol. 3
SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND PROCESSOR
when a suspended task is resumed (using an FXRSTOR instruction). Here, the
x87 FPU/MMX/SSE/SSE2/SSE3/SSE4 state must be saved as part of the task
state. This approach is appropriate for preemptive multitasking operating
systems, where the application cannot know when it is going to be preempted
and cannot prepare in advance for task switching. Here, the operating system is
responsible for saving and restoring the task and the x87
FPU/MMX/SSE/SSE2/SSE3 state when necessary.
• The operating system can take the responsibility for saving the x87 FPU, MMX,
XMM, and MXCSR registers as part of the task switch process, but delay the
saving of the MMX and x87 FPU state until an x87 FPU, MMX, or
SSE/SSE2/SSE3/SSSE3/SSE4 instruction is actually executed by the new task.
Using this approach, the x87 FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 state is
saved only if an x87 FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 instruction needs
to be executed in the new task. (See
Section 13.5.1, “Using the TS Flag to
Control the Saving of the x87 FPU, MMX, SSE, SSE2, SSE3 SSSE3 and SSE4
State,” for more information.)
13.5.1 Using the TS Flag to Control the Saving of the
x87 FPU, MMX, SSE, SSE2, SSE3 SSSE3 and SSE4 State
Saving the x87 FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 state using FXSAVE requires
processor overhead. If the new task does not access x87 FPU, MMX, XMM, and
MXCSR registers, avoid overhead by not automatically saving the state on a task
switch.
The TS flag in control register CR0 is provided to allow the operating system to delay
saving the x87 FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 state until an instruction
that actually accesses this state is encountered in a new task. When the TS flag is
set, the processor monitors the instruction stream for an x87
FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 instruction. When the processor detects
one of these instructions, it raises a device-not-available exception (#NM) prior to
executing the instruction. The device-not-available exception handler can then be
used to save the x87 FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 state for the previous
task (using an FXSAVE instruction) and load the x87
FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 state for the current task (using an
FXRSTOR instruction). If the task never encounters an x87
FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 instruction, the device-not-available excep
-
tion will not be raised and a task state will not be saved unnecessarily.
NOTE
The CRC32 and POPCNT instructions do not operate on the x87
FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 state. They operate on the
general-purpose registers and are not involved in the OS’s lazy
FXSAVE/FXRSTOR technique.