Optimizing for SIMD Integer Applications 4
4-27
Highly Efficient Clipping
For clipping signed words to an arbitrary range, the pmaxsw and pminsw
instructions may be used. For clipping unsigned bytes to an arbitrary
range, the
pmaxub and pminub instructions may be used. Example 4-19
shows how to clip signed words to an arbitrary range; the code for
clipping unsigned bytes is similar.
Example 4-19 Clipping to a Signed Range of Words [high, low]
; Input:
; MM0 signed source operands
; Output:
; MM0 signed words clipped to the signed
; range [high, low]
pminsw MM0, packed_high
pmaxsw MM0, packed_low
Example 4-20 Clipping to an Arbitrary Signed Range [high, low]
; Input:
; MM0 signed source operands
; Output:
; MM1 signed operands clipped to the unsigned
; range [high, low]
paddw MM0, packed_min ; add with no saturation
; 0x8000 to convert to unsigned
paddusw MM0, (packed_usmax - high_us)
; in effect this clips to high
psubusw MM0, (packed_usmax - high_us + low_us)
; in effect this clips to low
paddw MM0, packed_low ; undo the previous two offsets