Optimizing for SIMD Integer Applications 4
4-11
Non-Interleaved Unpack
The unpack instructions perform an interleave merge of the data
elements of the destination and source operands into the destination
register. The following example merges the two operands into the
destination registers without interleaving. For example, take two
adjacent elements of a packed-word data type in
source1 and place this
value in the low 32 bits of the results. Then take two adjacent elements
of a packed-word data type in
source2 and place this value in the high
32 bits of the results. One of the destination registers will have the
combination illustrated in Figure 4-3.
Example 4-5 Interleaved Pack without Saturation
; Input:
; MM0 signed source value
; MM1 signed source value
; Output:
; MM0 the first and third words contain the
; low 16-bits of the doublewords in MM0,
; the second and fourth words contain the
; low 16-bits of the doublewords in MM1
pslld MM1, 16 ; shift the 16 LSB from each of the
; doubleword values to the 16 MSB
; position
pand MM0, {0,ffff,0,ffff}
; mask to zero the 16 MSB
; of each doubleword value
por MM0, MM1 ; merge the two operands