SSSE3
Supplemental Streaming SIMD Extensions 3 is a SIMD instruction set created by Intel and is the fourth iteration of the SSE technology.
History
SSSE3 was first introduced with Intel processors based on the Core microarchitecture on June 26, 2006 with the "Woodcrest" Xeons.SSSE3 has been referred to by the codenames Tejas New Instructions or Merom New Instructions for the first processor designs intended to support it.
Functionality
SSSE3 contains 16 new discrete instructions. Each instruction can act on 64-bit MMX or 128-bit XMM registers. Therefore, Intel's materials refer to 32 new instructions. They include:- Twelve instructions that perform horizontal addition or subtraction operations.
- Six instructions that evaluate absolute values.
- Two instructions that perform multiply and add operations and speed up the evaluation of dot products.
- Two instructions that accelerate packed-integer multiply operations and produce integer values with scaling.
- Two instructions that perform a byte-wise, in-place shuffle according to the second shuffle control operand.
- Six instructions that negate packed integers in the destination operand if the signs of the corresponding element in the source operand is less than zero.
- Two instructions that align data from the composite of two operands.
CPUs with SSSE3
- AMD:
- * "Cat" low-power processors
- ** Bobcat-based processors
- ** Jaguar-based processors and newer
- ** Puma-based processors and newer
- * "Heavy Equipment" processors
- ** Bulldozer-based processors
- ** Piledriver-based processors
- ** Steamroller-based processors
- ** Excavator-based processors and newer
- *Zen-based processors
- * Zen+-based processors
- * Zen2-based processors
- Intel:
- *Xeon 5100 Series
- *Xeon 5300 Series
- *Xeon 5400 Series
- *Xeon 3000 Series
- *Core 2 Duo
- *Core 2 Extreme
- *Core 2 Quad
- *Core i7
- *Core i5
- *Core i3
- *Pentium Dual Core
- *Celeron 4xx Sequence Conroe-L
- *Celeron Dual Core E1200
- *Celeron M 500 series
- *Atom
- VIA:
- *Nano
New instructions
PSIGNB, PSIGNW, PSIGND | Packed Sign | Negate the elements of a register of bytes, words or dwords if the sign of the corresponding elements of another register is negative. |
PABSB, PABSW, PABSD | Packed Absolute Value | Fill the elements of a register of bytes, words or dwords with the absolute values of the elements of another register |
PALIGNR | Packed Align Right | take two registers, concatenate their values, and pull out a register-length section from an offset given by an immediate value encoded in the instruction. |
PSHUFB | Packed Shuffle Bytes | takes registers of bytes A = and B = and replaces A with ; except that it replaces the ith entry with 0 if the top bit of bi is set. |
PMULHRSW | Packed Multiply High with Round and Scale | treat the 16-bit words in registers A and B as signed 16-bit fixed-point numbers between −1.00000000 and +0.99996948..., and multiply them together with correct rounding. |
PMADDUBSW | Multiply and Add Packed Signed and Unsigned Bytes | Take the bytes in registers A and B, multiply them together, add pairs, signed-saturate and store. I.e. pmaddubsw = |
PHSUBW, PHSUBD | Packed Horizontal Subtract | takes registers A = and B = and outputs |
PHSUBSW | Packed Horizontal Subtract and Saturate Words | like PHSUBW, but outputs |
PHADDW, PHADDD | Packed Horizontal Add | takes registers A = and B = and outputs |
PHADDSW | Packed Horizontal Add and Saturate Words | like PHADDW, but outputs |