Arch86

Opcode	Encoding	16-bit	32-bit	64-bit	`CPUID` Feature Flag(s)	Description
`F3 0F 58 /r` `ADDSS xmm1, xmm2/m64`	`rm`	Invalid	Valid	Valid	`sse`	Add the lowest single-precision floating-point value from xmm1 and xmm2/m64. Store the result in xmm1.
Opcode: `F3 0F 58 /r` Mnemonic: `ADDSS xmm1, xmm2/m64` Encoding: `rm` Validity (16/32/64 bit): invalid, valid, valid `CPUID` Feature Flag(s): `sse` Add the lowest single-precision floating-point value from xmm1 and xmm2/m64. Store the result in xmm1.
`VEX.LIG.F3.0F.WIG 58 /r` `VADDSS xmm1, xmm2, xmm3/m64`	`rvm`	Invalid	Valid	Valid	`avx`	Add the lowest single-precision floating-point value from xmm2 and xmm3/m64. Store the result in xmm1.
Opcode: `VEX.LIG.F3.0F.WIG 58 /r` Mnemonic: `VADDSS xmm1, xmm2, xmm3/m64` Encoding: `rvm` Validity (16/32/64 bit): invalid, valid, valid `CPUID` Feature Flag(s): `avx` Add the lowest single-precision floating-point value from xmm2 and xmm3/m64. Store the result in xmm1.
`EVEX.LLIG.F3.0F.W0 58 /r` `VADDSS xmm1 {k1}{z}, xmm2, xmm3/m64{er}`	`ervm`	Invalid	Valid	Valid	`avx512-f`	Add the lowest single-precision floating-point value from xmm2 and xmm3/m64. Store the result in xmm1.
Opcode: `EVEX.LLIG.F3.0F.W0 58 /r` Mnemonic: `VADDSS xmm1 {k1}{z}, xmm2, xmm3/m64{er}` Encoding: `ervm` Validity (16/32/64 bit): invalid, valid, valid `CPUID` Feature Flag(s): `avx512-f` Add the lowest single-precision floating-point value from xmm2 and xmm3/m64. Store the result in xmm1.

Encoding

Encoding	Operand 1	Operand 2	Operand 3	Operand 4
`rm`	`n/a`	`ModRM.reg[rw]`	`ModRM.r/m[r]`
`rvm`	`n/a`	`ModRM.reg[rw]`	`VEX.vvvv[r]`	`ModRM.r/m[r]`
`ervm`	`tuple1-scalar`	`ModRM.reg[rw]`	`EVEX.vvvvv[r]`	`ModRM.r/m[r]`

Description

The (V)ADDSS instruction adds a single single-precision floating-point value from the two source operands. The result is stored in the destination operand.

The VEX and EVEX forms will copy bits 32..127 from the first source operand into the destination. All forms except the legacy SSE one will zero the upper (untouched) bits.

Operation

public void ADDSS(SimdF32 dest, SimdF32 src)
{
    dest[0] += src[0];
    // dest[1..] is unmodified
}

public void VADDSS_Vex(SimdF32 dest, SimdF32 src1, SimdF32 src2)
{
    dest[0] = src1[0] + src[2];
    dest[1] = src1[1];
    dest[2] = src1[2];
    dest[3] = src1[3];
    dest[2..] = 0;
}

public void VADDSS_EvexMemory(SimdF32 dest, SimdF32 src1, SimdF32 src2, KMask k)
{
    if (k[0])
        dest[0] = src1[0] + src2[0];
    else if (EVEX.z)
        dest[0] = 0;
    // otherwise unchanged
    dest[1] = src1[1];
    dest[2] = src1[2];
    dest[3] = src1[3];
    dest[2..] = 0;
}

public void VADDSS_EvexRegister(SimdF32 dest, SimdF32 src1, SimdF32 src2, KMask k)
{
    if (EVEX.b)
        OverrideRoundingModeForThisInstruction(EVEX.rc);

    if (k[0])
        dest[0] = src1[0] + src2[0];
    else if (EVEX.z)
        dest[0] = 0;
    // otherwise unchanged
    dest[1] = src1[1];
    dest[2] = src1[2];
    dest[3] = src1[3];
    dest[2..] = 0;
}

Intrinsics

__m128d _mm_add_ss(__m128d a, __m128d b)
__m128d _mm_add_round_ss(__m128d a, __m128d b, const int rounding)
__m128d _mm_mask_add_ss(__m128d s, __mmask8 k, __m128d a, __m128d b)
__m128d _mm_mask_add_round_ss(__m128d s, __mmask8 k, __m128d a, __m128d b, const int rounding)
__m128d _mm_maskz_add_ss(__mmask8 k, __m128d a, __m128d b)
__m128d _mm_maskz_add_round_ss(__mmask8 k, __m128d a, __m128d b, const int rounding)

Exceptions

SIMD Floating-Point

#XM

#D - Denormal operand.
#I - Invalid operation.
#O - Numeric overflow.
#P - Inexact result.
#U - Numeric underflow.

Other Exceptions

VEX Encoded Form: See Type 3 Exception Conditions.
EVEX Encoded Form: See Type E3 Exception Conditions.