Opcode | Encoding | 16-bit | 32-bit | 64-bit | CPUID Feature Flag(s) | Description |
---|---|---|---|---|---|---|
NP 0F C2 /r ib CMPPS xmm1, xmm2/m128, imm8 | rmi | Invalid | Valid | Valid | sse | Compare packed single-precision floating-point values from xmm1 and xmm2/m128. Use bits 0..2 of imm8 as a comparison predicate. Store the result in xmm1. |
VEX.128.NP.0F.WIG C2 /r ib VCMPPS xmm1, xmm2, xmm3/m128, imm8 | rvmi | Invalid | Valid | Valid | avx | Compare packed single-precision floating-point values from xmm2 and xmm3/m128. Use bits 0..4 of imm8 as a comparison predicate. Store the result in xmm1. |
VEX.256.NP.0F.WIG C2 /r ib VCMPPS ymm1, ymm2, ymm3/m256, imm8 | rvmi | Invalid | Valid | Valid | avx | Compare packed single-precision floating-point values from ymm2 and ymm3/m256. Use bits 0..4 of imm8 as a comparison predicate. Store the result in ymm1. |
EVEX.128.NP.0F.W0 C2 /r ib VCMPPS k1 {k2}{z}, xmm1, xmm2/m128/m64bcst, imm8 | ervmi | Invalid | Valid | Valid | avx512-f avx512-vl | Compare packed single-precision floating-point values from xmm1 and xmm2/m128/m64bcst. Use bits 0..4 of imm8 as a comparison predicate. Store the result in k1. |
EVEX.256.NP.0F.W0 C2 /r ib VCMPPS k1 {k2}{z}, ymm1, ymm2/m256/m64bcst, imm8 | ervmi | Invalid | Valid | Valid | avx512-f avx512-vl | Compare packed single-precision floating-point values from ymm1 and ymm2/m256/m64bcst. Use bits 0..4 of imm8 as a comparison predicate. Store the result in k1. |
EVEX.512.NP.0F.W0 C2 /r ib VCMPPS k1 {k2}{z}, zmm1, zmm2/m512/m64bcst{sae}, imm8 | ervmi | Invalid | Valid | Valid | avx512-f | Compare packed single-precision floating-point values from zmm1 and zmm2/m512/m64bcst. Use bits 0..4 of imm8 as a comparison predicate. Store the result in k1. |
Encoding
Encoding | Operand 1 | Operand 2 | Operand 3 | Operand 4 | Operand 5 |
---|---|---|---|---|---|
rmi | n/a | ModRM.reg[rw] | ModRM.r/m[r] | imm8 | |
rvmi | n/a | ModRM.reg[rw] | VEX.vvvv[r] | ModRM.r/m[r] | imm8 |
ervmi | full | ModRM.reg[rw] | EVEX.vvvvv[r] | ModRM.r/m[r] | imm8 |
Description
The (V)CMPPS
instruction compares four, eight, or 16 single-precision floating-point values from the two source operands. The eight bit immediate determines the operation. The result is stored in the destination operand.
All forms except the legacy SSE one will zero the upper (untouched) bits.
For the legacy SSE and VEX versions, the results will all be a 32-bit integer (not a single-precision floating-point) of either all zeros or ones. For the EVEX versions, the destination is a mask register where each bit contains an individual comparison result.
For the legacy SSE version, bits 0..2
are used to determine the operation. For the VEX and EVEX encoded versions, bits 0..4
are used. The other bits are reserved. The operation is determined from the table below (an empty row indicates the division between allowed predicates from legacy SSE and VEX/EVEX encoded forms):
imm8 Value | Predicate | Description | Result1 | QNaN Signals #IA | |||
---|---|---|---|---|---|---|---|
A < B | A = B | A > B | Unordered2 | ||||
00h | EQ_OQ (EQ) | Equal (ordered, non-signaling) | false | true | false | false | no |
01h | LT_OS (LT) | Less-than (ordered, signaling) | true | false | false | false | yes |
02h | LE_OS (LE) | Less-than-or-equal (ordered, signaling) | true | true | false | false | yes |
03h | UNORD_Q (UNORD) | Unordered (non-signaling) | false | false | false | true | no |
04h | NEQ_UQ (NEQ) | Not-equal (unordered, non-signaling) | true | false | true | true | no |
05h | NLT_US (NLT) | Not-less-than (unordered, signaling) | false | true | true | true | yes |
06h | NLE_US (NLE) | Not-less-than-or-equal (unordered, signaling) | false | false | true | true | yes |
07h | ORD_Q (ORD) | Ordered (non-signaling) | true | true | true | false | no |
08h | EQ_UQ | Equal (unordered, non-signaling) | false | true | false | true | no |
09h | NGE_US (NGE) | Not-greater-than-or-equal (unordered, signaling) | true | false | false | true | yes |
0Ah | NGT_US (NGT) | Not-greater-than (unordered, signaling) | true | true | false | true | yes |
0Bh | FALSE_OQ (FALSE) | False (ordered, non-signaling) | false | false | false | false | no |
0Ch | NEQ_OQ | Not-equal (ordered, non-signaling) | true | false | true | false | no |
0Dh | GE_OS (GE) | Greater-than-or-equal (ordered, signaling) | false | true | true | false | yes |
0Eh | GT_OS (GT) | Greater-than (ordered, signaling) | false | false | true | false | yes |
0Fh | TRUE_UQ (TRUE) | True (unordered, non-signaling) | true | true | true | true | no |
10h | EQ_OS | Equal (ordered, signaling) | false | true | false | false | yes |
11h | LT_OQ | Less-than (ordered, non-signaling) | true | false | false | false | no |
12h | LE_OQ | Less-than-or-equal (ordered, non-signaling) | true | true | false | false | no |
13h | UNORD_S | Unordered (signaling) | false | false | false | true | yes |
14h | NEQ_US | Not-equal (unordered, signaling) | true | false | true | true | yes |
15h | NLT_UQ | Not-less-than (unordered, non-signaling) | false | true | true | true | no |
16h | NLE_UQ | Not-less-than-or-equal (unordered, non-signaling) | false | false | true | true | no |
17h | ORD_S | Ordered (signaling) | true | true | true | false | yes |
18h | EQ_US | Equal (unordered, signaling) | false | true | false | true | yes |
19h | NGE_UQ | Not-greater-than-or-equal (unordered, non-signaling) | true | false | false | true | no |
1Ah | NGT_UQ | Not-greater-than (unordered, non-signaling) | true | true | false | true | no |
1Bh | FALSE_OS | False (ordered, signaling) | false | false | false | false | yes |
1Ch | NEQ_OS | Not-equal (ordered, signaling) | true | false | true | false | yes |
1Dh | GE_OQ | Greater-than-or-equal (ordered, non-signaling) | false | true | true | false | no |
1Eh | GT_OQ | Greater-than (ordered, non-signaling) | false | false | true | false | no |
1Fh | TRUE_US | True (unordered, signaling) | true | true | true | true | yes |
A
is 1st operand;B
is 2nd operand- If either A or B is NaN.
Assemblers may implement the following pseudo-mnemonics for the various predicate values:
Pseudo-Mnemonic Form | Encoded Form |
---|---|
CMPEQPS src1, src2 | CMPPS src1, src2, 00h |
CMPLTPS src1, src2 | CMPPS src1, src2, 01h |
CMPLEPS src1, src2 | CMPPS src1, src2, 02h |
CMPUNORDPS src1, src2 | CMPPS src1, src2, 03h |
CMPNEQPS src1, src2 | CMPPS src1, src2, 04h |
CMPNLTPS src1, src2 | CMPPS src1, src2, 05h |
CMPNLEPS src1, src2 | CMPPS src1, src2, 06h |
CMPORDPS src1, src2 | CMPPS src1, src2, 07h |
VCMPEQPS dest, src1, src2 | VCMPPS dest, src1, src2, 00h |
VCMPLTPS dest, src1, src2 | VCMPPS dest, src1, src2, 01h |
VCMPLEPS dest, src1, src2 | VCMPPS dest, src1, src2, 02h |
VCMPUNORDPS dest, src1, src2 | VCMPPS dest, src1, src2, 03h |
VCMPNEQPS dest, src1, src2 | VCMPPS dest, src1, src2, 04h |
VCMPNLTPS dest, src1, src2 | VCMPPS dest, src1, src2, 05h |
VCMPNLEPS dest, src1, src2 | VCMPPS dest, src1, src2, 06h |
VCMPORDPS dest, src1, src2 | VCMPPS dest, src1, src2, 07h |
VCMPEQ_UQPS dest, src1, src2 | VCMPPS dest, src1, src2, 08h |
VCMPNGEPS dest, src1, src2 | VCMPPS dest, src1, src2, 09h |
VCMPNGTPS dest, src1, src2 | VCMPPS dest, src1, src2, 0Ah |
VCMPFALSEPS dest, src1, src2 | VCMPPS dest, src1, src2, 0Bh |
VCMPNEQ_OQPS dest, src1, src2 | VCMPPS dest, src1, src2, 0Ch |
VCMPGEPS dest, src1, src2 | VCMPPS dest, src1, src2, 0Dh |
VCMPGTPS dest, src1, src2 | VCMPPS dest, src1, src2, 0Eh |
VCMPTRUEPS dest, src1, src2 | VCMPPS dest, src1, src2, 0Fh |
VCMPEQ_OSPS dest, src1, src2 | VCMPPS dest, src1, src2, 10h |
VCMPLT_OQPS dest, src1, src2 | VCMPPS dest, src1, src2, 11h |
VCMPLE_OQPS dest, src1, src2 | VCMPPS dest, src1, src2, 12h |
VCMPUNORD_SPS dest, src1, src2 | VCMPPS dest, src1, src2, 13h |
VCMPNEQ_USPS dest, src1, src2 | VCMPPS dest, src1, src2, 14h |
VCMPNLT_UQPS dest, src1, src2 | VCMPPS dest, src1, src2, 15h |
VCMPNLE_UQPS dest, src1, src2 | VCMPPS dest, src1, src2, 16h |
VCMPORD_SPDPS dest, src1, src2 | VCMPPS dest, src1, src2, 17h |
VCMPEQ_USPS dest, src1, src2 | VCMPPS dest, src1, src2, 18h |
VCMPNGE_UQPS dest, src1, src2 | VCMPPS dest, src1, src2, 19h |
VCMPNGT_UQPS dest, src1, src2 | VCMPPS dest, src1, src2, 1Ah |
VCMPFALSE_OSPS dest, src1, src2 | VCMPPS dest, src1, src2, 1Bh |
VCMPNEQ_OSPS dest, src1, src2 | VCMPPS dest, src1, src2, 1Ch |
VCMPGE_OQPS dest, src1, src2 | VCMPPS dest, src1, src2, 1Dh |
VCMPGT_OQPS dest, src1, src2 | VCMPPS dest, src1, src2, 1Eh |
VCMPTRUE_USPS dest, src1, src2 | VCMPPS dest, src1, src2, 1Fh |
Operation
ComparisonFunc[] PredicateMapping8 = new[]
{
EQ_OQ, // 0 - equal (ordered, non-signaling)
LT_OS, // 1 - less-than (ordered, signaling)
LE_OS, // 2 - less-than-or-equal (ordered, signaling)
UNORD_Q, // 3 - unordered (non-signaling)
NEQ_UQ, // 4 - not-equal (unordered, non-signaling)
NLT_US, // 5 - not-less-than (unordered, signaling)
NLE_US, // 6 - not-less-than-or-equal (unordered, signaling)
ORD_Q, // 7 - ordered (non-signaling)
};
ComparisonFunc[] PredicateMapping32 = new[]
{
EQ_OQ, // 0 - equal (ordered, non-signaling)
LT_OS, // 1 - less-than (ordered, signaling)
LE_OS, // 2 - less-than-or-equal (ordered, signaling)
UNORD_Q, // 3 - unordered (non-signaling)
NEQ_UQ, // 4 - not-equal (unordered, non-signaling)
NLT_US, // 5 - not-less-than (unordered, signaling)
NLE_US, // 6 - not-less-than-or-equal (unordered, signaling)
ORD_Q, // 7 - ordered (non-signaling)
EQ_UQ, // 8 - equal (unordered, non-signaling)
NGE_US, // 9 - not-greater-than-or-equal (unordered, signaling)
NGT_US, // 10 - not-greater-than (unordered, signaling)
FALSE_OQ, // 11 - false (ordered, non-signaling)
NEQ_OQ, // 12 - not-equal (ordered, non-signaling)
GE_OS, // 13 - greater-than-or-equal (ordered, signaling)
GT_OS, // 14 - greater-than (ordered, signaling)
TRUE_UQ, // 15 - true (unordered, non-signaling)
EQ_OS, // 16 - equal (ordered, signaling)
LT_OQ, // 17 - less-than (ordered, non-signaling)
LE_OQ, // 18 - less-than-or-equal (ordered, non-signaling)
UNORD_S, // 19 - unordered (signaling)
NEQ_US, // 20 - not-equal (unordered, signaling)
NLT_UQ, // 21 - not-less-than (unordered, non-signaling)
NLE_UQ, // 22 - not-less-than-or-equal (unordered, non-signaling)
ORD_S, // 23 - ordered (signaling)
EQ_US, // 24 - equal (unordered, signaling)
NGE_UQ, // 25 - not-greater-than-or-equal (unordered, non-signaling)
NGT_UQ, // 26 - not-greater-than (unordered, non-signaling)
FALSE_OS, // 27 - false (ordered, signaling)
NEQ_OS, // 28 - not-equal (ordered, signaling)
GE_OQ, // 29 - greater-than-or-equal (ordered, non-signaling)
GT_OQ, // 30 - greater-than (ordered, non-signaling)
TRUE_US, // 31 - true (unordered, signaling)
};
public void CMPPS(SimdU32 dest, SimdF32 src, U8 predicate)
{
ComparisonFunc func = PredicateMapping8[predicate];
dest[0] = func(dest[0], src[0]) ? 0xFFFF_FFFF_FFFF_FFFFul : 0;
dest[1] = func(dest[1], src[1]) ? 0xFFFF_FFFF_FFFF_FFFFul : 0;
dest[2] = func(dest[2], src[2]) ? 0xFFFF_FFFF_FFFF_FFFFul : 0;
dest[3] = func(dest[3], src[3]) ? 0xFFFF_FFFF_FFFF_FFFFul : 0;
// dest[4..] is unmodified
}
void VCMPPS_Vex(SimdU32 dest, SimdF32 src1, SimdF32 src2, U8 predicate, int kl)
{
ComparisonFunc func = PredicateMapping32[predicate];
for (int n = 0; n < kl; n++)
dest[n] = func(src1[n], src2[n]) ? 0xFFFF_FFFF_FFFF_FFFFul : 0;
dest[kl..] = 0;
}
public void VCMPPS_Vex128(SimdU32 dest, SimdF32 src1, SimdF32 src2, U8 predicate) =>
VCMPPS_Vex(dest, src1, src2, predicate, 4);
public void VCMPPS_Vex256(SimdU32 dest, SimdF32 src1, SimdF32 src2, U8 predicate) =>
VCMPPS_Vex(dest, src1, src2, predicate, 8);
void VCMPPS_EvexMemory(KMask dest, SimdF32 src1, SimdF32 src2, U8 predicate, KMask k, int kl)
{
ComparisonFunc func = PredicateMapping32[predicate];
for (int n = 0; n < kl; n++)
{
if (k[n])
dest[n] = func(src1[n], src2[n]) ? 1 : 0;
else
dest[n] = 0;
// no merge masking - EVEX.z is implicit (zero masking)
}
dest[kl..] = 0;
}
public void VCMPPS_Evex128Memory(KMask dest, SimdF32 src1, SimdF32 src2, U8 predicate, KMask k) =>
VCMPPS_EvexMemory(dest, src1, src2, predicate, 4);
public void VCMPPS_Evex256Memory(KMask dest, SimdF32 src1, SimdF32 src2, U8 predicate, KMask k) =>
VCMPPS_EvexMemory(dest, src1, src2, predicate, 8);
public void VCMPPS_Evex512Memory(KMask dest, SimdF32 src1, SimdF32 src2, U8 predicate, KMask k) =>
VCMPPS_EvexMemory(dest, src1, src2, predicate, 16);
void VCMPPS_EvexRegister(KMask dest, SimdF32 src1, SimdF32 src2, U8 predicate, KMask k, int kl)
{
ComparisonFunc func = PredicateMapping32[predicate];
for (int n = 0; n < kl; n++)
{
if (k[n])
dest[n] = func(src1[n], EVEX.b ? src2[0] : src2[n]) ? 1 : 0;
else
dest[n] = 0;
// no merge masking - EVEX.z is implicit (zero masking)
}
dest[kl..] = 0;
}
public void VCMPPS_Evex128Register(KMask dest, SimdF32 src1, SimdF32 src2, U8 predicate, KMask k) =>
VCMPPS_EvexRegister(dest, src1, src2, predicate, 4);
public void VCMPPS_Evex256Register(KMask dest, SimdF32 src1, SimdF32 src2, U8 predicate, KMask k) =>
VCMPPS_EvexRegister(dest, src1, src2, predicate, 8);
public void VCMPPS_Evex512Register(KMask dest, SimdF32 src1, SimdF32 src2, U8 predicate, KMask k) =>
VCMPPS_EvexRegister(dest, src1, src2, predicate, 16);
Intrinsics
__m128 _mm_cmp_ps(__m128d a, __m128d b, const int predicate)
__mmask8 _mm_cmp_ps_mask(__m128d a, __m128d b, const int predicate)
__mmask8 _mm_mask_cmp_ps_mask(__mmask8 k1, __m128d a, __m128d b, const int predicate)
__m256 _mm256_cmp_ps(__m256d a, __m256d b, const int predicate)
__mmask8 _mm256_cmp_ps_mask(__m256d a, __m256d b, const int predicate)
__mmask8 _mm256_mask_cmp_ps_mask(__mmask8 k1, __m256d a, __m256d b, const int predicate)
__mmask16 _mm512_cmp_ps_mask(__m512d a, __m512d b, const int predicate)
__mmask16 _mm512_cmp_round_ps_mask(__m512d a, __m512d b, const int predicate, const int rounding)
__mmask16 _mm512_mask_cmp_ps_mask(__mmask8 k1, __m512d a, __m512d b, const int predicate)
__mmask16 _mm512_mask_cmp_round_ps_mask(__mmask8 k1, __m512d a, __m512d b, const int predicate, const int rounding)
Exceptions
SIMD Floating-Point
#XM
#D
- Denormal operand.#I
- Invalid operation.
Other Exceptions
VEX Encoded Form: See Type 2 Exception Conditions.
EVEX Encoded Form: See Type E2 Exception Conditions.