Arch86

1. Overview Table
- 1. Alternate GPR8 Encoding
- 2. Interpreting VEX, EVEX, and XOP Opcodes
2. Encoding
- 1. Interpreting the Operand Value
3. Description
4. Operation
5. Example(s)
6. Flags Affected
7. Intrinsics
8. Exceptions
- 1. By Operating Mode
- 2. SIMD Floating-Point
- 3. "Other"

This page details many of the sections of instruction pages.

Overview Table

The overview table lists all the various forms (opcodes) that an instruction can take, along with how they are encoded. Each row of the table consists of the following items, in order:

Opcode and Mnemonic: A single opcode listing both the encoding and assembly form. Italics in the mnemonic (bottom) part signify operands. See below for an explanation on interpreting vector (VEX, EVEX, and XOP prefixed) opcodes.
Some vector opcodes (such as ADDPD (Add Packed Single-Precision Floating-Point Values)) require that certain "legacy" prefixes be present (or none at all in some cases, such as with ADDPS (Add Packed Double-Precision Floating-Point Values)). In these cases, using the wrong prefixes will change how the instruction is decoded, and will result in a different opcode. For example, prefixing an ADDPS instruction with 0x66 will change it into an ADDPD instruction.
EVEX forms commonly feature other bits of information such as: an indication mask registers are allowed (with {k1}), error masking (with {er}), and more.
Encoding: A code that is a reference into the encoding table. That table indicates where in the instruction stream the operands are encoded.
## bit Mode (multiple): Whether a given instruction form (opcode) is valid, invalid, or non-encodable in the specified processor mode. "Valid" forms are allowed whereas "invalid" forms will (usually) raise a #UD exception if encountered. "N/E" (not encodable) forms are also invalid, but, because they also encode a different (valid) opcode, they will be decoded incorrectly.
For example, in Long Mode (64 bit mode), the byte range 0x40 through 0x4F was repurposed for the REX prefix. This makes encoding INC eax as 0x40 impossible. Should the processor encounter what the author thinks is that line of assembly, it will instead treat it as a REX prefix will all four bits cleared. Hence, it's not just invalid, it's not encodable.
In the "64 bit Mode" column, there are two other possible values are: "N/P" (not prefixable) and "N/S" (not supported). Non-prefixable opcodes are ones where the REX prefix has no effect. Non-selectable opcodes are ones where an address size override prefix would be required, but, as such, would not be supported.
CPUID Feature Flag(s) (optional): If present, the specified CPUID "feature flags" must be present (i.e. set) in order to decode. However, the existence of these flags does not imply the ability to execute an opcode; Some CPU features must be enabled (through various control registers) before use. If the flags are not present, or they are but are not enabled, the opcode will cause a #UD exception to be raised.
Description: A short description of what the specific opcode does. For most instructions, the various cells will be almost carbon copies of each other, but with minor changes.

Alternate `GPR8` Encoding

TODO...

Interpreting VEX, EVEX, and XOP Opcodes

Vector prefixed opcodes (through a VEX, EVEX, or XOP prefix) are written differently than normal instructions. This is because the vector prefixes are multiple (two through four) bytes long and encode quite a bit of information. All three prefixes take the form of {prefix}.{length}.{opmap}.{legacy}.{w} with each field representing a specific field in the encoded prefix. The other fields in the prefix (such as vvvv) are unspecified as they are dependent on the operands. The various fields are:

prefix: The actual vector prefix that this opcode uses. This can be one of: VEX, EVEX, or XOP. In the case of VEX prefixes, the choice of the 0xC4 or 0xC5 form is not specified and must be determined from the required (and prefix implied) bits.
length: The length of the vectors this opcode operates on. This is encoded in the L bit for VEX and XOP prefixes, and L'L bits for EVEX prefixes. This can be one of: 128 (for XMM), 256 (for YMM), 512 (for ZMM), or LIG ("length ignored"). For BMI opcodes, this value will be specified explicitly - either as L0 or L1.
For LIG opcodes on EVEX prefixes, the bits may be repurposed for "rounding control".
opmap: The implied ("compressed") opcode map that is stored in the mm bits of the vector prefix. For example, for VEX and EVEX prefixes, a value of 0F38 implies a two byte 0F 38 opmap prefix being encoded as 10b in the mm bits. In the case of XOP prefixes, this value is simply the hex encoding of the mmmmm bits; They do not correspond to any implied opmap bytes.
legacy: The implied ("compressed") legacy vector prefix that is stored in the pp bits of the vector prefix. For the cases where this opcode (in legacy SSE form) doesn't have any prefixes (i.e. NP), this field will not be present, which indicates a value of 00b in the pp bits.
w: The single W bit in the vector prefix. This is commonly used as an extra bit to specify the opcode. WIG ("W ignored") means just that - the W bit is ignored.

Encoding

The "Encoding" table lists the encoding of the operands for the various opcodes in the overview table. Each row of the table consists of the following items, in order:

Encoding: The name of the encoding this row is for.
Tuple Type (optional): The EVEX encoding's tuple form. This column is only present if an EVEX encoding for this instruction exists. If present, any encoding that does not use an EVEX prefix will contain "N/A".
Operand(s): The actual location the operand is encoded in. Instructions that contain a different number of operands depending on the mnemonic (for example, vector instructions with a legacy encoding) will contain "N/A" for disallowed operands.

Interpreting the Operand Value

The operand value cell takes the form source[rw] which represents a data, source that is both read from and written to ([rw]). Read only or write only data is signified by [r] and [w], respectively.

source only specifies where the register number is encoded. It does not specify which register file is used (general purpose, segment, vector, etc.); That is specified by the mnemonic's encoding.

source will be one of the following values:

address##: An immediate value with ## bits that represents a "direct" address in the linear address space. If multiple values of ## are allowed, they will be separated by a slash.
AL/AX/EAX/RAX: The accumulator register. Which portion is used depends on the operand size of the opcode.
DS:SI: Memory addressed by the DS:SI register pair. DS:ESI and DS:RSI may be used instead depending on the operand size attribute.
ES:DI: Memory addressed by the ES:DI register pair. ES:EDI and ES:RDI may be used instead depending on the operand size attribute.
EVEX.vvvvv: The vvvvv bits of an EVEX prefix encode the register. These bits are stored in inverted form. For example, ZMM26 would be stored as 00101b (11010b inverted).
FLAGS: The FLAGS register.
imm##: An immediate value with ## bits. If multiple values of ## are allowed, they will be separated by a slash.
imm8(4..7): The upper four bits (the high nibble) of an eight bit immediate encode the register.
ModRM.reg: The reg field of a ModR/M byte encodes the register. The three bits can be extended up to five through any of these prefixes: REX, VEX, EVEX, or XOP.
ModRM.r/m: If the mod field of a ModR/M byte signifies register form (11b), the r/m field encodes the register. If, however, the mod field signifies memory form (not 11b), the address is calculated (possibly with an SIB byte and a displacement) and used instead. The three bits can be extended up to five through any of these prefixes: REX, VEX, EVEX, or XOP.
offset##: An immediate value with ## bits that represents an offset from the following instruction. If multiple values of ## are allowed, they will be separated by a slash.
For example, an infinite loop (a: JMP a) would be encoded as EB FE where FE represents negative 2. This would jump backwards two bytes to the a label and begin again. In fact, a "nop" could be encoded as EB 00 which would be a simple jump to the following instruction (zero bytes ahead).
VEX.vvvv: The vvvv bits of an VEX prefix encode the register. These bits are stored in inverted form. For example, XMM12 would be stored as 0011b (1100b inverted).
XOP.vvvv: The vvvv bits of an XOP prefix encode the register. These bits are stored in inverted form. For example, XMM4 would be stored as 1011b (0100b inverted).

Description

The "Description" section, as the name implies, contains a simplified description of the instruction's operation.

Operation

The "Operation" section is pseudo-code that uses C#-like syntax.

Examples

The "Examples" section (if present) contains one or more example assembly snippets that demonstrate the instruction. Any examples provided use NASM (Intel) syntax.

Flags Affected

The "Flags Affected" section (if present) contains a description of how the processor's arithmetic flags are affected by the instruction. If this section is not present, then no arithmetic flags are changed.

Intrinsics

The "Intrinsics" section(s) (if present) contain C function definitions that can be used in one's code to utilize the instruction without inline assembly.

Exceptions

The "Exceptions" section contains a list of the possible exceptions that can be raised, along with the criteria for doing so.

By Operating Mode

TODO

SIMD Floating-Point

TODO

"Other"

TODO

Overview Table

Alternate GPR8 Encoding

Interpreting VEX, EVEX, and XOP Opcodes

Encoding

Interpreting the Operand Value

Description

Operation

Examples

Flags Affected

Intrinsics

Exceptions

By Operating Mode

SIMD Floating-Point

"Other"

Alternate `GPR8` Encoding