diff --git a/Internal-Encoding-of-x86-Encodings.md b/Internal-Encoding-of-x86-Encodings.md new file mode 100644 index 0000000..08e8cc6 --- /dev/null +++ b/Internal-Encoding-of-x86-Encodings.md @@ -0,0 +1,32 @@ +For cached decodings, this emulator uses the layout: + +* 8 bits: operation; +* 8 bits: + * b7: address size — 16- or 32-bit; + * b6: set if this instruction has a displacement or offset attached; + * b5: set if this instruction has an immediate operand attached; + * [b4, b0]: the source operand's `Source`; +* 16 bits: + * [b15, b14]: this instruction's data size; + * [b13, b10]: the length of this instruction in bytes (or 0 to indicate a length extension word is present); + * [b9, b5]: the top five bits of this instruction's SIB; + * [b4, b0]: the destination operand's `Source`. + +The low three bits of the SIB are stored in the low three bits of its operand's `Source` if necessary; the `Source` enum treats all values from `11000b` upwards as having the equivalent meaning of `Indirect` for this reason. + +Extension words are 16 bits in length for 16-bit decodings and 32 bits in length for 32-bit decodings. Up to three may be present, in the order: +1. an immediate operand; +2. an offset or displacement; and +3. a length extension. + +If a length extension is present then it is laid out as: +* [b15/b31, b6]: instruction length in bytes; +* [b5, b4]: repetition attached to this instruction — repe/repne; +* [b3, b1]: segment override attached to this instruction; +* b0: whether the lock prefix was found. + +Therefore each decoded instruction is: +* between 4 and 10 bytes for 16-bit decodings; and +* between 4 and 16 bytes for 32-bit decodings. + +`sizeof(Instruction)` is therefore either `10` or `16`; it provides `packing_size` to give the size in bytes that are actually in use. `Instruction` is plain-old-data with a trivial destructor so it is safe to place them into memory such that instruction n+1 is placed at the address of instruction n + its `packing_size()`. Extension words therefore need be paid for only when required. \ No newline at end of file