1
0
mirror of https://github.com/TomHarte/CLK.git synced 2024-11-24 10:31:15 +00:00

Created Internal Encoding of x86 Encodings (markdown)

Thomas Harte 2023-10-02 12:25:05 -04:00
parent 6dd84677ec
commit f145c280ae

@ -0,0 +1,32 @@
For cached decodings, this emulator uses the layout:
* 8 bits: operation;
* 8 bits:
* b7: address size — 16- or 32-bit;
* b6: set if this instruction has a displacement or offset attached;
* b5: set if this instruction has an immediate operand attached;
* [b4, b0]: the source operand's `Source`;
* 16 bits:
* [b15, b14]: this instruction's data size;
* [b13, b10]: the length of this instruction in bytes (or 0 to indicate a length extension word is present);
* [b9, b5]: the top five bits of this instruction's SIB;
* [b4, b0]: the destination operand's `Source`.
The low three bits of the SIB are stored in the low three bits of its operand's `Source` if necessary; the `Source` enum treats all values from `11000b` upwards as having the equivalent meaning of `Indirect` for this reason.
Extension words are 16 bits in length for 16-bit decodings and 32 bits in length for 32-bit decodings. Up to three may be present, in the order:
1. an immediate operand;
2. an offset or displacement; and
3. a length extension.
If a length extension is present then it is laid out as:
* [b15/b31, b6]: instruction length in bytes;
* [b5, b4]: repetition attached to this instruction — repe/repne;
* [b3, b1]: segment override attached to this instruction;
* b0: whether the lock prefix was found.
Therefore each decoded instruction is:
* between 4 and 10 bytes for 16-bit decodings; and
* between 4 and 16 bytes for 32-bit decodings.
`sizeof(Instruction)` is therefore either `10` or `16`; it provides `packing_size` to give the size in bytes that are actually in use. `Instruction` is plain-old-data with a trivial destructor so it is safe to place them into memory such that instruction n+1 is placed at the address of instruction n + its `packing_size()`. Extension words therefore need be paid for only when required.