prog8/virtualmachine/src/prog8/vm/Instructions.kt

419 lines
20 KiB
Kotlin
Raw Normal View History

package prog8.vm
/*
2022-03-23 00:52:01 +00:00
Virtual machine:
2022-03-23 00:52:01 +00:00
65536 virtual registers, 16 bits wide, can also be used as 8 bits. r0-r65535
65536 bytes of memory. Thus memory pointers (addresses) are limited to 16 bits.
2022-04-11 20:39:33 +00:00
Value stack, max 128 entries of 1 byte each.
2022-04-08 22:49:23 +00:00
Status registers: Carry.
Instruction serialization format possibility:
OPCODE: 1 byte
TYPECODE: 1 byte
REGISTER 1: 2 bytes
REGISTER 2: 2 bytes
REG3/MEMORY/VALUE: 2 bytes
2022-03-23 00:52:01 +00:00
Instructions with Type come in variants 'b' and 'w' (omitting it in the instruction means 'b' by default)
Currently NO support for 24 or 32 bits, and FLOATING POINT is not implemented yet either. FP would be
a separate set of registers and instructions/routines anyway.
*only* LOAD AND STORE instructions have a possible memory operand, all other instructions use only registers or immediate value.
2022-03-23 00:52:01 +00:00
LOAD/STORE
----------
All have type b or w.
load reg1, value - load immediate value into register
2022-03-23 00:52:01 +00:00
loadm reg1, address - load reg1 with value in memory address
loadi reg1, reg2 - load reg1 with value in memory indirect, memory pointed to by reg2
2022-03-23 00:52:01 +00:00
loadx reg1, reg2, address - load reg1 with value in memory address, indexed by value in reg2
loadr reg1, reg2 - load reg1 with value in register reg2
2022-03-23 00:52:01 +00:00
storem reg1, address - store reg1 in memory address
storei reg1, reg2 - store reg1 in memory indirect, memory pointed to by reg2
2022-03-23 00:52:01 +00:00
storex reg1, reg2, address - store reg1 in memory address, indexed by value in reg2
storez address - store zero in memory address
2022-04-10 23:31:34 +00:00
storezi reg1 - store zero in memory pointed to by reg1
2022-03-23 00:52:01 +00:00
storezx reg1, address - store zero in memory address, indexed by value in reg
2022-03-23 00:52:01 +00:00
CONTROL FLOW
------------
Possible subroutine call convention:
Set parameters in Reg 0, 1, 2... before call. Return value set in Reg 0 before return.
But you can decide whatever you want because here we just care about jumping and returning the flow of control.
Saving/restoring registers is possible with PUSH and POP instructions.
jump location - continue running at instruction number given by location
2022-03-22 00:41:23 +00:00
jumpi reg1 - continue running at instruction number in reg1
2022-03-23 00:52:01 +00:00
call location - save current instruction location+1, continue execution at instruction nr given by location
calli reg1 - save current instruction location+1, continue execution at instruction number in reg1
syscall value - do a systemcall identified by call number
return - restore last saved instruction location and continue at that instruction
2022-03-23 00:52:01 +00:00
BRANCHING
---------
2022-04-08 22:49:23 +00:00
All have type b or w except the branches that only check status bits.
2022-03-23 00:52:01 +00:00
2022-04-08 22:49:23 +00:00
bstcc location - branch to location if Status bit Carry is Clear
bstcs location - branch to location if Status bit Carry is Set
bsteq location - branch to location if Status bit Zero is set
bstne location - branch to location if Status bit Zero is not set
bstneg location - branch to location if Status bit Negative is not set
bstpos location - branch to location if Status bit Negative is not set
2022-03-30 21:40:39 +00:00
bz reg1, location - branch to location if reg1 is zero
bnz reg1, location - branch to location if reg1 is not zero
beq reg1, reg2, location - jump to location in program given by location, if reg1 == reg2
bne reg1, reg2, location - jump to location in program given by location, if reg1 != reg2
blt reg1, reg2, location - jump to location in program given by location, if reg1 < reg2 (unsigned)
blts reg1, reg2, location - jump to location in program given by location, if reg1 < reg2 (signed)
ble reg1, reg2, location - jump to location in program given by location, if reg1 <= reg2 (unsigned)
bles reg1, reg2, location - jump to location in program given by location, if reg1 <= reg2 (signed)
bgt reg1, reg2, location - jump to location in program given by location, if reg1 > reg2 (unsigned)
bgts reg1, reg2, location - jump to location in program given by location, if reg1 > reg2 (signed)
bge reg1, reg2, location - jump to location in program given by location, if reg1 >= reg2 (unsigned)
bges reg1, reg2, location - jump to location in program given by location, if reg1 >= reg2 (signed)
seq reg1, reg2, reg3 - set reg=1 if reg2 == reg3, otherwise set reg1=0
sne reg1, reg2, reg3 - set reg=1 if reg2 != reg3, otherwise set reg1=0
slt reg1, reg2, reg3 - set reg=1 if reg2 < reg3 (unsigned), otherwise set reg1=0
slts reg1, reg2, reg3 - set reg=1 if reg2 < reg3 (signed), otherwise set reg1=0
sle reg1, reg2, reg3 - set reg=1 if reg2 <= reg3 (unsigned), otherwise set reg1=0
sles reg1, reg2, reg3 - set reg=1 if reg2 <= reg3 (signed), otherwise set reg1=0
sgt reg1, reg2, reg3 - set reg=1 if reg2 > reg3 (unsigned), otherwise set reg1=0
sgts reg1, reg2, reg3 - set reg=1 if reg2 > reg3 (signed), otherwise set reg1=0
sge reg1, reg2, reg3 - set reg=1 if reg2 >= reg3 (unsigned), otherwise set reg1=0
sges reg1, reg2, reg3 - set reg=1 if reg2 >= reg3 (signed), otherwise set reg1=0
2022-04-08 22:49:23 +00:00
TODO: support for the other prog8 special branching instructions if_XX (bpl, bmi etc.)
2022-03-23 00:52:01 +00:00
but we don't have any 'processor flags' whatsoever in the vm so it's a bit weird
2022-03-23 00:52:01 +00:00
INTEGER ARITHMETIC
------------------
All have type b or w. Note: result types are the same as operand types! E.g. byte*byte->byte.
2022-03-23 00:52:01 +00:00
ext reg1 - reg1 = unsigned extension of reg1 (which in practice just means clearing the MSB / MSW) (latter not yet implemented as we don't have longs yet)
exts reg1 - reg1 = signed extension of reg1 (byte to word, or word to long) (note: latter ext.w, not yet implemented as we don't have longs yet)
inc reg1 - reg1 = reg1+1
2022-03-30 21:40:39 +00:00
incm address - memory at address += 1
2022-03-23 00:52:01 +00:00
dec reg1 - reg1 = reg1-1
2022-03-30 21:40:39 +00:00
decm address - memory at address -= 1
2022-03-23 00:52:01 +00:00
neg reg1 - reg1 = sign negation of reg1
add reg1, reg2, reg3 - reg1 = reg2+reg3 (unsigned + signed)
sub reg1, reg2, reg3 - reg1 = reg2-reg3 (unsigned + signed)
mul reg1, reg2, reg3 - unsigned multiply reg1=reg2*reg3 note: byte*byte->byte, no type extension to word!
div reg1, reg2, reg3 - unsigned division reg1=reg2/reg3 note: division by zero yields max signed int $ff/$ffff
2022-03-23 00:52:01 +00:00
mod reg1, reg2, reg3 - remainder (modulo) of unsigned division reg1=reg2%reg3 note: division by zero yields max signed int $ff/$ffff
2022-04-11 20:39:33 +00:00
sqrt reg1, reg2 - reg1 is the square root of reg2 (for .w and .b both , the result is a byte)
sgn reg1, reg2 - reg1 is the sign of reg2 (0, 1 or -1)
cmp reg1, reg2 - set processor status bits C, N, Z according to comparison of reg1 with reg2. (semantics taken from 6502/68000 CMP instruction)
2022-03-23 00:52:01 +00:00
2022-04-08 22:49:23 +00:00
NOTE: because mul/div are constrained (truncated) to remain in 8 or 16 bits, there is NO NEED for separate signed/unsigned mul and div instructions. The result is identical.
2022-03-23 00:52:01 +00:00
2022-03-23 00:52:01 +00:00
LOGICAL/BITWISE
---------------
All have type b or w.
and reg1, reg2, reg3 - reg1 = reg2 bitwise and reg3
or reg1, reg2, reg3 - reg1 = reg2 bitwise or reg3
xor reg1, reg2, reg3 - reg1 = reg2 bitwise xor reg3
lsrx reg1, reg2, reg3 - reg1 = multi-shift reg2 right by reg3 bits + set Carry to shifted bit
asrx reg1, reg2, reg3 - reg1 = multi-shift reg2 right by reg3 bits (signed) + set Carry to shifted bit
lslx reg1, reg2, reg3 - reg1 = multi-shift reg2 left by reg3 bits + set Carry to shifted bit
lsr reg1 - shift reg1 right by 1 bits + set Carry to shifted bit
asr reg1 - shift reg1 right by 1 bits (signed) + set Carry to shifted bit
lsl reg1 - shift reg1 left by 1 bits + set Carry to shifted bit
2022-04-08 22:49:23 +00:00
ror reg1 - rotate reg1 right by 1 bits, not using carry + set Carry to shifted bit
roxr reg1 - rotate reg1 right by 1 bits, using carry + set Carry to shifted bit
rol reg1 - rotate reg1 left by 1bits, not using carry + set Carry to shifted bit
roxl reg1 - rotate reg1 left by 1bits, using carry, + set Carry to shifted bit
MISC
2022-03-23 00:52:01 +00:00
----
2022-04-08 22:49:23 +00:00
clc - clear Carry status bit
sec - set Carry status bit
nop - do nothing
breakpoint - trigger a breakpoint
copy reg1, reg2, length - copy memory from ptrs in reg1 to reg3, length bytes
copyz reg1, reg2 - copy memory from ptrs in reg1 to reg3, stop after first 0-byte
msig [b, w] reg1, reg2 - reg1 becomes the most significant byte (or word) of the word (or int) in reg2 (.w not yet implemented; requires 32 bits regs)
2022-04-08 22:49:23 +00:00
swapreg reg1, reg2 - swap values in reg1 and reg2
concat [b, w] reg1, reg2, reg3 - reg1 = concatenated lsb/lsw of reg2 and lsb/lsw of reg3 into new word or int (int not yet implemented; requires 32bits regs)
2022-04-08 22:49:23 +00:00
push [b, w] reg1 - push value in reg1 on the stack
pop [b, w] reg1 - pop value from stack into reg1
*/
enum class Opcode {
NOP,
LOAD,
LOADM,
LOADI,
LOADX,
LOADR,
STOREM,
STOREI,
STOREX,
STOREZ,
STOREZI,
STOREZX,
JUMP,
JUMPI,
2022-03-23 00:52:01 +00:00
CALL,
CALLI,
SYSCALL,
RETURN,
2022-04-08 22:49:23 +00:00
BSTCC,
BSTCS,
BSTEQ,
BSTNE,
BSTNEG,
BSTPOS,
BZ,
BNZ,
BEQ,
BNE,
BLT,
BLTS,
BGT,
BGTS,
BLE,
BLES,
BGE,
BGES,
2022-03-27 12:23:01 +00:00
SEQ,
SNE,
SLT,
SLTS,
SGT,
SGTS,
SLE,
SLES,
SGE,
SGES,
2022-03-23 00:52:01 +00:00
INC,
2022-03-30 21:40:39 +00:00
INCM,
2022-03-23 00:52:01 +00:00
DEC,
2022-03-30 21:40:39 +00:00
DECM,
NEG,
ADD,
SUB,
MUL,
DIV,
2022-03-23 00:52:01 +00:00
MOD,
2022-04-11 20:39:33 +00:00
SQRT,
SGN,
CMP,
EXT,
EXTS,
AND,
OR,
XOR,
ASRX,
LSRX,
LSLX,
2022-03-28 21:49:44 +00:00
ASR,
LSR,
LSL,
ROR,
2022-04-08 22:49:23 +00:00
ROXR,
ROL,
2022-04-08 22:49:23 +00:00
ROXL,
2022-04-08 22:49:23 +00:00
CLC,
SEC,
2022-03-23 00:52:01 +00:00
PUSH,
POP,
MSIG,
2022-03-28 21:49:44 +00:00
SWAPREG,
CONCAT,
BREAKPOINT
}
val OpcodesWithAddress = setOf(
Opcode.LOADM,
Opcode.LOADX,
Opcode.STOREM,
Opcode.STOREX,
Opcode.STOREZ,
Opcode.STOREZX
)
2022-03-23 00:52:01 +00:00
enum class VmDataType {
BYTE,
WORD
2022-03-23 00:52:01 +00:00
// TODO add INT (32-bit)? INT24 (24-bit)?
}
data class Instruction(
val opcode: Opcode,
2022-03-23 00:52:01 +00:00
val type: VmDataType?=null,
2022-03-27 23:49:43 +00:00
val reg1: Int?=null, // 0-$ffff
val reg2: Int?=null, // 0-$ffff
val reg3: Int?=null, // 0-$ffff
val value: Int?=null, // 0-$ffff
val symbol: List<String>?=null // alternative to value
2022-03-19 00:20:01 +00:00
) {
2022-03-30 21:40:39 +00:00
init {
2022-03-30 19:44:48 +00:00
val format = instructionFormats.getValue(opcode)
if(format.datatypes.isNotEmpty() && type==null)
throw IllegalArgumentException("missing type")
if(format.reg1 && reg1==null ||
2022-03-30 21:40:39 +00:00
format.reg2 && reg2==null ||
format.reg3 && reg3==null)
2022-03-30 19:44:48 +00:00
throw IllegalArgumentException("missing a register")
2022-04-08 22:49:23 +00:00
if(!format.reg1 && reg1!=null ||
!format.reg2 && reg2!=null ||
!format.reg3 && reg3!=null)
throw IllegalArgumentException("too many registers")
2022-03-30 19:44:48 +00:00
if(format.value && (value==null && symbol==null))
throw IllegalArgumentException("missing a value or symbol")
2022-03-30 21:40:39 +00:00
}
override fun toString(): String {
val result = mutableListOf(opcode.name.lowercase())
2022-03-23 00:52:01 +00:00
2022-03-19 00:20:01 +00:00
when(type) {
2022-03-23 00:52:01 +00:00
VmDataType.BYTE -> result.add(".b ")
VmDataType.WORD -> result.add(".w ")
2022-03-19 00:20:01 +00:00
else -> result.add(" ")
}
reg1?.let {
2022-03-22 00:41:23 +00:00
result.add("r$it")
2022-03-19 00:20:01 +00:00
result.add(",")
}
reg2?.let {
2022-03-22 00:41:23 +00:00
result.add("r$it")
2022-03-19 00:20:01 +00:00
result.add(",")
}
reg3?.let {
2022-03-22 00:41:23 +00:00
result.add("r$it")
2022-03-19 00:20:01 +00:00
result.add(",")
}
value?.let {
result.add(it.toString())
2022-03-27 23:49:43 +00:00
result.add(",")
}
symbol?.let {
result.add("_" + it.joinToString("."))
2022-03-19 00:20:01 +00:00
}
if(result.last() == ",")
2022-03-22 00:41:23 +00:00
result.removeLast()
2022-03-19 00:20:01 +00:00
return result.joinToString("").trimEnd()
}
}
2022-03-23 00:52:01 +00:00
data class InstructionFormat(val datatypes: Set<VmDataType>, val reg1: Boolean, val reg2: Boolean, val reg3: Boolean, val value: Boolean)
2022-03-23 00:52:01 +00:00
private val NN = emptySet<VmDataType>()
private val BW = setOf(VmDataType.BYTE, VmDataType.WORD)
@Suppress("BooleanLiteralArgument")
val instructionFormats = mutableMapOf(
Opcode.NOP to InstructionFormat(NN, false, false, false, false),
Opcode.LOAD to InstructionFormat(BW, true, false, false, true ),
Opcode.LOADM to InstructionFormat(BW, true, false, false, true ),
Opcode.LOADI to InstructionFormat(BW, true, true, false, false),
Opcode.LOADX to InstructionFormat(BW, true, true, false, true ),
Opcode.LOADR to InstructionFormat(BW, true, true, false, false),
Opcode.SWAPREG to InstructionFormat(BW, true, true, false, false),
Opcode.STOREM to InstructionFormat(BW, true, false, false, true ),
Opcode.STOREI to InstructionFormat(BW, true, true, false, false),
Opcode.STOREX to InstructionFormat(BW, true, true, false, true ),
Opcode.STOREZ to InstructionFormat(BW, false, false, false, true ),
Opcode.STOREZI to InstructionFormat(BW, true, false, false, false),
Opcode.STOREZX to InstructionFormat(BW, true, false, false, true ),
Opcode.JUMP to InstructionFormat(NN, false, false, false, true ),
Opcode.JUMPI to InstructionFormat(NN, true, false, false, false),
2022-03-23 00:52:01 +00:00
Opcode.CALL to InstructionFormat(NN, false, false, false, true ),
Opcode.CALLI to InstructionFormat(NN, true, false, false, false),
Opcode.SYSCALL to InstructionFormat(NN, false, false, false, true ),
Opcode.RETURN to InstructionFormat(NN, false, false, false, false),
2022-04-08 22:49:23 +00:00
Opcode.BSTCC to InstructionFormat(NN, false, false,false, true ),
Opcode.BSTCS to InstructionFormat(NN, false, false,false, true ),
Opcode.BSTEQ to InstructionFormat(NN, false, false,false, true ),
Opcode.BSTNE to InstructionFormat(NN, false, false,false, true ),
Opcode.BSTNEG to InstructionFormat(NN, false, false,false, true ),
Opcode.BSTPOS to InstructionFormat(NN, false, false,false, true ),
Opcode.BZ to InstructionFormat(BW, true, false, false, true ),
Opcode.BNZ to InstructionFormat(BW, true, false, false, true ),
Opcode.BEQ to InstructionFormat(BW, true, true, false, true ),
Opcode.BNE to InstructionFormat(BW, true, true, false, true ),
Opcode.BLT to InstructionFormat(BW, true, true, false, true ),
Opcode.BLTS to InstructionFormat(BW, true, true, false, true ),
Opcode.BGT to InstructionFormat(BW, true, true, false, true ),
Opcode.BGTS to InstructionFormat(BW, true, true, false, true ),
Opcode.BLE to InstructionFormat(BW, true, true, false, true ),
Opcode.BLES to InstructionFormat(BW, true, true, false, true ),
Opcode.BGE to InstructionFormat(BW, true, true, false, true ),
Opcode.BGES to InstructionFormat(BW, true, true, false, true ),
2022-03-27 12:23:01 +00:00
Opcode.SEQ to InstructionFormat(BW, true, true, true, false),
Opcode.SNE to InstructionFormat(BW, true, true, true, false),
Opcode.SLT to InstructionFormat(BW, true, true, true, false),
Opcode.SLTS to InstructionFormat(BW, true, true, true, false),
Opcode.SGT to InstructionFormat(BW, true, true, true, false),
Opcode.SGTS to InstructionFormat(BW, true, true, true, false),
Opcode.SLE to InstructionFormat(BW, true, true, true, false),
Opcode.SLES to InstructionFormat(BW, true, true, true, false),
Opcode.SGE to InstructionFormat(BW, true, true, true, false),
Opcode.SGES to InstructionFormat(BW, true, true, true, false),
2022-03-23 00:52:01 +00:00
Opcode.INC to InstructionFormat(BW, true, false, false, false),
2022-03-30 21:40:39 +00:00
Opcode.INCM to InstructionFormat(BW, false, false, false, true ),
2022-03-23 00:52:01 +00:00
Opcode.DEC to InstructionFormat(BW, true, false, false, false),
2022-03-30 21:40:39 +00:00
Opcode.DECM to InstructionFormat(BW, false, false, false, true ),
2022-03-23 00:52:01 +00:00
Opcode.NEG to InstructionFormat(BW, true, false, false, false),
Opcode.ADD to InstructionFormat(BW, true, true, true, false),
Opcode.SUB to InstructionFormat(BW, true, true, true, false),
Opcode.MUL to InstructionFormat(BW, true, true, true, false),
Opcode.DIV to InstructionFormat(BW, true, true, true, false),
2022-03-23 00:52:01 +00:00
Opcode.MOD to InstructionFormat(BW, true, true, true, false),
2022-04-11 20:39:33 +00:00
Opcode.SQRT to InstructionFormat(BW, true, true, false, false),
Opcode.SGN to InstructionFormat(BW, true, true, false, false),
Opcode.CMP to InstructionFormat(BW, true, true, false, false),
2022-03-23 00:52:01 +00:00
Opcode.EXT to InstructionFormat(BW, true, false, false, false),
Opcode.EXTS to InstructionFormat(BW, true, false, false, false),
Opcode.AND to InstructionFormat(BW, true, true, true, false),
Opcode.OR to InstructionFormat(BW, true, true, true, false),
Opcode.XOR to InstructionFormat(BW, true, true, true, false),
Opcode.ASRX to InstructionFormat(BW, true, true, true, false),
Opcode.LSRX to InstructionFormat(BW, true, true, true, false),
Opcode.LSLX to InstructionFormat(BW, true, true, true, false),
Opcode.ASR to InstructionFormat(BW, true, false, false, false),
Opcode.LSR to InstructionFormat(BW, true, false, false, false),
Opcode.LSL to InstructionFormat(BW, true, false, false, false),
2022-04-08 22:49:23 +00:00
Opcode.ROR to InstructionFormat(BW, true, false, false, false),
Opcode.ROXR to InstructionFormat(BW, true, false, false, false),
Opcode.ROL to InstructionFormat(BW, true, false, false, false),
Opcode.ROXL to InstructionFormat(BW, true, false, false, false),
Opcode.MSIG to InstructionFormat(BW, true, true, false, false),
2022-03-23 00:52:01 +00:00
Opcode.PUSH to InstructionFormat(BW, true, false, false, false),
Opcode.POP to InstructionFormat(BW, true, false, false, false),
2022-03-28 21:49:44 +00:00
Opcode.CONCAT to InstructionFormat(BW, true, true, true, false),
2022-04-08 22:49:23 +00:00
Opcode.CLC to InstructionFormat(NN, false, false, false, false),
Opcode.SEC to InstructionFormat(NN, false, false, false, false),
Opcode.BREAKPOINT to InstructionFormat(NN, false, false, false, false)
)