All Projects → ddfreyne → rcpu

ddfreyne / rcpu

Licence: other
VM emulator and assembler written in Crystal

Programming Languages

crystal
512 projects
Makefile
30231 projects
ruby
36898 projects - #4 most used programming language

RCPU

RCPU is a virtual machine emulator and assembler written in Crystal.

Caution
This project is very much a work in progress.

Building

You will need Crystal to build RCPU. Once you have Crystal, run make, which will generate the compiled assembler and emulator binaries in the current directory.

Usage

To assemble a file, use rcpu-assemble, passing the input filename (with the rcs extension). This will generate a corresponding rcb file. For example:

% ./rcpu-assemble samples/fib.rcs

To run a file, use rcpu-emulate, passing the input filename (with the rcb extension). For example:

% ./rcpu-emulate samples/fib.rcb

The rcs extension is used for RCPU source files, while rcb is for RCPU binary (i.e. compiled) files.

Tests

Run make spec to compile and run the tests.

Architecture

Registers

Name Code Size (bytes) Purpose

r0

0x00

4

general-purpose

r1

0x01

4

general-purpose

r2

0x02

4

general-purpose

r3

0x03

4

general-purpose

r4

0x04

4

general-purpose

r5

0x05

4

general-purpose

r6

0x06

4

general-purpose

r7

0x07

4

general-purpose

rpc

0x08

4

program counter (a.k.a instruction pointer)

rflags

0x09

1

contains result of cmp(i)

rsp

0x0a

4

stack pointer

rbp

0x0b

4

base pointer

rr

0x0c

4

return value

Instruction format

Instructions are of variable length.

The first byte is the opcode. Register arguments are one byte long, and label/immediate arguments are four bytes long.

Instruction set

Mnemonic Opcode Arguments Effect

Function handling

call(i)

0x01, 0x02

reg/imm

ret

0x03

(none)

Stack management

push(i)

0x04, 0x05

reg/imm

push a0 onto stack

pop

0x06

reg

pop into a0

Branching

j(i)

0x07, 0x08

reg/imm

pc ← a0

je(i)

0x09, 0x0a

reg/imm

if = then pc ← a0

jne(i)

0x0b, 0x0c

reg/imm

if ≠ then pc ← a0

jg(i)

0x0d, 0x0e

reg/imm

if > then pc ← a0

jge(i)

0x0f, 0x10

reg/imm

if ≥ then pc ← a0

jl(i)

0x11, 0x12

reg/imm

if < then pc ← a0

jle(i)

0x13, 0x14

reg/imm

if ≤ then pc ← a0

Arithmetic

cmp(i)

0x15, 0x16

reg, reg/imm

(see below)

mod(i)

0x17, 0x18

reg, reg, reg/imm

a0 ← a1 % a2

add(i)

0x19, 0x1a

reg, reg, reg/imm

a0 ← a1 + a2

sub(i)

0x1b, 0x1c

reg, reg, reg/imm

a0 ← a1 - a2

mul(i)

0x1d, 0x1e

reg, reg, reg/imm

a0 ← a1 * a2

div(i)

0x1f, 0x20

reg, reg, reg/imm

a0 ← a1 / a2

xor(i)

0x21, 0x22

reg, reg, reg/imm

a0 ← a1 ^ a2

or(i)

0x23, 0x24

reg, reg, reg/imm

a0 ← a1 | a2

and(i)

0x25, 0x26

reg, reg, reg/imm

a0 ← a1 & a2

shl(i)

0x27, 0x28

reg, reg, reg/imm

a0 ← a1 << a2

shr(i)

0x29, 0x2a

reg, reg, reg/imm

a0 ← a1 >> a2

not

0x2b

reg, reg

a0 ← ~a1

Register handling

mov

0x2c

reg, reg

a0 ← a1

li

0x2d

reg, imm

a0 ← a1

Memory handling

lw

0x2e

reg, imm

(see below)

lh

0x2f

reg, imm

(see below)

lb

0x30

reg, imm

(see below)

sw

0x31

reg, imm

(see below)

sh

0x32

reg, imm

(see below)

sb

0x33

reg, imm

(see below)

Special

sleep

0xfd

imm

sleep for a0 ms

prn

0xfe

reg

print a0

halt

0xff

(none)

stops emulation

cmp(i) updates the flags register and sets the 0x01 bit to true if the arguments are equal, and the 0x02 bit to true if the first argument is greater than the second.

lw, lh and lb load data from memory into a register. lw loads a word (4 bytes), lh loads a half word (2 bytes) and lb loads a byte. Similarly, sw, sh and sb store data from a register into memory.

Several opcodes have an (i) variant. These variants take a four-byte immediate argument (meaning the data is encoded in the instruction) rather than a register name. For opcodes that have immediate variants, the Opcode column contains the non-immediate variant followed by the immediate variant.

Label arguments are identical to immediate arguments.

Memory layout

Range Use

0x00 – …

Program code

… – 0xffff

Stack (grows downward, word-aligned)

0x00010000 - 0x00014b00

Video memory (160 by 120 pixels, 1 byte per pixel)

Assembly language syntax

A lines can be an instruction line, a label line, or a data directive line. Blank lines are ignored.

Comments start with the # character and can appear anywhere on a line, including a blank line. For example:

	# load coords
	li r2, 0                 # x (in px)
	li r3, 0                 # y (in px)

An instruction line starts with a tab character, followed by the instruction mnemonic, and arguments separated by commas. For example:

	li r3, 0                 # y (in px)
	jei @print-string-done
	addi rsp, rsp, 12

Register arguments are indicated with an r prefix (e.g. rsp or r0).

Immediate values can be given in decimal (e.g. 123), in hexadecimal (starting with 0x, e.g. 0xfe), or in binary (starting with 0b, e.g. 0b10010000).

Label arguments start with the @ character.

A label line starts with an identifier, followed by a colon. For example:

print-string-loop:

A data directive line starts with a period, followed by the directive name, followed by optional arguments. For example:

.byte 0x73 # s
.byte 0x6c # l
.byte 0x65 # e
.byte 0x65 # e
.byte 0x70 # p

.word @char-left-parenthesis  # (
.word @char-right-parenthesis # )
.word @char-question-mark     # * - TODO
.word @char-question-mark     # + - TODO
.word @char-comma             # ,
.word @char-dash              # -
.word @char-period            # .
.word @char-slash             # /

The supported data directives are .byte, .half and .word; they insert a byte, a half word (two bytes) or a word (four bytes), respectively.

See the examples in the samples directory for inspiration.

To do

  • Finish implementing all opcodes

  • Tests

  • Line/column numbers in parser error messages

  • RCPU prefix in binaries

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].