Machine architecture

Machine model

The machine model is simple:

  • The memory contains both the program and the data.
  • The memory contains 4096 bytes and is addressed from address 0 to address 4095.
  • 32-bit registers are numbered from r0 to r15.
  • Register r0 is the instruction pointer (IP): it contains the address of the next instruction to be executed.
  • Reads from memory and writes to memory are 32-bit wide and do not need to be aligned.
  • Data stored in memory uses little-endian ordering.

Execution model

A step of execution happens as follows:

  1. The instruction at IP is decoded. Its length depends on the instruction (in other words, instruction size is variable). Note that each element - e.g reg_a - of the instruction is encoded on exactly one byte.
  2. IP is advanced to point after the decoded instruction and its arguments.
  3. The decoded instruction is executed.

Machine failure

Here are the reasons the machine can fail:

  1. The memory at IP does not contain a valid instruction.
  2. The instruction does not totally fit in memory.
  3. The instruction references an invalid register.
  4. The instruction references an invalid memory address.

A failure must cause the execution of the current step to return an error: the execution is not allowed to panic. The machine must no longer be used after an error.

Instruction set

InstructionArgumentsEffect
move if1 rᵢ rⱼ rₖif rₖ ≠ 0 Then rᵢ ← rⱼ
store2 rᵢ rⱼmem[rᵢ] ← rⱼ
load3 rᵢ rⱼrᵢ ← mem[rⱼ]
loadimm4 rᵢ L Hrᵢ ← extend(signed(H L))
sub5 rᵢ rⱼ rₖrᵢ ← rⱼ - rₖ
out6 rᵢoutput char(rᵢ)
exit7exit the program
out number8 rᵢoutput decimal(rᵢ)

Detailed description

The number of instructions is very limited. We will give at least one example for every instruction. All examples assume that:

  • register r1 contains 10
  • register r2 contains 25
  • register r3 contains 0x1234ABCD
  • register r4 contains 0
  • register r5 contains 65

All other registers are unused in the examples.

If the example contains 1 1 2 3, it means that the instruction is made of bytes 1, 1, 2 and 3 (4 bytes total) in this order.

move if

1 rᵢ rⱼ rₖ: if register rₖ contains a non-zero value, copy the content of register rⱼ into register rᵢ; otherwise do nothing.

Examples:

  • 1 1 2 3: since register r3 contains a non-zero value (0x1234ABCD), register r1 is set to 25 (the value of register r2).
  • 1 1 2 4: since register r4 contains a zero value, nothing happens.

store

2 rᵢ rⱼ: store the content of register rⱼ into the memory starting at address pointed by register rᵢ using little-endian representation.

Example:

  • 2 2 3: the content of register r3 (0x1234ABCD) will be stored at addresses [25, 26, 27, 28] since register r2 contain 25. 0xCD will be stored into address 25, 0xAB into address 26, 0x34 into address 27, and 0x12 into address 28.

load

3 rᵢ rⱼ: load the 32-bit content from memory at address pointed by register rⱼ into register rᵢ using little-endian representation.

Example:

  • 3 1 2: since register r2 contains 25, move the 32-bit value at addresses [25, 26, 27, 28] into register r1. In little-endian format, it means that if address 25 contains 0xCD, address 26 contains 0xAB, address 27 contains 0x34, and address 28 contains 0x12, the value loaded into register r1 will be 0x1234ABCD.

loadimm

4 rᵢ L H: interpret H and L respectively as the high-order and the low-order bytes of a 16-bit signed value, sign-extend it to 32 bits, and store it into register rᵢ.

Examples:

  • 4 1 0x11 0x70: store 0x00007011 into register r1
  • 4 1 0x11 0xd0: store 0xffffd011 into register r1

Note how sign extension transforms a positive 16 bit value (0x7011 == 28689) into a positive 32 bit value (0x00007011 == 28689) and a negative 16 bit value (0xd011 == -12271) into a negative 32-bit value (0xffffd011 == -12271).

sub

5 rᵢ rⱼ rₖ: store the content of register rⱼ minus the content of register rₖ into register rᵢ

Arithmetic wraps around in case of overflow. For example, 0 - 1 returns 0xffffffff, and 0 - 0xffffffff returns 1.

Examples:

  • 5 10 2 1: store 15 into r10 (the subtraction of register r2 25 and register r1 10).
  • 5 10 4 1: store -10 (0xfffffff6) into r10 (the subtraction of register r4 0 and register r1 10).

out

6 rᵢ: display the character whose unicode value is stored in the 8 low bits of register rᵢ on the standard output.

Example:

  • 6 5: output "A" since the 8 low bits of register r5 contain 65 which is the unicode codepoint for "A".
  • 6 3: output "Í" since the 8 low bits of register r3 contain 0xCD which is the unicode codepoint for "Í".

Note: you have to convert the content into a char and display this char.

exit

7: exit the current program

Example:

  • 7: get out.

out number

8 rᵢ: output the signed number stored in register rᵢ in decimal.

Example:

  • 8 5: output "65" since register r5 contains 65.
  • 8 3: output "305441741" since register r3 contains 0x1234ABCD.

Note

Note that some common operations are absent from this instruction set. For example, there is no add operation, however a+b can be replaced by a-(0-b). Also, there are no jump or conditional jump operations. Those can be replaced by manipulating the value stored in register r0 (IP).