FiveForths is a tiny Forth written in hand-coded RISC-V assembly, initially designed to run on the 32-bit Longan Nano (GD32VF103) microcontroller.
This document provides technical details about what's under the hood of FiveForths.
- FiveForths specification
- Primitives list
- Registers list
- Source files list
- Memory map
- Word header
- Hash format
- Other Forths
Below is a list of specifications for FiveForths, most can be changed in the source files:
- Support for 32-bit
GD32VF103
microcontrollers on the Longan Nano board - CPU configured to run at
8 MHz
- UART configured for
115,200
baud rate,8N1
- Data Stack (DSP) size:
256 Bytes
- Return Stack (RSP) size:
256 Bytes
- Terminal Input Buffer (TIB) size:
256 Bytes
- Pad Buffer (PAD) size:
256 Bytes
- Threading mode:
Indirect Threaded Code
(ITC) - Word header size:
12 Bytes
(3 CELLs) - Word name storage:
32-bit hash
(djb2) - Return character newline:
\n
- Maximum word length:
32 characters
- Stack effects comments support
( x -- x )
: yes - Stack and memory overflow/underflow protection: yes
- Backslash comments support
\ comment
: yes - Decimal string number input support
12345
and-202
: yes - Hex string number input support
0xCAFE4241
and-0xCA
: yes - Multiline code definitions support: no
- OK message:
" ok\n"
- ERROR message:
" ?\n"
Below is the list of Forth primitives, there are currently 19 primitives:
Word | Stack Effects | Description |
---|---|---|
reboot |
( -- ) | Reboot the entire system and initialize memory |
@ |
( addr -- x ) | Fetch memory at addr |
! |
( x addr -- ) | Store x at addr |
sp@ |
( -- addr ) | Get current data stack pointer |
rp@ |
( -- addr ) | Get current return stack pointer |
0= |
( x -- f ) | -1 if top of stack is 0, 0 otherwise |
+ |
( x1 x2 -- n ) | Add the two values at the top of the stack |
nand |
( x1 x2 -- n ) | Bitwise NAND the two values at the top of the stack |
lit |
( -- n ) | Get the next word from IP and push it to the stack, increment IP |
exit |
( r:addr -- ) | Resume execution at address at the top of the return stack |
key |
( -- x ) | Read 8-bit character from uart input |
emit |
( x -- ) | Write 8-bit character to uart output |
tib |
( -- addr ) | Store TIB variable value in top of data stack |
state |
( -- addr ) | Store STATE variable value in top of data stack |
>in |
( -- addr ) | Store TOIN variable value in top of data stack |
here |
( -- addr ) | Store HERE variable value in top of data stack |
latest |
( -- addr ) | Store LATEST variable value in top of data stack |
: |
( -- ) | Start the definition of a new word |
; |
( -- ) | End the definition of a new word |
The following Forth registers are assigned to RISC-V registers below. The source files also use additional registers such as temporaries (t0
to t6
).
Forth name | RISC-V name | Description |
---|---|---|
DSP | sp | data stack pointer |
W | a0 | working register |
X | a1 | working register |
Y | a2 | working register |
Z | a3 | working register |
FP | s0 | frame pointer (unused for now) |
IP | s1 | instruction pointer |
RSP | s2 | return stack pointer |
The firmware binary is built using GNU as
, so all source files have the lowercase .s
extension.
Filename | Description |
---|---|
fiveforths.s | Loads the actual source files from src/ |
src/ |
|
01-variables-constants.s | Some constants which are stored in Flash memory, but which may point to memory addresses to be used as variables |
02-macros.s | Macros to avoid repeating code throughout the source files |
03-interrupts.s | The interrupt initialization and handling routines |
04-io-helpers.s | Helpers to send and receive characters over the UART |
05-internal-functions.s | Functions called by the interpreter such as hashing and lookup functions |
06-initialization.s | Initialization routines when the board is booted or reset |
07-error-handling.s | Error handling routines and messages to be printed |
08-forth-primitives.s | The Forth primitive words |
09-interpreter.s | The interpreter functions to process UART characters, execute and compile words |
src/boards/<board>/ |
|
boards.s | Variables and constant specific to the <board> |
linker.ld | Linker script specific to the <board> |
src/mcus/<mcu>/ |
|
mcu.s | Variables and constant specific to the <mcu> |
The stack size is defined in mcu.s
and defaults to 256 bytes for the Data, Return, Terminal
stacks. The Data
and Return
stacks grow downward from the top of the memory. The Terminal
buffer grows upward from the start of the Variables
area. The User Dictionary
grows upward from the bottom of the memory. Currently 5
Cells are used to store variables. There is also an additional 64
Cells reserved for the Pad
area, which can grow upward or downward. The Pad
area is not exposed in Forth and should be used exclusively by internal code or new Assembly primitives - as an in-memory scratchpad without affecting the other stacks or user dictionary.
Top
+-----------------+-------------------------+
| Memory Map | Size (1 Cell = 4 Bytes) |
+-------------------------------------------+
| | | |
| Data Stack | 64 Cells (256 Bytes) | |
| | | v
+-------------------------------------------+
| | | |
| Return Stack | 64 Cells (256 Bytes) | |
| | | v
+-------------------------------------------+
| | | ^
| Terminal Buffer | 64 Cells (256 Bytes) | |
| | | |
+-------------------------------------------+
| | |
| Variables | 5 Cells (20 Bytes) |
| | |
+-------------------------------------------+
| | | ^
| Pad Area | 64 Cells (256 Bytes) | |
| | | v
+-------------------------------------------+
| | | ^
| | | |
| | | |
| User Dictionary | Variable size | |
| | | |
| | | |
| | | |
+-----------------+-------------------------+
A dictionary word header contains 3 Cells (3 x 32 bits = 12 bytes). The Link
is the value of the last defined word, which is stored in the variable LATEST
. The Hash
is generated by the djb2_hash
function. And the Codeword
is the address of the .addr
label which jumps to the docol
function.
+----------+----------+-------------+
| Link | Hash | Codeword |
+----------+----------+-------------+
32-bits 32-bits 32-bits
The hash is a 32-bit hash with the last 8 bits (from the LSB) used for the Flags (3 bits) and Length (5 bits) of the word.
32-bit hash
+-------+--------+------------------+
| FLAGS | LENGTH | HASH |
+-------+--------+------------------+
3-bits 5-bits 24-bits
This document would be incomplete without listing other Forths which inspired me and are worth checking out:
- colorForth, by Chuck Moore (inventor)
- Mecrisp, batteries-included with FPGA support
- sectorforth, super tiny 16-bit implementation
- jonesforth, 32-bit heavily documented
- derzforth, 32-bit risc-v inspiration
- nasmjf, the devlog idea and well documented
- CamelForth, by Brad Rodriguez (Moving Forth)
- muforth, the sum of all Forth knowledge
- planckforth, how to bootstrap a Forth
Additional information can be found in the devlogs.
Now that you've grokked the reference, you're ready to read the other documents below:
- TUTORIALS: a quick guide to get started
- EXPLAIN: learn the story behind FiveForths
- HOWTO: build, usage, and code examples in Forth and RISC-V Assembly
FiveForths documentation and source code copyright © 2021~ Alexander Williams and licensed under the permissive open source MIT license.