We need some C background to fluently read the runtime source. If you are not familiar with C, here are some good resources:
- Beginner C resource list↗
- Brian Kernighan & Dennis Richie, The C Programming Language↗
- Ben Klemens, 21st Century C: Tips from the New School↗
- Lawrence Angrave, _System Programming↗
Runtime Structure and Responsibilities
If the vision of Urbit is to implement [2 [0 3] 0 2]
as a frozen lifecycle function, then it needs some scaffolding on any real system. Real computers have memories, chipset architectures, operating system conventions, and other affordances (and limitations). Conventionally, an operating system takes explicit care of such niceties, which is one reason why calling Urbit an “OS” has been controversial. The runtime interpreter is designed to take Nock as a specification and produce a practical computer of it.
Today, there are two primary Nock executable runtimes: Vere and Sword (née Ares). (Jaque, the JVM experiment, and King Haskell have fallen by the wayside.)
- Vere is written in C and is the standard runtime for Arvo.
- Sword is written in Rust and aims to solve some deep theoretical challenges to producing efficient Nock on a contemporary chipset. Sword is under development by Zorp↗ and formerly with contributions from Tlon Corporation↗, and the Urbit Foundation↗.
We will take Vere as the normative runtime for Core Academy.
As we mentioned last time in the boot sequence lesson, the runtime spawns the king (king.c
) and indirectly the serf (serf.c
) processes. These both run for the lifetime of the Urbit process.
There are two competing frames for how to structure the Urbit process: king/serf and urth/mars.
King v. serf separates the Nock and Arvo material from the I/O and event log material. It has the advantage that (per the whitepaper), “The serf only ever talks to the king, while the king talks with both the serf and Unix.”
The king process is in charge of:
- IPC
- Event log
- Unix effects including I/O
- Stateless Nock interpreter
The serf process is the Nock runtime and bears responsible for:
- Nock virtual machine (tracking current state of Arvo as a noun and
++poke
ing it with nouns) - Bytecode interpretation
- Jet dashboard
- Snapshotting
- Noun allocation for Arvo
The Mars/Urth split↗ reframes the worker process so that it includes the event log with the current serf responsibility (“Mars”), thus enabling online event log management and truncation.
The Structure of Vere’s Source
Vere is provided in the urbit/vere
repo. It is built from the pkg/
directory and contains the following top-level folders:
.├── c3├── ent├── noun├── ur├── urcrypt└── vere
/c3
contains the types and definitions to enable thec3
logical system.c3
is the set of C conventions which Vere enforces. These include well-specified integer types, tooling for loobeans (instead of booleans), and motes (#define
s for short Urbit words). “The C3 style uses Hoon style TLV variable names, with a quasi Hungarian syntax.” There are no Urbit-specific requirements for C3, which could otherwise just be a general-purpose C discipline.Like aura bitwidth markers, C documents programmer intent but does not generally enforce it. Most of the parts of
c3
are simply lapidary terms for C99 types.- Scan the files in
/c3
.
- Scan the files in
/ent
provides entropy for the runtime. Entropy is derived from/dev/urandom
, which is a special file that provides pseudorandom numbers derived from system noise./dev/urandom
produces machine randomness as close to true randomness as possible↗, including seeds like network latency and keystroke latency to seed the cryptographically secure pseudo-random number generator (CSPRNG)./noun
is the gorilla, containingu3
(the noun library) and the jets. We'll go into it in detail with the system architecture in a moment in Sectionu3
./ur
, is like/ent
a single-purpose library, in this case for bitstreams and serialization./urcrypt
is a C library to standardize cryptographic calls across a number of libraries.This library is a dependency for both Vere and Ares, and is in the process of being moved into a standalone repo.
/vere
contains the runtime architecture itself, the king and the serf and related tooling, as independent fromu3
.
file | purpose |
---|---|
auto.c | I/O drivers |
benchmarks.c | performance tests |
dawn.c | key validation for bootstrapping |
disk.c | database reads and writes for event log |
foil.c | file synching |
king.c | main runtime loop |
lord.c | manage IPC between king and serf |
main.c | setup and entrypoint for runtime execution |
mars.c | Mars event log replay (see Mars/Urth split above) |
newt.c | noun blob messages |
pier.c | manage pier (files on host OS disk) |
save.c | save events to pier |
serf.c | the serf itself |
time.c | Unix/Urbit time operations |
vere.h | shared Vere-specific struct s |
ward.c | lifecycle management for structures |
u3
Nouns
A noun is either an atom or a cell. However, we have to decide what this implementation looks like in a language like C, that prefers arrays and pointers. u3
is the noun library, which features Urbit-specific memory operations, tracing and profiling tools, and so forth.
A u3_noun
is a 32-bit c3_w
= uint32_t
. The first bits indicate what kind of value the noun is and thus how to approach it:
Bit 31 | Bit 30 | Meaning |
---|---|---|
1 | 1 | Indirect cell (pom ) |
1 | 0 | Indirect atom (pug ) |
0 | ·— | Direct atom (cat ) |
An indirect noun is a dog
. For indirect nouns, bits 29–0 are a word pointer into the loom. In addition, 0xffff.ffff
is u3_none
, which is “not a noun”.
A common pattern is to extract values from a noun into C-typed values, carry out the manipulation, and then wrap them back into the noun. Furthermore, the value from an arbitrary atom may in fact be a bignum, and so GMP↗ is used to manage these values.
- Examine
/noun/jets/a/add.c
, in particularu3qa_add
.
One of the painful parts of working with u3
is the reference counting system. Reference counting↗ is an expedient to handle tracking the number of pointers to an object in memory so that the memory can be freed at the appropriate time. Since C doesn't provide reference counting support in the language, we must manually track these and free the value only when the refcount goes to zero. The relevant functions are u3k
to gain a refcount and u3z
to lose one.
There are also two different protocols for reference counting, used by different parts of the system:
transfer
semantics relinquishes a refcount of any sent values. Most functions behave this way, which means that you don't have to think about de-allocating values if they've been sent elsewhere.retain
semantics hold onto the refcount even if the value is sent elsewhere. The functions which useretain
semantics tend to inspect or query nouns rather than make or modify nouns.
The
u3
convention is that, unless otherwise specified, all functions have transfer semantics - with the exception of the prefixes:u3r
,u3x
,u3z
,u3q
andu3w
. Also, within jet directoriesa
throughf
(but notg
), internal functions retain (for historical reasons).
- Compare
u3ka_add
andu3qa_add
.
u3
is designed to make some guarantees for the programmer. It's not Urbit itself, but it's designed to be an implementation platform for Urbit. Thus:
- Every event is logged internally before it enters
u3
. - A permanent state noun maintains a single reference.
- Any event can be aborted without damaging the permanent state (“solid state”).
- We snapshot the permanent state and can prune logs.
We will discuss the specifics of the memory model next week in ca06
when we discuss the loom and the road model.
- “Land of Nouns”; note particularly the section
u3: reference protocols
, labeledTHIS IS THE MOST CRITICAL SECTION IN THE `u3` DOCUMENTATION.
Read that if nothing else.
Library
The contents of /noun
constitute the u3
noun library. Functions are organized by file and prefix into certain namespaces by operation. Because u3
is a library, we can't cleanly separate it into serf/king components, although certain modules do have close identification with one or the other.
prefix | purpose | .h | .c |
---|---|---|---|
u3a_ | allocation | allocate.h | allocate.c |
u3e_ | persistence | events.h | events.c |
u3h_ | hashtables | hashtable.h | hashtable.c |
u3i_ | noun construction | imprison.h | imprison.c |
u3j_ | jet control | jets.h | jets.c |
u3l_ | logging | log.h | log.c |
u3m_ | system management | manage.h | manage.c |
u3n_ | nock computation | nock.h | nock.c |
u3o_ | command-line options | options.h | options.c |
u3r_ | noun access (error returns) | retrieve.h | retrieve.c |
u3s_ | noun serialization | serial.h | serial.c |
u3t_ | profiling | trace.h | trace.c |
u3u_ | urth (memory management) | urth.h | urth.c |
u3v_ | arvo | vortex.h | vortex.c |
u3x_ | noun access (error crashes) | xtract.h | xtract.c |
u3z_ | memoization | zave.h | zave.c |
u3k[a-g] | jets (transfer, C args) | jets/k.h | jets/[a-g]/*.c |
u3q[a-g] | jets (retain, C args) | jets/q.h | jets/[a-g]/*.c |
u3w[a-g] | jets (retain, nock core) | jets/w.h | jets/[a-g]/*.c |
u3a
defines memory allocation functions. These are used throughout, but we'll discuss it a bit more when we talk about the king. You will quickly run into reference counting features, likeu3k
(u3a_gain()
) to gain a refcount andu3z
(u3a_lose()
) to lose one.u3e
manages the loom.u3h
provides fast custom hashing for the runtime.u3i
puts a value (expected to be ac3
type) into a noun. (Look at this one now.)u3l
supports logging.u3m
manages the system: bootsu3
, makes a pier, handles crashes, etc.u3n
implements the Nock bytecode interpreter.u3o
parses the manifold command-line options of Urbit and writes them into globals.u3r
extracts a value from a noun, with au3_weak
on failure. (Look at this one now.)u3s
implements noun serialization (++jam
and++cue
).u3t
provides tracing for crashes.u3u
offers memory management tooling (deduplication and memory mapping).u3v
supports Arvo interaction.u3x
extracts a value from a noun., with a crash on failureu3z
supports~+
siglus rune memoization.
If you work much in Vere, you will get used to seeing these. There are basically two broad categories of functions: single-use functions (like starting a pier, u3m_pier
) and utility functions (like writing a value to a noun, u3i_word
).
Return to
/noun/jets/a/add.c
and look atu3wa_add
andu3ka_add
.
The Serf
The serf process is the Nock runtime and bears responsible for:
- Nock virtual machine (tracking current state of Arvo as a noun and
++poke
ing it with nouns) - Bytecode interpretation
- Jet dashboard
If you examine /vere/serf.c
, you can get a feel for how it is organized. See e.g. u3_serf_work
and callees.
Arvo Noun Management
/vere/vortex.c
, e.g.u3v_peek
,u3v_wish
, andu3v_poke_sure
.
Nock Bytecode Interpreter (u3n
)
/noun/nock.c
, e.g.u3n_nock_on
,u3n_slam_on
(calling convention for gates).
The end result of the Hoon compilation process is Nock code as a noun. This noun is evaluated by the runtime, but it is not actually directly run as such. Instead, the runtime builds an efficient bytecode stream and executes that instead to complete the calculation.
The Nock bytecode for any expression can be obtained using the %xray
raw hint.
> ~> %xray =+(2 [- +(-)]){[litb 2] snol head swap head bump ault halt}[2 3]> ~> %xray =+(2 [(add - -) +(-)]){[litb 2] snol [fask 4095] [kicb 1] snoc head swap [fabk 6] swap [fabk 6] auto musm [kicb 0] swap head bump ault halt}[4 3]
The Nock bytecode is defined in the OPCODES
macro in /noun/nock.c
and evaluated by _n_burn
in that same folder. The OPCODES
#define
uses the X macro↗, which is a bit of C deep lore.
As a consequence of the architecture of Vere today, we see a lot of expensive call overhead. For instance, when you wrap an %xray
hint around a core, you don't get the core itself—instead you get the formula that invokes the code.
> ~> %xray (met 3 (jam .)){[fask 1023] [kicb 3] snol head swap tail [lilb 3] swap tail [fask 1023] [kicb 2] snol head swap tail musm [kicb 1] auto musm [ticb 0] halt}984.339
Since many things are computed in virtual Nock, ++mock
, we have bail/trace/bounded computation at the price of slow virtualization.
One objective of Sword (née Ares), subject knowledge analysis, is to improve on Nock bytecode generation. This is being implemented into Vere as well.
Jet Dashboard (u3j
)
As we summarized when first introducing jets in ca00
, the runtime manages jets, including re-running them when playing back the event log history.
The jet dashboard is the system in the runtime that registers, validates, and runs jets: specific pieces of Nock code reimplemented in C for performance.
The jet dashboard maintains three jet state systems:
cold
state results from the logical execution history of the pier and consists of nouns.cold
jet state registers jets as they are found.cold
state ignore restarts.hot
state is the global jet dashboard and describes the actual set of jets loaded into the pier for the current running process. Calls tohot
state result from Nock Nine invocations of a core and an axis.hot
state is thus tied to process restart.warm
lists dependencies betweencold
andhot
state.warm
state can be cleared at any time and is cleared on restart.
The jet dashboard (u3j
, /noun/jets.c
) will not be explored in detail in Core Academy, but we do want to look at a couple of actual jets.
Jets
- Examine
/noun/jets/b/lent.c
,/noun/jets/b/turn.c
,/noun/jets/c/turn.c
,/noun/jets/e/rs.c
,/noun/jets/e/slaw.c
.
Many Urbit contributors may find jet composition to be their first serious encounter with the runtime. On the bright side, jetting is a fairly constrained and well-understood space. However, it has a complex interface for unpacking calls and nouns, including reference counting requirements.
u3w
functions are the main entry point (as identified in/noun/tree.c
). These unpack and sanity-check the sample, then call eitheru3q
oru3k
variants of the jet. The unpacking axes are hard-coded in/noun/xtract.h
.By convention,
u3q
andu3w
functions havetransfer
semantics.u3k
functions haveretain
semantics, so they are responsible tou3z
free their values after the computation completes.u3_none
(0x7fff.ffff
) is NOT the same asu3_nul
. A jet that returnsu3_none
punts the value back to the Hoon/Nock version.
Snapshotting
We'll cover snapshotting in the next lesson, ca06
.