We need some C background to fluently read the runtime source. If you are not familiar with C, here are some good resources:
- Beginner C resource list↗
- Brian Kernighan & Dennis Richie, The C Programming Language↗
- Ben Klemens, 21st Century C: Tips from the New School↗
- Lawrence Angrave, _System Programming↗
Runtime Structure and Responsibilities
If the vision of Urbit is to implement [2 [0 3] 0 2] as a frozen lifecycle function, then it needs some scaffolding on any real system. Real computers have memories, chipset architectures, operating system conventions, and other affordances (and limitations). Conventionally, an operating system takes explicit care of such niceties, which is one reason why calling Urbit an “OS” has been controversial. The runtime interpreter is designed to take Nock as a specification and produce a practical computer of it.
Today, there are two primary Nock executable runtimes: Vere and Sword (née Ares). (Jaque, the JVM experiment, and King Haskell have fallen by the wayside.)
- Vere is written in C and is the standard runtime for Arvo.
- Sword is written in Rust and aims to solve some deep theoretical challenges to producing efficient Nock on a contemporary chipset. Sword is under development by Zorp↗ and formerly with contributions from Tlon Corporation↗, and the Urbit Foundation↗.
We will take Vere as the normative runtime for Core Academy.
As we mentioned last time in the boot sequence lesson, the runtime spawns the king (king.c) and indirectly the serf (serf.c) processes. These both run for the lifetime of the Urbit process.
There are two competing frames for how to structure the Urbit process: king/serf and urth/mars.
King v. serf separates the Nock and Arvo material from the I/O and event log material. It has the advantage that (per the whitepaper), “The serf only ever talks to the king, while the king talks with both the serf and Unix.”
The king process is in charge of:
- IPC
- Event log
- Unix effects including I/O
- Stateless Nock interpreter
The serf process is the Nock runtime and bears responsible for:
- Nock virtual machine (tracking current state of Arvo as a noun and
++pokeing it with nouns) - Bytecode interpretation
- Jet dashboard
- Snapshotting
- Noun allocation for Arvo
The Mars/Urth split↗ reframes the worker process so that it includes the event log with the current serf responsibility (“Mars”), thus enabling online event log management and truncation.
The Structure of Vere’s Source
Vere is provided in the urbit/vere repo. It is built from the pkg/ directory and contains the following top-level folders:
.├── c3├── ent├── noun├── ur├── urcrypt└── vere
/c3contains the types and definitions to enable thec3logical system.c3is the set of C conventions which Vere enforces. These include well-specified integer types, tooling for loobeans (instead of booleans), and motes (#defines for short Urbit words). “The C3 style uses Hoon style TLV variable names, with a quasi Hungarian syntax.” There are no Urbit-specific requirements for C3, which could otherwise just be a general-purpose C discipline.Like aura bitwidth markers, C documents programmer intent but does not generally enforce it. Most of the parts of
c3are simply lapidary terms for C99 types.- Scan the files in
/c3.
- Scan the files in
/entprovides entropy for the runtime. Entropy is derived from/dev/urandom, which is a special file that provides pseudorandom numbers derived from system noise./dev/urandomproduces machine randomness as close to true randomness as possible↗, including seeds like network latency and keystroke latency to seed the cryptographically secure pseudo-random number generator (CSPRNG)./nounis the gorilla, containingu3(the noun library) and the jets. We'll go into it in detail with the system architecture in a moment in Sectionu3./ur, is like/enta single-purpose library, in this case for bitstreams and serialization./urcryptis a C library to standardize cryptographic calls across a number of libraries.This library is a dependency for both Vere and Ares, and is in the process of being moved into a standalone repo.
/verecontains the runtime architecture itself, the king and the serf and related tooling, as independent fromu3.
| file | purpose |
|---|---|
auto.c | I/O drivers |
benchmarks.c | performance tests |
dawn.c | key validation for bootstrapping |
disk.c | database reads and writes for event log |
foil.c | file synching |
king.c | main runtime loop |
lord.c | manage IPC between king and serf |
main.c | setup and entrypoint for runtime execution |
mars.c | Mars event log replay (see Mars/Urth split above) |
newt.c | noun blob messages |
pier.c | manage pier (files on host OS disk) |
save.c | save events to pier |
serf.c | the serf itself |
time.c | Unix/Urbit time operations |
vere.h | shared Vere-specific structs |
ward.c | lifecycle management for structures |
u3
Nouns
A noun is either an atom or a cell. However, we have to decide what this implementation looks like in a language like C, that prefers arrays and pointers. u3 is the noun library, which features Urbit-specific memory operations, tracing and profiling tools, and so forth.
A u3_noun is a 32-bit c3_w = uint32_t. The first bits indicate what kind of value the noun is and thus how to approach it:
| Bit 31 | Bit 30 | Meaning |
|---|---|---|
1 | 1 | Indirect cell (pom) |
1 | 0 | Indirect atom (pug) |
0 | ·— | Direct atom (cat) |
An indirect noun is a dog. For indirect nouns, bits 29–0 are a word pointer into the loom. In addition, 0xffff.ffff is u3_none, which is “not a noun”.
A common pattern is to extract values from a noun into C-typed values, carry out the manipulation, and then wrap them back into the noun. Furthermore, the value from an arbitrary atom may in fact be a bignum, and so GMP↗ is used to manage these values.
- Examine
/noun/jets/a/add.c, in particularu3qa_add.
One of the painful parts of working with u3 is the reference counting system. Reference counting↗ is an expedient to handle tracking the number of pointers to an object in memory so that the memory can be freed at the appropriate time. Since C doesn't provide reference counting support in the language, we must manually track these and free the value only when the refcount goes to zero. The relevant functions are u3k to gain a refcount and u3z to lose one.
There are also two different protocols for reference counting, used by different parts of the system:
transfersemantics relinquishes a refcount of any sent values. Most functions behave this way, which means that you don't have to think about de-allocating values if they've been sent elsewhere.retainsemantics hold onto the refcount even if the value is sent elsewhere. The functions which useretainsemantics tend to inspect or query nouns rather than make or modify nouns.
The
u3convention is that, unless otherwise specified, all functions have transfer semantics - with the exception of the prefixes:u3r,u3x,u3z,u3qandu3w. Also, within jet directoriesathroughf(but notg), internal functions retain (for historical reasons).
- Compare
u3ka_addandu3qa_add.
u3 is designed to make some guarantees for the programmer. It's not Urbit itself, but it's designed to be an implementation platform for Urbit. Thus:
- Every event is logged internally before it enters
u3. - A permanent state noun maintains a single reference.
- Any event can be aborted without damaging the permanent state (“solid state”).
- We snapshot the permanent state and can prune logs.
We will discuss the specifics of the memory model next week in ca06 when we discuss the loom and the road model.
- “Land of Nouns”; note particularly the section
u3: reference protocols, labeledTHIS IS THE MOST CRITICAL SECTION IN THE `u3` DOCUMENTATION.Read that if nothing else.
Library
The contents of /noun constitute the u3 noun library. Functions are organized by file and prefix into certain namespaces by operation. Because u3 is a library, we can't cleanly separate it into serf/king components, although certain modules do have close identification with one or the other.
| prefix | purpose | .h | .c |
|---|---|---|---|
u3a_ | allocation | allocate.h | allocate.c |
u3e_ | persistence | events.h | events.c |
u3h_ | hashtables | hashtable.h | hashtable.c |
u3i_ | noun construction | imprison.h | imprison.c |
u3j_ | jet control | jets.h | jets.c |
u3l_ | logging | log.h | log.c |
u3m_ | system management | manage.h | manage.c |
u3n_ | nock computation | nock.h | nock.c |
u3o_ | command-line options | options.h | options.c |
u3r_ | noun access (error returns) | retrieve.h | retrieve.c |
u3s_ | noun serialization | serial.h | serial.c |
u3t_ | profiling | trace.h | trace.c |
u3u_ | urth (memory management) | urth.h | urth.c |
u3v_ | arvo | vortex.h | vortex.c |
u3x_ | noun access (error crashes) | xtract.h | xtract.c |
u3z_ | memoization | zave.h | zave.c |
u3k[a-g] | jets (transfer, C args) | jets/k.h | jets/[a-g]/*.c |
u3q[a-g] | jets (retain, C args) | jets/q.h | jets/[a-g]/*.c |
u3w[a-g] | jets (retain, nock core) | jets/w.h | jets/[a-g]/*.c |
u3adefines memory allocation functions. These are used throughout, but we'll discuss it a bit more when we talk about the king. You will quickly run into reference counting features, likeu3k(u3a_gain()) to gain a refcount andu3z(u3a_lose()) to lose one.u3emanages the loom.u3hprovides fast custom hashing for the runtime.u3iputs a value (expected to be ac3type) into a noun. (Look at this one now.)u3lsupports logging.u3mmanages the system: bootsu3, makes a pier, handles crashes, etc.u3nimplements the Nock bytecode interpreter.u3oparses the manifold command-line options of Urbit and writes them into globals.u3rextracts a value from a noun, with au3_weakon failure. (Look at this one now.)u3simplements noun serialization (++jamand++cue).u3tprovides tracing for crashes.u3uoffers memory management tooling (deduplication and memory mapping).u3vsupports Arvo interaction.u3xextracts a value from a noun., with a crash on failureu3zsupports~+siglus rune memoization.
If you work much in Vere, you will get used to seeing these. There are basically two broad categories of functions: single-use functions (like starting a pier, u3m_pier) and utility functions (like writing a value to a noun, u3i_word).
Return to
/noun/jets/a/add.cand look atu3wa_addandu3ka_add.
The Serf
The serf process is the Nock runtime and bears responsible for:
- Nock virtual machine (tracking current state of Arvo as a noun and
++pokeing it with nouns) - Bytecode interpretation
- Jet dashboard
If you examine /vere/serf.c, you can get a feel for how it is organized. See e.g. u3_serf_work and callees.
Arvo Noun Management
/vere/vortex.c, e.g.u3v_peek,u3v_wish, andu3v_poke_sure.
Nock Bytecode Interpreter (u3n)
/noun/nock.c, e.g.u3n_nock_on,u3n_slam_on(calling convention for gates).
The end result of the Hoon compilation process is Nock code as a noun. This noun is evaluated by the runtime, but it is not actually directly run as such. Instead, the runtime builds an efficient bytecode stream and executes that instead to complete the calculation.
The Nock bytecode for any expression can be obtained using the %xray raw hint.
> ~> %xray =+(2 [- +(-)]){[litb 2] snol head swap head bump ault halt}[2 3]> ~> %xray =+(2 [(add - -) +(-)]){[litb 2] snol [fask 4095] [kicb 1] snoc head swap [fabk 6] swap [fabk 6] auto musm [kicb 0] swap head bump ault halt}[4 3]
The Nock bytecode is defined in the OPCODES macro in /noun/nock.c and evaluated by _n_burn in that same folder. The OPCODES #define uses the X macro↗, which is a bit of C deep lore.
As a consequence of the architecture of Vere today, we see a lot of expensive call overhead. For instance, when you wrap an %xray hint around a core, you don't get the core itself—instead you get the formula that invokes the code.
> ~> %xray (met 3 (jam .)){[fask 1023] [kicb 3] snol head swap tail [lilb 3] swap tail [fask 1023] [kicb 2] snol head swap tail musm [kicb 1] auto musm [ticb 0] halt}984.339
Since many things are computed in virtual Nock, ++mock, we have bail/trace/bounded computation at the price of slow virtualization.
One objective of Sword (née Ares), subject knowledge analysis, is to improve on Nock bytecode generation. This is being implemented into Vere as well.
Jet Dashboard (u3j)
As we summarized when first introducing jets in ca00, the runtime manages jets, including re-running them when playing back the event log history.
The jet dashboard is the system in the runtime that registers, validates, and runs jets: specific pieces of Nock code reimplemented in C for performance.
The jet dashboard maintains three jet state systems:
coldstate results from the logical execution history of the pier and consists of nouns.coldjet state registers jets as they are found.coldstate ignore restarts.hotstate is the global jet dashboard and describes the actual set of jets loaded into the pier for the current running process. Calls tohotstate result from Nock Nine invocations of a core and an axis.hotstate is thus tied to process restart.warmlists dependencies betweencoldandhotstate.warmstate can be cleared at any time and is cleared on restart.
The jet dashboard (u3j, /noun/jets.c) will not be explored in detail in Core Academy, but we do want to look at a couple of actual jets.
Jets
- Examine
/noun/jets/b/lent.c,/noun/jets/b/turn.c,/noun/jets/c/turn.c,/noun/jets/e/rs.c,/noun/jets/e/slaw.c.
Many Urbit contributors may find jet composition to be their first serious encounter with the runtime. On the bright side, jetting is a fairly constrained and well-understood space. However, it has a complex interface for unpacking calls and nouns, including reference counting requirements.
u3wfunctions are the main entry point (as identified in/noun/tree.c). These unpack and sanity-check the sample, then call eitheru3qoru3kvariants of the jet. The unpacking axes are hard-coded in/noun/xtract.h.By convention,
u3qandu3wfunctions havetransfersemantics.u3kfunctions haveretainsemantics, so they are responsible tou3zfree their values after the computation completes.u3_none(0x7fff.ffff) is NOT the same asu3_nul. A jet that returnsu3_nonepunts the value back to the Hoon/Nock version.
Snapshotting
We'll cover snapshotting in the next lesson, ca06.