11. Vanes IV: Clay

Clay is Urbit's version-controlled, referentially-transparent, globally-addressable filesystem. All data in Clay are typed and most are convertible to other types.

  • Version controlled: Clay natively supports Git-like history, branching, merging, and checkpointing. A particular continuity (“desk”) is a series of numbered commits.

  • Referentially transparent: for Clay, referential transparency means “a request must always yield the same result for all time.”

  • Globally addressable: the standard resource identifier format includes the ship and time, which means that a fully-specified Clay path is similar to a URI.

  • Persistent: Clay inherits from Arvo that all events are persisted to disk.

  • Typed: Clay attaches identification tags to any data and has ready to hand a set of conversion routines appropriate to the data type. These ID tags are called “marks,” and they act like MIME types.

To adequately address Clay we need to consider its quite varied functionality. To that end, we will not make an exposition for Clay per /sys/lull but instead break it up topically.

:: clay (4c), revision control
::
:: The way to understand Clay is to take it section-by-section:
::
:: - Data structures. You *must* start here; make sure you understand
:: the entire contents of +raft.
::
:: - Individual reads. +aver is the entry point, follow it through
:: +read-at-tako to understand each kind of read.
::
:: - Subscriptions. +wake is the center of this mechanism; nothing
:: else responds to subscriptions. +wake has no arguments, which means
:: every subscription response happens when something in Clay's *state*
:: has changed. No edge-triggered responses.
::
:: - Receiving foreign data. For individual requests, this is
:: +take-foreign-answer. For sync requests (%many, which is %sing %v
:: for a foreign desk), this is +foreign-update.
::
:: - Ford. +ford builds hoon files and gives files their types.
:: Read +build-file for the first, and +read-file is the second.
::
:: - Writing to a desk. Every write to a desk goes through +park, read
:: it thoroughly.
::
:: - Merges. Control flow starts at +start-merge, then +merge, but
:: everything is scaffolding for +merge-by-germ, which is the ideal of
:: a merge function: it takes two commits and a merge strategy and
:: produces a new commit.
::
:: - Tombstoning. This is in +tomb.

We will distribute the types as appropriate and organize these conceptually into our Core Academy approach:

  1. File and desks
  2. Subscriptions and desk distribution (including OTAs)
  3. Marks, tubes, and ++ford
  4. Merges and desk writes
  5. Move handler
  6. The scry interface
  7. Solid-state subscriptions

A warning: Clay is very old, and represents some of the darkest jungle of Ye Olde Urbyt. The names and connexions are often obscure but we will bushwhack a trail through /sys/vane/clay together.

Files and Desks

Files

What is a file system? It’s a way to pretend that a hard drive isn’t really a bunch of spinning magnetic platters that can store bits at certain locations, but rather a hierarchical system of folders-within-folders containing individual files that in turn consist of one or more strings of bytes. (Joel Spolsky, “The Law of Leaky Abstractions”)

What is a file? Per the context set by the quote above, it is a string of bytes at the level we wish to consider it. For all purposes in Urbit as a single-level store, we ignore file fragmentation and we try to ignore endianness except in certain specific cases. The “hierarchical system of folders-within-folders” has little bearing on actual storage in Clay, as the identifying path is simply a tag. That tag is hashed and put into a lookup table, and when you request a resource Clay checks its map and produces the file as a noun for you.

Using a %cx scry, we can examine a particular resource on the %base desk as either a byte string or an ASCII text string:

`@ux`.^(@ %cx /===/gen/cat/hoon)
.^(cord %cx /===/gen/cat/hoon)

A small fib in the above statement is that the path includes more than just an arbitrary “file path”. In particular, every file path really includes a beak at its head (as text). (There's some waffling here between values and knots in various parts of the system, since a path is formally (list knot).)

+$ beak [p=ship q=desk r=case] :: path prefix
::
+$ beam [[p=ship q=desk r=case] s=path] :: global name
::
+$ aeon @ud :: version number

beak and ship

The top-level determiner for Clay is the beak: (p=ship q=desk r=case). The ship is straightforward, merely the actual ship on which a resource resides. Clay is a globally-addressable filesystem so we can refer to a hypothetical resource on any ship. (This does not mean that the value actually exists, of course.)

+$ ship @p

desk

The second element of the beak is the desk. Clay organizes the world into desks, which are also the most logical unit for app distribution. Essentially a desk is an organized collections of files with a common history.

+$ desk @tas

The structure of a desk matters for several parts of the Urbit system. Some parts are requirements: notably /mar which contains details on how to load the file resources as nouns, and its dependencies in /sur (structure files) and /lib (library files). By convention, the following are also present:

  • /app (agent files, recognized by Gall)
  • /gen (generators, recognized by Dojo)
  • /sys (recognized by Arvo generally but only really active on %base)
  • /ted (thread files, recognized by %spider and %khan)

(Keep in mind that inclusion of a file several times, like a mark file, need not lead to OS bloat since noun deduplication can store multiple references to a single identical resource.)

Some desks include /tests for unit tests. A few other agents like %docs use their own conventions as well (/doc).

case

+$ case
$% [%da p=@da] :: %da date
[%tas p=@tas] :: %tas label
[%ud p=@ud] :: %ud sequence
[%uv p=@uv] :: %uv hash
==

Most commonly, the case is now, the timestamp which would refer to the file in its current state. Internally, Clay stores everything as a sequential value aeon in the %ud format and converts to equivalent cases when necessary (see e.g. ++aeon-to-tako).

Any part of the beak can typically be replaced with = in a statement to get a default value. (Where in the parsers is this handled?)

> /===
[~.~zod ~.base ~.~2022.7.17..23.50.01..3305 ~]
> /=landscape=
[~.~zod %landscape ~.~2022.7.17..23.50.05..9d0d ~]
> /~nec==
[~.~nec ~.base ~.~2022.7.17..23.50.13..dd5d ~]

Desks and Commits

Why bother with the case? Enter the commit, which refers to a particular revision. A desk is a collection of commits as a particular continuity. (You can think of a desk as being like a Git branch.) Each commit, or +$yaki, is a node in a historical state graph. It includes its parents and its namespace, as well links to any associated data to change. (While Clay can support file diffs, right now it does not have that feature turned on—it simply stores entire files; see %info below.)

+$ yaki :: commit
$: p=(list tako) :: parents
q=(map path lobe) :: namespace
r=tako :: self-reference
t=@da :: date
== ::
::
+$ tako @uvI :: yaki ref
+$ lobe @uvI :: blob ref

In a while, we will look at how commits change desk state via merging, but we can punt on that question for now.

Knowing what commits are, we are finally prepared to examine desk state, the +$dome:

+$ dome
$: let=aeon :: top id
hit=(map aeon tako) :: versions by id
lab=(map @tas aeon) :: labels
tom=(map tako norm) :: tomb policies
nor=norm :: default policy
mim=(map path mime) :: mime cache
fod=flue :: ford cache
wic=(map weft yoki) :: commit-in-waiting
liv=zest :: running agents
ren=rein :: force agents on/off
== ::
  • let is the version number at the latest point in the desk's history.
  • hit is the collection of version numbers pointing to commit hashes to arrive at the current state.
  • lab is the set of labels pointing to aeons (for case).
  • tom is the set of tombstone policies.
  • nor is the default tombstone policy.
  • mim is the MIME cache. Resources are converted to and from Unix frequently without changes so by caching we can sometimes short-circuit this.
  • fod is ++ford's build cache.
  • wic is the collection of commits pending to be applied (as in a system upgrade).
  • liv is the list of agents currently active on this desk, derived from desk.bill and Gall.
  • ren is the set of agents to force on or off.

Notes & Gifts

There aren't many notes or gifts directly associated with file management since you often locally scry out individual files. (This is as opposed to desk management, which has ample moves associated therewith.)

:: ::::
:::: ++clay :: (1c) versioning
:: ::::
++ clay ^?
|%
+$ gift :: out result <-$
$% [%writ p=riot] :: response
[%wris p=[%da p=@da] q=(set (pair care path))] :: many changes
== ::
+$ task :: in request ->$
$% [%warp wer=ship rif=riff] :: internal file req
[%werp who=ship wer=ship rif=riff-any] :: external file req
== ::
--

Basically, a %warp request (to either a local or a foreign ship) may result in a %writ gift in response. (This does require setting permissions )

Desk management has the following associated notes. (There are no gifts since per-desk information isn't exposed in the API this way.)

:: ::::
:::: ++clay :: (1c) versioning
:: ::::
++ clay ^?
|%
+$ task :: in request ->$
$% [%drop des=desk] :: cancel pending merge
[%info des=desk dit=nori] :: internal edit
[%into des=desk all=? fis=mode] :: external edit
$: %merg :: merge desks
des=desk :: target
her=@p dem=desk cas=case :: source
how=germ :: method
== ::
$: %fuse :: merge many
des=desk :: target desk
bas=beak :: base desk
con=(list [beak germ]) :: merges
== ::
[%park des=desk yok=yoki ran=rang] :: synchronous commit
[%pork ~] :: resume commit
[%prep lat=(map lobe page)] :: prime clay store
[%rein des=desk ren=rein] :: extra apps
[%tire p=(unit ~)] :: app state subscribe
[%tomb =clue] :: tombstone specific
[%zeal lit=(list [=desk =zest])] :: batch zest
[%zest des=desk liv=zest] :: live
== ::
--

Scries

The following scries specifically apply to files (single resources), commits, and desk-wide operations:

Most of the time you will use %x or %y from userspace, but in the kernel you may need more sophisticated information.

Subscriptions and Desk Distribution

A desk can subscribe to another remote desk as its upstream, meaning that any changes on the remote are automatically propagated to subscribers. Apps are typically distributed this way (the alternative being an installation from source).

A subscription means that the upstream sponsor maintains a list of requested downstream sponsees in its qyx.dojo state for the appropriate desk. You can access this through a %cx /cult scry:

.^((set [@p rave:clay]) %cx /=//=/cult/base)

The %next and %mult requests typically result from your own Gall agent wanting to know if the desk is updated (e.g. via a |commit). The %sing requests result from subscriptions and reads.

++wake is the center of this mechanism; nothing else responds to subscriptions. ++wake has no arguments, which means every subscription response happens when something in Clay's state has changed. No edge-triggered responses.

Desk distribution is largely the same for userspace app updates and system OTAs, except that userspace apps never require a system upgrade and do not have a separate %kids desk.

OTAs

The most important desk subscription is your %base desk to your sponsor's %kids desk. This is how Urbit OS updates (over-the-air updates or OTAs) are propagated. The lifecycle of an OTA:

  1. The sponsor syncs her %base desk to her %kids desk. (This takes place via a %merg task.) See MAINTAINERS.md for one procedure.
  • Why have a separate %kids desk?
  1. The sponsor's %kids desk notifies all of its subscribers/sponsees (in qyx.dojo). (This notification comes over Ames to Clay.)
  2. The sponsee receives the files via remote scry.
  3. Once these have arrived, then the remote update is applied via a merge. (See ++apply-foreign-update, in particular the definition of nut and hut.) See the discussion of merges below.
  4. If a system update (to Hoon, Arvo, or the vanes) is involved, then handle the OTA as discussed in ca04. This may involve simply recompiling the vanes and migrating the state, or it may require updating everything all the way back to /sys/hoon including a worklist. (In this latter case, see ++sys-update, ++park (including the kelvin check), and how the %pork %slip is managed.)
  5. This should unblock any desks that are blocked on the system kelvin version. See ++goad and ++wick.
  6. If this ship has any sponsees, propagate the OTA to the %kids desk and thence forward to the daughter points.
  7. If the state of the desk is requested, calculate and produce values like the %cz hash. (See /gen/vats and /sur/hood's ++report-prep and ++report-vat, and /sys/clay's ++content-hash, for instance.)

Merges and desk writes

A commit describes a set of changes to be made to a desk to result in a new desk state. Since desks are fundamentally collections of path-addressed resources, this is functionally similar to a regular Git-style version control system.

Thus we need to be able to produce commits (+$yakis) and apply them. This process is called a merge. A merge means that we have to decide how to reconcile two claims about reality into one. This results in several possible merge types in Clay. Most of the time you'll use %init implicitly (via |new-desk, for instance) or %only-that when you're trying to fix a desk mismatch.

These are ultimately concerned with reconciliation strategies involving commit types:

+$ miso :: file delta
$% [%del ~] :: delete
[%ins p=cage] :: insert
[%dif p=cage] :: mutate from diff
[%mut p=cage] :: mutate from raw
== ::
+$ soba (list [p=path q=miso]) :: delta
::
+$ misu :: computed delta
$% [%del ~] :: delete
[%ins p=cage] :: insert
[%dif p=lobe q=cage] :: mutate from diff
== ::
+$ suba (list [p=path q=misu]) :: delta
::
+$ nori :: repository action
$% [%& p=soba] :: delta
[%| p=@tas q=(unit aeon)] :: label
== ::
+$ nuri :: repository action
$% [%& p=suba] :: delta
[%| p=@tas] :: label
== ::
::
+$ mizu [p=@u q=(map @ud tako) r=rang] :: new state
+$ moar [p=@ud q=@ud] :: normal change range
+$ moat [from=case to=case =path] :: change range

Operations

Reads

++aver scaffolds read requests (%sings); see also ++read-at-tako (which is why we needed to see commit logic before we could really examine ++read-x).

Changes

A change to a desk can originate from at least three sources:

  1. Unix, via a mounted desk.
  2. Userspace editing.
  3. Update from a remote desk.
:: ::::
:::: ++clay :: (1c) versioning
:: ::::
++ clay ^?
|%
+$ gift :: out result <-$
$% [%mere p=(each (set path) (pair term tang))] :: merge result
[%writ p=riot] :: response
[%wris p=[%da p=@da] q=(set (pair care path))] :: many changes
== ::
+$ task :: in request ->$
$% [%drop des=desk] :: cancel pending merge
[%info des=desk dit=nori] :: internal edit
[%into des=desk all=? fis=mode] :: external edit
$: %merg :: merge desks
des=desk :: target
her=@p dem=desk cas=case :: source
how=germ :: method
== ::
$: %fuse :: merge many
des=desk :: target desk
bas=beak :: base desk
con=(list [beak germ]) :: merges
== ::
== ::
-- ::clay

We consider these first as single-file updates (commits), then take a look at the merge process.

To modify a file, we must produce an %info write task. This requires a desk label and a +$nori or repository action. In the case of writing a new file, the +$nori looks like this:

[%& ~[[/blade/runner/txt %ins %txt !>(~['Batty' 'Pris' 'Zhora' 'Leon'])]]]

passed into Clay like this:

|pass [%c %info %base [%& ~[[/blade/runner/txt %ins %txt !>(~['Batty' 'Pris' 'Zhora' 'Leon'])]]]]

(A text file in Urbit is a (list cord) not a cord, per the %txt mark.)

  • Trace how the %info task is dispatched into Clay: ++call%info→the worklist→++abet.

Merging desks is a more sophisticated operation, since it involves reconciling both current state and the parent commits. There are many ways to reconcile two versions of a desk:

+$ germ :: merge style
$? %init :: new desk
%fine :: fast forward
%meet :: orthogonal files
%mate :: orthogonal changes
%meld :: force merge
%only-this :: ours with parents
%only-that :: hers with parents
%take-this :: ours unless absent
%take-that :: hers unless absent
%meet-this :: ours if conflict
%meet-that :: hers if conflict
== ::

Control flow starts at ++start-merge, then ++merge, but everything is scaffolding for ++merge-by-germ, which is the ideal of a merge function: it takes two commits and a merge strategy and produces a new commit.

The mechanics of the merge are handled by ++merge-by-germ. For instance, an %only-this merge follows this rule:

If this is an %only-this merge, we check to see if ali's and bob's commits are the same, in which case we're done. Otherwise, we create a new commit with bob's data plus ali and bob as parents.

++ merge-by-germ
|= [=ali=yaki bob-yaki=(unit yaki)]
^- (unit merge-result)
?+ germ
%only-this
?: =(r.ali-yaki r.bob-yaki)
~
:* ~
conflicts=~
new=&+[[r.bob-yaki r.ali-yaki ~] (to-yuki q.bob-yaki)]
lat=~
==
--

Types

+$ cone (map [ship desk] dome) :: domes
+$ crew (set ship) :: permissions group
+$ dict [src=path rul=real] :: effective permission
+$ domo :: project state
$: let=@ud :: top id
hit=(map @ud tako) :: changes by id
lab=(map @tas @ud) :: labels
== ::
+$ germ :: merge style
$? %init :: new desk
%fine :: fast forward
%meet :: orthogonal files
%mate :: orthogonal changes
%meld :: force merge
%only-this :: ours with parents
%only-that :: hers with parents
%take-this :: ours unless absent
%take-that :: hers unless absent
%meet-this :: ours if conflict
%meet-that :: hers if conflict
== ::
+$ mode (list [path (unit mime)]) :: external files
+$ mood [=care =case =path] :: request in desk
+$ mool [=case paths=(set (pair care path))] :: requests in desk
+$ norm (axal ?) :: tombstone policy
+$ open $-(path vase) :: get prelude
+$ page ^page :: export for compat
+$ rang :: repository
$+ rang
$: hut=(map tako yaki) :: changes
lat=(map lobe page) :: data
== ::
+$ rant :: response to request
$: p=[p=care q=case r=desk] :: clade release book
q=path :: spur
r=cage :: data
== ::
+$ rave :: general request
$% [%sing =mood] :: single request
[%next =mood] :: await next version
[%mult =mool] :: next version of any
[%many track=? =moat] :: track range
== ::
+$ real :: resolved permissions
$: mod=?(%black %white) ::
who=(pair (set ship) (map @ta crew)) ::
== ::
+$ regs (map path rule) :: rules for paths
+$ rein (map dude:gall ?) :: extra apps
+$ riff [p=desk q=(unit rave)] :: request+desist
+$ riff-any ::
$% [%1 =riff] ::
== ::
+$ rite :: new permissions
$% [%r red=(unit rule)] :: for read
[%w wit=(unit rule)] :: for write
[%rw red=(unit rule) wit=(unit rule)] :: for read and write
== ::
+$ riot (unit rant) :: response+complete
+$ rule [mod=?(%black %white) who=(set whom)] :: node permission
+$ rump [p=care q=case r=@tas s=path] :: relative path
+$ saba [p=ship q=@tas r=moar s=dome] :: patch+merge
+$ toro [p=@ta q=nori] :: general change
++ unce :: change part
|* a=mold ::
$% [%& p=@ud] :: skip[copy]
[%| p=(list a) q=(list a)] :: p -> q[chunk]
== ::
++ urge |*(a=mold (list (unce a))) :: list change
+$ waft :: kelvin range
$^ [[%1 ~] p=(set weft)] ::
weft ::
+$ whom (each ship @ta) :: ship or named crew
+$ zest $~(%dead ?(%dead %live %held)) :: how live
:: ::

Building Code: ++ford & Marks

Clay is responsible for assembling and building code. Building code differs from compiling code in that Clay's ++ford arm must collect associated cores and code (referenced via / fas runes) and produce the appropriate Hoon source for ++ride and friends to process into executable Nock. (The former standalone %ford vane was merged into %clay via Ford Fusion in 2020.)

Since Clay receives updates as source from remote desks, Clay is the de facto prime mover for internal state upgrades.

  • /sys/hoon is stateless, so when it is updated it takes place first and just passes the worklist into the new world.
  • /sys/arvo does maintain state, so the current state must be extracted and passed into the newly built program.
  • /sys/zuse is stateless.
  • Vanes are stateful and like Arvo may have a larval phase if necessary. (Notably Gall has one.)
  • Userspace apps can then be updated by Gall using their ++on-save and ++on-load arms.

Ford produces several kinds of results, but these may be grouped into file-related types and mark-related types:

+$ pour :: ford build w/content
$% [%file =path]
[%nave =mark]
[%dais =mark]
[%cast =mars]
[%tube =mars]
:: leafs
::
[%vale =path =lobe]
[%arch =path =(map path lobe)]
==
+$ soak :: ford result
$% [%cage =cage]
[%vase =vase]
[%arch dir=(map @ta vase)]
[%dais =dais]
[%tube =tube]
==

File Builds

To see an example of how ++ford works, trace the %a care:

  • ++scry%a
  • ++read-a
  • ++tako-ford++tako-to-yaki
  • ++build-file++build-dependency (note the ++slap) → ++read-file++run-dependency++parse-pile
  • Also see ++build-fit for paths, noting how it handles -//.

Files are built by ++ford in vase mode. Arvo (Gall, etc.) can then drop them back into static mode once it has the core.

Since building a file is a pure function, Clay memoizes the results of all builds, including builds of marks, mark conversions, and hoon source files. These memoization results are stored along with the desk and are used by later revisions of that desk.

Ford supplies several / fas “runes” to build code. (Formally these are not part of Hoon and are more aking to C's #include statements.) These are processed in ++parse-pile++pile-rule.

  • /? faswut, pin kelvin version (currently ignored)
  • /- import /sur files
  • /+ import /lib files
  • /= arbitrary path to file
  • /~ arbitrary path to directory
  • /% build and import mark core
  • /$ import mark conversion gate
  • /* import file via specific mark

Build cares, Part I

Marks

One of the roles of Clay in managing desks is to validate desk content. That is, does every resource in a desk have a definite way to convert to a noun representation (or to another representation, but at minimum to %noun)?

If Clay has been asked to perform a commit, it needs to validate all the files in this desk and notify all subscribers to live queries of this desk's data. Gall, for example, maintains live queries on builds of its live agents. Validation uses the Ford build system.

A conventional DVCS filesystem like Git has special rules for handling text v. binary blob elements, Clay encourages the use of marks to identify filesystem data type and conversion routines. “It’s best defined as a symbolic mapping from a filesystem to a schema engine.” It's much like a MIME type, which specifies an intended data format (in a manner similar to a file extension).

A mark is "like an executable MIME type." It's best defined as a symbolic mapping from a filesystem to a schema engine.

(You should get used to divorcing the conceptual relationship of data—what we could call it’s form in the Platonic sense or the noun in the Martian sense—from it’s representation or instantiation. For instance, one writes a JSON file a certain way in text, but when parsing it needs to think about it at a higher level of abstraction.)

A mark is a validated data structure, including rules for transformation between representations. In this regard, it is like a more rigorous file type. We frequently use marks in Gall agents to verify classes of operations (such as actions or updates) or to convert incoming data (such as via the JSON mark).

Consider a file at /web/foo/json. In order to validate this file, Clay must load the mark definition core and use its validation routine to ensure the untyped value of /web/foo/json is in fact valid JSON. To obtain this core, Clay must build the file at /mar/json/hoon from source and then process the resulting raw mark core using some mild metaprogramming to get a standard interface core for dealing with marks, called a $dais, whose type is defined in Zuse. Since building a source file only makes sense if the file has been validated as a %hoon file, but mark definitions themselves must be built from source, there's a logical dependency cycle -- who validates the validators? To break this cycle, Clay hard-codes the validation of %hoon files. [(See ++read-x in /sys/clay.)] This allows mark definitions to be built from source, and in fact any file can depend on any other file of any mark as long as there are no cycles. As of Ford Fusion, Ford performs a cycle check to ensure acyclicity.

At a high level, files are validated using ++read-file, which uses marks via ++validate-page. So let's dive into marks.

The simplest way to use a mark is to simply supply Dojo with the names for source and target along with a value:

&json &mime [/application/json (as-octs:mimes:html '"hey"')]

(In one sense, a mark is simply a label which nominally corresponds to a /mar file—but it is possible to have cages that don't ever touch the filesystem.)

Marks expose several arms for converting between value representations:

  • ++grab cores convert to our mark from other marks.
  • ++grow cores convert from our mark to another mark.
  • ++grad specify functions for revision control like creating diffs, patching files and so on. In our case, rather than writing all those functions, we've just delegated those tasks to the %noun mark.

To convert from mark %alfa to mark %bravo, Clay tries the following operations, in order:

  • direct grow from %alfa
  • direct grab from %bravo
  • indirect jump from %alfa through %charlie
  • indirect grab from %bravo through %charlie

You can see this logic instantiated in ++build-cast.

  • Construct a multi-step conversion between two marks that cannot grab/grow into each other (likely via %noun).

Note that marks don't have to perfectly round-trip: if you converted a wain to json back to wain, you won't necessarily have the same text.

  • Examine the mark file /mar/tape/hoon.
  • Examine the mark file /mar/xml/hoon.

As practically constructed, marks are typically either simple calls to outsource to other marks and /sur type validation, or they may involve JSON reparsing or construction. Only rarely do more complicated marks need to be built.

Marks can be built (using the right cares) to be either static or dynamic.

Static mark conversion gates only convert from one type directly to another. These have type $-(from to).

> =txt-to-mime .^($-(wain mime) %cf /===/txt/mime)
> (txt-to-mime ~['foo'])
[p=/text/plain q=[p=3 q=7.303.014]]
  • See ++read-f and ++build-nave.

Static mark cores (+$naves) are more flexible than %f gates because they also supply the ++grad arm to apply diffs.

:: $nave: typed mark core
::
++ nave
|$ [typ dif]
$_
^?
|%
++ diff |~([old=typ new=typ] *dif)
++ form *mark
++ join |~([a=dif b=dif] *(unit (unit dif)))
++ mash
|~ [a=[ship desk dif] b=[ship desk dif]]
*(unit dif)
++ pact |~([typ dif] *typ)
++ vale |~(noun *typ)
--
  • See ++read-e and ++build-tube.

Dynamic mark conversion gates, or +$tubes, process on vases instead.

:: $tube: mark conversion gate
::
+$ tube $-(vase vase)
> =txt-mime-tube .^(tube:clay %cc /===/txt/mime)
> !< mime (txt-mime-tube !>(~['foo']))
[p=/text/plain q=[p=3 q=7.303.014]]
  • See ++read-c and ++build-tube.

Finally, dynamic mark cores (+$dais) are the most powerful of all: they are doors operating in vase mode on files.

:: $dais: processed mark core
::
+$ dais
$_ ^|
|_ sam=vase
++ diff |~(new=_sam *vase)
++ form *mark
++ join |~([a=vase b=vase] *(unit (unit vase)))
++ mash
|~ [a=[ship desk diff=vase] b=[ship desk diff=vase]]
*(unit vase)
++ pact |~(diff=vase sam)
++ vale |~(noun sam)
--

Build cares, Part II

Clay as a Vane

Now we're ready to have a gander at the formal vane state.

:: Formal vane state.
::
:: -- `rom` is our domestic state.
:: -- `hoy` is a collection of foreign ships where we know something about
:: their clay.
:: -- `ran` is the object store.
:: -- `mon` is a collection of mount points (mount point name to urbit
:: location).
:: -- `hez` is the unix duct that %ergo's should be sent to.
:: -- `cez` is a collection of named permission groups.
:: -- `pud` is an update that's waiting on a kernel upgrade
::
+$ raft :: filesystem
$: rom=room :: domestic
hoy=(map ship rung) :: foreign
ran=rang :: hashes
fad=flow :: ford cache
mon=(map term beam) :: mount points
hez=(unit duct) :: sync duct
cez=(map @ta crew) :: permission groups
tyr=(set duct) :: app subs
tur=rock:tire :: last tire
pud=(unit [=desk =yoki]) :: pending update
sad=(map ship @da) :: scry known broken
bug=[veb=@ mas=@] :: verbosity
== ::
  • +$room is the domestic desk state.
::
:: Domestic ship.
::
:: `hun` is the duct to dill, and `dos` is a collection of our desks.
::
+$ room :: fs per ship
$: hun=duct :: terminal duct
dos=(map desk dojo) :: native desk
== ::
::
:: Domestic desk state.
::
:: Includes subscriber list, dome (desk content), possible commit state (for
:: local changes), possible merge state (for incoming merges), and permissions.
::
+$ dojo
$: qyx=cult :: subscribers
dom=dome :: desk state
per=regs :: read perms per path
pew=regs :: write perms per path
fiz=melt :: state for mega merges
==

Move handler

There are several engine cores embedded in /sys/clay:

  • ++de desk engine to modify the desk (commits, merges, etc.) and metadata about the desk
  • ++ze utility engine to manipulate desk state itself
  • ++lu userspace agent management engine
  • ++me merge management core

The formal Arvo interface is located at section 4cA, filesystem vane. It is rather complex compared to the smaller vanes, and deserves a scan through the main arms.

The complete set of moves for Clay are:

:: ::::
:::: ++clay :: (1c) versioning
:: ::::
++ clay ^?
|%
+$ gift :: out result <-$
$% [%boon payload=*] :: ames response
[%croz rus=(map desk [r=regs w=regs])] :: rules for group
[%cruz cez=(map @ta crew)] :: permission groups
[%dirk p=@tas] :: mark mount dirty
[%ergo p=@tas q=mode] :: version update
[%hill p=(list @tas)] :: mount points
[%done error=(unit error:ames)] :: ames message (n)ack
[%mere p=(each (set path) (pair term tang))] :: merge result
[%ogre p=@tas] :: delete mount point
[%rule red=dict wit=dict] :: node r+w permissions
[%tire p=(each rock:tire wave:tire)] :: app state
[%writ p=riot] :: response
[%wris p=[%da p=@da] q=(set (pair care path))] :: many changes
== ::
+$ task :: in request ->$
$~ [%vega ~] ::
$% [%boat ~] :: pier rebooted
[%cred nom=@ta cew=crew] :: set permission group
[%crew ~] :: permission groups
[%crow nom=@ta] :: group usage
[%drop des=desk] :: cancel pending merge
[%info des=desk dit=nori] :: internal edit
$>(%init vane-task) :: report install
[%into des=desk all=? fis=mode] :: external edit
$: %merg :: merge desks
des=desk :: target
her=@p dem=desk cas=case :: source
how=germ :: method
== ::
$: %fuse :: merge many
des=desk :: target desk
bas=beak :: base desk
con=(list [beak germ]) :: merges
== ::
[%mont pot=term bem=beam] :: mount to unix
[%dirk pot=term] :: mark mount dirty
[%ogre pot=$@(term beam)] :: delete mount point
[%park des=desk yok=yoki ran=rang] :: synchronous commit
[%perm des=desk pax=path rit=rite] :: change permissions
[%pork ~] :: resume commit
[%prep lat=(map lobe page)] :: prime clay store
[%rein des=desk ren=rein] :: extra apps
[%stir arg=*] :: debug
[%tire p=(unit ~)] :: app state subscribe
[%tomb =clue] :: tombstone specific
$>(%trim vane-task) :: trim state
$>(%vega vane-task) :: report upgrade
[%warp wer=ship rif=riff] :: internal file req
[%werp who=ship wer=ship rif=riff-any] :: external file req
[%wick ~] :: try upgrade
[%zeal lit=(list [=desk =zest])] :: batch zest
[%zest des=desk liv=zest] :: live
$>(%plea vane-task) :: ames request
== ::

The scry interface

Scries

Clay has more cares than any other vane because it needs to store and build Hoon code, as well as handle resource transformation using marks. We have already incidentally run into many of these scries, but for the sake of summary:

+$ care :: clay submode
$? %a %b %c %d %e %f ::
%p %q %r %s %t %u ::
%v %w %x %y %z ::
== ::

Resource cares

Build cares

System care

At this point, %s is the only new scry that we haven't looked at yet. (Clay unlike some other vanes certainly has a full complement of convenience scries.)

> =/ =dome:clay .^(dome:clay %cv %)
=/ =tako:clay (~(got by hit.dome) let.dome)
.^(tako:clay %cs %/hash/(scot %uv tako))
0v16.er7uq.oke4u.cru7u.nglu9.q3su7.6ub1o.bh4qk.r5uav.ut12d.5rdl5

|mount & unix.c

Urbit maintains its own single-level store including Clay via the runtime, but supports synchronizing Clay's vision with the underlying host OS. To mount a drive in this sense means to make a Unix-visible copy in the pier; the more recently timestamped of two files is considered the canonical instance.

:: ::::
:::: ++clay :: (1c) versioning
:: ::::
++ clay ^?
|%
+$ gift :: out result <-$
$% [%dirk p=@tas] :: mark mount dirty
[%hill p=(list @tas)] :: mount points
[%ogre p=@tas] :: delete mount point
== ::
+$ task :: in request ->$
$% [%boat ~] :: pier rebooted
[%mont pot=term bem=beam] :: mount to unix
[%dirk pot=term] :: mark mount dirty
[%ogre pot=$@(term beam)] :: delete mount point
== ::

Mount point information is stored in the +$raft at mon=(map term beam). The actual procedure for mounting a drive is in ++mount; note particularly the call out to ++ergo which brokers file synchronization to Unix via the associated hez duct. The emission of an %ergo task to Unix is handled in vere/io/unix.c.

  • Scan through vere/io/unix.c.
  • How are on-Urbit edits be handled in the case of conflict with a base file? Explore this scenario.

Solid-state subscriptions

Agents frequently need to synchronize all or some of their state via communication. To do this, they can either communicate their entire state when it changes, or they can send deltas indicating how to update the state to a particular point. (There could be checks on this like reporting a checksum or the hash.) Chat agents, for instance, send single messages rather than the total history of the chat channel to that point.

The more efficient solution is … to only send out instructions on how to update the state, but then any subscribed Agent B has to manually interpret these, update its own state, and risk getting some detail wrong. Even if this is done correctly, reimplementing this common pattern in many agents is obviously both wasting wetware and cluttering codebases. [Solid-state subscriptions are] how we … implement the second solution in kernelspace, reducing code overhead, network load and memory usage at the same time.

SSS will likely continue to evolve.

Permissions

Clay supports file permissions at the level of paths (and daughter paths). See ++perm in /sys/clay for an example of setting permissions. This system does not appear to be used much at the current time. It does impact requests of remote file resources, which is not yet a common use pattern beyond simply publishing code.

:: ::::
:::: ++clay :: (1c) versioning
:: ::::
++ clay ^?
|%
+$ gift :: out result <-$
$% [%croz rus=(map desk [r=regs w=regs])] :: rules for group
[%cruz cez=(map @ta crew)] :: permission groups
[%rule red=dict wit=dict] :: node r+w permissions
== ::
+$ task :: in request ->$
$% [%cred nom=@ta cew=crew] :: set permission group
[%crew ~] :: permission groups
[%crow nom=@ta] :: group usage
[%perm des=desk pax=path rit=rite] :: change permissions
== ::
--
::
+$  crew  (set ship)                                  ::  permissions group
+$  regs  (map path rule) :: rules for paths
+$  rule  [mod=?(%black %white) who=(set whom)] :: node permission
+$ whom (each ship @ta) :: ship or named crew
+$  rite                                               ::  new permissions    
    $%  [%r red=(unit rule)]                            ::  for read     
        [%w wit=(unit rule)]                            ::  for write
        [%rw red=(unit rule) wit=(unit rule)]           ::  for read and write
    ==
  • Trace how |public works.

Some related material for %treaty and the docket file system will be covered in ca12.

🏺 Kiln

%kiln is the system affordance for interacting with Clay and Gall from userspace without composing direct tasks. It's a library inside of %hood and a set of associated generators.

A %hood generator (located in /gen/hood) that wants to interact with %kiln needs to send a poke indicating which predefined %kiln action should be taken, e.g.,

;< bind:m ~ (poke-our:strandio %hood %kiln-mount !>([pax desk]))
(pure:m !>(~))

You can see the set of %kiln actions in the ++poke arm of /lib/kiln. Most of these have an associated generator in /gen/hood.

  • Trace the %kiln action for |revive.
  • Trace the %kiln action for |ota.
  • Trace the %kiln action for |mount.
  • See ~midden-fabler, mount-all-desks.hoon for an example of using %kiln in another generator.

🪦 Tombstoning

Tombstoning is the deletion of data for old desk revisions. Clay has a single %tomb task, but its clue has a number of different possible actions:

+$ clue :: murder weapon
$% [%lobe =lobe] :: specific lobe
[%all ~] :: all safe targets
[%pick ~] :: collect garbage
[%norm =ship =desk =norm] :: set default norm
[%worn =ship =desk =tako =norm] :: set commit norm
[%seek =ship =desk =cash] :: fetch source blobs
==
::
+$ norm (axal ?)

A tombstoned value can no longer be successfully returned from a scry. In this case, [~ ~] is a response meaning that you can never know the value.

The tombstone policy (+$norm) affects a recursive directory structure.

Story

Story is a set of generators to produce Clay commit messages. The actual messages are stored in a file in Clay, effectively using a Clay as a database. The generators are instrumented through %hood/%helm so they can pass notes to Arvo.

> |new-desk %tale
> |mount %tale
> |cp /===/mar/story/hoon /=tale=/mar/story/hoon
+ /~zod/tale/2/mar/story/hoon
> |cp /===/sur/story/hoon /=tale=/sur/story/hoon
+ /~zod/tale/3/lib/story/hoon
> |cp /===/lib/story/hoon /=tale=/lib/story/hoon
+ /~zod/tale/4/lib/story/hoon
> |story-init, =desk %tale
+ /~zod/tale/5/story
> +story-read, =desk %tale
> |story-write 'Short message' 'Long descriptive message', =desk %tale
: /~zod/tale/6/story
> +story-read, =desk %tale
commit: 0vn.l7i50.emt3e.79vbv.tjuv6.ftaqk.pos61.iqa5q.j0jq4.7mn92.vjssn
Short message
Long descriptive message

Story is supported in %base, but you'll need to make the mark available on the target desk as done here.

The Future of Clay

Clay does some things very well, but at the current scale of Urbit it hasn't really been stress-tested to its performance limits much. (There are some limits on the number of tokens that can be loaded from a single file, for instance.)

There are really two directions we can go with Clay: strip it back down towards source control and distribution, or scale it up into a full noun management system.

The first approach is rooted in an argument that Clay shouldn't do everything, but instead should push off aspects of file management and data storage to Gall instead. The details of this have not been laid out explicitly in any document I'm aware of, but it has been discuss in core architecture meetings.

The other possibility is that Gall and Clay merge into a hypothetical vane called Hume, which then manages agents and agent data in the same space as files and source.

Exercise

  • Produce a %hood generator that triggers %kiln to produce a file containing the line count of a supplied text file. This file should have the same name but a .wc suffix (which will require a mark). A %txt mark results in (list cord).
  • Walk through producing an OTA for a fake ~zod sponsor and a ~marzod sponsee. See MAINTAINERS.md for details of that process (under “Release Next Release Candidate”); you will obviously need to change ship identities.

There are always horrid exceptions, even in common use -- like extensionless Makefiles. The trivial solution is that if %clay finds a file mysterious, it won't track it.

  • What happens to a Makefile today (that is, a file without a suffix)?