This module will introduce you to several useful data structures built on the door, then discuss how the compiler handles types and the sample.
Key Data Structures and Molds
++maps are a versatile way to store and access data, but they are far from the only useful pattern. ++map
s were documented in the previous module.
tree
We use tree to make a binary tree data structure in Hoon, e.g., (tree @)
for a binary tree of atoms.
There are two kinds of tree
in Hoon:
The null tree
~
.A non-null tree which is a cell with three parts.
- The node value.
- The left child of the node.
- The right child of the node.
Each child is itself a tree. The node value has the face
n
, the left child has the facel
, and the right child has the facer
. The following diagram provides an illustration of a(tree @)
(without the faces):
12/ \8 14/ \ / \4 ~ ~ 16/ \ / \~ ~ ~ ~
Hoon supports trees of any type that can be constructed in Hoon, e.g.: (tree @)
, (tree ^)
, (tree [@ ?])
, etc. Let's construct the tree in the diagram above in the dojo, casting it accordingly:
> `(tree @)`[12 [8 [4 ~ ~] ~] [14 ~ [16 ~ ~]]]{4 8 12 14 16}
Notice that we don't have to insert the faces manually; by casting the noun above to a (tree @)
Hoon inserts the faces for us. Let's put this noun in the dojo subject with the face b
and pull out the tree at the left child of the 12
node:
> =b `(tree @)`[12 [8 [4 ~ ~] ~] [14 ~ [16 ~ ~]]]> b{4 8 12 14 16}> l.b-find.l.bfind-fork
This didn't work because we haven't first proved to Hoon that b
is a non-null tree. A null tree has no l
in it, after all. Let's try again, using ?~
wutsig to prove that b
isn't null. We can also look at r
and n
:
> ?~(b ~ l.b){4 8}> ?~(b ~ r.b){14 16}> ?~(b ~ n.b)12
Find and Replace
Here's a program that finds and replaces certain atoms in a (tree @)
.
|= [nedl=@ hay=(tree @) new=@]^- (tree @)?~ hay ~:+ ?: =(n.hay nedl)newn.hay$(hay l.hay)$(hay r.hay)
nedl
is the atom to be replaced, hay
is the tree, and new
is the new atom with which to replace nedl
. Save this as findreplacetree.hoon
and run in the dojo:
> b{4 8 12 14 16}> +findreplacetree [8 b 94]{4 94 12 14 16}> +findreplacetree [14 b 94]{4 8 12 94 16}
set
A set is rather like a list except that each entry can only be represented once. As with a map, a set
is typically associated with a particular type, such as (set @ud)
for a collection of decimal values. (set
s also don't have an order, so they're basically a bag of unique values.)
set
operations are provided by ++in. Most names are similar to map
/++by operations when applicable.
++silt produces a set
from a list
:
=primes (silt ~[2 3 5 7 11 13])
++put:in adds a value to a set
(and null-ops when the value is already present):
=primes (~(put in primes) 17)=primes (~(put in primes) 13)
++del:in removes a value from a set
:
=primes (~(put in primes) 18)=primes (~(del in primes) 18)
++has:in checks for existence:
> (~(has in primes) 15)%.n> (~(has in primes) 17)%.y
++tap:in yields a list
of the values:
> ~(tap in primes)~[3 2 7 5 11 13 17]> (sort ~(tap in primes) lth)~[2 3 5 7 11 13 17]
++run:in applies a function across all values:
> (~(run in primes) dec){10 6 12 1 2 16 4}
Example: Cartesian Product
Here's a program that takes two sets of atoms and returns the Cartesian product↗ of those sets. A Cartesian product of two sets a
and b
is a set of all the cells whose head is a member of a
and whose tail is a member of b
.
|= [a=(set @) b=(set @)]=/ c=(list @) ~(tap in a)=/ d=(list @) ~(tap in b)=| acc=(set [@ @])|- ^- (set [@ @])?~ c acc%= $c t.cacc |- ?~ d acc%= $d t.dacc (~(put in acc) [i.c i.d])====
Save this as cartesian.hoon
in your urbit's pier and run in the dojo:
> =c `(set @)`(sy ~[1 2 3])> c{1 2 3}> =d `(set @)`(sy ~[4 5 6])> d{5 6 4}> +cartesian [c d]{[2 6] [1 6] [3 6] [1 4] [1 5] [2 4] [3 5] [3 4] [2 5]}
unit
Redux (and vase
)
We encountered the unit briefly as a tool for distinguishing null results from actual zeroes: using a unit
allows you to specify something that may not be there. For this reason, unit
s are commonly used for operations that sometimes fail, such as search functions, database lookups, remote data requests, etc.
You can build a unit
using the tic special notation or ++some:
> `%mars[~ %mars]> (some %mars)[~ u=%mars]
While ++got:by is one way to get a value back without wrapping it in a unit
, it's better practice to use the unit
logic gates to manipulate gates to work correctly with unit
s.
For example, use ++need to unwrap a unit
, or crash if the unit
is ~
null.
> =colors (malt `(list (pair @tas @ux))`~[[%red 0xed.0a3f] [%yellow 0xfb.e870] [%green 0x1.a638] [%blue 0x66ff]])> (~(get by colors) %yellow)[~ q=0xfb.e870]> (need (~(get by colors) %yellow))0xfb.e870> (~(get by colors) %teal)~> (need (~(get by colors) %teal))dojo: hoon expression failed
Rather than unwrap a unit, one can modify gates to work with unit
s directly even if they're not natively set up that way. For instance, one cannot decrement a unit
because ++dec doesn't accept a unit
. ++bind can bind a non-unit
function—another gate-building gate!.
> (bind ((unit @ud) [~ 2]) dec)[~ 1]> (bind (~(get by colors) %orange) red)[~ 0xff]
(There are several others tools listed on that page which may be potentially useful to you.)
A vase is a pair of type and value, such as that returned by !>
zapgar. A vase
is useful when transmitting data in a way that may lose its type information.
Containers of Containers
maps and sets are frequently used in the standard library and in the extended ecosystem. There are other common patterns which recur often enough that they have their own names:
++jar is a mold for a
map
oflist
s.++jar
uses the ++ja core. (Mnemonic: jars hold solid ordered things, like a list.)++jug is a mold for a
map
ofset
s.++jug
uses the ++ju core. (Mnemonic: jugs hold liquids, evoking the unordered nature of a set.)++mip is a mold for a map of maps.
++mip
lives in the%landscape
desk in/lib/mip.hoon
. Affordances are still few but a short example follows:> =mip -build-file /=landscape=/lib/mip/hoon> =my-map-warm (malt `(list (pair @tas @ux))`~[[%red 0xed.0a3f] [%yellow 0xfb.e870]])> =my-map-cool (malt `(list (pair @tas @ux))`~[[%green 0x1.a638] [%blue 0x66ff]])> =my-mip *(mip:mip @tas (map @tas @ux))> =my-mip (~(put bi:mip my-mip) %cool %blue 0x66ff)> =my-mip (~(put bi:mip my-mip) %cool %green 0x1.a638)> =my-mip (~(put bi:mip my-mip) %warm %red 0xed.0a3f)> =my-mip (~(put bi:mip my-mip) %warm %yellow 0xfb.e870)> my-mip[ n=[p=%warm q=[n=[p=%yellow q=0xfb.e870] l=[n=[p=%red q=0xed.0a3f] l=~ r=~] r=~]]l=[n=[p=%cool q=[n=[p=%green q=0x1.a638] l=[n=[p=%blue q=0x66ff] l=~ r=~] r=~]] l=~ r=~]r=~]> (~(got bi:mip my-mip) %cool %green)0x1.a638> ~(tap bi:mip my-mip)~[[x=%warm y=%yellow v=0xfb.e870][x=%warm y=%red v=0xed.0a3f][x=%cool y=%green v=0x1.a638][x=%cool y=%blue v=0x66ff]]mip
s are unjetted and quite slow but serve as a proof of concept.++mop ordered maps are discussed in the App School guides.
Molds and Samples
Modifying Gate Behavior
Sometimes you need to modify parts of a core (like a gate) on-the-fly to get the desired behavior. For instance, if you are using ++roll to calculate the multiplicative product of the elements of a list, this “just works”:
> (roll `(list @ud)`~[10 12 14 16 18] mul)483.840
In contrast, if you do the same thing to a list of numbers with a fractional part (@rs
floating-point values), the naïve operation will fail:
> (roll `(list @rs)`~[.10 .12 .14 .16 .18] mul:rs).0
Why is this? Let's peek inside the gates and see. Since we know a core is a cell of [battery payload]
, let's take a look at the payload:
> +:mul[[a=1 b=1] <33.uof 1.pnw %138>]> +:mul:rs[[a=.0 b=.0] <21.ezj [r=?(%d %n %u %z) <51.njr 139.oyl 33.uof 1.pnw %138>]>]
The correct behavior for ++mul:rs is really to multiply starting from one, not zero, so that ++roll doesn't wipe out the entire product.
Custom Samples
In an earlier exercise we created a door with sample [a=@ud b=@ud c=@ud]
. If we investigated, we would find that the initial value of each is 0
, the bunt value of @ud
.
> +6:poly[a=0 b=0 c=0]
What if we wish to define a door with a chosen sample value directly? We can make use of the $_
buccab rune, whose irregular form is simply _
. To create the door poly
with the sample set to have certain values in the Dojo, we would write
> =poly |_ [a=_5 b=_4 c=_3]++ quad|= x=@ud(add (add (mul a (mul x x)) (mul b x)) c)--> (quad:poly 2)31
For our earlier example with ++roll, if we wanted to set the default sample to have a different value than the bunt of the type, we could use _
cab:
> =mmul |=([a=_.1 b=_.1] (mul:rs a b))> (roll `(list @rs)`~[.10 .12 .14 .16 .18] mmul).483840
Named Tuples
A named tuple is a structured collection of values with faces. The $:
buccol rune forms a named tuple. We use these implicitly in an irregular form when we specify the sample of a gate, as |=([a=@ b=@] (add a b))
expands to a $:
buccol expression for [a=@ b=@]
. Otherwise, we only need these if we are building a special type like a vector (e.g. with two components like an x and a y).
Structure Mode
Most Hoon expressions evaluate normally (that's what “normal” means), what we'll call noun mode (or normal mode). However, sample definitions and +$
lusbuc mold specification arms evaluate in what is called structure mode. (You may occasionally see this the older term “spec mode”.) Structure mode expressions use a similar syntax to regular Hoon expressions but create structure definitions instead.
For instance, in eval mode if you use the irregular form p=1
this is an irregular form of the ^=
kettis rune. This is one way to define a variable using a =+
tislus; these are equivalent statements:
> =+(hello=1 hello)1> =+(^=(hello 1) hello)1
(Normally we have preferred =/
tisfas in the Hoon School docs, but that is just for consistency.)
In a sample definition, such as in a gate, the statement is evaluated in structure mode; these are equivalent statements:
|=(hello=@ hello)|=($=(hello @) hello)
There are several other subtle cases where normal mode and structure mode diverge, but most of the time structure mode is invisible to you. The $
buc runes are typically invoked in structure mode.