Infinite Negative Utility

One of the most famous episodes of Star Trek: The Next Generation is the second episode of the fifth season: “Darmok”. In it, the starship Enterprise and its crew have come across the Tamarians, an alien civilization whose language seems to stymie the show's conceit of a perfect “universal translator.” As the crew of the Enterprise ask the straightforward questions one would expect in a diplomatic context—”Would you be prepared to consider the creation of a mutual non-aggression pact between our two peoples?“—the Tamarians respond with translated but still incomprehensible statements filled with proper nouns and colorful imagery, like, “Shaka, when the walls fell,” or, “Temba, his arms wide,” or the phrase that provides the episode's title: “Darmok and Jalad, at Tanagra.”

As the episode progresses, with our protagonist Captain Picard and the Tamarian captain Dathon trapped together on the planet and struggling to understand each other, both Picard and his crew independently come to understand the guiding principle of these phrases: Tamarians communicate by something like imagery or metaphor, using well-known cultural allusions as shorthand for the situations they're in and the actions they're taking. The phrase “Shaka, when the walls fell” alludes to a famous historical disaster, and functions almost like a Tamarian swear-word, while “Darmok and Jalad, at Tanagra” alludes to a myth or folktale about two strangers who face a common threat and emerge as friends, and serves to represent the events of the episode itself, where Picard and Dathon face off against a strange beast and in the process bring their two peoples together.

In an article in The Atlantic—appropriately also titled “Shaka, When the Walls Fell”—Ian Bogost describes the events of the episode and provides an interesting analysis of the Tamarian language. In the process of doing a deep read of the episode's use of Tamarian, he rejects both “metaphor” and “imagery”, the words offered by the episode itself to describe the Tamarians' phrases, and instead offers two alternatives: “strategy” and “logic”.

“Strategy” is perhaps the best metaphor of all for the Tamarian phenomenon the Federation misnames metaphor. A strategy is a plan of action, an approach or even, at the most abstract, a logic. Such a name reveals what’s lacking in both metaphor and allegory alike as accounts for Tamarian culture. To be truly allegorical, Tamarian speech would have to represent something other than what it says. But for the Children of Tama, there is nothing left over in each speech act. The logic of Darmok or Shaka or Uzani is not depicted as image, but invoked or instantiated as logic in specific situations.

[…]

If we pretend that “Shaka, when the walls fell” is a signifier, then its signified is not the fictional mythological character Shaka, nor the myth that contains whatever calamity caused the walls to fall, but the logic by which the situation itself came about. Tamarian language isn’t really language at all, but machinery.

In the sort of hyper-pedantic fan spaces where Star Trek is popular, you sometimes see a criticism of the conceit of the episode: how could the Tamarians, with their florid allusions and indirect language, ever have been able to develop advanced technology like space-flight? The episode depicts their technology as superior to even the Federation's, but how could the Tamarians have built any of it without being able to issue a direct technical command like, “Connect these two pipes”? Bogost touches on this by contrasting the detail-by-detail bits of technobabble shown on the Enterprise with the terseness of Tamarian strategy:

As Troi explained, the Tamarians’ possess a sophisticated aptitude for abstraction. This capacity responds to fans’ skepticism at the Tamarian’s technological prowess. The Children of Tama would not be delayed by their inability to speak directly because they seem to have no need whatsoever for explicit, low-level discourse like instructions and requests. They’d just not bother talking about the socket wrench, instead proceeding to the actual work of building or maintaining the vessel.

[…]

While the episode doesn’t provide a Tamarian mythical equivalent [of a dense technical conversation], we can speculate on how the Tamarians would handle a similar situation. While I suppose the explicit directive to adjust thermal input by a specified amount might be rendered allegorically (some Tamarian speech is narrower than others), it’s equally likely that the entire exchange would be unnecessary, subsumed into some larger operation, say, “Baby Jessica, in her well.” The rest is just details.

Does this response mean that the fan criticism is invalid, and that the episode is “actually realistic”? Well, of course not: it's science fiction, and part of the fun of science fiction is to push past the bounds of realism and play around in the hypothetical and the near-real. There are still a myriad of ways to pick holes in the “realism” of “Darmok”, whether those holes are in the technology, the language, the culture, all of it.

But it can be powerful and rewarding to engage with science fiction by taking it at face value, accepting the what-if implicit in the premise, then working through what it would or could imply. That's how Bogost is engaging with the Tamarian language, by saying, “Well, what if this is how it worked? What then?” And the notion of a short phrase encoding not simply a referent but an entire approach, logic, and even world-view: well, isn't that a fascinating concept to consider?


The notion of “design patterns” in software is typically associated with the book Design Patterns: Elements of Reusable Object-Oriented Software by Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides, a group also called the “Gang of Four”. This book provides the template for what many programmers imagine when they hear the words “design pattern”: the Visitor Pattern, the Command Pattern, the Abstract Factory pattern, and so forth.

Of course, design patterns didn't originate there: it's also relatively well-known that the originator of “design patterns” as a concept was the architect Christopher Alexander, in particular in the 1977 book A Pattern Language: Towns, Buildings, Construction which Alexander co-wrote with Sara Ishikawa and Murray Silverstein. This book is actually the middle part of a series, preceded by The Oregon Experiment (which describes the design of the University of Oregon campus) and followed by The Timeless Way of Building (which acts like a summary of the approach in the previous two books.) Alexander's patterns are techniques for designing spaces, ranging from region-level (like “Lace of Country Streets”) all the way down to tiny details (like “Half-Inch Trim”.)

However, there's still a gap in that history of patterns. Patterns were first brought to computing well before the Gang of Four. In 1987, Kent Beck and Ward Cunningham presented a paper at OOPSLA called Using Pattern Languages for Object-Oriented Programs.

A pattern language guides a designer by providing workable solutions to all of the problems known to arise in the course of design. It is a sequence of bits of knowledge written in a style and arranged in an order which leads a designer to ask (and answer) the right questions at the right time. Alexander encodes these bits of knowledge in written patterns, each sharing the same structure. Each has a statement of a problem, a summary of circumstances creating the problem and, most important, a solution that works in those circumstances. A pattern language collects the patterns for a complete structure, a residential building for example, or an interactive computer program. Within a pattern language, patterns connect to other patterns where decisions made in one influence the others. A written pattern includes these connections as prologue and epilogue. Alexander has shown that nontrivial languages can be organized without cycles in their influence and that this allows the design process to proceed without any need for reversing prior decisions.

Something notable which appears in both the original architectural treatment as well as the Beck and Cunningham treatment is the emphasis not just on patterns but on pattern languages: they're not simply approaches to writing or structuring some code, but families of approaches with interdependent motivations, rationales, and logics behind them.

This is apparent if you look at any of Christopher Alexander's patterns. Each of them (as Beck and Cunningham note above) is laid out in several rigorous sections, including a problem, a solution, and a note on usage, as well as sections of background and research. Here's a selected part of a pattern from the middle of the book called “Half-Open Wall”:

Problem:

Rooms which are too closed prevent the natural flow of social occasions, and the natural process of transition from one social moment to another. And rooms which are too open will not support the differentiation of events which social life requires.

Solution:

Adjust the walls, openings, and windows in each indoor space until you reach the right balance between open, flowing space and closed cell-like space. Do not take it for granted that each space is a room; nor, on the other hand, that all spaces must flow into each other. The right balance will always lie between these extremes: no one room entirely enclosed; and no space totally connected to another. Use combinations of columns, half-open walls, porches, indoor windows, sliding doors, low sills, french doors, sitting walls, and so on, to hit the right balance.

Usage:

Wherever a small space is in a larger space, yet slightly separate from it, make the wall between the two about half-open and half-solid—Alcoves, Workspace Enclosure. Concentrate the solids and the openings, so that there are essentially a large number of smallish openings, each framed by thick columns, waist high shelves, deep soffits, and arches or braces in the corners, with ornament where solids and openings meet—Interior Windows, Columns at the Corners, Column Place, Column Connections, Small Panes, Ornament...

The last section was written in a world before widespread digital hyperlinks, but it's remarkable how amenable to hyperlinking it is. (In fact, I confess I didn't retype the above from the book, I borrowed it from a web-based presentation of the patterns in the book which of course did the hyperlinking!) Each pattern references several other patterns, positioning it within a web of approaches with their own motivations and approaches and relationships.

Alexander's patterns are multi-level in a way that is self-reinforcing: using a pattern at the town level gives you scaffolding that supports the patterns shown at the neighborhood level, and in turn using patterns at the neighborhood level supports the patterns at the house level, and so forth. Alexander is explicit that this is a necessary part of how patterns are supposed to work:

In short, no pattern is an isolated entity. Each pattern can exist in the world, only to the extent that it is supported by other patterns: the larger patterns in which it is embedded, the patterns of the same size that surround it, and the smaller patterns which are embedded in it.

And for that matter, Alexander's patterns are also presented in order from large-scale to small-scale, deliberately describing the large-scale supporting patterns before the small patterns that reinforce them. Alexander remarks that the order is, “...essential to the way the language works.” Note too how Beck and Cunningham, in the passage I quoted, emphasize that a pattern language is, “...arranged in an order which leads a designer to ask (and answer) the right questions at the right time.”

Alexander also draws an explicit distinction between patterns that are the solution to a problem, where any possible solution to that problem must resemble the pattern given, and patterns which are a solution to the problem. Of the latter, Alexander even says

In these cases, we believe it would be wise for you to treat the pattern with a certain amount of disrespect—and that you seek out variants of the solution which we have given, since there are almost certainly possible ranges of solutions which are not covered by what we have written.

One reason this is important is because Alexander also puts emphasis on a specific word in the title: “A”. His opinion is that not only does each society have their own “pattern language” for the design of spaces, but that every individual within that society

[...] will have a unique language, shared in part, but which as a totality is unique to the mind of the person who has it. In this sense, in a healthy society there will be as many pattern languages as there are people—even though these languages are shared or similar.

Less loftily and more practically, Alexander suggests that projects can be improved by deciding on a specific pattern language to inform the project. The net goal of a pattern language, then, is that questions which might arise in the course of a design project have existing “default” answers, to the point that they barely need to be asked. After all, by drawing on an element of a pattern language, you are not just providing a solution to a single problem: you are contextualizing that solution within a web of other choices and approaches.

You might even say that, by invoking an element of a pattern language, you are invoking not simply a singular technique but an entire approach, logic, and even world-view.


If you're a programmer and you haven't read Alexander before, you may be noticing that this doesn't actually match up with your experience of patterns, well, at all. We certainly aren't using patterns to build software as advanced as the Tamarians' technology: we're often stuck with enterprise-level technology, and I don't mean the starship this time. Patterns are notoriously associated with the most tedious and unnecessarily verbose codebases out there, with “StrategyFactory” being a snarky shorthand for over-engineered code of all kinds.

The patterns put forward by the Gang of Four are, unsurprisingly, much less lofty in their goals. If you haven't read the book that introduces them, you may be surprised to discover that they are presented, in some respects, just as schematically as Alexander's patterns, starting with Intent and Motivation that work similarly to Alexander's Problem sections. However, the book contains a significantly smaller set of patterns—only 23 in total, an order of magnitude fewer than Alexander's 253—and they mostly sit within a fairly narrow abstraction band, with a few patterns vaguely gesturing at higher-level concerns (like the Chain of Responsibility Pattern) and the rest sitting firmly at the level of a few interacting objects or classes with their own state (like the Builder or Decorator or Iterator patterns.)

One of the most common criticisms I see of design patterns—and it's an old criticism, I associate it with Peter Norvig's 1998 presentation “Design Patterns in Dynamic Programming”, but I suspect it wasn't unheard-of even then—is that patterns as put forth by the Gang of Four are simply excessively-ornate workarounds for missing programming language features, and that patterns go away in well-designed programming languages. Norvig specifically describes some of the Gang-of-Four design patterns as “invisible or simpler” in dynamically typed languages, although some of his examples are less about dynamic typing and more about the existence of anonymous or higher-order functions.

A clear example here is the Command pattern used in object-oriented languages. A Command is an interface over executing pieces of code: you have an invoker, a piece of code which knows how to execute commands, as well as one or more commands, objects which implement an interface accepted by the invoker. This interface can consist of a single method: execute().

interface Command {
  public void execute();
}

class Invoker {
  static void invoke(Command cmd) {
    cmd.execute();
  }
}

class SayHello implements Command {
  public void execute() {
    System.out.println("Hello!");
  }
}

public static void main(String[] args) {
  Invoker.invoke(new SayHello());
}

It's also very easy to see why this pattern is “invisible” in other languages: it's just a callback! An invoker becomes a function which takes another function and calls it. The interface becomes the type of a function which takes no arguments, or disappears entirely in a dynamically-typed language. The ceremony is gone. Even keeping, for the sake of argument, a separate invoker and static types, we can see a fraction of the code is necessary in Python:

from typing import Callable

def invoke(cmd: Callable[[], None]):
    cmd()

def say_hello():
    print("Hello!")

invoke(say_hello)

I think this is a very valid criticism (and, even though I'm quibbling with some parts of it, I do think Norvig's presentation here is very good and more nuanced than I'm going into here) but I'm going to defend the Gang of Four in two ways here.

Firstly, the Gang of Four make it clear that commands are often not just a callback but may contain other information or scaffolding to support the command itself. You can easily imagine other data attached to a command—say, team ownership, or log information—but we don't even need to leave the book's text for an example, because the book comes with its own example: supporting “undo”. If you want to have first-class commands with an additional undo operation, then you need at least a little more structure than a simple callback. Even in Python or Lisp, an undo-capable command would need a bit of ceremony, and it's nice to be able to describe the approach as, “Using the command pattern,” instead of, “Storing a callback as a first-class value along with extra data necessary to support how we use that callback.”

But there's a larger defense of the Gang of Four patterns to be made here, which is, paradoxically, that Norvig is correct: these patterns do become invisible in the right context... and that's the point1.

The goal of learning a language, after all, is that with fluency the language itself ceases being an object of focus and instead becomes the medium by which other ideas are expressed. When you first learn a spoken language, you explicitly track the inflections and conjugations and declensions, you rack your brain for vocabulary words, and you consciously position your tongue in ways you never did before. The process of achieving fluency is making those things disappear, so that you no longer thing about the words and how to arrange them, you think about the ideas.

So if a design pattern becomes obvious or invisible, it doesn't strictly follow that it's worthless to name it or call it out as a pattern. Perhaps it means that the pattern is, so to speak, “correct”.


Despite the above vague defense of the Gang of Four, I do broadly agree that design patterns failed to lived up to their promise or potential in computing.2 They don't even live up to the original pitch that Beck and Cunningham gave: go back and read that passage and notice how many things they emphasize that are considered necessary by Alexander and yet are missing from the Gang of Four.

Perhaps the most pithy summary of my complaints is this: there are three words in the phrase “a pattern language”, and modern software development only remembers the middle one.

The first word, “a”, is important because it implies that other pattern languages can exist: that is to say a pattern is not part of “the pattern language”, but one of a selected set. There can be, should be, and are others!

Remember what Alexander said about his choice of the word “a”: each individual has their own language, overlapping but unique. That's not a concept which comes up in software patterns very much, if at all, but I think it's true. Your own approach to a problem can end up wildly different from mine, because the patterns we pull on in our own pattern languages are not the same.

For that matter, each project may have its own set of techniques which differ from other projects. It's often apparent, reading the code of a project, whether it was written with a consistent set of approaches or no. Establishing a pattern language for a project—even if people don't refer to this activity by name or do it formally—can make a project more consistent and readable and accelerate all future design work in that project, because every design question will have its default answer.

The last word, “language”, is important because language is both an interrelated system and a medium for communication. The power of pattern languages is that they give us a shared vocabulary for communicating interrelated concepts.

This speaks to the other failure of Gang of Four patterns: they're not a language, they're not multi-level and reinforcing, they're not presented in an order amenable to helping a person solve problems: they're just a grab-bag of techniques. There are large-scale, architecture-level design patterns (say, “Plugin Architecture” or “Model-View-Controller” or “Functional Core, Imperative Shell”) and there are also micro-scale design patterns (say, “Defunctionalization”3 or “Exit Early Instead Of Branching4”) and neither kind of pattern appears in the Gang of Four book. A proper software pattern language would assemble all its patterns into a cohesive, self-reinforcing whole.

Despite my bombastic framing here, I don't think design patterns will actually get us all the way—or even necessarily part of the way—to Tamarian levels of technology. But I do think design patterns, even as we have them in software today, have a little more to recommend them than the pop-culture idea of patterns might suggest, and I also think there's the potential for proper pattern languages, assembled with thought and care, to be a great way of both teaching and communicating about how to write and maintain good software5.

Also, make sure you read that Bogost article. It's very good.


  1. There's also a flip-side to this, which is that Norvig's thesis here is also compelling from the point of view of language design: what language features might make more design patterns “invisible”?
  2. It's worth noting, in fairness, that they haven't necessarily lived up to their promise in architecture, either: my understanding is that Christopher Alexander is more widely known and read by programmers with vaguely-defined multidisciplinary pretensions—yes, like me—than by architects.
  3. If you're not familiar with it, defunctionalization is, more or less, “Turn a lambda or callback into a non-lambda by separating code and data.” Underneath the hood, this is what enables lots of async code, where functions turn into state machines so they can be suspended, but it can be useful as an explicit transformation: for example, if you want take a data structure that contains a lambda, serialize it, and run it on another machine, then you can't simply serialize the lambda (at least in most languages) which means you need to defunctionalize it first.
  4. That is to say, in a language with early returns, you can return early instead of wrapping the entire body of the function in conditionals, which is in some languages considered good style because it establishes invariants for the function up-front, prevents rightward drift, and is often conceptually easier to read . I'd absolutely contend that this is a pattern on the same level as Alexander's “Half-Inch Trim”.
  5. I'll motivate this a bit with an example: Parse, Don't Validate is a design pattern. It's not necessarily described in the original post as a design pattern, and it's not typically presented as part of a pattern language, but it's a pattern nonetheless. I also think having this blog post and having a name for the approach has been wildly valuable not just in Haskell but in many other programming language communities. Not only do I wish I had that named concept when I was first learning to program, I think I would have found it valuable if I could connect it with other strategies along the same lines, and that's where a pattern language could come in.

Matzo is a small, experimental, idiosyncratic programming language for creating random text1 which I have in a reasonably usable state at this point. It's still firmly alpha software—and it's a personal project I only work on sporadically, so I'm not making any promises about future stability—but it's usable enough that I feel comfortable at least blogging about it!

If you're familiar with Tracery2, you probably understand the same sorts of things you'd create in Matzo. Matzo aims for use-cases a bit more sophisticated, but only just: it's still a pretty simple tool at its core, as it's designed for small-scale human-directed kinds of text generation.

You can experiment with Matzo in your browser: this is a webassembly-compiled version of the Rust program. It doesn't actually let you save programs, but it'll put the whole program into the URL bar, which means you can more or less “save” programs by bookmarking them. It also has a handful of examples you can look at in the top-right, and it'll track the random seed at the same time.

The rest of this blog post will be a whirlwind tour of the language!

Simple programs

The simplest program in Matzo just outputs some text, which is done with the top-level puts statement. Everything in Matzo is either a definition or a puts to produce output. Statements are separated with semicolons, although the final semicolon is optional.

puts "Hello, world!";
  → run in playground

The simplest operation in Matzo is string concatenation, which is so pervasive it's not done with an operator at all: you just put expressions next to each other to concatenate.

puts "Hello" ", " "world" "!";
  → run in playground

Basic definitions are done with the := operator. We can pull out some of the expressions above and give them intermediate names:

greeting := "Hello";
target := "world";
puts greeting ", " target "!";
  → run in playground

Let's make this program a tiny bit more interesting, though, and introduce some non-determinism. In Matzo, this is done with the disjunction operator, written as a vertical bar (|) between expressions. Let's make it choose between two possible greetings:

greeting := "Hello" | "Yo";
target := "world";
puts greeting ", " target "!";
  → run in playground

This program can now produce two possible pieces of text: either Hello, world! or Yo, world!. In this case, it produces the two with roughly equal probability, but we can change that. One way to modify this is to provide more alternatives: for example, if we want it to produce Hello three times as often as it produces Yo, we can simply repeat Hello three times:

greeting := "Hello" | "Hello" | "Hello" | "Yo";
target := "world";
puts greeting ", " target "!";
  → run in playground

Because Matzo will, by default, give each choice within a single disjunction equal weight, this will end up four possibilities with equal weights, three of which happen to be identical: while this program can also produce Hello, world! or Yo, world!, it will produce the former ¾ of the time and the latter ¼ of the time.

But wanting weighted randomness is common enough that Matzo has a shorthand for it! We can use a numeric weight and achieve the same effect, with all unweighted branches being treated as having a numeric weight of 1:

greeting := 3: "Hello" | "Yo";
target := "world";
puts greeting ", " target "!";
  → run in playground

There's a bit of a gotcha with disjunction, too: weights are computed for each disjunction without looking inside what the expressions are, whether they're other random names, or literal expressions, or even other parenthesized expressions. This also means parentheses can change the probabilities associated with a disjunction, as everything within the parentheses becomes a single “choice”. Consider the following three statements:

(* example one *)
puts "a" | "b" | "c";

(* example two *)
puts "a" | ("b" | "c");

(* example three *)
puts ("a" | "b") | "c";
  → run in playground

Each of these lines might print a, b, or c when run—but they do so with different probabilities! The first one will print them all with equal likelihood; the second one will print a half the time, and each other option a quarter of the time; the third will print c half the time, and each other option a quarter of the time. This can be a bit unintuitive: for a lot of mathematical operations, introducing parenthetical grouping like this doesn't have an effect on the outcome3.

If you have several one-word string literals to choose from and you want them to have equal probability, Matzo also has a piece of shorthand up its sleeve: the ::= operator lets you use space-separated bare words and treat them as several string literals to randomly choose between4. We can rewrite our equal-weighted greeting program like this:

greeting ::= Hello Yo;
target ::= world;
puts greeting ", " target "!";
  → run in playground

In this case, it doesn't save that much typing, but with many alternatives, it can be much more terse:

(* twelve options with literal-plus-disjunction *)
greekGod :=
  "Zeus" | "Hera" | "Poseidon" | "Demeter" |
  "Athena" | "Aphrodite" | "Artemis" | "Apollo" |
  "Hephaestus" | "Ares" | "Dionysus" | "Hermes";

(* twelve options with the shorthand *)
romanGod ::=
  Jupiter Juno Neptune Ceres Minerva Venus
  Diana Apollo Vulcan Mars Bacchus Mercury;
  → run in playground

Randomness and fixed values

The following is a Matzo program that uses the same value more than once:

x := "a" | "b";
puts x x;
  → run in playground

This program can print aa, ab, ba, or bb, with equal probability. This demonstrates something important: bound names are re-evaluated each time they're used.

Sometimes, though, this isn't what we want! Imagine that we're building a short prose scene with a randomly-chosen character name: we want to choose a random character name, but for that random name to be consistently used within the text. We can do this with the fix keyword, which is used before a definition:

fix x := "a" | "b";
puts x x;
  → run in playground

This program now can only print aa or bb: the reason is that, once x is defined, it'll eagerly choose between its possible random values, and then subsequent uses of x won't be random.

With this, we can write a program like the one described above:

fix strangerName ::= Austin Carter Logan Sage;
puts "You come across your friend " strangerName "!";
puts "You haven't seen them in " ("weeks" | "months" | "years") ".";
puts strangerName " looks " ("pleased" | "thrilled" | "frustrated") " to see you.";
  → run in playground

Functions

There are a lot of situations in text generation where you might want simple function-like operations. For example, let's look at this program:

fix color ::= red brown white;
fix animal ::= cat dog rabbit;
puts "You see a " color " " animal ".";
puts "You shout out, \"Hey! " color " " animal "!\""
  → run in playground

This will produce output like

You see a white dog.
You shout out, "Hey! white dog!"

The problem, of course, is that the output text here isn't capitalized according to English conventions. You could try to change the structure of the text so you don't need capitalization, but for situations like these, Matzo provides built-in functions to help: in this case, you might use the str/capitalize function:

fix color ::= black brown white;
fix animal ::= cat dog rabbit;
puts "You see a " color " " animal ".";
puts "You shout to it: \"Hey! " str/capitalize[color] " " animal "!\""
  → run in playground

Functions are given lower-case names and are called with square brackets. Functions can have more than one argument, separated by commas. A good example of this is the rep function.

Imagine that you're generating random non-English words according to some schema. Often, that'll involve repeating the same syllable some number of times. You could write a program like this:

consonant ::= p t k w h n;
vowel := ("a" | "e" | "i" | "o" | "u") (4: "" | "'");
syllable := 4: consonant vowel | vowel;
word := syllable syllable
     | syllable syllable syllable
     | syllable syllable syllable syllable
     | syllable syllable syllable syllable syllable
     ;
puts word
  → run in playground

This produces simple nonsense words that roughly resemble Hawaiian or another Polynesian language, like piweka'kiwe or tuhau'te. But that definition for word is pretty repetitive! Luckily, for that situation, we can use the rep function, which takes two arguments: the first is an integer number of repetitions, and the second is the expression to repeat. We can refactor the above definition to this:

word := rep[2, syllable]
     | rep[3, syllable]
     | rep[4, syllable]
     | rep[5, syllable]
     ;
  → run in playground

But we can go even further! Most of that structure is the same, so we can fold the randomness in to the first parameter instead:

word := rep[2 | 3 | 4 | 5, syllable];
  → run in playground

And there's one final bit of Matzo shorthand we can use here: if we want to generate integers from a range, we can simply write that range as x..y, and Matzo will randomly choose a number from that range (treated inclusively.) So that definition can become

word := rep[2..5, syllable];
  → run in playground

A function that deserves special mention is se, the “sentence-ifier” function. You probably noticed before that we occasionally have to do a lot of work to capitalize words and manage spaces and punctuation: in fact, I've found that hands-down the most common bug in producing text from Matzo is getting spacing wrong, like this:

vehicle ::= car bus train bicycle;
puts "I want to take a " vehicle "to London."
  → run in playground

Notice the problem? This program is missing a space inside a string literal, so it might produce a sentence like I want to take a carto London. In order to automatically handle situations like this, the se function can take any number of arguments and will automatically insert spaces as necessary between arguments, not producing multiple spaces in a row and not putting spaces next to punctuation that doesn't need them, as well as trying to auto-capitalize words at the beginning. That means we can rewrite the above program, with spacing corrected, like this, and se will handle the spaces for us:

vehicle ::= car bus train bicycle;
puts se["I want to take a", vehicle, "to London."];
  → run in playground

There are a handful of other built-in functions, including those for working with data types I haven't talked about yet like tuples. But we can also write our own! We do that with the fn keyword. For example, we can define our own naïve pluralizing function like this:

fn plural[word] => word "s";
puts se["I bought four", plural["apple" | "orange" | "pear"], "today."];
  → run in playground

In order to make our functions more useful, though, we should talk about…

Pattern-matching

Obviously we can't simply pluralize any English noun by adding an s to the end: English plurals are infamously irregular! Our naïve pluralizing function above will turn feet into feets, focus into focuss, and child into childs. Actually writing a complete pluralization function is going to be pretty hard, but we can do better with a few hard-coded special cases.

Matzo also supports defining a function by cases. To do this, we can put curly braces around the body of the function, with each case being written separately and separated by semicolons:

fn plural {
  ["foot"] => "feet";
  ["child"] => "children";
  ["focus"] => "foci";
  [word] => word "s"
};
puts plural["foot"];
  → run in playground

This still isn't a complete pluralization function, of course, but now we can call plural["foot"] and it'll give us "feet" instead of "foots".

A reasonably common way that functions get used in Matzo is for generating abstract data and then using that abstract data in different concrete ways. For example, imagine that we're coming up with a random character generator and we want to support a set of gendered pronouns in the output: one way to do this is to choose from our list of supported genders, and use pattern-matching to choose specific gendered pronouns based on that gender:

fix gender := M | F | N;
fn subj {
  [M] => "he";
  [F] => "she";
  [_] => "they"
};
fn obj {
  [M] => "him";
  [F] => "her";
  [_] => "them"
};

puts "While walking, I came across a friend of mine.";
puts se["I hadn't seen", obj[gender], "in so long!"];
puts se[subj[gender], "greeted me warmly."];
  → run in playground

There's actually a lot more I could say about pattern-matching—in particular, the way pattern-matching interacts with non-determinism is very subtle and involves some very nuanced trade-offs—but that might have to wait for a future deep-dive blog post, since this one is already quite long!

Other data types

You've already seen numbers. Matzo supports integers and has a few mathematical operations, although because complicated math is relatively rare in the kinds of programs Matzo is intended for, it doesn't support the typical mathematical syntax: you instead have to write add[2, mul[3, 4]] for that kind of thing. You've also seen the range syntax, which can be very useful. In fact, you could replicate a basic dice-rolling program with just puts 1..6;.

You've also technically seen atoms already, but I didn't draw attention to them. Identifiers with capital letters aren't variables, they're just… themselves. If it helps, you can think of them as a shorthand for capitalized strings, since they get printed as themselves (although Foo and "Foo" are not identical for the purposes of pattern-matching.) They're particularly useful for abstract data that doesn't show up in the output directly but drives the output in some way.

Another very useful kind of data is tuples, which are sequences of expressions. They're written with angle brackets and there are a lot of built-in functions for working with them. I like using tuples when creating random words, especially when I want an unusual or flavorful orthography for the words.

Consider Italian orthography, where the letter c before certain vowels is pronounced like the English “k” and before other vowels is pronounced like the English “ch”. If I wanted to produce random words with an orthography like this, I'd approach it by generating the underlying sounds first, with each syllable represented as a tuple, and then process it to build the orthography using helpers like tuple/map and tuple/rep:

(* given a syllable represented as a tuple,
 * figure out an Italianiate orthography for it *)
fn syllableToOrthography {
  (* a "ch" sound followed by a front vowel
   * is written with the bare letter `c` *)
  [<"tʃ", "e">] => "ce";
  [<"tʃ", "i">] => "ci";

  (* a "ch" sound followed by other vowels
   * is written with `ci` plus the other vowel *)
  [<"tʃ", "a">] => "cia";
  [<"tʃ", "o">] => "cio";
  [<"tʃ", "u">] => "ciu";

  (* a hard "k" sound followed by a front
   * vowel is written with `ch` *)
  [<"k", "e">] => "che";
  [<"k", "i">] => "chi";

  (* all other hard "k" sounds are written
   * with the letter `c` *)
  [<"k", v>] => "c" v;

  [<c, v>] => c v;
};
  → run in playground

Finally, there are also records, which are bundles of data with named fields. I've found these most useful as a mechanism for creating a single conceptual “entity” with multiple random attributes while keeping them together. For example, in the following program, pet is going to represent a random pet, with a random name and a random type:

pet := {
  name: "Spot" | "King" | "Skimbleshanks",
  type: "cat" | "dog" | "rabbit",
};

fix alicePet := pet;
fix bobPet := pet;

puts se[
  "Alice has a", alicePet.type, "named", alicePet.name,
  ", while", bobPet.name, "is Bob's", bobPet.type, "."
];
puts se[
  "Bob's", bobPet.type, "and Alice's", alicePet.type, ("do not" | ""), "get along well."
];
  → run in playground

Notice how we can create the record with the {field: expression, field: expression, ...} syntax, and can pull out fields using the record.field syntax. We can also pattern-match on them using a syntax similar to the one used to create them.

A bigger example

Well, let's put this together and create a simple fake language! The goal here is to come up with random words with a particular flavor, given both as pronunciations in the International Phonetic Alphabet as well as a more casual transcription in the Latin alphabet. Firstly, let's create our consonants and vowels, represented in IPA, and with some weights to make certain sounds more likely:

cons := "p" | 2: "t" | "k"
      | 2: "m" | 3: "n" | 2: "ŋ"
      | "s" | "ʃ" | 2: "ɾ";
vowel ::= a i u;

Now, we can create a syllable. Each syllable here will be a record, with the various parts of the syllable represented as fields:

syll := {
  onset: cons,
  nucleus: vowel,
  coda: 5: "" | "ʔ",
};

Now, a word will be several syllables. Notice the weighting of the number here: it can be a single-syllable word, but most of the time it'll be multiple syllables.

fix word := tuple/rep[1 | 10: (2..5), syll];

Why did we fix it? Well, remember that we want to provide two bits of output: the orthographic spelling and the IPA pronunciation, and this means both should be drawing from the same random word, which here we'll just call word.

Producing the IPA orthography is easy enough: since we're using IPA for all the actual sounds, we simply need to pull them out of the record and put them next to each other:

fn syllToIpa [{onset:, nucleus:, coda:}] => onset nucleus coda;

Now, we need to process the IPA and come up with a Latin orthography. We can leave the vowels as-is, but we want to turn the IPA consonants into simple ASCII characters, so:

fn consToOrtho {
  ["ŋ"] => "ng";
  ["ʃ"] => "sh";
  ["ɾ"] => "r";
  ["ʔ"] => "'";
  [c] => c;
};

With this, we can now turn a syllable to orthography as well:

fn syllToOrtho [{onset:, nucleus:, coda:}] =>
  consToOrtho[onset] nucleus consToOrtho[coda];

Now, we can finally produce our output. We can use tuple/map to apply these functions to the words, and tuple/join to concatenate them down to a string (with an optional second argument for a separator), and produce our output!

ortho := tuple/join[tuple/map[syllToOrtho, word]];
ipa := "/ˈ" tuple/join[tuple/map[syllToIpa, word], "."] "/";

puts ortho " (pronounced " ipa ")";
  → run in playground

This will produced structured output like the following examples:

mishu'ngi (pronounced /ˈmi.ʃuʔ.ŋi/)
ngiri'tana (pronounced /ˈŋi.ɾiʔ.ta.na/)
ku'mi (pronounced /ˈkuʔ.mi/)

What's next for Matzo?

Well, there's still some bugs, some runtime inefficiencies, and some quite frankly bad parse error messages. And in fact, there are a small number of language features I'd like to experiment with! I feel like all the big parts are there and it's eminently usable for a lot of random generation tasks, but it's definitely missing a few convenience things.

There are four bigger things I might try to pursue in Matzo's future, though:

  • Firstly, I want to establish a richer standard library. There's a lot of helper functionality I end up recreating regularly, especially stuff to make producing grammatical English easier (e.g. plurals, verb agreement, pronouns, and so forth) and it'd be cool to be able to rely on that stuff when tossing off relatively quick programs.
  • Second, I want easier interfaces to external data. It'd be great to have an interface to, say, data sets like Darius Kazemi's corpora project or APIs like Wordnik, so that your Matzo programs could pull in data conveniently and use them for other kinds of random generation.
  • On a similar note, I want Tracery compatibility. I'd like this two different ways: firstly, I think it'd be cool to be able to reference an external Tracery program and incorporate it into a Matzo program, so you could assemble individual Tracery parts into Matzo. Secondly, I think it'd be fun to experiment with converting a subset of Matzo programs to Tracery: not every program can be turned into Tracery (since Tracery is—on purpose!—much more limited in the sorts of “computation-like” things you can express) but Tracery is also much more widely implemented and it'd be nice to take a well-formed subset of Matzo programs and be able to use them with existing Tracery implementations.
  • Finally, I'd like to take Matzo and turn the internals into a well-defined and tiny virtual machine. Why? Well, the virtual machine could be small enough that it's easily reimplemented but also easily embeddable in larger programs, at which point it'd be easy to take some Matzo programs and integrate them into, say, a Unity or Unreal Engine game to power dialogue or generation.

I mentioned earlier that I have two other language designs for this task, too, and I'd like to experiment with resurrecting them as well: one of them is a typed language that is otherwise very similar to Matzo, and the other is a logical language based on Answer-Set Programming (which means it ends up resembling the popular Wave Function Collapse algorithm in some respects.) Ideally, if I pursue the virtual machine approach, I'd like the same machine to be a backend to those other languages as well, so you could write any of them and end up with runnable random-text programs.

That's pretty ambitious, and who knows if I'll get around to any of them, much less all of them. For now, though, you can absolutely play with Matzo in your browser or build it yourself. Feel free to reach out to me if you have thoughts, comments, bug reports, feature requests, or anything!


  1. I've actually created three such languages in the past, but I reimplemented Matzo in Rust a few years back and got it close to “releasable”, while the other two are badly bit-rotted Haskell programs that I have never resuscitated.
  2. If you are familiar with Tracery and want to see an illustrative comparison, here's a simplified example program from the Tracery tutorial:

    {
      "name": ["Arjun","Yuuma","Darcy"],
      "animal": ["unicorn","raven","sparrow"],
      "mood": ["vexed","indignant","impassioned"],
      "story": ["#hero# traveled with her pet #heroPet#.  #hero# was never #mood#, for the #heroPet# was always too #mood#."],
      "origin": ["#[hero:#name#][heroPet:#animal#]story#"]
    }
    

    and here's an idiomatically equivalent program in Matzo:

    fix name ::= Arjun Yuuma Darcy;
    fix animal ::= unicorn raven sparrow;
    mood ::= vexed indignant impassioned;
    puts se[
      name, "traveled with her pet", animal, "."
    ];
    puts se[
      name, "was never", mood, "for the", animal, "was always", mood, "."
    ];
      → run in playground
  3. The mathematical term for this is associativity: we would say that Matzo's disjunction operator is not associative. That said, when you hear that an operation is “non-associative”, usually, you would expect that the operator has an implicit branching and therefore one parenthesization is identical to the expression without parentheses. The choice Matzo makes here is pretty weird!
  4. You can also use quoted string literals with ::= if you want, and it'll let you intersperse them with bare words.

For almost a complete decade—starting with discovering Haskell in about 2009 and right up until switching to a job where I used primarily Ruby and C++ in about 2019—I would have called myself first and foremost a Haskell programmer.

Not necessarily a dogmatic Haskeller! I was—and still am—proudly a polyglot who bounces between languages depending on the needs of the project. However, Haskell was my default language for new projects, and in the absence of strongly compelling reasons to use other languages I would push for Haskell. I used it for command-line tools, for web services, for graphical applications, for little scripts…

At this point, though, I think of my Haskell days as effectively behind me. I haven't aggressively sworn it off—this isn't a “Haskell is dead to me!” piece—but it's no longer my default language even for personal projects, and I certainly wouldn't deliberately seek out “a job in Haskell” in the same way I once did.

So, I wanted to talk about why I fell away from Haskell. I should say up front: this is a piece about why I left Haskell, and not about why you should. I don't think people are wrong for using Haskell, or that Haskell is bad. In fact, if I've written this piece the way I hope to write it, I would hope that people read it and come away with a desire to maybe learn Haskell themselves!

What drew me to Haskell?

The absolute biggest pull for me, when first coming to Haskell, wasn't monads or DSLs or even types. The biggest pull was the ability to reason about code symbolically and algebraically.

Yeah, I know, that might sound like some pretentious nonsense. Bear with me.

The idea here is that certain transformations in Haskell are always correct, which allows you to reason about code in a mechanical but powerful way. A simple example: in Haskell, we can look at a function call and replace it by the body of that function. Say, if we have a function double n = n + n, and we call it with some expression x, we can replace double x with x + x no matter what x is.

This property isn't true in most other programming languages! For example, in many imperative languages, this transformation will fail if we call double with i++ for some variable i: the reason is that i++ modifies the value of i, so double(i++) (which increments i once) will of course produce a different value than (i++) + (i++) (which increments i twice.)

“So what?” you might ask, and that's fair. However, this starts to be very appealing in that certain ways of changing your code are always going to be mechanically correct. This is unbelievably liberating. There's a kind of fearless refactoring that it enables, where certain mechanical transformations, each easily verified as correct, can be stacked on top of each other to do wild but ultimately safe changes to your code overall. Doing large refactorings in, say, a Ruby codebase can be deeply harrowing, while doing large refactorings in a Haskell codebase can be a complete breeze.

Or, as the mathematician Hermann Weyl once said:

We now come to the decisive step of mathematical abstraction: we forget what the symbols stand for. ...[The mathematician] need not be idle; there are many operations which he may carry out with these symbols, without ever having to look at the things they stand for.”

This same approach—forgetting what the code means and yet still being able to transform it in productive and powerful ways—is possible with Haskell more than with any other language I've used1.

Of course, the other big pull for Haskell is the type system. There's a lot to be said about the totality of modern Haskell's type system, but the core Haskell language strikes a spectacular balance between having a strict type system without it being too noisy or restrictive. Type inference means that most types are implicit in the code, making the process of writing types significantly less onerous than in something like Java, and the flexibility and convenience of typeclasses means that even when you need to think about the types, they're often not too fussy (compared to, say, OCaml's requirement that you use different versions of the arithmetic operators for integers and floats.) At the same time, the fact that the type system can sometimes get in your way is part of the reason for using Haskell.

I would describe good Haskell code as “brittle”, and I mean that as a compliment. People tend to casually use “brittle” to mean “prone to breakage”, but in materials science what “brittle” means is that something breaks without bending: when a brittle material reaches the limits of its strength, it fractures instead of deforming. Haskell is a language where abstractions do not “bend” (or permit invalid programs) but rather “break” (fail to compile) in the face of problems.

And that's often what you want! For example, Haskell has a NonEmpty type which represents a list which is guaranteed to contain at least one element. Operations on NonEmpty which preserve the same length (like modifying each element with map) or which are guaranteed to add to the length (like combining two NonEmpty lists with append) will return NonEmpty. Other operations which reduce the number of elements potentially to zero, like using filter to remove items that match a predicate, will return a plain list, since there's no guarantee they will be non-empty! In other programming languages, you might informally say, “I know that this thing will always have at least one element,” but in Haskell, it is idiomatic to express this directly in the type system and have the compiler double-check your logic.

So if you have a program where you need to supply, say, a NonEmpty list of file paths to examine, then you can't just pass the command-line args to this function directly, because those might be empty: you must check that they're not empty first and handle the empty case accordingly. If I later on add a filtering step, only keeping the files with a relevant file extension, then I must check for an empty resulting list, because I literally cannot accidentally pass an empty list to the function. This program is “brittle” because it can fail to compile in the face of changes which aren't safe, which is incredibly powerful2.

Over time, writing Haskell means you start building programs in a way that maintains program invariants using types so that the compiler can double-check them. Sometimes people take that to mean stuff like type-level computation, but “express invariants using types” can be much simpler. It can mean something as simple as wrapper types for strings to represent whether they've been SQL-escaped, so that your web API gives you a RawString but your database abstraction only accepts SanitizedString and you can't accidentally introduce a code path which forgets to sanitize it. It can mean converting a surface representation filled with Maybe fields and turning it to an internal representation where information is guaranteed to exist. It can mean something being just generic enough to test in isolation.

And Haskell's other strength is that the language itself is malleable and powerful, which enables terse domain-specific languages without things like macros. Some of the code I'm the proudest of in Haskell has been simple domain specific languages: an example is my config-ini library, a library for working with INI configuration files. Instead of setting up an imperative interface to convey how to parse an INI file into your application-specific configuration, you set up a declarative interface that maps specific parts of your configuration type (via lenses) to specific parts of the structure of an INI file, which in turn lets you use that interface to do reading, writing, and diff-minimal update3. This is accomplished with a simple monadic DSL:

configParser :: IniSpec Config ()
configParser = do
  section "NETWORK" $ do
    cfHost .=  field "host" string
    cfPort .=  field "port" number
  section "LOCAL" $ do
    cfUser .=? field "user"

DSLs aren't the right choice for everything, but they can be a powerful tool when applied correctly, and Haskell also minimizes the amount of “magic” necessary to make them work. Unlike the elaborate dynamic metaprogramming which powers DSLs in something like Ruby, the features which power DSLs in Haskell are often just flexible syntax and the ability to overload things like monads. Some Ruby DSLs are “infectious”, since you need to do global monkeypatching to enable constructs like 2.days.ago, but Haskell DSLs are often easy to scope to specific files or areas of code and can be clean in their implementation.

Finally, I think a related but understated strength of Haskell is just how natural it makes working with higher-order functions. This is partly syntax, partly semantics, and partly a good synergy between the two. I don't want to overstate the importance of syntax, but I think the fact that you can write (+) in Haskell and that means “a function which takes two numeric arguments and adds them” lets you gravitate towards that rather than other constructs. What's a function to pairwise multiply two lists in Haskell? It's simple:

pairwiseProduct = zipWith (*)

What's the equivalent in Ruby, a language which does have blocks and often permits some aggressive code golfing? It's still terser than, say, Java, but still much less so than the Haskell:

# assuming at least Ruby 2.7 for the block syntax
def pairwise_product(xs, ys)
  xs.zip(ys).map {_1*_2}
end

I once heard it said that Haskell lets you work with functions the way Perl lets you work with strings. Lots of Haskell idioms, like monads, are perfectly expressible in other languages: Haskell just makes them feel natural, while writing a monad in many other languages feels like you have to do lots of busy-work.

What pushed me away from Haskell?

If I had to choose the three big factors that contributed to my gradual loss of interest in Haskell, they were these:

  • the stylistic neophilia that celebrates esoteric code but makes maintenance a chore
  • the awkward tooling that makes working with Haskell in a day-to-day sense clunkier
  • the constant changes that require sporadic but persistent attention and cause regular breakages

What do I mean by stylistic neophilia here? The Haskell community, as a whole, is constantly experimenting with and building new abstractions. Some of these are borrowed from abstract algebra or category theory, and permit abstract manipulation of various problem domains, while others result from pushing more computation to the type level in order restrict more invalid states and allow the compiler to enforce more specific invariants.

I think these are cool and I'm happy people are doing them! I'm glad that people are experimenting with things like, say, expressing web APIs at the type level or using comonads to express menu systems. These push at the very boundaries of how to express code and address problem domains in radical new ways, bringing unexpected advantages and giving programmers new levels of expressiveness.

I also… don't really want to deal with them on a day-to-day basis. My personal experience has been that very often these sort-of-experimental approaches, while solving some issues, tend to cause many more issues than is apparent at first. An experience I've had several times throughout my Haskell-writing days—both in personal and professional code—is that we'll start with a fancy envelope-pushing approach, see some early advantages, and then eventually tear it out once we discover that the disadvantages have grown large enough that the approach was a net drag on our productivity.

A good concrete example here is a compiler project I was involved in where our first implementation had AST nodes which used a type parameter to represent their expression types: in effect, this made it impossible to produce a syntax tree with a type error, because if we attempted this, our compiler itself wouldn't compile. This approach did catch a few bugs as we were first writing the compiler! It also made many optimization passes into labyrinthine messes whenever they didn't strictly adhere to the typing discipline that we wanted: masses of casts and lots of work attempting to appease the compiler for what should have been simple rewrites. In that project, we eventually removed the type parameter from the AST, because we'd probably have run out of budget if we finished the compiler and appeased GHC every time we tried to write an optimization.

This wasn't an isolated incident: I'd say that in three-quarters of the projects I worked on where we tried a “fancy types” approach, we ended up finding them not worth it. It's also not just me: the entire Simple Haskell movement4 is predicated on the idea that you get the most benefits out of the language by eschewing the fancier experimental features and sticking to minimal extensions.

And yet, fancier type features are pervasive in the broader community. New libraries are often designed around the fancier features, and there's even a cottage industry of alternative takes on the standard library that try to embed more complicated type features directly into the basic operations of the language. This also informs the direction of the language: you start getting into linear types and even a Haskell-ey take on dependent types, and then those start creeping into libraries you might use, as well.

It can also be an uphill battle to hold the line against these fancier type explorations! As my friend and fellow Haskell programmer Trevor once said, “The road to hell is paved with well-intentioned GADT usage.” Many of these fancily-typed constructs are appealing and do bring some advantages, and many of the disadvantages are delayed. Often, these features make things difficult to change or maintain, which means it can be weeks or months or even years before the drawbacks become fully apparent. And on various occasions, I've replaced complicatedly-typed abstractions with much simplified versions, only to see subsequent programmers notice the lack of fancy types5 and try to replace those simple abstractions with various kinds of cutting-edge type-level abstractions, only to see the code blow up in size, drop in performance, and lose much of its readability.

As I said before, I don't fault people for exploring these idioms, and it's not impossible to find examples that do end up pulling their weight. (Using data structures indexed by compiler phase is a good example of a “fancy type-level feature” that I've found remarkably useful in the past6.) But keeping up with all of it is alienating and exhausting, and at some point, it wasn't a stretch for me to look at the celebration of type-level exploration and the amount of work it took to keep it away from the code I was writing and consequently think, “Do I really belong here?”

The awkward tooling is something that I think is sort of obvious if you've tried writing Haskell for any length of time. We've got multiple standard-ish build systems including Cabal and Stack (and I'm led to believe that rules_haskell for Bazel is decent now although I haven't tried it), as well as linters like hlint, editor integrations like Dante or the Haskell Language Server, autoformatters like brittany or ormolu

All these tools are, well, fine, or at least fine-ish. They usually do what they need to. I am quite confident I have never loved any of them: at best, they did what they needed to do with minimal hassle (which has been my experience with, for example, ormolu) and at worst they've had constant bugs and required regular workarounds but more or less got the job done. It's quite possible that things have changed drastically since I was more involved in Haskell, but during the decade that I was, tooling certainly improved7 but never really shined.

A big problem is the perpetual Not-Invented-Here where people constantly try to build the newest, best thing. This isn't at all unique to Haskell—in fact, complaints about Haskell build systems look petty next to Python's Hieronymous Bosch-esque ecosystem of build tools and environment managers—but it's still frustrating to see people constantly trying to reinvent every little detail (down to the textual format for Haskell packages, one thing I'm reasonably convinced they got perfectly right8) and then leaving it about 95% finished for years.

And if I'm spending my working day with a language, I want the tooling to be great. I want it to be something I can celebrate, something I can brag about. I think Rust's cargo is such a tool: I regularly find things about it that are great, and add-ons that make it even better. At least as of 2019 when I was last writing Haskell, there was no Haskell tool that came even close to Cargo's ease-of-use and stability.

Again, I don't think Haskell tools are abysmal, they're just… fine. And at this point, I think my bar for tools has gotten higher than “just fine”.

Finally, I mentioned the constant changes, by which I mean the way that the Haskell language itself undergoes regular backwards-incompatible revisions. This has been going for a long time: I mean, just look at this joke I tweeted back in 2015, in response to the Foldable-Traversable Prelude changes, sometimes at the time also called the “Burning Bridges Proposal”.

getty (@aisamanra), 2015-10-15

The way that Haskell-the-language evolves—well, the way that GHC evolves, which is de facto Haskell since it's the only reasonable public9 implementation—is that it gradually moves to correct its past missteps and inconsistencies even in pretty fundamental parts of the language or standard libraries. This is in many ways good: there's an attitude in some communities that past mistakes are unfortunate but we need to live with them. People in the Haskell community often reject this idea: why not improve things? Why not fix what we see as clunky mistakes in the standard library? These changes move the foundations of Haskell towards a cleaner and more consistent set of primitives, making it a better language overall.

But oh my god they're annoying.

Very few of the changes are particularly onerous—indeed, they often are accompanied by analyses which show that only such-and-such a fraction of Hackage projects will be affected by them—but they all amount to persistent friction. If I grab an old Haskell project, even one without any external dependencies, I can often safely assume that it won't build any more with any recent version of GHC because everything has been changed just a little tiny bit. The fixes are often tiny, but they add up. Things are always just a little different than they used to be, just different enough that they require your attention: adding a superclass there, removing an import there, adding a type signature since this thing turned just polymorphic enough that the code is ambiguous…

And you know what? It doesn't have to be like this. Look at the way Rust does updates with epochs. The Rust language explicitly builds in contracts of backwards-compatibility: if you use things that are stable, then they won't intentionally break. Intentional breaking changes are hidden behind epochs. I've got old Rust projects which I can return to and continue to build and develop without similar tedium from the language itself.

I think there's a common thread between these three things I mentioned: none of them, especially in isolation, are so painful that you can't just deal with them. You can write Haskell without fancy type features, evaluating the worthwhile ones and holding the line against the costly ones. You can use Haskell's perfectly alright tooling if you don't mind the occasional bug or missing feature or clunky workflow. You can follow compiler changelogs and do sporadic codebase updates every other release cycle. They're bumps in the road, but you've still got a road: if the destination is worth it, what's a few bumps?

I still think Haskell is a great language. The only thing that changed is my tolerance for the bumps.

What do I still miss from Haskell?

All the stuff I said about Haskell above? All that still holds and I genuinely think it's true. I miss being able to write and refactor code algebraically: even when writing Rust, a language which I very much like and which has a lot of the same strengths as Haskell, I miss the ability to do small mechanical refactorings and know that I'm maintaining the meaning of the program.10

And the type-system, too! It's true that other languages have developed more sophisticated type systems—mainstream languages in 2023 have a lot more you can do with static types than they did in 2009—but Haskell's type system still has features that others typically lack, even without the whole menagerie of extensions: abstracting over higher-kinded types is an obvious example here.

The Haskell library ecosystem is a real mixed bag—from the typical half-maintained churn of open source libraries to the weird little kingdoms of individual authors trying to reinvent the universe in their own image to the entire tedious gamut of sadly-necessary-but-all-different string-like types—but there are some pretty spectacular libraries and abstractions in Haskell that are only half-possible in other languages. Lenses, for example, are really cool in a way that's hard to grasp if you haven't used them much. I look forward to seeing them poorly half-implemented in mainstream languages in the 2030s.

Haskell libraries also are often declarative by necessity, but those APIs end up being a pleasure to use: a favorite example here is the Brick terminal UI library, whose simple declarative building blocks end up producing by far the best TUI library I've ever used, but the diagram-creation library diagrams or the music-writing library music-suite are other great examples. In these cases, there's often nothing preventing similar libraries from existing in other languages, but they tend to get built in Haskell first by pure functional necessity.

A thing that seems small, but which I miss a ton, is do-notation. Or, well, I don't care as much about do notation specifically as “any notation which lets you add more bindings without adding indentation for each level of binding”, which Haskell lets you do with both do-notation and with terse lambda syntax and optional operators. There are many abstractions—monadic and not—where nesting lambdas would be a convenient way of expressing APIs, but in most languages nesting callbacks like this ends up being a big ugly hassle of rightward drift. Consider this Ruby usage of flat_map to come up with a list of possible ice cream servings, where each additional axis adds yet another layer of indentation:

def all_flavors =
  [:vanilla, :chocolate, :strawberry].flat_map do |flavor|
    [:cone, :cup].flat_map do |container|
      [:sprinkles, :nuts, :fudge].flat_map do |topping|
        ["a #{container} of #{flavor} ice cream with #{topping} on top"]
      end
    end
  end

The equivalent Haskell here?

allFlavors :: [String]
allFlavors = do
  flavor <- ["vanilla", "chocolate", "strawberry"]
  container <- ["cone", "cup"]
  topping <- ["sprinkles", "nuts", "fudge"]
  pure ("a " <> container <> " of " <> flavor <>
        " ice cream with " <> topping <> "on top")

…yes, I know this is a cartoon example. However, there are plenty of places where having Haskell's sugar can be incredibly powerful. You can, for example, build a convenient and good-looking library for async/await-style structured concurrency on top of do-notation by yourself in an afternoon, while most other mainstream languages had to deal with vicious bikeshedding for years before finally coming up with a halfway-usable async/await syntax11, to say nothing of the ability to trivially embed concurrent transactions or probabilistic programs in Haskell code using the exact same lightweight sugar.

So when I say say that I've fallen out of love with Haskell, it is definitely not because there's nothing in Haskell to love!

So should I use Haskell or not?

Regardless of what programming language you're talking about, there is always a single correct response to the question, “Should I learn or use this programming language?” And that answer is, “It ultimately depends on your goals.”

Are you trying to become a better programmer in general? Then yes, absolutely, learn Haskell! I think this article should make it clear that Haskell is a fascinating and powerful language, and I think the learning experience is more than worth it. Haskell made me a better programmer, and I will continue to think so even if I never write another line of Haskell in my life12.

Are you trying to use it for something? Well, my answer is more restrained, but I don't think the answer is a clear “no”. I think you should be honest about the advantages you get from Haskell—and those advantages are real!—and weigh them against your personal or organizational tolerance for the bumps I've described. I know for a fact that it's not impossible to either individually or as a group get enough benefit out of Haskell that the paper-cuts I've described stay just that: tiny paper-cuts. I know this both because I've worked at such organizations, because I've been such a person, and have many friends who remain Haskell people even as I've drifted away from it.

However, if you pressed me further for a commitment to a yes-or-no answer, my answer would be: no, don't use Haskell. If I were tasked with building an engineering organization, I'd personally stay away from establishing Haskell as the default language, because I know both the value of good tooling and the cost of bad tooling, and I'd rather find a language where I could offer programmers some of the best tooling with a stable language and a clear code ecosystem right away. But even then, I'm not 100% confident in that answer: after all, Haskell does offer some remarkable tools for program safety and correctness, and suffering through poor tooling and linting against unnecessary type families might—maybe, in the right contexts—be worth it for that extra safety13.

It's kind of a bittersweet conclusion! I'm incredibly happy for the time I've spent learning and writing Haskell, and I think the world would be worse if Haskell wasn't in it. And in all honesty, I think part of the value we get out of Haskell is because of—and not in spite of—some of the rough edges above. I'm glad that people experiment with type system features I don't want to use, because with effort and research and time (and no small amount of obtuse error messages) those features will become the powerful type system features of tomorrow that underlie safer, more expressive, more powerful tools and languages! This, after all, is why Haskell's old motto was “Avoid success at all costs.”

It's just not for me any more.


  1. To be totally fair, the experience of code refactors via algebraic manipulation is still possible in other languages, especially in non-pure functional languages like Scheme or SML where most code is unlikely to have side effects and side effects are more likely to be clearly marked. But Haskell's purity and laziness means that such transformations safer in general.
  2. Especially over time and across people! You might have read that paragraph and thought, “What's the big deal? I know the method takes a non-empty list: that's not hard to remember!” But the challenge comes when you need to remember these invariants weeks, months, years later, or communicate them to other engineers working on the same code. It's so much more liberating to be able to express your API in such a way that it can't be used incorrectly by anyone!
  3. What “diff-minimal update” means is that you can supply an existing INI file as well as a new updated structure, and the code will produce a new version of that INI file that retains comments, whitespace, and key ordering, but updates values as needed to correspond to the new in-Haskell structure.
  4. Okay, full disclosure: I like the idea of this “Simple Haskell Initiative” a lot more than I like its execution. It's all well and good to say, “Don't use the whole language,” but there's only a tiny sliver of advice on what parts to leave behind, and all the people quoted don't actually agree on what subset to use. This “initiative” would be a lot better if it focused on education on how to accomplish tasks with Haskell subsets and rationale for what to drop, instead of the anodyne non-message of, “Don't use every feature!”
  5. Or, worse, the fact that the simple approach uses IO somewhere.
  6. I noticed after publishing this article that some people were confused about this point, since I complained about a compiler project which included “fancy types”, so to clarify: the thing I found to be not worth it in the context of compiler development was using a GADT to represent the target language's types in the host language, like this:

    data Expr :: * -> * where
      Lit :: Int -> Expr Int
      Bool :: Bool -> Expr Bool
      IsZero :: Expr Int -> Expr Bool
      IfThenElse :: Expr Bool -> Expr a -> Expr a -> Expr a
    

    In that example, if I try to write a term like IsZero (Lit 5), it'll accept it fine, but if I try to write IfThenElse (IsZero (Lit 5)) (Lit 3) (Bool False), Haskell will reject it as ill-typed because the two branches of the if-expression do not have identical types.

    That's not the same as the “trees that grow” approach I'm referencing here: the only commonality is that they include expressions with a type parameter. Instead, “trees that grow” is about reusing the same AST but gradually adding more information to it. For example, in a compiler, you often start by representing variables as raw strings, and then move to something more like a “symbol” type which can disambiguate shadowed names and perhaps point to the binding site of the variable. The “trees that grow” approach can allow you to

    data CompilerPhase = Parsed | Resolved
    
    data Expr (phase :: CompilerPhase)
      = Lam (Name phase) (Expr phase)
      | App (Expr phase) (Expr phase)
      | Var (Name phase)
    
    data Symbol = Symbol {
      symName :: String,
      symIndex :: Int
    }
    
    type family Name (t :: CompilerPhase) :: *
    type instance Name Parsed = String
    type instance Name Resolved = Symbol
    

    In this example, I'm reusing the same Expr type for all ASTs, but for Expr Parsed, the variable names in that AST are going to just be String, and for Expr Resolved, those variable names will be replaced with a richer Symbol type. There's a lot more you can do with this, but it's a very different approach than the first thing I described.

  7. For context, when I started writing Haskell professionally, we had only had cabal and sandboxes were brand-new.
  8. I'll get on a soapbox for this one for a moment. The intention behind hpack is good: custom formats can lack tooling or integrations with other software, while moving to a common format like YAML solves that problem. A YAML file will definitely have syntax-highlighting in the way a .cabal file might not, and you can generate a valid YAML file from an arbitrary programming language where you probably don't also have a library to generate a valid .cabal file. …but a custom format also gives you the power to express constructs in a simple, clear, domain-specific way that would have to get awkwardly shoehorned into another format (or left as strings which get interpreted in a special way.) A tool like hpack means that not only do you have to deal with YAML gotchas like the fact that 1.10 parses as 1.1 unless you quote it as a string, you also lose out on simple constructs like Cabal's conditional syntax. In hpack, you have to write this:

    when:
      - condition: flag(fast)
        then:
          ghc-options: -O2
        else:
          ghc-options: -O0
    

    which is the equivalent of the following snippet of Cabal:

    if flag(fast)
      ghc-options: -O2
    else
      ghc-options: -O0
    

    And honestly? I will take the latter over the former any day. I write version numbers and conditional configuration way more often than I have ever needed to produce a .cabal file from another piece of software. HPack makes uncommon workflows easier while making common workflows annoying.

  9. Haskell is used by a number of financial institutions and I'm led to understand some of them have their own internal implementations. I don't know the current state of them: I think some of them have moved away from their internal implementations, but I don't know that for sure and was too lazy to research this during the writing of this blog post.
  10. And I'm not just talking about side-effects! In Rust, it's possible for f(x) to do something slightly different than let y = x; f(y) because of how names affect lifetime heuristics, making this kind of mechanical refactoring a lot more finicky and contextual.
  11. This is not actually a jab at Rust! I do agree that Rust's async story has ended up being kind of messy and complicated, but I think that's understandable, since Rust had a much harder task than almost any other language: implementing a usable and comprehensible async/await mechanism with manageable borrowing rules on top of a systems language that should expose those features with no heap allocation and as little runtime overhead as possible… that was a research problem done in the open, and I don't think it was ever going to produce a perfectly clean elegant core. The fact that it's succeeded in any small part is something I think is impressive, even if the end result has more sharp edges and visible seams than non-async Rust. (And I actually think Rust's postfix .await syntax is a great idea.)
  12. And I will write more Haskell, if for no other reason than that I plan to continue supporting my s-expression library and my INI-parsing library as long as they have users.
  13. …although in those cases I might gravitate towards Rust, which is at times a more finicky language but has significantly better tooling.

Something I've had puttering around in the back of my head for ages now is My Perfect Application Language. There are languages out there that I like more or less, but I still believe that another language could be a more perfect fit for the software I want to write1, and I have a large number of ideas about what features it would include, what features it would omit, how it might be written, and so forth.

One dissatisfaction I have with existing languages is that I think none of them do error-handling terribly well. Graydon Hoare's very good blog post What next? talks about places that programming languages might go in the future, and one thing he notes is that we've got various approaches but none of them feel obviously right. I very much agree and I don't want to pretend I've “solved” error-handling here.

However, I've got a hypothesis about what sort of error-handling system I might want in My Perfect Application Language, and I've never actually written it down in full. I'm not imagining a completely novel error-handling system: instead, I'm imagining taking a few ideas which have been tried and combining them in a somewhat novel way to hopefully combine their strengths and temper their weaknesses. To explain this hypothetical feature, I'm going to break it down into two language design hypotheses:

The first hypothesis: statically declaring the list of possible exceptions which a function can throw, like Java's checked exceptions or C++ exception specifications, can be a good feature so long as they can be optionally inferred and do not need to be explicitly listed by the programmer.

The second hypothesis: we can create an incredibly flexible and powerful error-handling mechanism by implementing a statically typed inferred condition system inspired by Common Lisp's condition system, coupling static inference of error conditions with static inference of restarts.

The rest of this post will explain them in fragments. I should also note that this all hypothetical: I haven't implemented this, and I don't believe any programming language feature can be really justified until it has proven itself in at least medium-scale programs in its intended domain. (In fact, one of the features I'm about to discuss seemed fine in tiny snippets and turned out to be abysmal in large programs.) That said, it's a hypothesis I'd like to test, because I suspect a feature like the one I'm about to describe would be a tremendous boon to the kinds of programs and libraries that I usually write!

Statically inferred checked exceptions

I think it's probably pretty uncontroversial to say that checked exceptions might sound appealing in theory but are downright miserable in practice. The idea is simple: you can statically verify that all error paths are checked if each method enumerates the exceptions that might be thrown. If I call a method foo() that declares no exceptions, then I know for sure that I won't need to catch any of them! If I look in the source and see foo() throws ThatException, then I know for sure that I might need to handle ThatException! The compiler will yell at me if I don't do something about ThatException: I either need to explicitly catch it, or I need to explicitly pass it on by adding throws ThatException to my method. I can sleep easy, knowing that all error paths are handled!

…except, of course, now you need to copy-and-paste throws ThatException again and again and again and again throughout your whole damn codebase, which is ridiculously noisy and nobody wants to do it. (To say nothing of the half-dozen or more exceptions I might want to actually throw!) In practice, Java programmers have pretty universally found checked exceptions to be firmly Not Worth It: either they haphazardly toss the generic and unhelpful throws Exception around all their methods, or they catch any exception without bothering to discriminate in order to silence the warning, or they use runtime errors which don't need to be included in the type. Findings with large codebases have firmly indicated that exception specifications increase toil but don't significantly impact safety.

But I suspect the problem isn't that they can't increase safety in some respects: the problem is that the amount of toil they cause is disproportionate to the safety you get, and in that they tend to push programmers to use them in ways which minimize the toil and therefore also minimizes safety. So what if we got rid of the toil?

Let's imagine a hypothetical programming language with exceptions: the syntax isn't important, so I'm going to borrow a handwavey Rust-like syntax, but I shold be clear that all the examples below are imaginary and in a fictional non-Rust language. What I imagine here is that the default way of a method might look something like this:

// an example function in a fictional language
fn find(haystack: String, needle: String) -> Nat {
  // implementation elided
}

(Wait, why Nat? I also think programming languages should use natural numbers more!)

Ah, but there's nothing in that snippet that mentions exceptions, right? In our hypothetical language, that doesn't mean it won't throw an exception: instead, that means it's allowed to throw any exception! However, the compiler can infer which exceptions it might throw, and it shouldn't be hard to compute: simply accumulate any uncaught exception thrown in the method body or any uncaught exception thrown by a method which is called in the method body. If I asked the compiler for the type of find here, it might tell me a type like

fn find(haystack: String, needle: String) -> Nat
  throws (SubstringNotFound)

So that means the compiler is well aware that find might throw an exception, and in fact knows what exception it might throw. Anything which uses find now can be aware that it might throw SubstringNotFound. That means that, in this hypothetical language, we can write something like

fn example() -> Nat throws () {
  find("foobar", "ee")
}

And now we've given our compiler license to yell at us: we've claimed that example throws no exceptions, but because we're using find, the compiler can correctly point out that we haven't handled SubstringNotFound. Just like in Java, we've got two basic options—either catching SubstringNotFound in the body of example, or adding SubstringNotFound to the list of exceptions thrown—but we've also got a third, substantially more terse option: we can wholly remove throws () and allow it to infer for us whatever it wants to. We can add the specification if we care but otherwise let the compiler handle the rest.

I think there's also more that can be done around expressivity here. For example, perhaps a programmer could choose to explicitly include an exception in the list: a type like fn example() -> Nat throws (SomeException, ...) would mean, “Whatever the compiler infers for the method body here, but it can also throw SomeException even if the compiler didn't infer that one.” One situation you might want this is when prototyping an API: perhaps I know that my API might eventually throw CacheMissException, but I haven't wired up the cache yet, so I'm going to make sure I include that in my type signatures in the appropriate places just in case, and elsewhere use throws () to make sure I handle it in places where I need to.

More usefully, though, I can imagine a syntax for ensuring that specific exceptions aren't included in the inferred type. In this case, a type like fn example() -> Nat throws (!OtherException, ...) would mean, “This throws whatever the compiler infers for the body, but if that inferred set includes OtherException then raise a compile error.” This means you don't need to regularly re-write the set of exceptions for a complicated API which might throw a dozen different specific errors, but you could still say, “I don't want this specific exception to escape, so keep me honest: if example() ever tries to throw OtherException, then yell at me about it.”

In fact, I can imagine wanting to implement this in a way where exception lists, even empty ones like (), will actually implicitly include pervasive “existential” exceptions: for example, exceptions that represent signals like SIGKILL or exceptions that get raised when the process runs out of memory. In that case, fn foo() throws () would be a convenient fiction, because it won't force the programmer to handle out-of-memory errors, but a programmer could write fn foo() throws (!OutOfMemory) to indicate that foo not only doesn't throw any user-written or typical stdlib exceptions, it also promises to handle out-of-memory exceptions that bubbled up from within it. A typical program probably would still probably define fn main() throws (), but a server that's intended to be long-lived might define fn main() throws (!OutOfMemory, !Sigkill) so that the compiler can let you know when you've failed to handle those error conditions.

I haven't implemented this in any language, so it's quite possible it'd still have problems. And I've handwaved away a number of issues that I haven't tried to solve. For example, I haven't tried fully articulating the typing rules for higher-order functions, or how this would interact with typeclasses, or how the compilation model will work: there's a lot to be done, and it might require other limitations or compromises or heuristics in practice. But my hypothesis is that a feature such as this would let people use checked exceptions in a way that includes minimal overhead, allowing programmers to opt-in to useful checks but also get out of their way when it's not useful.

What's this about conditions and restarts, then?

I said there was another part of my hypothesis, and to explain that I'll have to talk about Common Lisp's condition system. Let's start with exceptions and describe in fine detail how they work, because that'll help us understand how conditions differ.

In a language with exceptions, you signal the existence of an error by you constructing a “thing” called an exception, which is usually a value of a specific type. In languages with inheritance these are often values that inherit from a specific base class (although not always—Ruby, for example, allows any value to be thrown) and in languages without inheritance they usually have some kind of error-specific tag (e.g. in Haskell or various ML languages.) These values can be “raised” or “thrown”. When an exception is “raised”, the language runtime will begin to walk up the stack, freeing stack frames as it goes, until it finds a “handler”, a bit of code which matches the appropriate exception type or tag and is attached to a block of error-handling code. If it fails to find a handler, the runtime will usually terminate the program and print a relevant message. If it does find a handler, it resumes from that point, providing the exception to that bit of code.

This is a widely-used mechanism for errors: so much so, in fact, that it can be hard to imagine alternatives. What else could you do? What else would you want out of it?

I'm going to steal an example from Peter Seibel's Practical Common Lisp as motivation, but rewrite it in a curly-bracket syntax for people who aren't as comfortable with Lisp. Imagine that I'm writing a library which is parsing log files in a particular format. I have a function called parse_log_entry which takes a fragment of text and produces a log entry. Say I've written it like this:

fn parse_log_entry(text: String) -> LogEntry {
  if (is_well_formed(text)) {
    return LogEntry::from_text(text);
  } else {
    raise MalformedLogEntry(text);
  }
}

Now the library is about parsing whole log files, so I also expose a function to parse an entire file like this:

fn parse_log_file(f: File) -> List<LogEntry> {
  let mut list = List::new();
  for ln in f.read_lines() {
    list.append(parse_log_entry(ln));
  }
  list
}

This is nice and simple! Unfortunately, if a single entry fails to parse, we'll throw a MalformedLogEntry exception and lose access to the whole log parsed so far! In some applications, maybe that's fine, but I've said we're writing a library, and we want it to be as flexible as possible for an eventual user. Perhaps a user of the library would like us to put a special kind of LogEntry value that represents a malformed entry instead? We could write something like this:

fn parse_log_file(f: File) -> List<LogEntry> {
  let mut list = List::new();
  for ln in f.read_lines() {
    try {
      list.append(parse_log_entry(ln))
    } catch (exn: MalformedLogEntry) {
      list.append(bad_log_entry())
    }
  }
  list
}

Now we handle that error gracefully. But now we're assuming that's how the user wants us to handle that error. Maybe the user wants us to quietly skip that entry instead! We can write that, too:

fn parse_log_file(f: File) -> List<LogEntry> {
  let mut list = List::new();
  for ln in f.read_lines() {
    try {
      list.append(parse_log_entry(ln))
    } catch (_exn: MalformedLogEntry) {
      // do nothing
    }
  }
  list
}

But what if they want us to skip them but write the errors to stdout? Or to a specific logger? Or apply a correction heuristic and try parsing the line again? Or…

Okay, library design is hard, and designing libraries which handle errors in every possible way is really hard. One approach here might be to provide all these as options which can be chosen by the library user. Maybe we expose different methods to the user which each implement different versions of these strategies, like parse_log_file_with_default versus parse_log_file_skip_bad. Maybe we provide one function with lots of optional parameters, like parse_log_file(with_default: ..., skip_invalid: ...). Maybe we just throw our hands in the air and choose one we think is sensible: convention over configuration, right?

On the other hand, if we had a condition system, we would have a really powerful way of allowing the user to choose any of these and more without having to significantly change our interface. What a condition system does is separate out two different concerns that get conflated by exception-handling: what error recovery code does and what error recovery code should be invoked. To begin with, instead of an exception handler, we install what's called a restart and give it a name: this is how we define our recovery code that might run after an error is raised, but it does not guarantee that the error-handling code associated with the restart is actually run. In this case, let's start with logic around skipping entries:

fn parse_log_file(f: File) -> List<LogEntry> {
  let mut list = List::new();
  for ln in f.read_lines() {
    try {
      list.append(parse_log_entry(ln))
    } restart SkipEntry {
      // do nothing
    }
  }
  list
}

This restart block represents a possible recovery strategy in the presence of an error, and SkipError is the name we've given it. However, we're still missing something: we haven't actually told our program to use it. Our library shouldn't be the one to make that choice, so let's imagine that we're calling the parse_log_file library function from some application code. We now tell our application code to handle a condition by invoking the restart we've defined:

fn analyze_log() -> List<LogEntry> {
  try {
    parse_log_file(File::open("my_log.txt"))
  } handle {
    MalformedLogEntry(_) => restart SkipEntry,
  }
}

This is where I'm telling the program which piece of recovery code to use. What I'm saying is, “If we ever come across a situation where our code has produced a MalformedLogEntry, then recover from that by finding the recovery path labeled with SkipEntry and restart from there.”

So far we've only defined that one recovery path. Let's revisit parse_log_entry and add a few more strategies we might use to recover from an error within that function. Unlike SkipEntry above, these also take parameters, which are pieces of information that the handler can supply in the handle blocks:

fn parse_log_entry(text: String) -> LogEntry {
  if (is_well_formed(text)) {
    return LogEntry::from_text(text);
  } else {
    try {
      raise MalformedLogEntry(text);
    } restart UseValue(v: LogEntry) {
      return v;
    } restart RetryParse(new_text: String) {
      return parse_log_entry(new_text);
    }
  }
}

Now we have a total of three possible ways to recover from a MalformedLogEntry: we can invoke the SkipEntry restart which will simply skip past malformed lines, we can use the UseValue restart with a value of type LogEntry to replace the bad log with a different provided one, or we can use the RetryParse restart to supply a new corrected string and attempt the parsing again.

The important thing now is that the library allows all of these restarts to exist simultaneously, but does not specify which to take: that's up to the calling code. Let's change our application code to supply bad_log_entry() as a default value instead; this means an application will still include as many LogEntry values as we had lines, but some are specifically represented as bad ones:

fn analyze_log() -> List<LogEntry> {
  try {
    parse_log_file(File::open("my_log.txt"))
  } handle {
    MalformedLogEntry(_) => restart UseEntry(bad_log_entry()),
  }
}

What if we want to skip the bad ones but still record that we saw them by printing messages to our logger? We can use SkipEntry with some extra handler code, then:

fn analyze_log() -> List<LogEntry> {
  try {
    parse_log_file(File::open("my_log.txt"))
  } handle {
    MalformedLogEntry(text) => {
      logger.log("Found bad log entry: `{}`", text);
      restart SkipEntry;
    }
  }
}

What if we want to try applying a correction heuristic to the first several errors that we see, but exit the program if we see more than an pre-determined “allowable” amount of bad errors? We can use shared state in our handler and the RetryParse restart:

fn analyze_log() -> List<LogEntry> {
  let mut errors = 0;
  try {
    parse_log_file(File::open("my_log.txt"))
  } handle {
    MalformedLogEntry(text) => {
      if (errors < ALLOWED_LOG_ERRORS) {
        errors += 1;
        restart RetryParse(try_correction(text));
      } else {
        logger.log("Encountered too many bad log entries; exiting");
        system.exit(1);
      }
    }
  }
}

Admittedly, this system is definitely more fiddly than exception-handling: you've got more moving parts, what with the error conditions (which in this example we had only one of) plus the named error restarts plus the handler logic. (I suspect this is one reason why language designers haven't bothered porting this into new languages, preferring simpler exception-based errors or explicit result types instead.) But the separation can be incredibly powerful: we no longer need to manually thread state through our API; instead, our API is designed for the “happy path”, but errors can still be handled in a granular way, and what's more, the application has full control over how those errors get handled. Breaking up strategies for error recovering (as named restarts) from how to actually handle errors (as handler strategies) allows for some incredibly simple but powerful API designs.

Statically inferred typed restarts

Okay, so let's put these together. The last section glossed over types, and for good reason: in Common Lisp, there aren't types for restarts. In fact, it's possible for a handler to specify a restart which doesn't exist, in which case it'll produce a different kind of error (a CONTROL-ERROR in Common Lisp parlance) because it wasn't able to find the place where code should resume.

But we could build a statically typed language that implements a condition system, so that the compiler could reject programs which catch non-existent errors or try to resume from non-existent restarts (or supply those restarts with values of the wrong type.) Yet again, the way I'd propose doing this is by allowing errors—or conditions, to use the Common Lisp terminology—but also restarts to be inferred as part of a function's type even when the rest of the type is explicitly written.

I think in this situation, the code we wrote up with above (without any type information) should still be valid, in the same way that the example code at the start of the post could infer possible exceptions if you didn't explicitly try to specify them. In this case, if we asked our hypothetical language to infer the relevant types, we might end up with something like this:

fn parse_log_entry(text: String) -> LogEntry
  throws (MalformedLogEntry(String))
  restarts (UseValue(LogEntry), RetryParse(String))

fn parse_log_file(file: File) -> List<LogEntry>
  throws (MalformedLogEntry(String))
  restarts (UseValue(LogEntry), RetryParse(String), SkipEntry)

fn analyze_log() -> List<LogEntry> throws () restarts ()

After all, the language can tell which errors are possible and which restarts are available. But as with the inferred checked exceptions before, we wouldn't need to write them out: they're easily inferred by the compiler, and it simplifies the process of writing the code tremendously! The compiler would reject your program if you try to use a restart that doesn't exist, or if you claim your function is error-free but fail to handle an error that might bubble up. This would allow us to have a very powerful set of tools for error handling and also careful API design without putting a strong burden of tracking minutiae on the programmer.

As I said before, I haven't implemented this anywhere, and who knows if this idea would survive contact with an end-user2. It's not only possible but likely that there are cases I haven't considered, complicated interactions I've failed to foresee, or ergonomic concerns that this doesn't address. But it's a hypothesis, and one I'd like to test some day, because if it works, I can imagine it being a tremendous tool for safety and legibility!


  1. Why do I specify this? Well, because I believe that there's no one programming language for all applications. I like Rust and I like Haskell and I like Python, and there are a few applications where any of them might be fine, but more often there are programs that would be perfect for one and terrible for the other two. I think that problem domain matters a lot to programming language design.
  2. I've also elided both implementation concerns (for example: how does this work with higher-order functions?) and other design questions. There are two big design questions that I've considered but don't have conclusions about. The first question is: how should conditions and restarts be named and namespaced? I've treated them in the code as global-ish, but maybe they should be namespaced underneath the function name (e.g. parse_log_entry::UseValue) or at least underneath the compilation unit, but there are a lot of possible choices here. The second question is: how “far up” should restarts be accessible, and how do you control that? I handwaved in the inferred type signatures above that, because we handle the condition which can get raised, we don't “re-export” restarts from analyze_log, but we do re-export them from parse_log_file. Is that an analysis we can do? If not, how do we make sure that restarts are available without adding too much more busywork for the programmer?

Literal actual years ago, a friend of mine asked me for a blog post about how I generate images with code. I've started writing such a blog post on multiple occasions, and I keep wanting to add more stuff, more asides, more examples, and it grows well beyond what a post should be. So instead, I'm going to break it up into multiple posts.

  1. In Praise of Netpbm
  2. Cairo, SVG, and generated vector images: planned
  3. Practical PostScript—no, why are you laughing, I'm serious: planned
  4. How I approach procedural images: planned

Generative pixel art

Sometimes the generative art I want to make is just pixel-based. For example, let's say I want to create some interesting cellular automata:

Three examples of three-state cellular automata.

Or perhaps I want to make some cool-looking glyphs, the kind you'd see used as writing in the background of a pixel-art-heavy game like Hyper Light Drifter or Fez:

Twelve examples of random blocky symbols.

Or maybe something kind of maze-like:

A twisting black-and-white pattern which resembles a maze.

There are a lot of ways to go about this, but for images like these I've got a particular approach which is barebones and flexible.

Say you want to generate some pixel art

For didactic reasons that'll become apparent later in the blog post, I'm going to write code here in Wren. Wren's a little object-oriented scripting language that's designed to be embedded in larger programs. I'm mostly not gonna explain it much, but it should be easy to follow it like it's pseudocode: it's an imperative object-oriented language that feels somewhere between Ruby and JavaScript with a pretty standard curly-brace syntax. I'm not gonna embed it as a scripting language, though: I just want to use it to create some basic black-and-white pixel art, so I'm gonna use wren-cli to run scripts directly with access to its pretty minimal standard library.

So, pixels. A raster image is just a two-dimensional array, right? How hard can that be? Let's start by making an array of arrays that will be our image.1 Since this is black-and-white, each element will be one of two numbers: for this example, I'll use 0 for white and 1 for black. I'll initialize a basic 3×3 image with all white pixels, by which I mean, make a 3×3 2D array.2

var width = 3
var height = 3
var image = []

for (y in 0...height) {
  var row = []
  for (x in 0...width) {
    row.add(0)
  }
  image.add(row)
}

Okay, so we've got our blank image. Let's do something basic: let's make a very simple black-and-white checkerboard pattern by alternating black and white pixels. This is pretty easy to do with the modulus operator:

for (x in 0...width) {
  for (y in 0...height) {
    image[y][x] = (x + y) % 2
  }
}

In this case, you'll notice that I access the image using image[y][x]. This is the opposite of what the usual notation is, but think about why that is: when we print out our 2D array, it looks like this:

[[0, 1, 0],
 [1, 0, 1],
 [0, 1, 0]]

But that means that arr[0] is the row at the top, arr[1] is the row beneath, and arr[2] is the last row. That means to e.g. get the second pixel of the first row, we'd need to use arr[0][1]. A little unusual, but not too hard to remember, and if it's really tripping us up, we can always write some wrapper functions that take the arguments in the correct order.

Okay! We've got our pixel array: let's turn it into an image!

…oh, wait. I chose to write this in Wren, didn't I?3 Here's a thing about Wren: it doesn't have a package manager, or even much of a stdlib. In another language, I might now say, “Okay, let's find the PNG encoder, toss my data at that, and write out a PNG file.” But I don't have a PNG encoder on hand to throw the data at.4 How can I take my 2D array and see it as an actual image?

Netpbm to the rescue

Netpbm is an image format, or rather, a family of image formats: seven in all. For the purposes of this blog post, I only care about the first three (although I'll describe the last four a bit later.) Let's start with the PBM format itself, where PBM stands for “Portable BitMap”. This is a format for bitmap images, in the very literal sense of 'a map of bits': each pixel is 1 bit (i.e. black or white, on or off.) Netpbm is a text-oriented format, so I can write it with a text editor. Here is a hand-written Netpbm file which describes the basic 3×3 image I was generating above:

P1
3 3
0 1 0
1 0 1
0 1 0

…and that's it. The P1 at the top says, “I am a Netpbm file,” the next line has the width and height of the image in pixels, and then after that are all the values for each individual pixel, 0 for white and 1 for black.

Okay, there's a little bit more to it, but not much: for one, Netpbm files can have comments, which are commonly used to indicate which program created them (although they can't come first: the first two bytes of the file have to be P1 for it to be recognized by parsers). It's also worth noting that all the whitespace after the width and height is optional, and you don't need to line up the pixels nicely like I did above. The following defines the same image:

P1
# check it out
3 3
010101010

But that's really it.

So, that's easy! Let's just write that to stdout in our Wren program. I'm going to separate the rows by newlines and the pixels by spaces, but as I said before, it's optional here:

System.print("P1")
System.print("%(width) %(height)")
for (y in 0...height) {
  for (x in 0...width) {
    System.write("%(image[y][x]) ")
  }
  System.print()
}

Because I've parameterized it by the width and height, I can also tweak those to make my image larger. And now that I've got a PBM file, I can open it in any program which supports the format: and you might be surprised which programs do support it. I can't view it directly in the browser or anything, but there plenty of viewers and editors that do understand it. I've got Glimpse installed, why not try that?

A 3x3 pixel checkerboard, viewed in the Glimpse editor.

Works just fine! And I could use Glimpse to convert it to another format. There's also a suite of tools called netpbm which are commonly available on Unix-like systems, so I could always run that to convert them to some other format.5

What if I want grays or colors?

You'll notice that the PBM format is pretty barebones: it is only capable of describing black-and-white images. So, let's create a grayscale image instead, which means creating a PGM file. Instead of a simple 0-versus-1 distinction, we'll have a scale from black to white. Unlike some other formats, the Netpbm format allows us to decide how many levels of gray we have. Let's say we want to create a very similar image except instead of cycling between black and white, we cycle between multiple shades of gray: let's say, arbitrarily, four. We can start with the same blank image, but instead of filling in 0 and 1, we fill in a wider range of numbers, which for us will range from 0 through 3.

for (x in 0...width) {
  for (y in 0...height) {
    image[y][x] = (x + y) % 4
  }
}

In order to produce this file type, we start it with P2 instead, which is the header variant for a grayscale image. We follow it with the width and height, but then we add one more thing: the maximum depth. This is the number that will represent white, while 0 will represent black;6 any number in between will represent a shade of gray. In a PGM, unlike a PBM, you do need spaces between the individual pixel values, but luckily we were already doing that before. (Newlines between rows are still optional: I just like how they look.)

System.print("P2") // the PGM header
System.print("%(width) %(height)")
System.print(3) // the maximum value which will appear
for (y in 0...height) {
  for (x in 0...width) {
    System.write("%(image[y][x]) ")
  }
  System.print()
}

The end result of this, yet again, is a bitmap like so:

A sample bitmap image of cycling shades of gray.

Creating color images with Netpbm is more involved, but only slightly. When I create a color image, I need to supply three values per pixel: red, green, and blue. One way you can represent this is by treating each pixel as a struct with three fields, or a map from color to value. For our purposes, let's say that each pixel is a hashmap with three keys: "r", "g", and "b". You'll also notice that I decided to define a color depth at the top of the file here, so we can use that variable later:

var width = 24
var height = 24
var depth = 24
var image = []

for (y in 0...height) {
  var row = []
  for (x in 0...width) {
    row.add({"r": 0, "g": 0, "b": 0})
  }
  image.add(row)
}

I'm going to arbitrarily choose something weird to draw here: let's say that I want the image to get more red as you go to the right, more green as you go down, and the blue-ness of pixels will alternate in a checkerboard pattern.

for (x in 0...width) {
  for (y in 0...height) {
    image[y][x]["r"] = ((x / width) * depth).floor
    image[y][x]["g"] = ((y / height) * depth).floor
    image[y][x]["b"] = ((x + y) % 2) * depth
  }
}

Once we do that, printing the image is easy, and indeed is nearly identical to the PGM version. We need the header P3, and we still need a maximum value for the brightness of pixels, but only one which serves as the maximum value for all three colors. The biggest change is that each pixel is now three numbers, not one, but we can accommodate that pretty easily:

System.print("P3")
System.print("%(width) %(height)")
System.print(depth)
for (y in 0...height) {
  for (x in 0...width) {
    var p = image[y][x]
    System.write("%(p["r"]) %(p["g"]) %(p["b"]) ")
  }
  System.print()
}

And the end result:

A sample bitmap image of shifting colors.

And there we go! We can produce bitmap images easily, with no library support at all!

Why Netpbm and not a bitmap library?

There's no shortage of good libraries for manipulating images, many of which have convenient higher-level operations you might need. At the very least, you probably want some kind of blitting operation, to take a set of smaller bitmaps and redraw them onto a larger one.

But there's a good reason to know about the Netpbm format, and a good reason why I've used it in a lot of my pixel art generation projects: you can use it from literally anywhere and without thinking about it much at all! Look at the above programs: I used a language without a native image library and I didn't even use file IO, and yet I was able to create bitmap images in less than two dozen lines.

If I want something a bit more sophisticated, I can start to build my own versions of higher-level operations. For example, let's say I want to encapsulate the image in a class, say, with a pixel(x, y, color) method I can call. It'd be pretty easy for me to add a rectangle method, like so:

  rectangle(x, y, width, height, shade) {
    // the two horizontal lines
    for (dx in 0..width) {
      pixel(x + dx, y,          shade)
      pixel(x + dx, y + height, shade)
    }

    // the two vertical lines
    for (dy in 0..height) {
      pixel(x,         y + dy, shade)
      pixel(x + width, y + dy, shade)
    }
  }

Then I could create an image with a handful of randomly-placed rectangles with random shades:

var image = GrayscaleImage.new(width, height, depth)
// create up to 6 rectangles
for (i in 0..rand.int(3, 6)) {
  // choose the color from the depth
  var color = rand.int(2, 8)
  // choose top-left point randomly
  var x = rand.int(0, width-3)
  var y = rand.int(0, height-3)
  // choose width and height from remaining
  var w = rand.int(x+2, width) - x
  var h = rand.int(y+2, height) - y
  // draw the rectangle
  image.rectangle(x, y, w, h, color)
}
image.showPGM()

This program, when run, produces output that looks like this:

Several gray rectangles randomly positioned on a black field.

I chose Wren here mostly because it's vaguely comprehensible if you know just about any object-oriented scripting language but also to show how little support you need to start making these images. I've built Netpbm images using languages which have other robust image library support (like Rust and Python) but I've also used languages where libraries were scarce and I didn't feel like building a PNG encoder or mucking with an FFI: languages like Wren, but also various obscure Scheme variants, Pony, and Idris.

In fact, the examples at the beginning of this post were all written in exactly this way, and in different languages: the three-state cellular automata were written in Rust, the glyphs in Ruby, and the maze-ish pattern in Python, all of them created by writing out Netpbm files without any library support!7

When would I use a bitmap library?

One concern is that NetPBM files are big. Not only are they not compressed, they're also written out using ASCII text, which means there's a lot of wasted space. For example, near the beginning of this post, I included an example of cellular automata, which I generated as 99×99-pixel squares. When encoded as a PNG, this image is about 1.9K in size, and when encoded using the uncompressed BMP format, it goes up to a hefty 39K. The PPM file I originally generated is 58K in size, around 30× the size of the PNG. For a tiny image, that's huge. This isn't a huge problem long-term, because you can easily convert them to a compressed format for storage, but it's worth noting, especially if you're going to be generating a lot of them at once. (I generated 120 random cellular automata images to choose one for this post: those images, each 99 pixel squares, took up about 8M on disk!)

You'll also have increasing problems as your desired drawing primitives get more complicated. The stuff I've done above all involved tight control over placing pixels or groups of pixels, but if I wanted to e.g. composite several images with various transparency modes, I'm going to have to start implementing more and more things manually. At some point, learning a good library becomes a better use of your time.8

Finally, this is really only suitable for generating a static image or several in a batch-like environment: that is to say, it's probably not gonna be an ideal choice for any generative task that needs to be done in real-time (like interactive simulations or video games) or even relatively quickly (like generating images on-the-fly for a web site). Your chosen application might be better served again by finding bindings to something like pixman instead!

Hey, what about the other four Netpbm variants?

There are other variants of the Netpbm format, which of course use headers P4 through P7. P4 through P6 are simply binary variants of P1 through P3: for example, an image with the header P4 is a packed-binary bitmap. The header section that defines the width and height is effectively identical to the ASCII version—you'd still use a base-10 ASCII representation of the width and height separated by whitespace—but afterwards you use packed binary versions of the values, and usually with a predetermined color depth as well.

The final variant, P7, is only ever binary, and allows you to define more color channels for specific purposes. For example, you can add an alpha channel, something that is lacking in the other bitmap formats.

Honestly? I've never used them. The reason I reach for Netpbm variants is because they're easy to print and also to parse. In theory, a binary Netpbm file is going to be easier to produce and parse than a compressed format like PNG, but I've never had occasion to need just enough efficiency that I'd use a packed binary image format but not want to take it a step further to a compressed format or dedicated library. If I'm pulling in libraries, I might as well do it right!

What's next for this series?

I've talked about how I produce pixel-art-ish images, but that's actually a relatively small portion of the generative artwork I've done. Most of my generative artwork has been done with vector art, and there are a few approaches I've taken there. That's going to be the topic of my next post, and I'm going to cover the two most common approaches I've taken, one that's similar to the approach described here, and another that's library-supported!


  1. Okay, you don't have to use a 2D array. For example, you could also use a flat array of width * height elements: assuming you're storing this in row-major order, you can get the element at the position (x, y) by using the index x + y * height. Alternately, if you don't mind being inefficient—and if you're using the approach I outline here, you're probably more interested in your time than the computer's time!—you can use a hashmap or dictionary whose keys are pairs of numbers. I regularly choose different representations for images depending on what one-off image I'm trying to generate.
  2. A lot of languages have some way of automatically creating a list of n copies of a repeated element, and Wren is no exception, but you might notice I didn't use it here, in large part because there's a common gotcha with these functions: they often don't deep-clone their argument, which can cause problems where mutations mutate more elements than you'd expect. In Wren's case, I could create a 2D array of zeroes by writing List.filled(height, List.filled(width, 0)), but in this case every row is a reference to the same row in memory, so every row of the image would end up being identical. This is a problem common to other imperative languages, so Ruby's Array.new(height, Array.new(width, 0)) or Python [[0] * 3] * 3 would have similar problems: because all rows are aliases for the same row in memory, modifying image[0][0] would also modify image[1][0] and image[2][0]. …probably not what you want!
  3. It's almost like I chose this language to make a clunky didactic point…
  4. I could write a C program which pulls in a PNG encoder and exposes it to Wren—indeed, that's sort of how the language expects to be used—but I don't want to muck with that for this blog post.
  5. That said, the Netpbm tools are very old: at this point, it's likely that you probably haven't even heard of most of the image formats they have converters for!
  6. You might notice that the meaning of the numbers is swapped for the PGM format. In the PBM format above, 0 is white and 1 is black. However, if I create a PGM file where the maximum value for a pixel was 1, then I'd have effectively created a bitmap file, but reversed: in a PGM file, 0 is black and the maximum value is white. This is a peculiarity of the format, and something you simply need to remember!
  7. The Rust one does use the rand library, but it doesn't use a library for images, at least.
  8. This is especially true if you ever want to create images with text. At that point, I'd skip pixels and head straight to vector, even if I want the final result to have a chunky look to it!

Let's talk about objects in functional languages.

This could be a sticking point, so I'll get this out of the way up-front: I find arguments over whether such-and-such a thing is “really” functional or “really” imperative or “really” object-oriented to be for the most part frustrating and useless. A lot of these arguments boil down to value judgments in disguise (a thing I've touched on in the past when talking about the rhetorical emphasis placed on math in programming) and many of the rest turn out to be based on informal definitions where 'functional' means 'like my favorite or first functional language'.1 I prefer to think about these concept less in terms of necessary criteria and hard lines, and more in terms of influence and approach and focus.

The paper On Understanding Data Abstraction Revisited by William Cook (itself a paper that revisits On understanding data types, data abstraction, and polymorphism by Luca Cardelli and Peter Wegner) lays out simple definitions of the often-used concepts of “abstract data type” and “object”. His definition of “object” is, of course, not the only one, but I think it's a useful one. Importantly, Cook's definition is based on abstract criteria that can apply even when the language itself does not have an explicit concept of an “object”. To emphasize this, I'm going to use Haskell in this blog post to demonstrate a pure functional variation on “object-oriented programming”2.

Beyond being a useful definition, Cook's formulation is also a valuable design pattern, one which has both advantages and disadvantages when designing programs. Consequently, the goal of this blog post is twofold: first, to show how “objects” can (under some definition) exist in a language like Haskell which has no built-in notion of “object-oriented programming”, and secondly, to show the advantages and disadvantages of code written in this style.

Abstract data types

Cook's definition of an abstract data type is a type that

[…] has a public name, a hidden representation, and operations to create, combine, and observe values of the abstraction.

To use a simplified version of his example, here's a (not terribly efficient) way of defining a “set of integers” in Haskell as an abstract data type. In this case, we've got an IntSet type whose representation can be easily hidden, because the operations that a consumer of this library cares about about have been defined in terms of that abstract representation.

-- we'll export `IntSet` but not the constructors
data IntSet
  = SetEmpty
  | SetInsert Int IntSet

empty :: IntSet
empty = SetEmpty

isEmpty :: IntSet -> Bool
isEmpty SetEmpty = True
isEmpty _ = False

insert :: Int -> IntSet -> IntSet
insert x set
  | contains set x = set
  | otherwise = SetInsert x set

contains :: IntSet -> Int -> Bool
contains SetEmpty x = False
contains (SetInsert y rest) x
  | x == y = True
  | otherwise = contains rest x

This is probably not a terribly controversial definition or design: it's a pretty typical one for most functional languages! …well, it might be controversial to use since it's got awful algorithmic performance, but it's fairly unobjectionable as a teaching tool, at least.

What would a user see when looking at the documentation for this module? Since we're not exporting the constructors for IntSet, it'll look something like this:

data IntSet
empty :: IntSet
isEmpty :: IntSet -> Bool
insert :: Int -> IntSet -> IntSet
contains :: Int -> IntSet -> Bool

The ability to hide the definition of IntSet is the thing that makes this an abstract data type. A user of the library doesn't care—and ideally doesn't need to care—what constructors hide behind that IntSet.

What is an object?

Cook then goes on to describe “objects”. Here's another (equally inefficient) implementation of IntSet, which I'll define as OIntSet so I can easily refer to both:

data OIntSet = OIntSet
  { oIsEmpty :: Bool
  , oContains :: Int -> Bool
  }

oEmpty :: OIntSet
oEmpty = OIntSet
  { oIsEmpty = True
  , oContains = \_ -> False
  }

oInsert :: Int -> OIntSet -> OIntSet
oInsert x set
  | oContains set x = set
  | otherwise = OIntSet
      { oIsEmpty = False
      , oContains = \i -> i == x || oContains set i
      }

It's possible that we can choose our export list carefully so that this implementation of OIntSet reveals the exact same set of operations as the previous one. However, there's a major difference here: OIntSet is not actually hiding a specific type. Instead, it just bundles the relevant set of operations inside of a record of functions, which acts like an interface type. In both the ADT-based approach and the “object”-based approach, a user does not know about the internal representation of OIntSet, but in the ADT approach, this is because there exists a single representation which is non-public, while in the “object”-based approach, there may be multiple separate implementations that are indistinguishable.

Why use objects?

An “object”-like representation allows for a vast amount of flexibility. Because a consumer of an OIntSet can use any value as long as the value has provided implementations of the relevant “methods”, we can easily and conveniently define new instances of OIntSet that have radically different internal representations. For example, we can define infinite sets that use simple numerical computations to define their oContains method, like this set of all even numbers:

oEvenNumbers :: OIntSet
oEvenNumbers = OIntSet
  { oIsEmpty = False
  , oContains = \i -> i `mod` 2 == 0
  }

Or we could construct OIntSet values that use a different data representation, such as a list, to store the members of the set:

oFromList :: [Int] -> OIntSet
oFromList list = OIntSet
  { oIsEmpty = null list
  , oContains = \i -> i `elem` list
  }

But even though these OIntSet definitions use different underlying data representations, they expose the same interface, so we can use the same operations to manipulate them. For example, we can define an oUnion operation which computes the union of two OIntSet objects, since that operation is easily expressible in terms of oIsEmpty and oContains:

oUnion :: OIntSet -> OIntSet -> OIntSet
oUnion set set' = OIntSet
  { oIsEmpty = oIsEmpty set && oIsEmpty set'
  , oContains = \i -> oContains set i || oContains set' i
  }

Our oUnion operation—indeed, any operation we define—can work on any two OIntSets even if they have wildly different internal representations. We can even use this to combine all of our previously-defined OIntSet constructors into one expression, to create a set that uses a combination of Haskell lists, numeric predicates, and closures to represent a set:

sample :: OIntSet
sample = oInsert 1 oEvenNumbers `oUnion` oFromList [10..20]

This is a very convenient way of building certain abstractions. By building around external interfaces, you can include varying data representations that easily work together.

This example is clearly a little bit contrived, so it's probably worth giving some other “real-world” examples where this design approach is useful. Object-oriented programming is generally cited as a good fit for a particular style of user interface programming, because it allows you to define classes of “widgets” that expose a common interface but have different internal representations. You could build a Haskell GUI library in this style by defining Widgets as “objects” with a common interface, something like this:

data Widget = Widget
  { drawWidget :: Ctx -> Position -> IO ()
  , handleEvent :: Event -> IO ()
  }

This is similar to the approach taken by the Brick TUI library, which has its own Widget record.

Why not use objects?

One major concern with this style of data representation is performance and optimization. Consider our original ADT representation for IntSet: it's inefficient, yes, but we can make it more efficient in a number of ways. For example, we could modify it so that, instead of always inserting new elements “at the front”, we can instead insert them in such a way that the internal representation of the set is always sorted lowest-to-highest. This means that we may no longer have to traverse the entire list to check for element membership. Even better, we might swap out the list-like representation for a binary tree representation, maybe doing some rebalancing in certain cases.

There is no way in general to apply these optimizations to the OIntSet-style program. You could define an OIntSet that sits in front of a balanced tree and therefore has faster lookup and insertion, but once it's sitting behind the interface, you no longer have access to those internals. You cannot, for example, write an oUnion operation that rebalances the binary trees behind the two sets it's operating on: it doesn't even know if both sets are backed by trees!

In effect, the major selling point of the “object”-style design here is also a major downside: you don't have guarantees about the specific representation of data, which means your programs can easily mix-and-match different representations, but it also means that your program can't make use of representation-specific knowledge in ways that are advantageous.

There's another major concern as well, and that's that the specific choice of “object” representation can make a big difference in terms of what operations you can and cannot support. Look back at OIntSet—I was able to define oUnion, but what about oIntersection? It turns out that it's not actually possible using the specific representation I've chosen3:

oIntersection :: OIntSet -> OIntSet -> OIntSet
oIntersection set set' = OIntSet
  { oIsEmpty = {- ??? -}
  , oIntersection = \i -> oContains set i && oContains set' i
  }

How do I implement oIsEmpty? I might naïvely try to write the inverse of oUnion and define it as oIsEmpty set || oIsEmpty set', but that's not at all what I want: the intersection of the set of even numbers and the set of odd numbers is an empty set, but neither the even nor the odd numbers are empty, so this would incorrectly compute their intersection as non-empty.

This is an artifact of the specific interface chosen for the set. I could modify the interface and be able to recapture this behavior, but almost any choice I make is going to have different repercussions: for example, I could add a method to enumerate all the values contained in the set, at which point I now have a convenient way to find out whether the intersection of two sets is indeed empty… but now I have made infinite sets significantly more difficult to define!

This is another face of the performance problem: the specific interface chosen is going to have far-reaching ramifications not only on what operations are efficient or inefficient, but on what operations are possible to write at all.

Why do these count as “objects”?

A lot of definitions of “objects” in the sense of “object-oriented programming” go back to Alan Kay. Kay was a major force behind the SmallTalk programming language, and he once gave this definition of OOP:

OOP to me means only messaging, local retention, and protection and hiding of state-process, and extreme late-binding of all things. It can be done in Smalltalk and in LISP. There are possibly other systems in which this is possible, but I'm not aware of them.

Our treatment of “objects” here does not fit in this definition, but neither do most languages that are typically called “object-oriented”. In particular, “late-binding” here means that methods are not looked up until runtime: in a proper SmallTalk-like system, this would be done by name, meaning that even virtual dispatch in a language like C++ or Java does not count. You can use languages like Ruby or Python in a way that matches this definition, but they're not typically used this way. Many object-oriented languages are also somewhat lax with respect to protection of local information: Python is a big offender here, as its instance variables are typically made private by convention rather than a language mechanism! And of course, almost none of the modern OOP languages are built strictly around messaging.

However, many of these languages are considered “object-oriented” because they try to capture the advantages of these features while not adhering strictly to them. Late-binding and all-you-can-do-is-send-a-message systems require some complicated machinery in order to implement efficiently, because otherwise the pervasive lookup of methods can become a source of slowdown, so many systems use virtual dispatch instead of extreme late-binding. Similarly, many systems do not adhere to strict information-hiding, but allow some public information for various conveniences (e.g. non-private variables in C++ or Java). In many ways, these are design decisions which help sandbag against the problems with “objects” described above. If we're being sticklers about the Alan Kay definition, we might call these languages “object-oriented-ish”.

The Cook definition is something of a distillation of several of the properties which are used by these “object-oriented-ish” languages. It encodes virtual dispatch: all operations here are higher-order functions, so you cannot know which you are calling until you are passed a function to call. It encodes hiding of local information in a way that's arguably stricter than Java or C++: an external consumer of this code cannot depend on anything but the provided functions, so any “local” information is necessarily hidden. If used appropriately, it can encode protection of state-process: the above example is pure, but if our “methods” had a type like IO (), we could include methods which update internal state (perhaps contained in an IORef) in response to other operations.

Alan Kay's definition is valuable, not just because it describes a powerful approach to programming as implemented in object-oriented languages like SmallTalk and IO, but also because because it describes the behavior of a number of systems that are not programming languages while also capturing what makes those systems powerful: for example, Microsoft's Component Object Model, Plan 9's 9P file system protocol, and arguably even HTTP itself are, from a certain point of view, systems based around doing extremely late-bound message-passing. But I would argue that Cook's definition is also valuable, because it describes both languages in wide use as well as pattern of data modeling in other systems.


  1. I once read a blog post that complained about zippers—a pure functional interface to certain data structures, one that's not even embedded in a monad or some other side-effect-ey abstraction—by claiming that they were “imperative”. Nonsense! This was clearly an example of a value judgment masquerading as some kind of technical truth, and a pretty shallow one at that. I'd be interested in an analysis of the advantages and disadvantages of using zippers, but “I think this abstraction is imperative, therefore it is bad,” is not that!
  2. Cook's paper uses an ML-like pseudocode for these examples, and also includes more detail: I've pared them down a bit.
  3. Actually, if you look closely at my representation, it is possible, just not practical: in Haskell, Int is a machine integer with a finite number of values, so we could enumerate every possible Int value and check whether it's present in both sets. That means I could implement this operation as oIsEmpty = or [ oContains set i && oContains set' i | i <- [minBound..maxBound]], which just needs to check 2^64 possible values on most modern processor architectures!

Last year I decided to use the Advent of Code to practice writing code using the Pony programming language, and the first few days I struggled with error-handling—it took several days before an approach finally clicked with me. This blog post describes my thought process as I worked to that approach. (Don't worry, I'm not going to spoil any of the problems: I use two small out-of-context code snippets from my answers here, but you won't get any insight into the problem or their solutions from them!)

Pony is a language that I've been interested in for a while. It's an object-oriented language designed around the actor model of parallel programming, and it uses a sophisticated type system to prevent things like data races in parallel code. Rust is well-known for the various static guarantees that it gives you regarding data parallelism: Pony gives you even more guarantees by using static capabilities that permit or restrict various operations within or across actors.1 It's a fascinating (albeit complicated) system, and I'd encouraging reading through the Pony tutorial to get an idea of the flavor. (You'll see hints of these features in the example code here, but I'm mostly going to gloss over them: they're not what I want to focus on here.)

Pony is also not much of a language for hastily-constructed one-off programs, the kind you often want to write in coding challenged like the Advent of Code. Pony demands that you handle errors very explicitly, and makes it impossible to use values without handling errors somehow or to compile programs that have unhandled errors.

“Ah, like Rust or Haskell?” you might ask, but not exactly. While Rust's Result-based error handling does have this property, Rust also trivially lets you defer errors to runtime: you can scatter unwrap anywhere, or expect a good result and panic otherwise. And Haskell? Haskell is full of partial functions, of errors and undefineds, of thrown exceptions from ostensibly pure code! By Pony's exacting error-handling standards, Rust might as well be the wild west, and Haskell is Fury Road. No, if I call a function that purports to return an integer in Rust or Haskell, that method may instead slam its way back up the call stack (or, in Haskell's case, the lazy soup of thunks that it has instead of a call stack) with a panic or an exception.

Pony, however, has no such easy escape hatches. Because of the realities of physical computers, yes, a Pony method may crash for reasons outside its control, like failing hardware or solar rays mysteriously flipping a bit in RAM, and because Pony is a Turing-complete language, a Pony function might still loop forever. But there's no convenient unwrap or error to be found in Pony: besides infinite loops and the computational equivalent of acts of God, a Pony method that claims to return an integer will return an integer.

When Code Goes Wrong

Of course, we do need to report errors somehow, so Pony does have a way of raising errors! It's just a very explicit way of raising errors. In Pony, methods can be partial, which is a way of explicitly indicating that the method might not return a value. All partial functions are marked with a question mark, and within a partial function you can raise an error using (of course) the error keyword. Let's work with a simple example: a factorial function which accepts signed integers, and thus needs to indicate an error condition when its argument is negative. (This factorial example is contained within a class called Math, but given that it's a class with no fields, you can think of it here as though it's just a namespace that contains the method.)

class Math
  fun factorial(n: I64): I64 ? =>
    match n
      | 0 => 1
      | let x: I64 if x > 0 =>
          x * factorial(x-1)?
    else
      error
    end

Notice that both the definition and the use of factorial need a ?. All partial functions, both when defined and when used, need to have a ? afterwards: Pony will not compile our program otherwise. So clearly we need to be explicit both when we create and when we propagate errors, but what about when we want to recover from them? For that, we use a try block, in which we can call partial functions, and then an else clause afterwards that will run if anything in the try block produced an error. Here is a main method for our program that tries to print a factorial but will print an error message if its argument happens to be negative. (It won't be in this case, as we've passed the constant 5, but Pony won't let us compile our program unless we have handled every possible partial function!)

actor Main
  new create(env: Env) =>
    try
      env.out.print(Format.int[I64](Math.factorial(5)?))
    else
      env.out.print("Something went wrong!")
    end

This looks pretty uninterestingly like exception-handling in most mainstream languages today. But when I raised an error, I just did so with error: what if instead I want to throw or catch a specific exception? Say I want to call my factorial function on one of the arguments passed to the program on the command line. There's more that can go wrong with this program: maybe the user didn't pass enough arguments, or maybe they passed an argument but it was a non-numeric string, or maybe it was numeric but it represented a negative number. What do we do?

actor Main
  new create(env: Env) =>
    try
      let arg_str = env.args(1)?    // try to fetch args[1]
      let arg = arg_str.i64()?      // try to convert it to a string
      let fact = Math.factorial(5)? // try to compute its factorial
      env.out.print(Format.int[I64](fact))
    else
      env.out.print("Something went wrong!")
    end

We can handle all of them, but if we want to distinguish between them, we're out of luck. Pony does not allow you to throw different exceptions or catch specific ones: all you get is error and an else case to recover from it. All partial functions raise the same error when observed from the outside2. What's the alternative?

One alternative is to go the functional route: use a return value. For example, Pony allows us to define primitives, which are types that have only a single value. Primitive types have many uses, one of which is that we can use them like error tags. Pony already does something like this extensively: for example, it uses the None primitive as a return value sort of like Rust or Haskell's (), but also combines it with union types to act like to Rust's Option or Haskell's Maybe. Let's rewrite our factorial as a total function, one that might return a number or might return a custom error tag which I'll name ArgIsNegative.

primitive ArgIsNegative

class Math
  fun factorial(n: I64): (I64 | ArgIsNegative) =>
    match n
      | 0 => 1
      | let x: I64 if x > 0 =>
          let r = match factorial(x-1)
            | let y: I64 => y
            | ArgIsNegative => return ArgIsNegative
            end
          x * r
    else
      ArgIsNegative
    end

Well, that's... clunkier than I'd hope. Specifically, our recursive call in the middle there became much more verbose. What happened?

This approach is not terribly different from Rust's or Haskell's error-handling, as both of those languages use explicit Result or Either types. The difference is that both of those languages have syntactic sugar to make working with those types less onerous: do-notation in Haskell and the ? operator in Rust. If we didn't have ? in Rust, we would have to do similar clunky pattern-matching, because every recursive call might not return a value, and we would need to pattern-match on the result to either extract the successful argument or propagate the error:

struct ArgIsNegative;

fn factorial(n: i64) -> Result<i64, ArgIsNegative> {
  match n {
    0 => Ok(1),
    _ if n > 0 => {
      let r = match factorial(n-1) {
        Ok(y) => y,
        Err(e) => return Err(e),
      };
      Ok(n * r)
    }
    _ => Err(ArgIsNegative),
  }
}

But Rust does have syntactic sugar for that, so a practical version of this in Rust is a lot more readable:

fn factorial(n: i64) -> Result<i64, ArgIsNegative> {
  match n {
    0 => Ok(1),
    _ if n > 0 => Ok(n * factorial(n-1)?),
    _ => Err(ArgIsNegative),
  }
}

Pony lacks that kind of sugar, which is why using this pattern in Pony is more of a hassle. Not impossible by any means, but you end up with some much more difficult-to-read code. So when I started writing my advent-of-code examples, I struggled with this: I wanted informative errors so that I could understand where my code went wrong and how to improve it, but manually matching on and returning these values was a hassle. And it was worse than the toy example of factorial: consider this (slightly modified) method from my actual solution which was supposed to fetch three values, all which might not exist in the input, and wrap them in a convenient wrapper type:

  fun ref build_operands(): (Operands | Error) =>
    let lhs = match get_value(offset+1)
    | let v: Value => v
    | let e: Error => return e
    end
    let rhs = match get_value(offset+2)
    | let v: Value => v
    | let e: Error => return e
    end
    let tgt = match read_storage(offset+3)
    | let i: I64 => i
    | let e: Error => return e
    end
    Operands(lhs, rhs, tgt)

That's a lot of boilerplate for a conceptually simple operation: call three helper functions, propagating errors if any of them fails, and then wrapping the result values in an object if they all succeeded!

How I Learned To Stop Worrying And Love Side-Effects

That's when I realized that I was thinking like a functional programmer, which was making me produce worse code.

Fundamentally, what did I want out of my errors in this program? I wanted two basic things:

  1. I want code that can abort when an error condition happens.
  2. I want to know what error condition happened and transmit back some information about why.

In a functional or functional-inspired setting like Haskell or Rust, you tend to break down problems into individual expressions that reify program states as values. A successful program state is represented by value tagged as 'successful' (called Result::Ok in Rust or Right in Haskell) and a failed program state is represented by a value tagged as 'failure' (called Result::Err in Rust or Left in Haskell). But Pony is also an object-oriented actor-model language, which means it naturally allows us to think about systems with encapsulated state.

While functional programming endeavors to make programming more tractable by removing all state changes and pushing you towards functional purity, object-oriented programming endeavors to make programming more tractable by encapsulating state changes into smaller chunks that are easy to understand in isolation, ideally chunks whose encapsulated state is either trivially reasoned about or even non-observable. Both functional programming and object-oriented programming are intended to sidestep the need for side-effecting assignment statements: in a functional setting by reifying everything into values that never change, and in an object-oriented setting by allowing objects to manage and guard their own internal state in response to external messages. In fact, Alan Kay even said as much in The Early History of SmallTalk:

Though OOP came from many motivations, two were central. The large scale one was to find a better module scheme for complex systems involving hiding of details, and the small scale one was to find a more flexible version of assignment, and then to try to eliminate it altogether.

So let's stop thinking about factorial as an individual pure function, and instead let's put it into a system which might contain encapsulated state: that is to say, an object. In particular, this object can include a queue of errors which it has encountered. Let's also put the meat of factorial into a private method (indicated in Pony with a leading underscore) write a slightly different public method which calls to our internal implementation:

class Math
  // a mutable queue of error messages
  var _errors: Array[String] = Array[String]()

  // our internal implementation
  fun ref _fact(n: I64): I64 ? =>
    if n == 0 then
      1
    elseif n > 0 then
      n * _fact(n-1)?
    else
      _errors.push("Negative argument: " + Format.int[I64](n))
      error
    end

  // our external interface
  fun ref factorial(n: I64): (I64 | Array[String]) =>
    try _fact(n)? else _errors end

This isn't exactly error-handling the way we traditionally think about it. Really what we're doing here is logging: when we encounter an error, we add it to our error queue and then bail. We can use the simpler partial function handling (which litters some question marks around our code but doesn't require the verbose explicit control flow of earlier examples) and yet an external caller can still get informative errors about what went wrong. The core of this approach is that it separates out the two error-handling concerns I described above: exiting early is accomplished with Pony's implementation of partial functions, while reporting useful errors is now handled by state that's only observable internally. An external caller shouldn't even know that any kind of state change was happening: it just gets either a number or an error, and calling it multiple times with the same argument should produce identical return values: it is still an externally pure function.3

In the above example, we only ever had a single possible error message, but this approach scales easily as we add more errors. Let's say we wrap our argument-handling logic into this factorial class, too, so that it takes the argument list from the command line and tries to print the factorial of the first provided argument:

class Factorial
  var errors: Array[String] = Array[String]()

  fun ref _fact(n: I64): I64 ? =>
    if n == 0 then
      1
    elseif n > 0 then
      n * _fact(n-1)?
    else
      errors.push("Negative argument: " + Format.int[I64](n))
      error
    end

  fun ref _fact_main(args: Array[String] val): I64 ? =>
    let arg_str = try args(1)? else
      errors.push("Not enough arguments"); error
    end
    let arg = try arg_str.i64()? else
      errors.push("Non-numeric argument: " + arg_str); error
    end
    _fact(arg)?

  fun ref main(env: Env) =>
    try
      env.out.print(Format.int[I64](_fact_main(env.args)?))
    else
      for e in errors.values() do
        env.out.print(e)
      end
    end

In this case, I'm using strings for my error messages, but there's no reason I couldn't have typed errors, as well. I could, for example, define an Error interface and push structured data instead. This trivial example doesn't show it, either, but the biggest win here is not from the place where you throw the error or the place where you catch it, but rather the intermediate places where you would need to propagate errors. What does the build_operands function I showed before look like with this scheme? It's about as simple Pony lets you get:

  fun ref build_operands(): Operands ? =>
    Operands(
      get_value(offset+1)?,
      get_value(offset+2)?,
      read_storage(offset+3)?
    )

It might fail, but the errors message are squirreled away elsewhere and you are no longer required to handle those explicitly, so all you need to do is sprinkle in a ? to propagate the error! The entry-point into this whole system can handle error-handling, abstracting away the existence of the error queue entirely.

The Point Of No Return

There's actually a bigger win to handling errors in this way, which is that this is a good way of handling errors in a parallel setting.

Remember I said at the beginning that Pony is an actor-model language? If we declare an actor instead of a class, then we can define not just methods (using the fun keyword) but also “behaviors” (using the be keyword). A behavior definition looks a lot like a method definition, but they're not identical semantically: calling a behavior is an asynchronous operation, so it's not guaranteed to run immediately and your won't wait for it to finish before continuing4. Because of this, behaviors can't return anything: after all, by the time a behavior is running, the original calling code will probably have moved on!

Pony does guarantees that a given instance of an actor will never run more than one behavior at a time. Each behavior invocation is a added to a queue for the actor on which it's invoked, which means that actor can freely mutate its own local state without fear of data races or anything: behaviors are sequential with respect to the actor they're on but behavior on different actors will run in parallel.

A consequence of this is that we can't rely on return values to communicate information between different actors, which means Result-like error handling was never going to be useful in that setting anyway. Instead, actors are must invoke behaviors and pass around values in order to communicate information with each other. That means that our error-handling-as-logging system is already well-suited to working in this setting!

One thing we could do, for example, is write our own “error log” actor. We give this actor access to an OutStream (which might be a handle to stdout or stderr or maybe a file) which it can use to report errors. This actor has two behaviors: one of them is err, which adds an error to its log, and the other is flush, which writes all the errors it's seen to whatever stream it was initialized with:

actor ErrorLog
  var _errors: Array[String] = Array[String]()
  let _stream: OutStream

  new create(stream: OutStream) =>
    _stream = stream

  be err(msg: String) =>
    _errors.push(msg)

  be flush() =>
    for msg in _errors.values() do
      _stream.print(msg)
    end
    _errors = Array[String]()

With our ErrorLog in hand, we can now rewrite our factorial in to log those errors asynchronously. In this case (for simplicity) I've left factorial as a synchronous method, but we could also rewrite it as its own actor which can compute its value in parallel, reporting it to one actor if it succeeds and reporting errors to our ErrorLog if it fails.

class Math
  // our internal implementation
  fun factorial(n: I64, log: ErrorLog): I64? =>
    if n == 0 then
      1
    elseif n > 0 then
      n * factorial(n-1, log)?
    else
      log.err("Negative argument: " + Format.int[I64](n))
      error
    end

actor Main
  new create(env: Env) =>
    let logger = ErrorLog.create(env.err)
    try
      let n = Math.factorial(-5, logger)?
      env.out.print(Format.int[I64](n))
    else
      logger.flush_errors()
    end

Is This Actually How You Write Factorial In Pony?

Absolutely not! That'd be ridiculous. A factorial implementation should take an unsigned integer so that factorial will be total and you don't need to worry about getting a negative number in the first place. Or, if your argument really needs to be signed, you can just return 0 for the negative cases and then guard against that happening elsewhere5. One way or another, this whole mess is in no way necessary just to write a factorial!

But this error-handling strategy is completely reasonable, because it both scales and parallelizes to the kinds of programs you're likely to want to write in Pony. Once you start having multiple actors doing multiple things, you're going to need something less like return values and more like logging anyway. It just required a bit of a shift in focus when I first came to it, thinking less in terms of a closed world of expressions, and more in terms of the broader world of independent systems. It's no coincidence that thinking about actor systems running in parallel requires that shift anyway, and Pony simply makes it natural to think in those terms!


  1. While Pony is designed to be a high-performance language, it is not as low-level a language as Rust, nor does it aim to be. There are broad parallels between the two in that Pony has features analogous to Rust's linear types and can use them to get guarantees around parallelism, but it does not give you tight control over things like allocation, and it also requires a larger runtime and a garbage collector. A better point of comparison might be Erlang: imagine building a typed Erlang safely from the ground up and you'd get something much closer to Pony.
  2. There's a long-open RFC about adding typed errors, which may change this in the future, but as of right now we've got nothing.
  3. Even if you don't know Pony, you might notice that I'm lying here! If I call Math.factorial each time, then we'll actually be re-initializing an instance of Math each time and therefore a new _errors array each time, and it'll look fine. However, it's possible to instantiate the Math class and then call factorial multiple times using that same instance of Math: this will result in a growing number of errors, because we haven't cleared the _errors array between calls. There are several ways to fix this, but an easy way takes advantage of a Pony-specific feature: the assignment statement in Pony returns a value, but it specifically return the previous value of the variable being assigned to. This means, for example, that swapping two variables can be done with a = (b = a): the expression b = a will assign the current value of a to b, and then evaluate to the previous value of b, which we then assign to a. This is an unusual decision, but there's a good reason for it that has to do with Pony's implementation of ownership and uniqueness, where assignment ends up serving a purpose similar to Rust's std::mem::swap. In our program, though, it means we can in one terse expression set the _errors array to a new empty array while returning the previous contents of _errors:

      fun ref fact(n: I64): (I64 | Array[String]) =>
        try _fact(n)? else errors = Array[String]() end
    
  4. If you're used to Erlang or Elixir, then you can think of behaviors as a statically typed kind of message-sending: each behavior invocation is a message with a particular expected payload of data.
  5. Sure, that means there's ambiguity that your user needs to distinguish, but it's not an unreasonable way of tackling this problem! You might have heard about one of Pony's more controversial choices, which was to define 1/0 = 0. This is a completely defensible choice and hopefully you've seen in this article why Pony's very strict error-handling strategy would otherwise necessitate a lot of work to handle every possible division operator otherwise.

Douglas Hofstadter's book Gödel, Escher, Bach: An Eternal Golden Braid is a classic text that's had a strong influence on countless people. It's a venerated book among a certain stripe of computer scientist and mathematician, and it's reputed to have inspired generations of people to pursue the study of logic and artificial intelligence. It's a meandering meditation of a number of topics that oscillates around logic, language, formal systems, biology, neurology, music, art, and many more topics, with a particularly strong showing from the three titular figures—the logician Kurt Gödel, the artist Maurits Cornelis Escher, and the composer Johann Sebastian Bach—as well as a heaping dose of writer and mathematician Lewis Carroll, who does not appear in the title but, at least on my copy, is explicitly invoked by the subtitle, “A metaphorical fugue on minds and machines in the spirit of Lewis Carroll.” Many people count it as one of their very favorite books.

So, if you're among the latter group, I should warn you that I'm about to give it a very lukewarm review.

I first read Gödel, Escher, Bach when I was about 20, while I was an undergraduate, and it's important to note that I was double-majoring in computer science and linguistics, and had a particular love of formal systems and logic, but was also proudly a generalist, and had a long-standing love of literature and music. That particular configuration of interests meant that this book was laser-focused to speak to exactly the things that I loved. It was a perfect book for me!

...well, it would have been, but for some reason I kept struggling to get through it. I thought highly of it, but my secret shame was that my admiration for the book was based mostly on the first two hundred or so pages. It took slogging effort to get myself through the rest, effort sporadically applied over the course of years. I did eventually make my way through the whole thing, but even now, I can't necessarily be sure I've read and absorbed every page, even though I've certainly looked at each one. By the end, my impression of the book was much more reserved: it does have some genuine high points and I can understand how it came to have its classic reputation, but I also felt it had a number of problems I've rarely seen discussed. For me personally, it ended up falling short of its reputation as a sparkling, effervescent text that drew together art, mathematics, and culture, and given how little I've seen this discussed, I wanted to write down why I feel this way.

The book is arranged into chapters, each beginning with a dialogue explaining a concept often using imaginative metaphor and a Socratic style, which is followed by a more traditional prose exploration of the concepts introduced in the dialogues. The dialogues are a big part of why I originally struggled with the book: they are meandering and long, and they regularly outstay their welcome. The Socratic style is a difficult one to write well without seeming contrived and difficult, and the book occasionally manages it, but it often falls incredibly flat: usually, they feature one character (usually Tortoise) explaining something verbosely, with conversational asides, but otherwise more or less mechanically, while the other character (usually Achilles) simply responds, “I see! Aha!” and follows up with a question that no real learner would ask but happens to be the next thing Hofstadter wants to talk about.

The other sections revisit the same ideas in prose, giving more concrete examples and dispensing with the catechistic form, and as such are able to give much terser and more interesting examples. Many of these are much clearer, and I remember on my first attempt at reading the book I was often tempted to skip the dialogues and read those first, because their explanations were often much more satisfying and took only a fraction of the time to read through. In some cases, the dialogues attempted to use metaphors that were nonsensical or even broken, and only by reading the chapter afterwards did the dialogue make any sense!

A good example here is the book's explanation of Gödel's incompleteness theorem. The high-level, slightly handwavey description of this theorem is that, for any sufficiently expressive mathematical proof system, there are more true facts within the system than there are proofs for facts, which in turn means that not every fact can be proved: in short, that not every mathematical fact has a corresponding mathematical proof. My short explanation papers over a number of important features of this theorem, such as what I meant by a 'sufficiently expressive mathematical system', and all those features are addressed much more rigorously by the actual proof. It's a fascinating theorem, and to Hofstadter's great credit, Gödel, Escher, Bach helped bring knowledge of this theorem to a much wider population.

Unfortunately, when I first began to read the dialogue which touched on the theorem, I was frankly mystified. Hofstadter decided to explain its working by coming up with a metaphor involving record-players and records that are designed to physically break the record-players they're played on. I'm familiar with how record-players work, but I have never played a record designed to break a record-player! This isn't an intuitive metaphor, because while I have intuition for the operation of records and record-players, I don't have any intuition at all about the universal manufacture of record-player-breaking records. The metaphor raises a number of questions: can the problem of record-player-breaking-records be suitably addressed by redesigning record-players? If no, why not? What if we simply read the information from a record using a visual device with no moving parts? What if we...?

A good metaphor has depth: you can convert a situation into the metaphor, visualize or reason about the implications of the metaphorical situation in isolation of the original situation, and then apply that back and have learned something about the original situation. However, the record-players in this metaphor don't actually work like record-players in the real world, so my own lack of intuition means that reasoning about the original situation via the metaphor is effectively impossible. When I first read the dialogue, I had no idea what was being explained: once I started the following prose chapter, I realized that the “record-players” were formal systems, “records” in were theorems designed to be unprovable within those formal systems, and that the whole thing was an awkward physical metaphor for Gödel's incompleteness theorem. In fact, it was only this realization that made me fully grasp the workings of metaphor in the first place: instead of the metaphor illuminating the theorem, I had to use my knowledge of the actual theorem to grasp what Hofstadter intended for the metaphor!

This is a pretty egregiously bad example, but it was also the point in the book where I realized that I wanted to like the book much more than I actually liked it in practice. I began to read onward and reread past sections with more skepticism, and I realized that the weaknesses which were particularly evident in the dialogue about Gödel's paradox were still partially present in many of the other dialogues. The original inspiration for the dialogue chapters was Lewis Carroll's short allegory What The Tortoise Said To Achilles, which expands on Zeno's paradox of motion to make a point about the foundations of logic. Carroll's dialogue is tight and focused and uses a rather clever metaphor, but the dialogues that punctuate Gödel, Escher, Bach are broad and meandering and the metaphors range from moderately serviceable to (like the one above) actively nonsensical, and the writing style is a mostly-mediocre Carroll pastiche, which means the characters often gratingly pepper their dialogue with interjections like, “Oh, my gracious! Oh, dear me! Oh, but you misunderstand! Go-golly! Oh, but that certainly won't do!” I eventually came to the conclusion that, while the dialogues are one of the more memorable features of the book, they're also an active impediment to conveying much of the book's material in an efficient and clear way.

The non-dialogue chapters, as I've said, are better, although they also range in quality. Many of them are clear, lucid explanations of mathematical concepts intended for a layperson, which often begin by introducing mathematical systems through simple examples, showing what can be done with pure symbol manipulation of those systems, and only afterwards pulling back the curtain to explain what they “mean” in a mathematical sense. The explanations of computational systems have a similar quality, although several of the later chapters feel rather too complicated for their comparatively simple conclusions. On the other hand, the topics that aren't about math or computers (or the shorter bits on workings of DNA) are introduced in a disappointingly cursory way that mostly consists of handwaving and pictures. Those latter sections lack depth and often betray strikingly little familiarity with or respect for the topic in question.

To give an egregious but illustrative example: Hofstadter mentions the experimental composer John Cage on a number of occasions, often bringing up Cage's modernist and aleatoric work as a counterpoint to the meticulously tightly-constructed melodies of Bach. Hofstadter is unsurprisingly negative about Cage's work, and usually characterizes it as avant-garde social commentary masquerading as music, and at one point a dialogue wryly suggests that John Cage might belong in a zoo. John Cage is most famous—or most infamous—for his piece 4'33”, which requires that a performer or group of performers walk onto stage and do not play their instruments for four minutes and thirty-three seconds. (It properly consists of three individual “movements” of non-playing whose lengths have been inconsistently specified across various editions of the score.) Hofstadter brings this piece up in a dialogue1:

Tortoise: [...John Cage] has composed many celebrated pieces, such as 4'33”, a three-movement piece consisting of silences of different lengths. It's wonderfully expressive—if you like that sort of thing. Achilles: I can see where if I were in a loud and brash café I might gladly pay to hear Cage's 4'33” on a jukebox. It might afford some relief! Tortoise: Right—who wants to hear the racket of clinking dishes and jangling silverware?

Tortoise's (and Hofstadter's) explanation of Cage's piece is fairly typical of explanations given of the piece: that is, four minutes and thirty-three seconds of silence. But this is, according to Cage's intentions, a strictly incorrect interpretation of what he was trying to do! Cage's actual intention in creating 4'33” was not to depict pure silence, but rather to force listeners in an auditorium to pay attention to the quiet and subtle sounds which they usually ignore when they listen to music. To make this painfully explicit, here is a quote from Cage about the original premiere of 4'33”:

They missed the point. There's no such thing as silence. What they thought was silence, because they didn't know how to listen, was full of accidental sounds. You could hear the wind stirring outside during the first movement. During the second, raindrops began pattering the roof, and during the third the people themselves made all kinds of interesting sounds as they talked or walked out.

Consequently, when Achilles and Tortoise agree that they'd rather hear silence than the sounds of a café, they're getting the point of 4'33” exactly backwards: a performance of 4'33” in a café should ideally compel you to listen to the “racket of clinking dishes and jangling silverware” with more awareness than usual!

As I said, this isn't an uncommon misunderstanding of John Cage's intentions, and reasonable people can and do differ as to whether 4'33” is a reasonable execution of that intention, or if that intention is reasonable in the first place. However, even with a charitable reading of Gödel, Escher, Bach, it's clear that Hofstadter isn't disputing Cage's artistic intention: instead, he doesn't seem to know what sort of artistic intention Cage actually has, preferring to read his own ideas about social commentary into Cage's work. His understanding of Cage and of the musical context in which Cage works is marked by a lack of context, a lack of deep engagement with the ideas there, and most importantly, a lack of respect. In the preface to my edition, he claims2 that he had,“...unambiguously heaped scorn on Cage's music, albeit in a somewhat respectful manner,” but there's very little respect or willingness to meet Cage on Cage's own terms here, only guarded derision, and that lack of engagement ends up weakening every section that tries to discuss John Cage in particular and modernist music in general.

This sort of cursory engagement with the cultural features of the book ends up undermining one of the book's major selling points: I had originally seen it as the work of a polymath effortlessly weaving fields together into a multifaceted but uniform whole, but in reality, areas that are more than a step or two outside Hofstadter's areas of expertise (computer science, formal logic, some of the more mathematically rigorous bits of cognitive science) are at best shallow, and at worst are “...heaping scorn...” on things Hofstadter doesn't understand and doesn't appear to want to understand.

Despite the relatively surface-level interaction Hofstadter has with the world outside of mathematics and computers, he nevertheless loves to drop in thick, multilayered references to such topics in every cranny he can find. The central two characters in the dialogues are Achilles and Tortoise, borrowed directly from Lewis Carroll's story above (which in turn borrowed them from the famous paradoxes of the Greek philosopher Zeno), and a Crab and a Genie show up on occasion as well. Their names are regularly abbreviated to a single letter, which means you can't help but notice that those letters happen to map to the names of the nucleobases that appear in DNA—adenine, thymine, cytosine, and guanine. All the dialogues have sing-song names that are usually inspired by music, such as Sonata for Unaccompanied Achilles or Canon by Intervallic Augmentation or Birthday Cantatatata. Off-hand mentions of people and places are often wry and unexplained cultural allusions: a typical example is that, at one point in a conversation about popcorn, Tortoise awkwardly shoehorns in the story of a “Schönberg factory” in Vienna that outraged consumers by stopping production of a delicious tonic in favor of a boring cereal, this being a nod to the Viennese composer Arnold Schoenberg's transition from traditional melodic compositions to experimental atonal pieces, a nod that never comes up elsewhere in the text of merits any explanation.3

I will admit right away that I personally find these heaps of unnecessary nods more tedious than interesting or endearing. There's a lot of reference, but very little of it means anything. Occasionally, a dialogue will use these references to actually explain something—one dialogue is written in the form of a crab canon, and thus is identical when read forward or backward, as a memorable way of explaining that form of musical canon—but most of the musical or biological or literary allusions are there really for their own sake. These things are rarely being commented on, or discussed in interesting context, or connected to other ideas. Instead, these ideas simply appear because this is a book in which ideas appear, interrupting the text for a shoehorned cameo, like Stan Lee in a comic book movie. Why do the characters' names map to nucleobases? I suspect if you asked Hofstadter, he'd claim it's because one of the themes of the book is that all these ideas are connected (in the titular “golden braid”), but this kind of reference doesn't actually connect anything to anything: it merely presents things adjacent to each other. It's all flavor, no substance.

This might seem unfair, so I'll give a very specific but pervasive instance of this sort of meaningless flavor. Many parts of the book invoke Zen, which is a school of Buddhism that originated in China and has spread to several other Asian countries but, in the Western mind, is usually associated with Japan. (We do, after all, know this school by its Japanese name Zen and not by its Chinese name Chán, its Korean name Seon, or its Vietnamese name Thiền.) Hofstadter's idea of Zen is a substance-less cliché: it consists almost entirely of faux-Eastern “Oriental” aesthetics and some handwaving about kōans (which are stories used in Zen practice for teaching and meditation) without really delving into any particular aspect of actual Zen thought or practice. There are lots of references to it—for example, he names a theorem MUMON, after the Japanese name of Zen master Wúmén Huìkāi, as part of a vague and largely pun-based connection to one of Wúmén's collected kōans—but none of those references have any substantial connection to the history or practice of Zen. In reality, Zen is a religious sect with history and cultural context and a complicated, multifaceted conversation that has been carried on throughout centuries. In Hofstadter's telling, Zen is just some funny nonsensical stories from Japan.

The edition I have includes a preface in which Hofstadter talks, twenty years after the book's release, about the book itself, its reception, and its legacy, and he goes out of his way to complain about a negative review of the book which accused him of being a hippie trying to popularize Zen. Hofstadter objects to this review because

As I declare at the start of Chapter 9, I find Zen not only confusing and silly, but on a very deep level utterly inimical to my core beliefs. However, I also find Zen's silliness—especially when it gets really silly—quite amusing, even refreshing, and it was simply fun for me to sprinkle a bit of Eastern spice into my basically very Western casserole. However, my having sprinkled little traces of Zen here and there does not mean that I am a Zen monk in sheep's clothing.

In this passage, Hofstadter openly admits to exactly the charge I'm bringing: that his inclusion of Zen is all about clichéd aesthetics (“Eastern spice”) and not at all about any of its substance—in this case, because he apparently doesn't seem to think it has any!

What Hofstadter doesn't admit to, but what I would argue, is that the whole book does this with almost every non-mathematical topic it tackles. His explanations of mathematics-adjacent topics do have substance and are often reasonably well-explained, but every time he branches out, he doesn't seem to realize that he's regurgitating shallow, half-misunderstood cliché: his discussions of modern art and music are, as I mentioned before, deeply lacking in this regard, but he name-checks plenty of artists, musicians, and writers with a high school understanding of who they were and what they did, preferring to pepper the text with photos of wacky paintings, drawing he made of letters that are made up of other letters, and tales of half-understood kōans. They're all spice: his “casserole” is a few insubstantial layers of food underneath inch-thick layers of spices.

This also presents a problem with the entire underlying program of the book: it's supposed to present examples of a common important idea—self-reference—resurfacing throughout various disparate areas, including mathematics and computation and art and music, but while this idea is well-motivated in the parts about mathematics and computation, but because most of the other topics the book tackles end up being just shallow aesthetics, then the “deep connections” there can only be present in shallow aesthetic ways. This was, for me, the ultimate breakdown of the promise of the book, as the grand unifying theme—the titular “eternal golden braid” of self-referential structures across domains—was only capable of unifying a few problem domains, as the rest of those connections were pretty but ultimately insubstantial.

While rereading bits of the book in order to write this post, I flipped through it at random and came across the photos marked as Figure 81, which in my copy at least appears on pages 490 and 491. These pages contain photos of television screens that are in turn displaying geometric images that result from pointing a camera at the screen: infinite turning shapes, nested and sweeping frames, eventually deforming into twirling light patterns. They are described in captions, beginning with plain descriptions like, “What happens when you rotate the camera,” and gradually becoming more florid, with captions like, “The galaxy has burned itself out, and become—a black hole!” The actual caption beneath these photos says the following:

Twelve self-engulfing TV screens. I would have included one more, had 13 not been prime.

These photos are fun! They feel especially endearing in 2018 because of the late-70's television depicted, and they depict a fun experiment that I did as a child as soon as I got my hands on my parents' bulky camcorder4. The captions, however, add very little, and the final comment (“...had 13 not been prime”) includes a bit of extra unnecessary whimsy that seems to wink at the reader but adds absolutely no meaning. Like so much of the book, it seems to hint at something grander while not signifying anything in particular. The photos themselves might be a fun illustration of something, but they're not a particularly deep illustration of anything, and their inclusion here (surrounded by several pages of Achilles and Tortoise pontificating about the notion of “self-engulfing”) doesn't bring any more enlightenment than when I first pointed a camcorder at a TV when I was four.

“A fun illustration of something,” is pretty much as far as the book goes: it hints at grand unifying patterns, but the pattern it finds is just the abstract notion of self-reference, and then it keeps bringing it up, making a few unnecessary references, showing some pictures, and asking, “Isn't that cool? Isn't that weird?” It'll give a perfectly competent (if somewhat verbose) description of formal systems, but as soon as it tries to venture connections to other domains, or to explain more complicated or nuanced details, it turns out that the only connections it can draw consist of vigorous handwaving. The whole book boils down to Hofstadter giving a competent lecture on logic, intimating the existence of an eternal golden braid, and then pointing at some fun photos of televisions.


  1. This appears on page 156 of my copy.
  2. This appears on page P-18 of my copy, as part of a response to a critic who mistakenly believed that Hofstadter liked Cage.
  3. I happen to love atonal music, and I'd highly recommend watching Vi Hart's video Twelve Tones which explains twelve-tone atonal music with plenty of examples and drawings of bird-bowls.
  4. Do young people know what a camcorder is these days? Do people still use the word 'camcorder'?

I first came across Rust back in 2010 or 2011, and it was a very different language than the one it is today, both syntactically and semantically. I remember at the time that newcomers would often complain loudly about the terse keywords—like the fact that the return keyword had been shortened to ret—and the omnipresent tildes scattered throughout the language like fallen leaves in autumn. My programming background was in functional languages—specifically in Scheme and Haskell—and I found this language fascinating, sitting in an interesting and unexplored place in the spectrum of programming languages and bringing something genuinely new to the table.

As I followed its development, I fell more and more in love with it. The core developers were pragmatic and thoughtful, and while I wouldn't always have made the same decisions as them, I always felt like they were making decisions that were well-thought-out and reflected a deep appreciation for both the piece of technology they had created as well as the community that was sprouting up around it. But more than that: I felt like the decisions reflected principles that I, as an engineer, found to be important.

For example, one such decision—a wildly contentious one when it happened!—was the removal of the ~ syntax. Back before 2014, the type ~T represented a unique pointer to a value of type T on the heap, and the expression ~expr allocated a value on the heap and returned a unique pointer to it. Rust removed these entirely: the type became Box<T>, and the corresponding expression became Box::new(expr). There was some forum discussion about whether this was happening in order to introduce syntactic salt which would make heap allocation more painful, but the primary motivation, as described in the RFC was different: it was removing the special case ~T in favor of a single more general mechanism, both to accommodate a larger number of use-cases (such as parameterizing Boxed values by an allocator) and remove redundancy in the language. The text of the RFC even suggests leaving the ~ in the language, but proposes removing it “[...in] the spirit of having one way to do things”.

This wasn't the only instance of such a decision: another good example is the evolution of Rust's closure types. Rust's closure system evolved multiple times and there were many proposals for how to develop it, but my memory of Rust at the time involved (for example) “stack closures” (which meant they could be passed to functions but never returned from functions) and procs, (which took ownership of the things they closed over and could therefore only be called once.) The syntax used to create them was different, their types were different, and they were treated differently and specially by the compiler. Eventually, Rust switched to its current system of unboxed closures, which are subject to the same borrowing rules as other types and use traits like Fn, FnMut and FnOnce in order to abstract over their function-like behavior. Again, this takes something which used to be a special-case and turned into something general, built in terms of powerful existing building-blocks of Rust.

If you want to see the results of these changes, look at the current version of the Periodic Table of Rust Types, which looks now like a very rote mechanical chart, and compare it to the first version of the same chart from January of 2014, which features wild irregularities and special cases. The fact that Rust would take such pains to pare the language down to powerful-but-orthogonal abstractions was, by and large, my favorite feature of the language.

I should be clear about why I like this so much: I would argue that the process of taking special cases and turning them into expressions of general properties is not merely a nice aesthetic property, but in fact one of the most important things we do as software engineers and especially as programming language designers. This is the entire purpose of abstraction: it allows us to build tools which can broadly apply to many situations, and in a programming language, it allows us to build languages which have a smaller surface area and therefore are easier not just to learn—which must only be done once—but also to remember and reason about their semantics—which must be done perpetually as we use them. A programmer writing in a language must model, in their head, the meaning and execution of the language, and consequently, a language composed of small, straightforward parts is going to be easier to model than a language composed of large numbers of special cases and situation-specific abstractions.

All of this is why I'm pretty dismayed by the current direction of Rust.

The recent contentious discussion is about a much-bikeshedded piece of sugar usually called try fn. The motivation for try fn has to do with the often-repetitive nature of handling errors in Rust using the Result type. Rust has several pieces of built-in support for writing functions that return Result<T, E> to represent either a successful return value of type T (represented as Ok(expr)) or an error of type E (represented as Err(expr)). For example, the sugar expr? requires that expr is a Result value, and will desugar to this:

match expr {
  Ok(v) => v,
  Err(e) => return Err(e.into()),
}

That is to say: in the Ok case, it'll turn a Result<T, E> into the T it contains, but in the Err case, it will return early from the enclosing function with an E value.

The try fn feature extends this use-case to make using Results even simpler. One of the current frustrations is the preponderance of Ok() constructors. Consider that when you're returning a value of type Result<(), Err> you will often have to end your block—as well as any “successful” early return—with the slightly unwieldy expression Ok(()):

fn sample(lst: Vec<T>) -> Result<(), String> {
    for x in lst {
        if is_this() {
            return Ok(());
        } else if is_that() {
            return Err("Bad!".to_owned())?;
        }
    }
    Ok(())
}

A proposal like try fn would introduce a sugar that automatically wraps Ok around successful cases. There are dozens of variations that have been described, but the above example might look something like this:

try fn sample(lst: Vec<T>) -> () throws String {
    for x in lst {
        if is_this() {
            return;
        } else if is_that() {
            Err("Bad!".to_owned())?;
        }
    }
}

To be perfectly honest: I think this sugar is a bad idea. I think that it adds an extra hurdle to learnability1, it obscures the (incredibly simple!) way that Rust's error-handling works, it produces an unfortunate asymmetry between producing Result values (where the shape of the data is implied by the context) and matching on Result values (where you will still use the constructors verbatim), and worst of all, it's only faintly nicer to read than the other, and thus a fair bit of fuss over a comparatively minor win in readability2. When compared with my earlier examples of changes to Rust, which were about removing special cases in favor of general mechanisms, this change takes a powerful feature implemented in terms of a general mechanism (errors represented as a Result, a plain ol' Rust enum) and turns it into a special case.

A change like this increases the language's surface area, and results in aggregate in more complexity to the language. One might argue that there's a sense in which try fn is a simplifying change: the body of the above function is shorter and simpler than it was without it. But this is only a local simplification: it streamlines the process of writing code by allowing a programmer to ignore some details about the semantics of their program, but those details are still present, they're just not locally evident, and instead are expressed as part of a larger context. A programmer who is reading code like this now has to be aware of more details of the code in question, as the meaning of an expression like return 5 can only be resolved by looking at the entire function definition. And now there are extra inconsistencies and special cases in the language, which puts a greater burden on the programmer's mental model and introduces more cases to consider in an already large language.

A feature like try fn isn't, taken on it own, a major shift in the language in any way, but it's still very plainly a move towards special cases, and consequently it's a change that runs counter to the things that I loved about Rust in the first place.


  1. There's a very meaningful parallel that various Swift developers brought up: Swift has an Optional type that works very much like Rust's Option or Haskell's Maybe, but in order to make it as familiar as possible, they introduced a number of pieces of syntactic sugar to make working with it as smooth as possible. The unfortunate side-effect of this is that it actually obscures the way that Optional works, as newcomers to the language see the constructors so rarely that they assume that Optional is a built-in compiler feature unlike other algebraic data types! If try fn is implemented in Rust, then my suspicion is that we'll see a whole swath of new Rust developers thinking the same thing here, that try fn is a special error-handling construct and not merely sugar for Result.
  2. I know that Ok(()) is a bit of a weird-looking expression, but I think that could be smoothed over without modifying the language's features by, say, introducing a function ok() that always returns Ok(()). Additionally, libraries like failure already include macros that help in the error case, so I could today rewrite that function as something more or less like:

    fn sample(lst: Vec<T>) -> Result<(), String> {
        for x in lst {
            if is_this() {
                return ok();
            } else if is_that() {
                throw!("Bad!");
            }
        }
        ok()
    }
    

    which is more readable and requires no language modification at all!

Something I've felt for a long time is that structural types are underused and underappreciated in modern programming languages. They're not unheard of: plenty of programming languages feature them in prominent or less-than-prominent ways! But I'm always surprised that they don't show up more prominently.

In particular, I feel that structural types combine about 95% of what I want out of dynamic duck typing with about 95% of what I want out of static type-checking: and honestly, that's a good amount of both! This post is intended to be a quick example of what structural types looks like and why I sometimes point to them as the static answer to duck typing.

Structural Types vs. Nominal Types

If you're familiar with Java, you know what nominal types look like. Consider the following (contrived) example in Java:

class Cat {
  void speak() {
    System.out.println("meow");
  }
}

class Dog {
  void speak() {
    System.out.println("woof");
  }
}

Both Cat and Dog are classes that, from a programmer's point of view, implement the same interface: both of them have a zero-argument method speak that prints something. However, as written, I can't easily write a function that accepts either a Cat or a Dog and statically rejects other objects which don't have a speak method. From Java's point of view, these are completely different types, regardless of the set of methods they expose. Why? Because they're named different things! I could manually tell Java that they implement a shared interface, but there are no language features that can work over objects of the same “shape” without explicitly indicating that they share that shape.

The OCaml language has a different approach to objects. I can define analogous classes in OCaml like this:

class cat = object
  method speak = print_endline "meow"
end
class dog = object
  method speak = print_endline "woof"
end

However, OCaml has a crucial difference in how it treats the types of objects: OCaml cares more about the interface than the name. Consider the function below:

let hear_what_it_has_to_say obj =
  let () = obj#speak in ()

If you're used to curly-brace languages, then this syntax might be unfamiliar: the # operator works like the . operator in Java and other object-oriented languages, so the expression obj#speak is the OCaml equivalent of obj.speak() in most traditional object-oriented languages. If we load this function into an OCaml repl, we can observe the type that OCaml has inferred for this function1:

# hear_what_it_has_to_say;;
- : < speak : unit;  .. >  ->  unit = <fun>

This again might be unfamiliar syntax, so the way to read this type is as follows: hear_what_it_has_to_say is a function which takes an object whose type (represented as the stuff in the angle brackets) has a method called speak which returns unit (more or less like void in Java or C++). The object may have other methods, which is what the .. means. Finally, the function itself returns unit.

In short, this function takes an argument that must be an object that has at least a speak method which doesn't return anything. This means I can call it with both a cat and a dog: after all, they both fit that description!

let () = hear_what_it_has_to_say (new cat)
(* prints "meow" *)
let () = hear_what_it_has_to_say (new dog)
(* prints "woof" *)

Notice that at no point in the above code did I indicate that cat or dog shared an interface: in fact, I didn't define any interfaces at all! I simply created data that had a particular shape, and wrote code that assumed a particular shape, and put them together.

Row Polymorphism

Sometimes when I talk about structural typing, I talk specifically about row types or row polymorphism. This is a particular implementation of structural typing which happens to be convenient and easy to reason about, although other approaches do exist.2

You've already seen an example of row polymorphism: OCaml's object types! The .. in the above type signature is what we would call a “row variable”, a stand-in for “the rest of the object”. In the above instance, both dog and cat had the same interface, but we could define a new object that features a different, larger interface:

class cow = object
  method speak = print_endline "moo"
  method num_stomachs = 4
end

Our cow class now has a method that neither cat nor dog bothered to implement. However, we can still call our hear_what_it_has_to_say method on it without trouble, even though its type is strictly larger than the types of both cat and dog. After all, it still fits the interface!

let () = hear_what_it_has_to_say (new cow)
(* prints "moo" *)

A powerful feature of row types is that we can give intermediate names to structures or parts of structures and use them accordingly. For example, I can write a function like the one above that calls a method and then returns the object it received:

let ecce_orator obj =
  let () = obj#speak in obj

Here's the type that OCaml infers for this:

# ecce_orator;
- : (< speak : unit; .. > as 'a) -> 'a = <fun>

This types looks mostly like the one before, except that we give the object type a temporary alias (here it is 'a) which allows us to express that the return value of the function is exactly the same as what we got in: that is, if we were passed a cat, we'll get back a cat with all its methods, if we were passed a cow, we'll get back a cow with all its methods, and so forth. This is important, and is one of the things that separates systems like row typing from other approaches to structural types like structural subtyping.3

Why Does It Matter?

I said near the beginning that structural types give you 95% of what you want from duck typing and 95% of what you want from static typing. For a long time, I suspected that people who were fans of dynamic languages would start to find structurally-typed systems and use them to build new languages which could take advantage of static types while retaining the flexibility of dynamic systems that permit duck-typed interfaces. I recently found a newer language which is a perfect example of exactly this approach: the Crystal language. To demonstrate, here's what the above OCaml snippets look like when translated into Crystal:

class Cat
  def speak() puts "meow" end
end

class Cow
  def speak() puts "moo" end
  def num_stomachs() 4 end
end

def hear_what_it_has_to_say(obj)
  obj.speak
end

hear_what_it_has_to_say(Cat.new)
hear_what_it_has_to_say(Cow.new)

If you know Ruby, this program will look very familiar: it's valid Ruby source! In particular, like the OCaml above, it's a program that can call hear_what_it_has_to_say on any object with a speak method through the magic of duck typing! Amazingly, it's also valid Crystal, and produces exactly the same output. There's an important difference, though: if I were to ammend this program with a line like hear_what_it_has_to_say(5), then the Crystal compiler gives me the following compile-time error:

    Error in src/main.cr:19: instantiating 'hear_what_it_has_to_say(Int32)'

    hear_what_it_has_to_say(5)
    ^~~~~~~~~~~~~~~~~~~~~~~

    in src/main.cr:12: undefined method 'speak' for Int32

      obj.speak
          ^~~~~

    Rerun with --error-trace to show a complete error trace.

This is a bit verbose and jargon-heavy, but what it's telling us is that the literal 5 (which Crystal gives the type Int32) doesn't have a method called speak, and therefore the program doesn't type-check. Crystal is doing something very much like what OCaml does here, but it's also doing it while presenting you with an interface that looks a lot like Ruby's: it's specifically designed to enable the use of duck typing while still preventing cases that would end up failing at runtime!

But You Said 95% Up At The Top

Okay, there are a few drawbacks to a system like this. One small cost relative to a more traditional nominally-typed system is performance: it's difficult to implement this kind of type system without some kind of indirection, which is a small but nonetheless present cost. When a method is given an object which has a speak method, it needs to know where the code for that method lives, which means I'll need some kind of function pointer or vtable to point it to the appropriate code, or else I'll have to replicate the method's code with specializations for each data layout we use. In most problem domains, this won't be a problem—-but nonetheless, this sort of indirection wouldn't necessarily be required in a system with more rigid types!

A slightly bigger cost is that structural systems like this have slightly weaker static guarantees than a more nominal system. In a Java-like setting, if I accidentally try to call myCat.speka() instead of myCat.speak(), then the compiler can immediately spot a problem right in the function definition: Cat objects don't have a speka method! In the analogous OCaml function, however, I might not find the problem so easily: if I mistyped out hear_what_it_has_to_say function above with a speka method, then the function itself would have been fine: it just means that it takes an object that has a speka method instead of what we intended! Our program as a whole still wouldn't compile, but the error would only arise elsewhere, when we try to pass a cat object to the method. In this case, we're probably safe, but when you start to look at programs across module or compilation unit boundaries, you can start seeing that it's possible for this sort of error to slip in unnoticed until later compilation units are presented with a nonsensical interface.

Finally, there's the cost relative to traditional dynamic type systems: these structurally-typed systems are often less expressive than pure duck typing! OCaml's approach, for example, doesn't let you branch on whether or not an object has a given method, like Python's hasattr or Ruby's respond_to?: you either use the interface you're given, or you don't. Crystal does let you do this, but the type system becomes more complex and sporadically harder to reason about, and it will regularly (although it didn't come up in my example above) simply give up inference and require the you to fill in the types that you want. Still, it's not as straightforward as passing in a thing that implements the stuff you want and letting it quack.

Of course, in several of these cases, I'm also setting up a bit of an artificial divide: there's nothing wrong with having a system that has features of structural and nominal typing! OCaml does, and this can be very powerful: we can have some parts of the program that work over types whose structure is inferred, and others that work over types whose structures is declared up-front, and both of these can happen in concert. Alternately, we can build gradually typed systems that allow full dynamic types and gradually use more static knowledge to move towards a system like this, for a full spectrum between flexibility and safety.

But even with the drawbacks as described, I contend that systems that use structural types as a modeling tool are a powerful middle step between dynamic and static languages, and they can definitely enable new powerful tools that allow the flexible experimentation of dynamically-typed languages while retaining many of the safety properties provided by statically-typed languages.


  1. I added a tiny bit of extra whitespace to the actual output, for clarity's sake.
  2. The most notable other approach to structural typing is structural subtyping, which I'm not going to go over here, but which also exists in OCaml: the Real World OCaml book has a section on Subtyping Versus Row Polymorphism which explains it in a bit more detail. There are some situations where you might prefer row types, and some where you might prefer a proper subtyping system: it very much depends!
  3. In particular, in a system with subtyping, I might have one type that's a subtype of another: say, a cat which is a subtype of animal. I can write a function which accepts an animal as argument, and pass it a cat, but once I've done that, I've casted that reference to a cat to a broader type, which means the type has lost some information! If that function returns the value it was given, it can't return the cat, it has to return the value it was given, and the type has no way of tying the full structure of the input type to the output type. That's where row typing is a valuable tool!