Infinite Negative Utility

Something I've had puttering around in the back of my head for ages now is My Perfect Application Language. There are languages out there that I like more or less, but I still believe that another language could be a more perfect fit for the software I want to write1, and I have a large number of ideas about what features it would include, what features it would omit, how it might be written, and so forth.

One dissatisfaction I have with existing languages is that I think none of them do error-handling terribly well. Graydon Hoare's very good blog post What next? talks about places that programming languages might go in the future, and one thing he notes is that we've got various approaches but none of them feel obviously right. I very much agree and I don't want to pretend I've “solved” error-handling here.

However, I've got a hypothesis about what sort of error-handling system I might want in My Perfect Application Language, and I've never actually written it down in full. I'm not imagining a completely novel error-handling system: instead, I'm imagining taking a few ideas which have been tried and combining them in a somewhat novel way to hopefully combine their strengths and temper their weaknesses. To explain this hypothetical feature, I'm going to break it down into two language design hypotheses:

The first hypothesis: statically declaring the list of possible exceptions which a function can throw, like Java's checked exceptions or C++ exception specifications, can be a good feature so long as they can be optionally inferred and do not need to be explicitly listed by the programmer.

The second hypothesis: we can create an incredibly flexible and powerful error-handling mechanism by implementing a statically typed inferred condition system inspired by Common Lisp's condition system, coupling static inference of error conditions with static inference of restarts.

The rest of this post will explain them in fragments. I should also note that this all hypothetical: I haven't implemented this, and I don't believe any programming language feature can be really justified until it has proven itself in at least medium-scale programs in its intended domain. (In fact, one of the features I'm about to discuss seemed fine in tiny snippets and turned out to be abysmal in large programs.) That said, it's a hypothesis I'd like to test, because I suspect a feature like the one I'm about to describe would be a tremendous boon to the kinds of programs and libraries that I usually write!

Statically inferred checked exceptions

I think it's probably pretty uncontroversial to say that checked exceptions might sound appealing in theory but are downright miserable in practice. The idea is simple: you can statically verify that all error paths are checked if each method enumerates the exceptions that might be thrown. If I call a method foo() that declares no exceptions, then I know for sure that I won't need to catch any of them! If I look in the source and see foo() throws ThatException, then I know for sure that I might need to handle ThatException! The compiler will yell at me if I don't do something about ThatException: I either need to explicitly catch it, or I need to explicitly pass it on by adding throws ThatException to my method. I can sleep easy, knowing that all error paths are handled!

…except, of course, now you need to copy-and-paste throws ThatException again and again and again and again throughout your whole damn codebase, which is ridiculously noisy and nobody wants to do it. (To say nothing of the half-dozen or more exceptions I might want to actually throw!) In practice, Java programmers have pretty universally found checked exceptions to be firmly Not Worth It: either they haphazardly toss the generic and unhelpful throws Exception around all their methods, or they catch any exception without bothering to discriminate in order to silence the warning, or they use runtime errors which don't need to be included in the type. Findings with large codebases have firmly indicated that exception specifications increase toil but don't significantly impact safety.

But I suspect the problem isn't that they can't increase safety in some respects: the problem is that the amount of toil they cause is disproportionate to the safety you get, and in that they tend to push programmers to use them in ways which minimize the toil and therefore also minimizes safety. So what if we got rid of the toil?

Let's imagine a hypothetical programming language with exceptions: the syntax isn't important, so I'm going to borrow a handwavey Rust-like syntax, but I shold be clear that all the examples below are imaginary and in a fictional non-Rust language. What I imagine here is that the default way of a method might look something like this:

// an example function in a fictional language
fn find(haystack: String, needle: String) -> Nat {
  // implementation elided
}

(Wait, why Nat? I also think programming languages should use natural numbers more!)

Ah, but there's nothing in that snippet that mentions exceptions, right? In our hypothetical language, that doesn't mean it won't throw an exception: instead, that means it's allowed to throw any exception! However, the compiler can infer which exceptions it might throw, and it shouldn't be hard to compute: simply accumulate any uncaught exception thrown in the method body or any uncaught exception thrown by a method which is called in the method body. If I asked the compiler for the type of find here, it might tell me a type like

fn find(haystack: String, needle: String) -> Nat
  throws (SubstringNotFound)

So that means the compiler is well aware that find might throw an exception, and in fact knows what exception it might throw. Anything which uses find now can be aware that it might throw SubstringNotFound. That means that, in this hypothetical language, we can write something like

fn example() -> Nat throws () {
  find("foobar", "ee")
}

And now we've given our compiler license to yell at us: we've claimed that example throws no exceptions, but because we're using find, the compiler can correctly point out that we haven't handled SubstringNotFound. Just like in Java, we've got two basic options—either catching SubstringNotFound in the body of example, or adding SubstringNotFound to the list of exceptions thrown—but we've also got a third, substantially more terse option: we can wholly remove throws () and allow it to infer for us whatever it wants to. We can add the specification if we care but otherwise let the compiler handle the rest.

I think there's also more that can be done around expressivity here. For example, perhaps a programmer could choose to explicitly include an exception in the list: a type like fn example() -> Nat throws (SomeException, ...) would mean, “Whatever the compiler infers for the method body here, but it can also throw SomeException even if the compiler didn't infer that one.” One situation you might want this is when prototyping an API: perhaps I know that my API might eventually throw CacheMissException, but I haven't wired up the cache yet, so I'm going to make sure I include that in my type signatures in the appropriate places just in case, and elsewhere use throws () to make sure I handle it in places where I need to.

More usefully, though, I can imagine a syntax for ensuring that specific exceptions aren't included in the inferred type. In this case, a type like fn example() -> Nat throws (!OtherException, ...) would mean, “This throws whatever the compiler infers for the body, but if that inferred set includes OtherException then raise a compile error.” This means you don't need to regularly re-write the set of exceptions for a complicated API which might throw a dozen different specific errors, but you could still say, “I don't want this specific exception to escape, so keep me honest: if example() ever tries to throw OtherException, then yell at me about it.”

In fact, I can imagine wanting to implement this in a way where exception lists, even empty ones like (), will actually implicitly include pervasive “existential” exceptions: for example, exceptions that represent signals like SIGKILL or exceptions that get raised when the process runs out of memory. In that case, fn foo() throws () would be a convenient fiction, because it won't force the programmer to handle out-of-memory errors, but a programmer could write fn foo() throws (!OutOfMemory) to indicate that foo not only doesn't throw any user-written or typical stdlib exceptions, it also promises to handle out-of-memory exceptions that bubbled up from within it. A typical program probably would still probably define fn main() throws (), but a server that's intended to be long-lived might define fn main() throws (!OutOfMemory, !Sigkill) so that the compiler can let you know when you've failed to handle those error conditions.

I haven't implemented this in any language, so it's quite possible it'd still have problems. And I've handwaved away a number of issues that I haven't tried to solve. For example, I haven't tried fully articulating the typing rules for higher-order functions, or how this would interact with typeclasses, or how the compilation model will work: there's a lot to be done, and it might require other limitations or compromises or heuristics in practice. But my hypothesis is that a feature such as this would let people use checked exceptions in a way that includes minimal overhead, allowing programmers to opt-in to useful checks but also get out of their way when it's not useful.

What's this about conditions and restarts, then?

I said there was another part of my hypothesis, and to explain that I'll have to talk about Common Lisp's condition system. Let's start with exceptions and describe in fine detail how they work, because that'll help us understand how conditions differ.

In a language with exceptions, you signal the existence of an error by you constructing a “thing” called an exception, which is usually a value of a specific type. In languages with inheritance these are often values that inherit from a specific base class (although not always—Ruby, for example, allows any value to be thrown) and in languages without inheritance they usually have some kind of error-specific tag (e.g. in Haskell or various ML languages.) These values can be “raised” or “thrown”. When an exception is “raised”, the language runtime will begin to walk up the stack, freeing stack frames as it goes, until it finds a “handler”, a bit of code which matches the appropriate exception type or tag and is attached to a block of error-handling code. If it fails to find a handler, the runtime will usually terminate the program and print a relevant message. If it does find a handler, it resumes from that point, providing the exception to that bit of code.

This is a widely-used mechanism for errors: so much so, in fact, that it can be hard to imagine alternatives. What else could you do? What else would you want out of it?

I'm going to steal an example from Peter Seibel's Practical Common Lisp as motivation, but rewrite it in a curly-bracket syntax for people who aren't as comfortable with Lisp. Imagine that I'm writing a library which is parsing log files in a particular format. I have a function called parse_log_entry which takes a fragment of text and produces a log entry. Say I've written it like this:

fn parse_log_entry(text: String) -> LogEntry {
  if (is_well_formed(text)) {
    return LogEntry::from_text(text);
  } else {
    raise MalformedLogEntry(text);
  }
}

Now the library is about parsing whole log files, so I also expose a function to parse an entire file like this:

fn parse_log_file(f: File) -> List<LogEntry> {
  let mut list = List::new();
  for ln in f.read_lines() {
    list.append(parse_log_entry(ln));
  }
  list
}

This is nice and simple! Unfortunately, if a single entry fails to parse, we'll throw a MalformedLogEntry exception and lose access to the whole log parsed so far! In some applications, maybe that's fine, but I've said we're writing a library, and we want it to be as flexible as possible for an eventual user. Perhaps a user of the library would like us to put a special kind of LogEntry value that represents a malformed entry instead? We could write something like this:

fn parse_log_file(f: File) -> List<LogEntry> {
  let mut list = List::new();
  for ln in f.read_lines() {
    try {
      list.append(parse_log_entry(ln))
    } catch (exn: MalformedLogEntry) {
      list.append(bad_log_entry())
    }
  }
  list
}

Now we handle that error gracefully. But now we're assuming that's how the user wants us to handle that error. Maybe the user wants us to quietly skip that entry instead! We can write that, too:

fn parse_log_file(f: File) -> List<LogEntry> {
  let mut list = List::new();
  for ln in f.read_lines() {
    try {
      list.append(parse_log_entry(ln))
    } catch (_exn: MalformedLogEntry) {
      // do nothing
    }
  }
  list
}

But what if they want us to skip them but write the errors to stdout? Or to a specific logger? Or apply a correction heuristic and try parsing the line again? Or…

Okay, library design is hard, and designing libraries which handle errors in every possible way is really hard. One approach here might be to provide all these as options which can be chosen by the library user. Maybe we expose different methods to the user which each implement different versions of these strategies, like parse_log_file_with_default versus parse_log_file_skip_bad. Maybe we provide one function with lots of optional parameters, like parse_log_file(with_default: ..., skip_invalid: ...). Maybe we just throw our hands in the air and choose one we think is sensible: convention over configuration, right?

On the other hand, if we had a condition system, we would have a really powerful way of allowing the user to choose any of these and more without having to significantly change our interface. What a condition system does is separate out two different concerns that get conflated by exception-handling: what error recovery code does and what error recovery code should be invoked. To begin with, instead of an exception handler, we install what's called a restart and give it a name: this is how we define our recovery code that might run after an error is raised, but it does not guarantee that the error-handling code associated with the restart is actually run. In this case, let's start with logic around skipping entries:

fn parse_log_file(f: File) -> List<LogEntry> {
  let mut list = List::new();
  for ln in f.read_lines() {
    try {
      list.append(parse_log_entry(ln))
    } restart SkipEntry {
      // do nothing
    }
  }
  list
}

This restart block represents a possible recovery strategy in the presence of an error, and SkipError is the name we've given it. However, we're still missing something: we haven't actually told our program to use it. Our library shouldn't be the one to make that choice, so let's imagine that we're calling the parse_log_file library function from some application code. We now tell our application code to handle a condition by invoking the restart we've defined:

fn analyze_log() -> List<LogEntry> {
  try {
    parse_log_file(File::open("my_log.txt"))
  } handle {
    MalformedLogEntry(_) => restart SkipEntry,
  }
}

This is where I'm telling the program which piece of recovery code to use. What I'm saying is, “If we ever come across a situation where our code has produced a MalformedLogEntry, then recover from that by finding the recovery path labeled with SkipEntry and restart from there.”

So far we've only defined that one recovery path. Let's revisit parse_log_entry and add a few more strategies we might use to recover from an error within that function. Unlike SkipEntry above, these also take parameters, which are pieces of information that the handler can supply in the handle blocks:

fn parse_log_entry(text: String) -> LogEntry {
  if (is_well_formed(text)) {
    return LogEntry::from_text(text);
  } else {
    try {
      raise MalformedLogEntry(text);
    } restart UseValue(v: LogEntry) {
      return v;
    } restart RetryParse(new_text: String) {
      return parse_log_entry(new_text);
    }
  }
}

Now we have a total of three possible ways to recover from a MalformedLogEntry: we can invoke the SkipEntry restart which will simply skip past malformed lines, we can use the UseValue restart with a value of type LogEntry to replace the bad log with a different provided one, or we can use the RetryParse restart to supply a new corrected string and attempt the parsing again.

The important thing now is that the library allows all of these restarts to exist simultaneously, but does not specify which to take: that's up to the calling code. Let's change our application code to supply bad_log_entry() as a default value instead; this means an application will still include as many LogEntry values as we had lines, but some are specifically represented as bad ones:

fn analyze_log() -> List<LogEntry> {
  try {
    parse_log_file(File::open("my_log.txt"))
  } handle {
    MalformedLogEntry(_) => restart UseEntry(bad_log_entry()),
  }
}

What if we want to skip the bad ones but still record that we saw them by printing messages to our logger? We can use SkipEntry with some extra handler code, then:

fn analyze_log() -> List<LogEntry> {
  try {
    parse_log_file(File::open("my_log.txt"))
  } handle {
    MalformedLogEntry(text) => {
      logger.log("Found bad log entry: `{}`", text);
      restart SkipEntry;
    }
  }
}

What if we want to try applying a correction heuristic to the first several errors that we see, but exit the program if we see more than an pre-determined “allowable” amount of bad errors? We can use shared state in our handler and the RetryParse restart:

fn analyze_log() -> List<LogEntry> {
  let mut errors = 0;
  try {
    parse_log_file(File::open("my_log.txt"))
  } handle {
    MalformedLogEntry(text) => {
      if (errors < ALLOWED_LOG_ERRORS) {
        errors += 1;
        restart RetryParse(try_correction(text));
      } else {
        logger.log("Encountered too many bad log entries; exiting");
        system.exit(1);
      }
    }
  }
}

Admittedly, this system is definitely more fiddly than exception-handling: you've got more moving parts, what with the error conditions (which in this example we had only one of) plus the named error restarts plus the handler logic. (I suspect this is one reason why language designers haven't bothered porting this into new languages, preferring simpler exception-based errors or explicit result types instead.) But the separation can be incredibly powerful: we no longer need to manually thread state through our API; instead, our API is designed for the “happy path”, but errors can still be handled in a granular way, and what's more, the application has full control over how those errors get handled. Breaking up strategies for error recovering (as named restarts) from how to actually handle errors (as handler strategies) allows for some incredibly simple but powerful API designs.

Statically inferred typed restarts

Okay, so let's put these together. The last section glossed over types, and for good reason: in Common Lisp, there aren't types for restarts. In fact, it's possible for a handler to specify a restart which doesn't exist, in which case it'll produce a different kind of error (a CONTROL-ERROR in Common Lisp parlance) because it wasn't able to find the place where code should resume.

But we could build a statically typed language that implements a condition system, so that the compiler could reject programs which catch non-existent errors or try to resume from non-existent restarts (or supply those restarts with values of the wrong type.) Yet again, the way I'd propose doing this is by allowing errors—or conditions, to use the Common Lisp terminology—but also restarts to be inferred as part of a function's type even when the rest of the type is explicitly written.

I think in this situation, the code we wrote up with above (without any type information) should still be valid, in the same way that the example code at the start of the post could infer possible exceptions if you didn't explicitly try to specify them. In this case, if we asked our hypothetical language to infer the relevant types, we might end up with something like this:

fn parse_log_entry(text: String) -> LogEntry
  throws (MalformedLogEntry(String))
  restarts (UseValue(LogEntry), RetryParse(String))

fn parse_log_file(file: File) -> List<LogEntry>
  throws (MalformedLogEntry(String))
  restarts (UseValue(LogEntry), RetryParse(String), SkipEntry)

fn analyze_log() -> List<LogEntry> throws () restarts ()

After all, the language can tell which errors are possible and which restarts are available. But as with the inferred checked exceptions before, we wouldn't need to write them out: they're easily inferred by the compiler, and it simplifies the process of writing the code tremendously! The compiler would reject your program if you try to use a restart that doesn't exist, or if you claim your function is error-free but fail to handle an error that might bubble up. This would allow us to have a very powerful set of tools for error handling and also careful API design without putting a strong burden of tracking minutiae on the programmer.

As I said before, I haven't implemented this anywhere, and who knows if this idea would survive contact with an end-user2. It's not only possible but likely that there are cases I haven't considered, complicated interactions I've failed to foresee, or ergonomic concerns that this doesn't address. But it's a hypothesis, and one I'd like to test some day, because if it works, I can imagine it being a tremendous tool for safety and legibility!


  1. Why do I specify this? Well, because I believe that there's no one programming language for all applications. I like Rust and I like Haskell and I like Python, and there are a few applications where any of them might be fine, but more often there are programs that would be perfect for one and terrible for the other two. I think that problem domain matters a lot to programming language design.
  2. I've also elided both implementation concerns (for example: how does this work with higher-order functions?) and other design questions. There are two big design questions that I've considered but don't have conclusions about. The first question is: how should conditions and restarts be named and namespaced? I've treated them in the code as global-ish, but maybe they should be namespaced underneath the function name (e.g. parse_log_entry::UseValue) or at least underneath the compilation unit, but there are a lot of possible choices here. The second question is: how “far up” should restarts be accessible, and how do you control that? I handwaved in the inferred type signatures above that, because we handle the condition which can get raised, we don't “re-export” restarts from analyze_log, but we do re-export them from parse_log_file. Is that an analysis we can do? If not, how do we make sure that restarts are available without adding too much more busywork for the programmer?

Literal actual years ago, a friend of mine asked me for a blog post about how I generate images with code. I've started writing such a blog post on multiple occasions, and I keep wanting to add more stuff, more asides, more examples, and it grows well beyond what a post should be. So instead, I'm going to break it up into multiple posts.

  1. In Praise of Netpbm
  2. Cairo, SVG, and generated vector images: planned
  3. Practical PostScript—no, why are you laughing, I'm serious: planned
  4. How I approach procedural images: planned

Generative pixel art

Sometimes the generative art I want to make is just pixel-based. For example, let's say I want to create some interesting cellular automata:

Three examples of three-state cellular automata.

Or perhaps I want to make some cool-looking glyphs, the kind you'd see used as writing in the background of a pixel-art-heavy game like Hyper Light Drifter or Fez:

Twelve examples of random blocky symbols.

Or maybe something kind of maze-like:

A twisting black-and-white pattern which resembles a maze.

There are a lot of ways to go about this, but for images like these I've got a particular approach which is barebones and flexible.

Say you want to generate some pixel art

For didactic reasons that'll become apparent later in the blog post, I'm going to write code here in Wren. Wren's a little object-oriented scripting language that's designed to be embedded in larger programs. I'm mostly not gonna explain it much, but it should be easy to follow it like it's pseudocode: it's an imperative object-oriented language that feels somewhere between Ruby and JavaScript with a pretty standard curly-brace syntax. I'm not gonna embed it as a scripting language, though: I just want to use it to create some basic black-and-white pixel art, so I'm gonna use wren-cli to run scripts directly with access to its pretty minimal standard library.

So, pixels. A raster image is just a two-dimensional array, right? How hard can that be? Let's start by making an array of arrays that will be our image.1 Since this is black-and-white, each element will be one of two numbers: for this example, I'll use 0 for white and 1 for black. I'll initialize a basic 3×3 image with all white pixels, by which I mean, make a 3×3 2D array.2

var width = 3
var height = 3
var image = []

for (y in 0...height) {
  var row = []
  for (x in 0...width) {
    row.add(0)
  }
  image.add(row)
}

Okay, so we've got our blank image. Let's do something basic: let's make a very simple black-and-white checkerboard pattern by alternating black and white pixels. This is pretty easy to do with the modulus operator:

for (x in 0...width) {
  for (y in 0...height) {
    image[y][x] = (x + y) % 2
  }
}

In this case, you'll notice that I access the image using image[y][x]. This is the opposite of what the usual notation is, but think about why that is: when we print out our 2D array, it looks like this:

[[0, 1, 0],
 [1, 0, 1],
 [0, 1, 0]]

But that means that arr[0] is the row at the top, arr[1] is the row beneath, and arr[2] is the last row. That means to e.g. get the second pixel of the first row, we'd need to use arr[0][1]. A little unusual, but not too hard to remember, and if it's really tripping us up, we can always write some wrapper functions that take the arguments in the correct order.

Okay! We've got our pixel array: let's turn it into an image!

…oh, wait. I chose to write this in Wren, didn't I?3 Here's a thing about Wren: it doesn't have a package manager, or even much of a stdlib. In another language, I might now say, “Okay, let's find the PNG encoder, toss my data at that, and write out a PNG file.” But I don't have a PNG encoder on hand to throw the data at.4 How can I take my 2D array and see it as an actual image?

Netpbm to the rescue

Netpbm is an image format, or rather, a family of image formats: seven in all. For the purposes of this blog post, I only care about the first three (although I'll describe the last four a bit later.) Let's start with the PBM format itself, where PBM stands for “Portable BitMap”. This is a format for bitmap images, in the very literal sense of 'a map of bits': each pixel is 1 bit (i.e. black or white, on or off.) Netpbm is a text-oriented format, so I can write it with a text editor. Here is a hand-written Netpbm file which describes the basic 3×3 image I was generating above:

P1
3 3
0 1 0
1 0 1
0 1 0

…and that's it. The P1 at the top says, “I am a Netpbm file,” the next line has the width and height of the image in pixels, and then after that are all the values for each individual pixel, 0 for white and 1 for black.

Okay, there's a little bit more to it, but not much: for one, Netpbm files can have comments, which are commonly used to indicate which program created them (although they can't come first: the first two bytes of the file have to be P1 for it to be recognized by parsers). It's also worth noting that all the whitespace after the width and height is optional, and you don't need to line up the pixels nicely like I did above. The following defines the same image:

P1
# check it out
3 3
010101010

But that's really it.

So, that's easy! Let's just write that to stdout in our Wren program. I'm going to separate the rows by newlines and the pixels by spaces, but as I said before, it's optional here:

System.print("P1")
System.print("%(width) %(height)")
for (y in 0...height) {
  for (x in 0...width) {
    System.write("%(image[y][x]) ")
  }
  System.print()
}

Because I've parameterized it by the width and height, I can also tweak those to make my image larger. And now that I've got a PBM file, I can open it in any program which supports the format: and you might be surprised which programs do support it. I can't view it directly in the browser or anything, but there plenty of viewers and editors that do understand it. I've got Glimpse installed, why not try that?

A 3x3 pixel checkerboard, viewed in the Glimpse editor.

Works just fine! And I could use Glimpse to convert it to another format. There's also a suite of tools called netpbm which are commonly available on Unix-like systems, so I could always run that to convert them to some other format.5

What if I want grays or colors?

You'll notice that the PBM format is pretty barebones: it is only capable of describing black-and-white images. So, let's create a grayscale image instead, which means creating a PGM file. Instead of a simple 0-versus-1 distinction, we'll have a scale from black to white. Unlike some other formats, the Netpbm format allows us to decide how many levels of gray we have. Let's say we want to create a very similar image except instead of cycling between black and white, we cycle between multiple shades of gray: let's say, arbitrarily, four. We can start with the same blank image, but instead of filling in 0 and 1, we fill in a wider range of numbers, which for us will range from 0 through 3.

for (x in 0...width) {
  for (y in 0...height) {
    image[y][x] = (x + y) % 4
  }
}

In order to produce this file type, we start it with P2 instead, which is the header variant for a grayscale image. We follow it with the width and height, but then we add one more thing: the maximum depth. This is the number that will represent white, while 0 will represent black;6 any number in between will represent a shade of gray. In a PGM, unlike a PBM, you do need spaces between the individual pixel values, but luckily we were already doing that before. (Newlines between rows are still optional: I just like how they look.)

System.print("P2") // the PGM header
System.print("%(width) %(height)")
System.print(3) // the maximum value which will appear
for (y in 0...height) {
  for (x in 0...width) {
    System.write("%(image[y][x]) ")
  }
  System.print()
}

The end result of this, yet again, is a bitmap like so:

A sample bitmap image of cycling shades of gray.

Creating color images with Netpbm is more involved, but only slightly. When I create a color image, I need to supply three values per pixel: red, green, and blue. One way you can represent this is by treating each pixel as a struct with three fields, or a map from color to value. For our purposes, let's say that each pixel is a hashmap with three keys: "r", "g", and "b". You'll also notice that I decided to define a color depth at the top of the file here, so we can use that variable later:

var width = 24
var height = 24
var depth = 24
var image = []

for (y in 0...height) {
  var row = []
  for (x in 0...width) {
    row.add({"r": 0, "g": 0, "b": 0})
  }
  image.add(row)
}

I'm going to arbitrarily choose something weird to draw here: let's say that I want the image to get more red as you go to the right, more green as you go down, and the blue-ness of pixels will alternate in a checkerboard pattern.

for (x in 0...width) {
  for (y in 0...height) {
    image[y][x]["r"] = ((x / width) * depth).floor
    image[y][x]["g"] = ((y / height) * depth).floor
    image[y][x]["b"] = ((x + y) % 2) * depth
  }
}

Once we do that, printing the image is easy, and indeed is nearly identical to the PGM version. We need the header P3, and we still need a maximum value for the brightness of pixels, but only one which serves as the maximum value for all three colors. The biggest change is that each pixel is now three numbers, not one, but we can accommodate that pretty easily:

System.print("P3")
System.print("%(width) %(height)")
System.print(depth)
for (y in 0...height) {
  for (x in 0...width) {
    var p = image[y][x]
    System.write("%(p["r"]) %(p["g"]) %(p["b"]) ")
  }
  System.print()
}

And the end result:

A sample bitmap image of shifting colors.

And there we go! We can produce bitmap images easily, with no library support at all!

Why Netpbm and not a bitmap library?

There's no shortage of good libraries for manipulating images, many of which have convenient higher-level operations you might need. At the very least, you probably want some kind of blitting operation, to take a set of smaller bitmaps and redraw them onto a larger one.

But there's a good reason to know about the Netpbm format, and a good reason why I've used it in a lot of my pixel art generation projects: you can use it from literally anywhere and without thinking about it much at all! Look at the above programs: I used a language without a native image library and I didn't even use file IO, and yet I was able to create bitmap images in less than two dozen lines.

If I want something a bit more sophisticated, I can start to build my own versions of higher-level operations. For example, let's say I want to encapsulate the image in a class, say, with a pixel(x, y, color) method I can call. It'd be pretty easy for me to add a rectangle method, like so:

  rectangle(x, y, width, height, shade) {
    // the two horizontal lines
    for (dx in 0..width) {
      pixel(x + dx, y,          shade)
      pixel(x + dx, y + height, shade)
    }

    // the two vertical lines
    for (dy in 0..height) {
      pixel(x,         y + dy, shade)
      pixel(x + width, y + dy, shade)
    }
  }

Then I could create an image with a handful of randomly-placed rectangles with random shades:

var image = GrayscaleImage.new(width, height, depth)
// create up to 6 rectangles
for (i in 0..rand.int(3, 6)) {
  // choose the color from the depth
  var color = rand.int(2, 8)
  // choose top-left point randomly
  var x = rand.int(0, width-3)
  var y = rand.int(0, height-3)
  // choose width and height from remaining
  var w = rand.int(x+2, width) - x
  var h = rand.int(y+2, height) - y
  // draw the rectangle
  image.rectangle(x, y, w, h, color)
}
image.showPGM()

This program, when run, produces output that looks like this:

Several gray rectangles randomly positioned on a black field.

I chose Wren here mostly because it's vaguely comprehensible if you know just about any object-oriented scripting language but also to show how little support you need to start making these images. I've built Netpbm images using languages which have other robust image library support (like Rust and Python) but I've also used languages where libraries were scarce and I didn't feel like building a PNG encoder or mucking with an FFI: languages like Wren, but also various obscure Scheme variants, Pony, and Idris.

In fact, the examples at the beginning of this post were all written in exactly this way, and in different languages: the three-state cellular automata were written in Rust, the glyphs in Ruby, and the maze-ish pattern in Python, all of them created by writing out Netpbm files without any library support!7

When would I use a bitmap library?

One concern is that NetPBM files are big. Not only are they not compressed, they're also written out using ASCII text, which means there's a lot of wasted space. For example, near the beginning of this post, I included an example of cellular automata, which I generated as 99×99-pixel squares. When encoded as a PNG, this image is about 1.9K in size, and when encoded using the uncompressed BMP format, it goes up to a hefty 39K. The PPM file I originally generated is 58K in size, around 30× the size of the PNG. For a tiny image, that's huge. This isn't a huge problem long-term, because you can easily convert them to a compressed format for storage, but it's worth noting, especially if you're going to be generating a lot of them at once. (I generated 120 random cellular automata images to choose one for this post: those images, each 99 pixel squares, took up about 8M on disk!)

You'll also have increasing problems as your desired drawing primitives get more complicated. The stuff I've done above all involved tight control over placing pixels or groups of pixels, but if I wanted to e.g. composite several images with various transparency modes, I'm going to have to start implementing more and more things manually. At some point, learning a good library becomes a better use of your time.8

Finally, this is really only suitable for generating a static image or several in a batch-like environment: that is to say, it's probably not gonna be an ideal choice for any generative task that needs to be done in real-time (like interactive simulations or video games) or even relatively quickly (like generating images on-the-fly for a web site). Your chosen application might be better served again by finding bindings to something like pixman instead!

Hey, what about the other four Netpbm variants?

There are other variants of the Netpbm format, which of course use headers P4 through P7. P4 through P6 are simply binary variants of P1 through P3: for example, an image with the header P4 is a packed-binary bitmap. The header section that defines the width and height is effectively identical to the ASCII version—you'd still use a base-10 ASCII representation of the width and height separated by whitespace—but afterwards you use packed binary versions of the values, and usually with a predetermined color depth as well.

The final variant, P7, is only ever binary, and allows you to define more color channels for specific purposes. For example, you can add an alpha channel, something that is lacking in the other bitmap formats.

Honestly? I've never used them. The reason I reach for Netpbm variants is because they're easy to print and also to parse. In theory, a binary Netpbm file is going to be easier to produce and parse than a compressed format like PNG, but I've never had occasion to need just enough efficiency that I'd use a packed binary image format but not want to take it a step further to a compressed format or dedicated library. If I'm pulling in libraries, I might as well do it right!

What's next for this series?

I've talked about how I produce pixel-art-ish images, but that's actually a relatively small portion of the generative artwork I've done. Most of my generative artwork has been done with vector art, and there are a few approaches I've taken there. That's going to be the topic of my next post, and I'm going to cover the two most common approaches I've taken, one that's similar to the approach described here, and another that's library-supported!


  1. Okay, you don't have to use a 2D array. For example, you could also use a flat array of width * height elements: assuming you're storing this in row-major order, you can get the element at the position (x, y) by using the index x + y * height. Alternately, if you don't mind being inefficient—and if you're using the approach I outline here, you're probably more interested in your time than the computer's time!—you can use a hashmap or dictionary whose keys are pairs of numbers. I regularly choose different representations for images depending on what one-off image I'm trying to generate.
  2. A lot of languages have some way of automatically creating a list of n copies of a repeated element, and Wren is no exception, but you might notice I didn't use it here, in large part because there's a common gotcha with these functions: they often don't deep-clone their argument, which can cause problems where mutations mutate more elements than you'd expect. In Wren's case, I could create a 2D array of zeroes by writing List.filled(height, List.filled(width, 0)), but in this case every row is a reference to the same row in memory, so every row of the image would end up being identical. This is a problem common to other imperative languages, so Ruby's Array.new(height, Array.new(width, 0)) or Python [[0] * 3] * 3 would have similar problems: because all rows are aliases for the same row in memory, modifying image[0][0] would also modify image[1][0] and image[2][0]. …probably not what you want!
  3. It's almost like I chose this language to make a clunky didactic point…
  4. I could write a C program which pulls in a PNG encoder and exposes it to Wren—indeed, that's sort of how the language expects to be used—but I don't want to muck with that for this blog post.
  5. That said, the Netpbm tools are very old: at this point, it's likely that you probably haven't even heard of most of the image formats they have converters for!
  6. You might notice that the meaning of the numbers is swapped for the PGM format. In the PBM format above, 0 is white and 1 is black. However, if I create a PGM file where the maximum value for a pixel was 1, then I'd have effectively created a bitmap file, but reversed: in a PGM file, 0 is black and the maximum value is white. This is a peculiarity of the format, and something you simply need to remember!
  7. The Rust one does use the rand library, but it doesn't use a library for images, at least.
  8. This is especially true if you ever want to create images with text. At that point, I'd skip pixels and head straight to vector, even if I want the final result to have a chunky look to it!

Let's talk about objects in functional languages.

This could be a sticking point, so I'll get this out of the way up-front: I find arguments over whether such-and-such a thing is “really” functional or “really” imperative or “really” object-oriented to be for the most part frustrating and useless. A lot of these arguments boil down to value judgments in disguise (a thing I've touched on in the past when talking about the rhetorical emphasis placed on math in programming) and many of the rest turn out to be based on informal definitions where 'functional' means 'like my favorite or first functional language'.1 I prefer to think about these concept less in terms of necessary criteria and hard lines, and more in terms of influence and approach and focus.

The paper On Understanding Data Abstraction Revisited by William Cook (itself a paper that revisits On understanding data types, data abstraction, and polymorphism by Luca Cardelli and Peter Wegner) lays out simple definitions of the often-used concepts of “abstract data type” and “object”. His definition of “object” is, of course, not the only one, but I think it's a useful one. Importantly, Cook's definition is based on abstract criteria that can apply even when the language itself does not have an explicit concept of an “object”. To emphasize this, I'm going to use Haskell in this blog post to demonstrate a pure functional variation on “object-oriented programming”2.

Beyond being a useful definition, Cook's formulation is also a valuable design pattern, one which has both advantages and disadvantages when designing programs. Consequently, the goal of this blog post is twofold: first, to show how “objects” can (under some definition) exist in a language like Haskell which has no built-in notion of “object-oriented programming”, and secondly, to show the advantages and disadvantages of code written in this style.

Abstract data types

Cook's definition of an abstract data type is a type that

[…] has a public name, a hidden representation, and operations to create, combine, and observe values of the abstraction.

To use a simplified version of his example, here's a (not terribly efficient) way of defining a “set of integers” in Haskell as an abstract data type. In this case, we've got an IntSet type whose representation can be easily hidden, because the operations that a consumer of this library cares about about have been defined in terms of that abstract representation.

-- we'll export `IntSet` but not the constructors
data IntSet
  = SetEmpty
  | SetInsert Int IntSet

empty :: IntSet
empty = SetEmpty

isEmpty :: IntSet -> Bool
isEmpty SetEmpty = True
isEmpty _ = False

insert :: Int -> IntSet -> IntSet
insert x set
  | contains set x = set
  | otherwise = SetInsert x set

contains :: IntSet -> Int -> Bool
contains SetEmpty x = False
contains (SetInsert y rest) x
  | x == y = True
  | otherwise = contains rest x

This is probably not a terribly controversial definition or design: it's a pretty typical one for most functional languages! …well, it might be controversial to use since it's got awful algorithmic performance, but it's fairly unobjectionable as a teaching tool, at least.

What would a user see when looking at the documentation for this module? Since we're not exporting the constructors for IntSet, it'll look something like this:

data IntSet
empty :: IntSet
isEmpty :: IntSet -> Bool
insert :: Int -> IntSet -> IntSet
contains :: Int -> IntSet -> Bool

The ability to hide the definition of IntSet is the thing that makes this an abstract data type. A user of the library doesn't care—and ideally doesn't need to care—what constructors hide behind that IntSet.

What is an object?

Cook then goes on to describe “objects”. Here's another (equally inefficient) implementation of IntSet, which I'll define as OIntSet so I can easily refer to both:

data OIntSet = OIntSet
  { oIsEmpty :: Bool
  , oContains :: Int -> Bool
  }

oEmpty :: OIntSet
oEmpty = OIntSet
  { oIsEmpty = True
  , oContains = \_ -> False
  }

oInsert :: Int -> OIntSet -> OIntSet
oInsert x set
  | oContains set x = set
  | otherwise = OIntSet
      { oIsEmpty = False
      , oContains = \i -> i == x || oContains set i
      }

It's possible that we can choose our export list carefully so that this implementation of OIntSet reveals the exact same set of operations as the previous one. However, there's a major difference here: OIntSet is not actually hiding a specific type. Instead, it just bundles the relevant set of operations inside of a record of functions, which acts like an interface type. In both the ADT-based approach and the “object”-based approach, a user does not know about the internal representation of OIntSet, but in the ADT approach, this is because there exists a single representation which is non-public, while in the “object”-based approach, there may be multiple separate implementations that are indistinguishable.

Why use objects?

An “object”-like representation allows for a vast amount of flexibility. Because a consumer of an OIntSet can use any value as long as the value has provided implementations of the relevant “methods”, we can easily and conveniently define new instances of OIntSet that have radically different internal representations. For example, we can define infinite sets that use simple numerical computations to define their oContains method, like this set of all even numbers:

oEvenNumbers :: OIntSet
oEvenNumbers = OIntSet
  { oIsEmpty = False
  , oContains = \i -> i `mod` 2 == 0
  }

Or we could construct OIntSet values that use a different data representation, such as a list, to store the members of the set:

oFromList :: [Int] -> OIntSet
oFromList list = OIntSet
  { oIsEmpty = null list
  , oContains = \i -> i `elem` list
  }

But even though these OIntSet definitions use different underlying data representations, they expose the same interface, so we can use the same operations to manipulate them. For example, we can define an oUnion operation which computes the union of two OIntSet objects, since that operation is easily expressible in terms of oIsEmpty and oContains:

oUnion :: OIntSet -> OIntSet -> OIntSet
oUnion set set' = OIntSet
  { oIsEmpty = oIsEmpty set && oIsEmpty set'
  , oContains = \i -> oContains set i || oContains set' i
  }

Our oUnion operation—indeed, any operation we define—can work on any two OIntSets even if they have wildly different internal representations. We can even use this to combine all of our previously-defined OIntSet constructors into one expression, to create a set that uses a combination of Haskell lists, numeric predicates, and closures to represent a set:

sample :: OIntSet
sample = oInsert 1 oEvenNumbers `oUnion` oFromList [10..20]

This is a very convenient way of building certain abstractions. By building around external interfaces, you can include varying data representations that easily work together.

This example is clearly a little bit contrived, so it's probably worth giving some other “real-world” examples where this design approach is useful. Object-oriented programming is generally cited as a good fit for a particular style of user interface programming, because it allows you to define classes of “widgets” that expose a common interface but have different internal representations. You could build a Haskell GUI library in this style by defining Widgets as “objects” with a common interface, something like this:

data Widget = Widget
  { drawWidget :: Ctx -> Position -> IO ()
  , handleEvent :: Event -> IO ()
  }

This is similar to the approach taken by the Brick TUI library, which has its own Widget record.

Why not use objects?

One major concern with this style of data representation is performance and optimization. Consider our original ADT representation for IntSet: it's inefficient, yes, but we can make it more efficient in a number of ways. For example, we could modify it so that, instead of always inserting new elements “at the front”, we can instead insert them in such a way that the internal representation of the set is always sorted lowest-to-highest. This means that we may no longer have to traverse the entire list to check for element membership. Even better, we might swap out the list-like representation for a binary tree representation, maybe doing some rebalancing in certain cases.

There is no way in general to apply these optimizations to the OIntSet-style program. You could define an OIntSet that sits in front of a balanced tree and therefore has faster lookup and insertion, but once it's sitting behind the interface, you no longer have access to those internals. You cannot, for example, write an oUnion operation that rebalances the binary trees behind the two sets it's operating on: it doesn't even know if both sets are backed by trees!

In effect, the major selling point of the “object”-style design here is also a major downside: you don't have guarantees about the specific representation of data, which means your programs can easily mix-and-match different representations, but it also means that your program can't make use of representation-specific knowledge in ways that are advantageous.

There's another major concern as well, and that's that the specific choice of “object” representation can make a big difference in terms of what operations you can and cannot support. Look back at OIntSet—I was able to define oUnion, but what about oIntersection? It turns out that it's not actually possible using the specific representation I've chosen3:

oIntersection :: OIntSet -> OIntSet -> OIntSet
oIntersection set set' = OIntSet
  { oIsEmpty = {- ??? -}
  , oIntersection = \i -> oContains set i && oContains set' i
  }

How do I implement oIsEmpty? I might naïvely try to write the inverse of oUnion and define it as oIsEmpty set || oIsEmpty set', but that's not at all what I want: the intersection of the set of even numbers and the set of odd numbers is an empty set, but neither the even nor the odd numbers are empty, so this would incorrectly compute their intersection as non-empty.

This is an artifact of the specific interface chosen for the set. I could modify the interface and be able to recapture this behavior, but almost any choice I make is going to have different repercussions: for example, I could add a method to enumerate all the values contained in the set, at which point I now have a convenient way to find out whether the intersection of two sets is indeed empty… but now I have made infinite sets significantly more difficult to define!

This is another face of the performance problem: the specific interface chosen is going to have far-reaching ramifications not only on what operations are efficient or inefficient, but on what operations are possible to write at all.

Why do these count as “objects”?

A lot of definitions of “objects” in the sense of “object-oriented programming” go back to Alan Kay. Kay was a major force behind the SmallTalk programming language, and he once gave this definition of OOP:

OOP to me means only messaging, local retention, and protection and hiding of state-process, and extreme late-binding of all things. It can be done in Smalltalk and in LISP. There are possibly other systems in which this is possible, but I'm not aware of them.

Our treatment of “objects” here does not fit in this definition, but neither do most languages that are typically called “object-oriented”. In particular, “late-binding” here means that methods are not looked up until runtime: in a proper SmallTalk-like system, this would be done by name, meaning that even virtual dispatch in a language like C++ or Java does not count. You can use languages like Ruby or Python in a way that matches this definition, but they're not typically used this way. Many object-oriented languages are also somewhat lax with respect to protection of local information: Python is a big offender here, as its instance variables are typically made private by convention rather than a language mechanism! And of course, almost none of the modern OOP languages are built strictly around messaging.

However, many of these languages are considered “object-oriented” because they try to capture the advantages of these features while not adhering strictly to them. Late-binding and all-you-can-do-is-send-a-message systems require some complicated machinery in order to implement efficiently, because otherwise the pervasive lookup of methods can become a source of slowdown, so many systems use virtual dispatch instead of extreme late-binding. Similarly, many systems do not adhere to strict information-hiding, but allow some public information for various conveniences (e.g. non-private variables in C++ or Java). In many ways, these are design decisions which help sandbag against the problems with “objects” described above. If we're being sticklers about the Alan Kay definition, we might call these languages “object-oriented-ish”.

The Cook definition is something of a distillation of several of the properties which are used by these “object-oriented-ish” languages. It encodes virtual dispatch: all operations here are higher-order functions, so you cannot know which you are calling until you are passed a function to call. It encodes hiding of local information in a way that's arguably stricter than Java or C++: an external consumer of this code cannot depend on anything but the provided functions, so any “local” information is necessarily hidden. If used appropriately, it can encode protection of state-process: the above example is pure, but if our “methods” had a type like IO (), we could include methods which update internal state (perhaps contained in an IORef) in response to other operations.

Alan Kay's definition is valuable, not just because it describes a powerful approach to programming as implemented in object-oriented languages like SmallTalk and IO, but also because because it describes the behavior of a number of systems that are not programming languages while also capturing what makes those systems powerful: for example, Microsoft's Component Object Model, Plan 9's 9P file system protocol, and arguably even HTTP itself are, from a certain point of view, systems based around doing extremely late-bound message-passing. But I would argue that Cook's definition is also valuable, because it describes both languages in wide use as well as pattern of data modeling in other systems.


  1. I once read a blog post that complained about zippers—a pure functional interface to certain data structures, one that's not even embedded in a monad or some other side-effect-ey abstraction—by claiming that they were “imperative”. Nonsense! This was clearly an example of a value judgment masquerading as some kind of technical truth, and a pretty shallow one at that. I'd be interested in an analysis of the advantages and disadvantages of using zippers, but “I think this abstraction is imperative, therefore it is bad,” is not that!
  2. Cook's paper uses an ML-like pseudocode for these examples, and also includes more detail: I've pared them down a bit.
  3. Actually, if you look closely at my representation, it is possible, just not practical: in Haskell, Int is a machine integer with a finite number of values, so we could enumerate every possible Int value and check whether it's present in both sets. That means I could implement this operation as oIsEmpty = or [ oContains set i && oContains set' i | i <- [minBound..maxBound]], which just needs to check 2^64 possible values on most modern processor architectures!

Last year I decided to use the Advent of Code to practice writing code using the Pony programming language, and the first few days I struggled with error-handling—it took several days before an approach finally clicked with me. This blog post describes my thought process as I worked to that approach. (Don't worry, I'm not going to spoil any of the problems: I use two small out-of-context code snippets from my answers here, but you won't get any insight into the problem or their solutions from them!)

Pony is a language that I've been interested in for a while. It's an object-oriented language designed around the actor model of parallel programming, and it uses a sophisticated type system to prevent things like data races in parallel code. Rust is well-known for the various static guarantees that it gives you regarding data parallelism: Pony gives you even more guarantees by using static capabilities that permit or restrict various operations within or across actors.1 It's a fascinating (albeit complicated) system, and I'd encouraging reading through the Pony tutorial to get an idea of the flavor. (You'll see hints of these features in the example code here, but I'm mostly going to gloss over them: they're not what I want to focus on here.)

Pony is also not much of a language for hastily-constructed one-off programs, the kind you often want to write in coding challenged like the Advent of Code. Pony demands that you handle errors very explicitly, and makes it impossible to use values without handling errors somehow or to compile programs that have unhandled errors.

“Ah, like Rust or Haskell?” you might ask, but not exactly. While Rust's Result-based error handling does have this property, Rust also trivially lets you defer errors to runtime: you can scatter unwrap anywhere, or expect a good result and panic otherwise. And Haskell? Haskell is full of partial functions, of errors and undefineds, of thrown exceptions from ostensibly pure code! By Pony's exacting error-handling standards, Rust might as well be the wild west, and Haskell is Fury Road. No, if I call a function that purports to return an integer in Rust or Haskell, that method may instead slam its way back up the call stack (or, in Haskell's case, the lazy soup of thunks that it has instead of a call stack) with a panic or an exception.

Pony, however, has no such easy escape hatches. Because of the realities of physical computers, yes, a Pony method may crash for reasons outside its control, like failing hardware or solar rays mysteriously flipping a bit in RAM, and because Pony is a Turing-complete language, a Pony function might still loop forever. But there's no convenient unwrap or error to be found in Pony: besides infinite loops and the computational equivalent of acts of God, a Pony method that claims to return an integer will return an integer.

When Code Goes Wrong

Of course, we do need to report errors somehow, so Pony does have a way of raising errors! It's just a very explicit way of raising errors. In Pony, methods can be partial, which is a way of explicitly indicating that the method might not return a value. All partial functions are marked with a question mark, and within a partial function you can raise an error using (of course) the error keyword. Let's work with a simple example: a factorial function which accepts signed integers, and thus needs to indicate an error condition when its argument is negative. (This factorial example is contained within a class called Math, but given that it's a class with no fields, you can think of it here as though it's just a namespace that contains the method.)

class Math
  fun factorial(n: I64): I64 ? =>
    match n
      | 0 => 1
      | let x: I64 if x > 0 =>
          x * factorial(x-1)?
    else
      error
    end

Notice that both the definition and the use of factorial need a ?. All partial functions, both when defined and when used, need to have a ? afterwards: Pony will not compile our program otherwise. So clearly we need to be explicit both when we create and when we propagate errors, but what about when we want to recover from them? For that, we use a try block, in which we can call partial functions, and then an else clause afterwards that will run if anything in the try block produced an error. Here is a main method for our program that tries to print a factorial but will print an error message if its argument happens to be negative. (It won't be in this case, as we've passed the constant 5, but Pony won't let us compile our program unless we have handled every possible partial function!)

actor Main
  new create(env: Env) =>
    try
      env.out.print(Format.int[I64](Math.factorial(5)?))
    else
      env.out.print("Something went wrong!")
    end

This looks pretty uninterestingly like exception-handling in most mainstream languages today. But when I raised an error, I just did so with error: what if instead I want to throw or catch a specific exception? Say I want to call my factorial function on one of the arguments passed to the program on the command line. There's more that can go wrong with this program: maybe the user didn't pass enough arguments, or maybe they passed an argument but it was a non-numeric string, or maybe it was numeric but it represented a negative number. What do we do?

actor Main
  new create(env: Env) =>
    try
      let arg_str = env.args(1)?    // try to fetch args[1]
      let arg = arg_str.i64()?      // try to convert it to a string
      let fact = Math.factorial(5)? // try to compute its factorial
      env.out.print(Format.int[I64](fact))
    else
      env.out.print("Something went wrong!")
    end

We can handle all of them, but if we want to distinguish between them, we're out of luck. Pony does not allow you to throw different exceptions or catch specific ones: all you get is error and an else case to recover from it. All partial functions raise the same error when observed from the outside2. What's the alternative?

One alternative is to go the functional route: use a return value. For example, Pony allows us to define primitives, which are types that have only a single value. Primitive types have many uses, one of which is that we can use them like error tags. Pony already does something like this extensively: for example, it uses the None primitive as a return value sort of like Rust or Haskell's (), but also combines it with union types to act like to Rust's Option or Haskell's Maybe. Let's rewrite our factorial as a total function, one that might return a number or might return a custom error tag which I'll name ArgIsNegative.

primitive ArgIsNegative

class Math
  fun factorial(n: I64): (I64 | ArgIsNegative) =>
    match n
      | 0 => 1
      | let x: I64 if x > 0 =>
          let r = match factorial(x-1)
            | let y: I64 => y
            | ArgIsNegative => return ArgIsNegative
            end
          x * r
    else
      ArgIsNegative
    end

Well, that's... clunkier than I'd hope. Specifically, our recursive call in the middle there became much more verbose. What happened?

This approach is not terribly different from Rust's or Haskell's error-handling, as both of those languages use explicit Result or Either types. The difference is that both of those languages have syntactic sugar to make working with those types less onerous: do-notation in Haskell and the ? operator in Rust. If we didn't have ? in Rust, we would have to do similar clunky pattern-matching, because every recursive call might not return a value, and we would need to pattern-match on the result to either extract the successful argument or propagate the error:

struct ArgIsNegative;

fn factorial(n: i64) -> Result<i64, ArgIsNegative> {
  match n {
    0 => Ok(1),
    _ if n > 0 => {
      let r = match factorial(n-1) {
        Ok(y) => y,
        Err(e) => return Err(e),
      };
      Ok(n * r)
    }
    _ => Err(ArgIsNegative),
  }
}

But Rust does have syntactic sugar for that, so a practical version of this in Rust is a lot more readable:

fn factorial(n: i64) -> Result<i64, ArgIsNegative> {
  match n {
    0 => Ok(1),
    _ if n > 0 => Ok(n * factorial(n-1)?),
    _ => Err(ArgIsNegative),
  }
}

Pony lacks that kind of sugar, which is why using this pattern in Pony is more of a hassle. Not impossible by any means, but you end up with some much more difficult-to-read code. So when I started writing my advent-of-code examples, I struggled with this: I wanted informative errors so that I could understand where my code went wrong and how to improve it, but manually matching on and returning these values was a hassle. And it was worse than the toy example of factorial: consider this (slightly modified) method from my actual solution which was supposed to fetch three values, all which might not exist in the input, and wrap them in a convenient wrapper type:

  fun ref build_operands(): (Operands | Error) =>
    let lhs = match get_value(offset+1)
    | let v: Value => v
    | let e: Error => return e
    end
    let rhs = match get_value(offset+2)
    | let v: Value => v
    | let e: Error => return e
    end
    let tgt = match read_storage(offset+3)
    | let i: I64 => i
    | let e: Error => return e
    end
    Operands(lhs, rhs, tgt)

That's a lot of boilerplate for a conceptually simple operation: call three helper functions, propagating errors if any of them fails, and then wrapping the result values in an object if they all succeeded!

How I Learned To Stop Worrying And Love Side-Effects

That's when I realized that I was thinking like a functional programmer, which was making me produce worse code.

Fundamentally, what did I want out of my errors in this program? I wanted two basic things:

  1. I want code that can abort when an error condition happens.
  2. I want to know what error condition happened and transmit back some information about why.

In a functional or functional-inspired setting like Haskell or Rust, you tend to break down problems into individual expressions that reify program states as values. A successful program state is represented by value tagged as 'successful' (called Result::Ok in Rust or Right in Haskell) and a failed program state is represented by a value tagged as 'failure' (called Result::Err in Rust or Left in Haskell). But Pony is also an object-oriented actor-model language, which means it naturally allows us to think about systems with encapsulated state.

While functional programming endeavors to make programming more tractable by removing all state changes and pushing you towards functional purity, object-oriented programming endeavors to make programming more tractable by encapsulating state changes into smaller chunks that are easy to understand in isolation, ideally chunks whose encapsulated state is either trivially reasoned about or even non-observable. Both functional programming and object-oriented programming are intended to sidestep the need for side-effecting assignment statements: in a functional setting by reifying everything into values that never change, and in an object-oriented setting by allowing objects to manage and guard their own internal state in response to external messages. In fact, Alan Kay even said as much in The Early History of SmallTalk:

Though OOP came from many motivations, two were central. The large scale one was to find a better module scheme for complex systems involving hiding of details, and the small scale one was to find a more flexible version of assignment, and then to try to eliminate it altogether.

So let's stop thinking about factorial as an individual pure function, and instead let's put it into a system which might contain encapsulated state: that is to say, an object. In particular, this object can include a queue of errors which it has encountered. Let's also put the meat of factorial into a private method (indicated in Pony with a leading underscore) write a slightly different public method which calls to our internal implementation:

class Math
  // a mutable queue of error messages
  var _errors: Array[String] = Array[String]()

  // our internal implementation
  fun ref _fact(n: I64): I64 ? =>
    if n == 0 then
      1
    elseif n > 0 then
      n * _fact(n-1)?
    else
      _errors.push("Negative argument: " + Format.int[I64](n))
      error
    end

  // our external interface
  fun ref factorial(n: I64): (I64 | Array[String]) =>
    try _fact(n)? else _errors end

This isn't exactly error-handling the way we traditionally think about it. Really what we're doing here is logging: when we encounter an error, we add it to our error queue and then bail. We can use the simpler partial function handling (which litters some question marks around our code but doesn't require the verbose explicit control flow of earlier examples) and yet an external caller can still get informative errors about what went wrong. The core of this approach is that it separates out the two error-handling concerns I described above: exiting early is accomplished with Pony's implementation of partial functions, while reporting useful errors is now handled by state that's only observable internally. An external caller shouldn't even know that any kind of state change was happening: it just gets either a number or an error, and calling it multiple times with the same argument should produce identical return values: it is still an externally pure function.3

In the above example, we only ever had a single possible error message, but this approach scales easily as we add more errors. Let's say we wrap our argument-handling logic into this factorial class, too, so that it takes the argument list from the command line and tries to print the factorial of the first provided argument:

class Factorial
  var errors: Array[String] = Array[String]()

  fun ref _fact(n: I64): I64 ? =>
    if n == 0 then
      1
    elseif n > 0 then
      n * _fact(n-1)?
    else
      errors.push("Negative argument: " + Format.int[I64](n))
      error
    end

  fun ref _fact_main(args: Array[String] val): I64 ? =>
    let arg_str = try args(1)? else
      errors.push("Not enough arguments"); error
    end
    let arg = try arg_str.i64()? else
      errors.push("Non-numeric argument: " + arg_str); error
    end
    _fact(arg)?

  fun ref main(env: Env) =>
    try
      env.out.print(Format.int[I64](_fact_main(env.args)?))
    else
      for e in errors.values() do
        env.out.print(e)
      end
    end

In this case, I'm using strings for my error messages, but there's no reason I couldn't have typed errors, as well. I could, for example, define an Error interface and push structured data instead. This trivial example doesn't show it, either, but the biggest win here is not from the place where you throw the error or the place where you catch it, but rather the intermediate places where you would need to propagate errors. What does the build_operands function I showed before look like with this scheme? It's about as simple Pony lets you get:

  fun ref build_operands(): Operands ? =>
    Operands(
      get_value(offset+1)?,
      get_value(offset+2)?,
      read_storage(offset+3)?
    )

It might fail, but the errors message are squirreled away elsewhere and you are no longer required to handle those explicitly, so all you need to do is sprinkle in a ? to propagate the error! The entry-point into this whole system can handle error-handling, abstracting away the existence of the error queue entirely.

The Point Of No Return

There's actually a bigger win to handling errors in this way, which is that this is a good way of handling errors in a parallel setting.

Remember I said at the beginning that Pony is an actor-model language? If we declare an actor instead of a class, then we can define not just methods (using the fun keyword) but also “behaviors” (using the be keyword). A behavior definition looks a lot like a method definition, but they're not identical semantically: calling a behavior is an asynchronous operation, so it's not guaranteed to run immediately and your won't wait for it to finish before continuing4. Because of this, behaviors can't return anything: after all, by the time a behavior is running, the original calling code will probably have moved on!

Pony does guarantees that a given instance of an actor will never run more than one behavior at a time. Each behavior invocation is a added to a queue for the actor on which it's invoked, which means that actor can freely mutate its own local state without fear of data races or anything: behaviors are sequential with respect to the actor they're on but behavior on different actors will run in parallel.

A consequence of this is that we can't rely on return values to communicate information between different actors, which means Result-like error handling was never going to be useful in that setting anyway. Instead, actors are must invoke behaviors and pass around values in order to communicate information with each other. That means that our error-handling-as-logging system is already well-suited to working in this setting!

One thing we could do, for example, is write our own “error log” actor. We give this actor access to an OutStream (which might be a handle to stdout or stderr or maybe a file) which it can use to report errors. This actor has two behaviors: one of them is err, which adds an error to its log, and the other is flush, which writes all the errors it's seen to whatever stream it was initialized with:

actor ErrorLog
  var _errors: Array[String] = Array[String]()
  let _stream: OutStream

  new create(stream: OutStream) =>
    _stream = stream

  be err(msg: String) =>
    _errors.push(msg)

  be flush() =>
    for msg in _errors.values() do
      _stream.print(msg)
    end
    _errors = Array[String]()

With our ErrorLog in hand, we can now rewrite our factorial in to log those errors asynchronously. In this case (for simplicity) I've left factorial as a synchronous method, but we could also rewrite it as its own actor which can compute its value in parallel, reporting it to one actor if it succeeds and reporting errors to our ErrorLog if it fails.

class Math
  // our internal implementation
  fun factorial(n: I64, log: ErrorLog): I64? =>
    if n == 0 then
      1
    elseif n > 0 then
      n * factorial(n-1, log)?
    else
      log.err("Negative argument: " + Format.int[I64](n))
      error
    end

actor Main
  new create(env: Env) =>
    let logger = ErrorLog.create(env.err)
    try
      let n = Math.factorial(-5, logger)?
      env.out.print(Format.int[I64](n))
    else
      logger.flush_errors()
    end

Is This Actually How You Write Factorial In Pony?

Absolutely not! That'd be ridiculous. A factorial implementation should take an unsigned integer so that factorial will be total and you don't need to worry about getting a negative number in the first place. Or, if your argument really needs to be signed, you can just return 0 for the negative cases and then guard against that happening elsewhere5. One way or another, this whole mess is in no way necessary just to write a factorial!

But this error-handling strategy is completely reasonable, because it both scales and parallelizes to the kinds of programs you're likely to want to write in Pony. Once you start having multiple actors doing multiple things, you're going to need something less like return values and more like logging anyway. It just required a bit of a shift in focus when I first came to it, thinking less in terms of a closed world of expressions, and more in terms of the broader world of independent systems. It's no coincidence that thinking about actor systems running in parallel requires that shift anyway, and Pony simply makes it natural to think in those terms!


  1. While Pony is designed to be a high-performance language, it is not as low-level a language as Rust, nor does it aim to be. There are broad parallels between the two in that Pony has features analogous to Rust's linear types and can use them to get guarantees around parallelism, but it does not give you tight control over things like allocation, and it also requires a larger runtime and a garbage collector. A better point of comparison might be Erlang: imagine building a typed Erlang safely from the ground up and you'd get something much closer to Pony.
  2. There's a long-open RFC about adding typed errors, which may change this in the future, but as of right now we've got nothing.
  3. Even if you don't know Pony, you might notice that I'm lying here! If I call Math.factorial each time, then we'll actually be re-initializing an instance of Math each time and therefore a new _errors array each time, and it'll look fine. However, it's possible to instantiate the Math class and then call factorial multiple times using that same instance of Math: this will result in a growing number of errors, because we haven't cleared the _errors array between calls. There are several ways to fix this, but an easy way takes advantage of a Pony-specific feature: the assignment statement in Pony returns a value, but it specifically return the previous value of the variable being assigned to. This means, for example, that swapping two variables can be done with a = (b = a): the expression b = a will assign the current value of a to b, and then evaluate to the previous value of b, which we then assign to a. This is an unusual decision, but there's a good reason for it that has to do with Pony's implementation of ownership and uniqueness, where assignment ends up serving a purpose similar to Rust's std::mem::swap. In our program, though, it means we can in one terse expression set the _errors array to a new empty array while returning the previous contents of _errors:

      fun ref fact(n: I64): (I64 | Array[String]) =>
        try _fact(n)? else errors = Array[String]() end
    
  4. If you're used to Erlang or Elixir, then you can think of behaviors as a statically typed kind of message-sending: each behavior invocation is a message with a particular expected payload of data.
  5. Sure, that means there's ambiguity that your user needs to distinguish, but it's not an unreasonable way of tackling this problem! You might have heard about one of Pony's more controversial choices, which was to define 1/0 = 0. This is a completely defensible choice and hopefully you've seen in this article why Pony's very strict error-handling strategy would otherwise necessitate a lot of work to handle every possible division operator otherwise.

Douglas Hofstadter's book Gödel, Escher, Bach: An Eternal Golden Braid is a classic text that's had a strong influence on countless people. It's a venerated book among a certain stripe of computer scientist and mathematician, and it's reputed to have inspired generations of people to pursue the study of logic and artificial intelligence. It's a meandering meditation of a number of topics that oscillates around logic, language, formal systems, biology, neurology, music, art, and many more topics, with a particularly strong showing from the three titular figures—the logician Kurt Gödel, the artist Maurits Cornelis Escher, and the composer Johann Sebastian Bach—as well as a heaping dose of writer and mathematician Lewis Carroll, who does not appear in the title but, at least on my copy, is explicitly invoked by the subtitle, “A metaphorical fugue on minds and machines in the spirit of Lewis Carroll.” Many people count it as one of their very favorite books.

So, if you're among the latter group, I should warn you that I'm about to give it a very lukewarm review.

I first read Gödel, Escher, Bach when I was about 20, while I was an undergraduate, and it's important to note that I was double-majoring in computer science and linguistics, and had a particular love of formal systems and logic, but was also proudly a generalist, and had a long-standing love of literature and music. That particular configuration of interests meant that this book was laser-focused to speak to exactly the things that I loved. It was a perfect book for me!

...well, it would have been, but for some reason I kept struggling to get through it. I thought highly of it, but my secret shame was that my admiration for the book was based mostly on the first two hundred or so pages. It took slogging effort to get myself through the rest, effort sporadically applied over the course of years. I did eventually make my way through the whole thing, but even now, I can't necessarily be sure I've read and absorbed every page, even though I've certainly looked at each one. By the end, my impression of the book was much more reserved: it does have some genuine high points and I can understand how it came to have its classic reputation, but I also felt it had a number of problems I've rarely seen discussed. For me personally, it ended up falling short of its reputation as a sparkling, effervescent text that drew together art, mathematics, and culture, and given how little I've seen this discussed, I wanted to write down why I feel this way.

The book is arranged into chapters, each beginning with a dialogue explaining a concept often using imaginative metaphor and a Socratic style, which is followed by a more traditional prose exploration of the concepts introduced in the dialogues. The dialogues are a big part of why I originally struggled with the book: they are meandering and long, and they regularly outstay their welcome. The Socratic style is a difficult one to write well without seeming contrived and difficult, and the book occasionally manages it, but it often falls incredibly flat: usually, they feature one character (usually Tortoise) explaining something verbosely, with conversational asides, but otherwise more or less mechanically, while the other character (usually Achilles) simply responds, “I see! Aha!” and follows up with a question that no real learner would ask but happens to be the next thing Hofstadter wants to talk about.

The other sections revisit the same ideas in prose, giving more concrete examples and dispensing with the catechistic form, and as such are able to give much terser and more interesting examples. Many of these are much clearer, and I remember on my first attempt at reading the book I was often tempted to skip the dialogues and read those first, because their explanations were often much more satisfying and took only a fraction of the time to read through. In some cases, the dialogues attempted to use metaphors that were nonsensical or even broken, and only by reading the chapter afterwards did the dialogue make any sense!

A good example here is the book's explanation of Gödel's incompleteness theorem. The high-level, slightly handwavey description of this theorem is that, for any sufficiently expressive mathematical proof system, there are more true facts within the system than there are proofs for facts, which in turn means that not every fact can be proved: in short, that not every mathematical fact has a corresponding mathematical proof. My short explanation papers over a number of important features of this theorem, such as what I meant by a 'sufficiently expressive mathematical system', and all those features are addressed much more rigorously by the actual proof. It's a fascinating theorem, and to Hofstadter's great credit, Gödel, Escher, Bach helped bring knowledge of this theorem to a much wider population.

Unfortunately, when I first began to read the dialogue which touched on the theorem, I was frankly mystified. Hofstadter decided to explain its working by coming up with a metaphor involving record-players and records that are designed to physically break the record-players they're played on. I'm familiar with how record-players work, but I have never played a record designed to break a record-player! This isn't an intuitive metaphor, because while I have intuition for the operation of records and record-players, I don't have any intuition at all about the universal manufacture of record-player-breaking records. The metaphor raises a number of questions: can the problem of record-player-breaking-records be suitably addressed by redesigning record-players? If no, why not? What if we simply read the information from a record using a visual device with no moving parts? What if we...?

A good metaphor has depth: you can convert a situation into the metaphor, visualize or reason about the implications of the metaphorical situation in isolation of the original situation, and then apply that back and have learned something about the original situation. However, the record-players in this metaphor don't actually work like record-players in the real world, so my own lack of intuition means that reasoning about the original situation via the metaphor is effectively impossible. When I first read the dialogue, I had no idea what was being explained: once I started the following prose chapter, I realized that the “record-players” were formal systems, “records” in were theorems designed to be unprovable within those formal systems, and that the whole thing was an awkward physical metaphor for Gödel's incompleteness theorem. In fact, it was only this realization that made me fully grasp the workings of metaphor in the first place: instead of the metaphor illuminating the theorem, I had to use my knowledge of the actual theorem to grasp what Hofstadter intended for the metaphor!

This is a pretty egregiously bad example, but it was also the point in the book where I realized that I wanted to like the book much more than I actually liked it in practice. I began to read onward and reread past sections with more skepticism, and I realized that the weaknesses which were particularly evident in the dialogue about Gödel's paradox were still partially present in many of the other dialogues. The original inspiration for the dialogue chapters was Lewis Carroll's short allegory What The Tortoise Said To Achilles, which expands on Zeno's paradox of motion to make a point about the foundations of logic. Carroll's dialogue is tight and focused and uses a rather clever metaphor, but the dialogues that punctuate Gödel, Escher, Bach are broad and meandering and the metaphors range from moderately serviceable to (like the one above) actively nonsensical, and the writing style is a mostly-mediocre Carroll pastiche, which means the characters often gratingly pepper their dialogue with interjections like, “Oh, my gracious! Oh, dear me! Oh, but you misunderstand! Go-golly! Oh, but that certainly won't do!” I eventually came to the conclusion that, while the dialogues are one of the more memorable features of the book, they're also an active impediment to conveying much of the book's material in an efficient and clear way.

The non-dialogue chapters, as I've said, are better, although they also range in quality. Many of them are clear, lucid explanations of mathematical concepts intended for a layperson, which often begin by introducing mathematical systems through simple examples, showing what can be done with pure symbol manipulation of those systems, and only afterwards pulling back the curtain to explain what they “mean” in a mathematical sense. The explanations of computational systems have a similar quality, although several of the later chapters feel rather too complicated for their comparatively simple conclusions. On the other hand, the topics that aren't about math or computers (or the shorter bits on workings of DNA) are introduced in a disappointingly cursory way that mostly consists of handwaving and pictures. Those latter sections lack depth and often betray strikingly little familiarity with or respect for the topic in question.

To give an egregious but illustrative example: Hofstadter mentions the experimental composer John Cage on a number of occasions, often bringing up Cage's modernist and aleatoric work as a counterpoint to the meticulously tightly-constructed melodies of Bach. Hofstadter is unsurprisingly negative about Cage's work, and usually characterizes it as avant-garde social commentary masquerading as music, and at one point a dialogue wryly suggests that John Cage might belong in a zoo. John Cage is most famous—or most infamous—for his piece 4'33”, which requires that a performer or group of performers walk onto stage and do not play their instruments for four minutes and thirty-three seconds. (It properly consists of three individual “movements” of non-playing whose lengths have been inconsistently specified across various editions of the score.) Hofstadter brings this piece up in a dialogue1:

Tortoise: [...John Cage] has composed many celebrated pieces, such as 4'33”, a three-movement piece consisting of silences of different lengths. It's wonderfully expressive—if you like that sort of thing. Achilles: I can see where if I were in a loud and brash café I might gladly pay to hear Cage's 4'33” on a jukebox. It might afford some relief! Tortoise: Right—who wants to hear the racket of clinking dishes and jangling silverware?

Tortoise's (and Hofstadter's) explanation of Cage's piece is fairly typical of explanations given of the piece: that is, four minutes and thirty-three seconds of silence. But this is, according to Cage's intentions, a strictly incorrect interpretation of what he was trying to do! Cage's actual intention in creating 4'33” was not to depict pure silence, but rather to force listeners in an auditorium to pay attention to the quiet and subtle sounds which they usually ignore when they listen to music. To make this painfully explicit, here is a quote from Cage about the original premiere of 4'33”:

They missed the point. There's no such thing as silence. What they thought was silence, because they didn't know how to listen, was full of accidental sounds. You could hear the wind stirring outside during the first movement. During the second, raindrops began pattering the roof, and during the third the people themselves made all kinds of interesting sounds as they talked or walked out.

Consequently, when Achilles and Tortoise agree that they'd rather hear silence than the sounds of a café, they're getting the point of 4'33” exactly backwards: a performance of 4'33” in a café should ideally compel you to listen to the “racket of clinking dishes and jangling silverware” with more awareness than usual!

As I said, this isn't an uncommon misunderstanding of John Cage's intentions, and reasonable people can and do differ as to whether 4'33” is a reasonable execution of that intention, or if that intention is reasonable in the first place. However, even with a charitable reading of Gödel, Escher, Bach, it's clear that Hofstadter isn't disputing Cage's artistic intention: instead, he doesn't seem to know what sort of artistic intention Cage actually has, preferring to read his own ideas about social commentary into Cage's work. His understanding of Cage and of the musical context in which Cage works is marked by a lack of context, a lack of deep engagement with the ideas there, and most importantly, a lack of respect. In the preface to my edition, he claims2 that he had,“...unambiguously heaped scorn on Cage's music, albeit in a somewhat respectful manner,” but there's very little respect or willingness to meet Cage on Cage's own terms here, only guarded derision, and that lack of engagement ends up weakening every section that tries to discuss John Cage in particular and modernist music in general.

This sort of cursory engagement with the cultural features of the book ends up undermining one of the book's major selling points: I had originally seen it as the work of a polymath effortlessly weaving fields together into a multifaceted but uniform whole, but in reality, areas that are more than a step or two outside Hofstadter's areas of expertise (computer science, formal logic, some of the more mathematically rigorous bits of cognitive science) are at best shallow, and at worst are “...heaping scorn...” on things Hofstadter doesn't understand and doesn't appear to want to understand.

Despite the relatively surface-level interaction Hofstadter has with the world outside of mathematics and computers, he nevertheless loves to drop in thick, multilayered references to such topics in every cranny he can find. The central two characters in the dialogues are Achilles and Tortoise, borrowed directly from Lewis Carroll's story above (which in turn borrowed them from the famous paradoxes of the Greek philosopher Zeno), and a Crab and a Genie show up on occasion as well. Their names are regularly abbreviated to a single letter, which means you can't help but notice that those letters happen to map to the names of the nucleobases that appear in DNA—adenine, thymine, cytosine, and guanine. All the dialogues have sing-song names that are usually inspired by music, such as Sonata for Unaccompanied Achilles or Canon by Intervallic Augmentation or Birthday Cantatatata. Off-hand mentions of people and places are often wry and unexplained cultural allusions: a typical example is that, at one point in a conversation about popcorn, Tortoise awkwardly shoehorns in the story of a “Schönberg factory” in Vienna that outraged consumers by stopping production of a delicious tonic in favor of a boring cereal, this being a nod to the Viennese composer Arnold Schoenberg's transition from traditional melodic compositions to experimental atonal pieces, a nod that never comes up elsewhere in the text of merits any explanation.3

I will admit right away that I personally find these heaps of unnecessary nods more tedious than interesting or endearing. There's a lot of reference, but very little of it means anything. Occasionally, a dialogue will use these references to actually explain something—one dialogue is written in the form of a crab canon, and thus is identical when read forward or backward, as a memorable way of explaining that form of musical canon—but most of the musical or biological or literary allusions are there really for their own sake. These things are rarely being commented on, or discussed in interesting context, or connected to other ideas. Instead, these ideas simply appear because this is a book in which ideas appear, interrupting the text for a shoehorned cameo, like Stan Lee in a comic book movie. Why do the characters' names map to nucleobases? I suspect if you asked Hofstadter, he'd claim it's because one of the themes of the book is that all these ideas are connected (in the titular “golden braid”), but this kind of reference doesn't actually connect anything to anything: it merely presents things adjacent to each other. It's all flavor, no substance.

This might seem unfair, so I'll give a very specific but pervasive instance of this sort of meaningless flavor. Many parts of the book invoke Zen, which is a school of Buddhism that originated in China and has spread to several other Asian countries but, in the Western mind, is usually associated with Japan. (We do, after all, know this school by its Japanese name Zen and not by its Chinese name Chán, its Korean name Seon, or its Vietnamese name Thiền.) Hofstadter's idea of Zen is a substance-less cliché: it consists almost entirely of faux-Eastern “Oriental” aesthetics and some handwaving about kōans (which are stories used in Zen practice for teaching and meditation) without really delving into any particular aspect of actual Zen thought or practice. There are lots of references to it—for example, he names a theorem MUMON, after the Japanese name of Zen master Wúmén Huìkāi, as part of a vague and largely pun-based connection to one of Wúmén's collected kōans—but none of those references have any substantial connection to the history or practice of Zen. In reality, Zen is a religious sect with history and cultural context and a complicated, multifaceted conversation that has been carried on throughout centuries. In Hofstadter's telling, Zen is just some funny nonsensical stories from Japan.

The edition I have includes a preface in which Hofstadter talks, twenty years after the book's release, about the book itself, its reception, and its legacy, and he goes out of his way to complain about a negative review of the book which accused him of being a hippie trying to popularize Zen. Hofstadter objects to this review because

As I declare at the start of Chapter 9, I find Zen not only confusing and silly, but on a very deep level utterly inimical to my core beliefs. However, I also find Zen's silliness—especially when it gets really silly—quite amusing, even refreshing, and it was simply fun for me to sprinkle a bit of Eastern spice into my basically very Western casserole. However, my having sprinkled little traces of Zen here and there does not mean that I am a Zen monk in sheep's clothing.

In this passage, Hofstadter openly admits to exactly the charge I'm bringing: that his inclusion of Zen is all about clichéd aesthetics (“Eastern spice”) and not at all about any of its substance—in this case, because he apparently doesn't seem to think it has any!

What Hofstadter doesn't admit to, but what I would argue, is that the whole book does this with almost every non-mathematical topic it tackles. His explanations of mathematics-adjacent topics do have substance and are often reasonably well-explained, but every time he branches out, he doesn't seem to realize that he's regurgitating shallow, half-misunderstood cliché: his discussions of modern art and music are, as I mentioned before, deeply lacking in this regard, but he name-checks plenty of artists, musicians, and writers with a high school understanding of who they were and what they did, preferring to pepper the text with photos of wacky paintings, drawing he made of letters that are made up of other letters, and tales of half-understood kōans. They're all spice: his “casserole” is a few insubstantial layers of food underneath inch-thick layers of spices.

This also presents a problem with the entire underlying program of the book: it's supposed to present examples of a common important idea—self-reference—resurfacing throughout various disparate areas, including mathematics and computation and art and music, but while this idea is well-motivated in the parts about mathematics and computation, but because most of the other topics the book tackles end up being just shallow aesthetics, then the “deep connections” there can only be present in shallow aesthetic ways. This was, for me, the ultimate breakdown of the promise of the book, as the grand unifying theme—the titular “eternal golden braid” of self-referential structures across domains—was only capable of unifying a few problem domains, as the rest of those connections were pretty but ultimately insubstantial.

While rereading bits of the book in order to write this post, I flipped through it at random and came across the photos marked as Figure 81, which in my copy at least appears on pages 490 and 491. These pages contain photos of television screens that are in turn displaying geometric images that result from pointing a camera at the screen: infinite turning shapes, nested and sweeping frames, eventually deforming into twirling light patterns. They are described in captions, beginning with plain descriptions like, “What happens when you rotate the camera,” and gradually becoming more florid, with captions like, “The galaxy has burned itself out, and become—a black hole!” The actual caption beneath these photos says the following:

Twelve self-engulfing TV screens. I would have included one more, had 13 not been prime.

These photos are fun! They feel especially endearing in 2018 because of the late-70's television depicted, and they depict a fun experiment that I did as a child as soon as I got my hands on my parents' bulky camcorder4. The captions, however, add very little, and the final comment (“...had 13 not been prime”) includes a bit of extra unnecessary whimsy that seems to wink at the reader but adds absolutely no meaning. Like so much of the book, it seems to hint at something grander while not signifying anything in particular. The photos themselves might be a fun illustration of something, but they're not a particularly deep illustration of anything, and their inclusion here (surrounded by several pages of Achilles and Tortoise pontificating about the notion of “self-engulfing”) doesn't bring any more enlightenment than when I first pointed a camcorder at a TV when I was four.

“A fun illustration of something,” is pretty much as far as the book goes: it hints at grand unifying patterns, but the pattern it finds is just the abstract notion of self-reference, and then it keeps bringing it up, making a few unnecessary references, showing some pictures, and asking, “Isn't that cool? Isn't that weird?” It'll give a perfectly competent (if somewhat verbose) description of formal systems, but as soon as it tries to venture connections to other domains, or to explain more complicated or nuanced details, it turns out that the only connections it can draw consist of vigorous handwaving. The whole book boils down to Hofstadter giving a competent lecture on logic, intimating the existence of an eternal golden braid, and then pointing at some fun photos of televisions.


  1. This appears on page 156 of my copy.
  2. This appears on page P-18 of my copy, as part of a response to a critic who mistakenly believed that Hofstadter liked Cage.
  3. I happen to love atonal music, and I'd highly recommend watching Vi Hart's video Twelve Tones which explains twelve-tone atonal music with plenty of examples and drawings of bird-bowls.
  4. Do young people know what a camcorder is these days? Do people still use the word 'camcorder'?

I first came across Rust back in 2010 or 2011, and it was a very different language than the one it is today, both syntactically and semantically. I remember at the time that newcomers would often complain loudly about the terse keywords—like the fact that the return keyword had been shortened to ret—and the omnipresent tildes scattered throughout the language like fallen leaves in autumn. My programming background was in functional languages—specifically in Scheme and Haskell—and I found this language fascinating, sitting in an interesting and unexplored place in the spectrum of programming languages and bringing something genuinely new to the table.

As I followed its development, I fell more and more in love with it. The core developers were pragmatic and thoughtful, and while I wouldn't always have made the same decisions as them, I always felt like they were making decisions that were well-thought-out and reflected a deep appreciation for both the piece of technology they had created as well as the community that was sprouting up around it. But more than that: I felt like the decisions reflected principles that I, as an engineer, found to be important.

For example, one such decision—a wildly contentious one when it happened!—was the removal of the ~ syntax. Back before 2014, the type ~T represented a unique pointer to a value of type T on the heap, and the expression ~expr allocated a value on the heap and returned a unique pointer to it. Rust removed these entirely: the type became Box<T>, and the corresponding expression became Box::new(expr). There was some forum discussion about whether this was happening in order to introduce syntactic salt which would make heap allocation more painful, but the primary motivation, as described in the RFC was different: it was removing the special case ~T in favor of a single more general mechanism, both to accommodate a larger number of use-cases (such as parameterizing Boxed values by an allocator) and remove redundancy in the language. The text of the RFC even suggests leaving the ~ in the language, but proposes removing it “[...in] the spirit of having one way to do things”.

This wasn't the only instance of such a decision: another good example is the evolution of Rust's closure types. Rust's closure system evolved multiple times and there were many proposals for how to develop it, but my memory of Rust at the time involved (for example) “stack closures” (which meant they could be passed to functions but never returned from functions) and procs, (which took ownership of the things they closed over and could therefore only be called once.) The syntax used to create them was different, their types were different, and they were treated differently and specially by the compiler. Eventually, Rust switched to its current system of unboxed closures, which are subject to the same borrowing rules as other types and use traits like Fn, FnMut and FnOnce in order to abstract over their function-like behavior. Again, this takes something which used to be a special-case and turned into something general, built in terms of powerful existing building-blocks of Rust.

If you want to see the results of these changes, look at the current version of the Periodic Table of Rust Types, which looks now like a very rote mechanical chart, and compare it to the first version of the same chart from January of 2014, which features wild irregularities and special cases. The fact that Rust would take such pains to pare the language down to powerful-but-orthogonal abstractions was, by and large, my favorite feature of the language.

I should be clear about why I like this so much: I would argue that the process of taking special cases and turning them into expressions of general properties is not merely a nice aesthetic property, but in fact one of the most important things we do as software engineers and especially as programming language designers. This is the entire purpose of abstraction: it allows us to build tools which can broadly apply to many situations, and in a programming language, it allows us to build languages which have a smaller surface area and therefore are easier not just to learn—which must only be done once—but also to remember and reason about their semantics—which must be done perpetually as we use them. A programmer writing in a language must model, in their head, the meaning and execution of the language, and consequently, a language composed of small, straightforward parts is going to be easier to model than a language composed of large numbers of special cases and situation-specific abstractions.

All of this is why I'm pretty dismayed by the current direction of Rust.

The recent contentious discussion is about a much-bikeshedded piece of sugar usually called try fn. The motivation for try fn has to do with the often-repetitive nature of handling errors in Rust using the Result type. Rust has several pieces of built-in support for writing functions that return Result<T, E> to represent either a successful return value of type T (represented as Ok(expr)) or an error of type E (represented as Err(expr)). For example, the sugar expr? requires that expr is a Result value, and will desugar to this:

match expr {
  Ok(v) => v,
  Err(e) => return Err(e.into()),
}

That is to say: in the Ok case, it'll turn a Result<T, E> into the T it contains, but in the Err case, it will return early from the enclosing function with an E value.

The try fn feature extends this use-case to make using Results even simpler. One of the current frustrations is the preponderance of Ok() constructors. Consider that when you're returning a value of type Result<(), Err> you will often have to end your block—as well as any “successful” early return—with the slightly unwieldy expression Ok(()):

fn sample(lst: Vec<T>) -> Result<(), String> {
    for x in lst {
        if is_this() {
            return Ok(());
        } else if is_that() {
            return Err("Bad!".to_owned())?;
        }
    }
    Ok(())
}

A proposal like try fn would introduce a sugar that automatically wraps Ok around successful cases. There are dozens of variations that have been described, but the above example might look something like this:

try fn sample(lst: Vec<T>) -> () throws String {
    for x in lst {
        if is_this() {
            return;
        } else if is_that() {
            Err("Bad!".to_owned())?;
        }
    }
}

To be perfectly honest: I think this sugar is a bad idea. I think that it adds an extra hurdle to learnability1, it obscures the (incredibly simple!) way that Rust's error-handling works, it produces an unfortunate asymmetry between producing Result values (where the shape of the data is implied by the context) and matching on Result values (where you will still use the constructors verbatim), and worst of all, it's only faintly nicer to read than the other, and thus a fair bit of fuss over a comparatively minor win in readability2. When compared with my earlier examples of changes to Rust, which were about removing special cases in favor of general mechanisms, this change takes a powerful feature implemented in terms of a general mechanism (errors represented as a Result, a plain ol' Rust enum) and turns it into a special case.

A change like this increases the language's surface area, and results in aggregate in more complexity to the language. One might argue that there's a sense in which try fn is a simplifying change: the body of the above function is shorter and simpler than it was without it. But this is only a local simplification: it streamlines the process of writing code by allowing a programmer to ignore some details about the semantics of their program, but those details are still present, they're just not locally evident, and instead are expressed as part of a larger context. A programmer who is reading code like this now has to be aware of more details of the code in question, as the meaning of an expression like return 5 can only be resolved by looking at the entire function definition. And now there are extra inconsistencies and special cases in the language, which puts a greater burden on the programmer's mental model and introduces more cases to consider in an already large language.

A feature like try fn isn't, taken on it own, a major shift in the language in any way, but it's still very plainly a move towards special cases, and consequently it's a change that runs counter to the things that I loved about Rust in the first place.


  1. There's a very meaningful parallel that various Swift developers brought up: Swift has an Optional type that works very much like Rust's Option or Haskell's Maybe, but in order to make it as familiar as possible, they introduced a number of pieces of syntactic sugar to make working with it as smooth as possible. The unfortunate side-effect of this is that it actually obscures the way that Optional works, as newcomers to the language see the constructors so rarely that they assume that Optional is a built-in compiler feature unlike other algebraic data types! If try fn is implemented in Rust, then my suspicion is that we'll see a whole swath of new Rust developers thinking the same thing here, that try fn is a special error-handling construct and not merely sugar for Result.
  2. I know that Ok(()) is a bit of a weird-looking expression, but I think that could be smoothed over without modifying the language's features by, say, introducing a function ok() that always returns Ok(()). Additionally, libraries like failure already include macros that help in the error case, so I could today rewrite that function as something more or less like:

    fn sample(lst: Vec<T>) -> Result<(), String> {
        for x in lst {
            if is_this() {
                return ok();
            } else if is_that() {
                throw!("Bad!");
            }
        }
        ok()
    }
    

    which is more readable and requires no language modification at all!

Something I've felt for a long time is that structural types are underused and underappreciated in modern programming languages. They're not unheard of: plenty of programming languages feature them in prominent or less-than-prominent ways! But I'm always surprised that they don't show up more prominently.

In particular, I feel that structural types combine about 95% of what I want out of dynamic duck typing with about 95% of what I want out of static type-checking: and honestly, that's a good amount of both! This post is intended to be a quick example of what structural types looks like and why I sometimes point to them as the static answer to duck typing.

Structural Types vs. Nominal Types

If you're familiar with Java, you know what nominal types look like. Consider the following (contrived) example in Java:

class Cat {
  void speak() {
    System.out.println("meow");
  }
}

class Dog {
  void speak() {
    System.out.println("woof");
  }
}

Both Cat and Dog are classes that, from a programmer's point of view, implement the same interface: both of them have a zero-argument method speak that prints something. However, as written, I can't easily write a function that accepts either a Cat or a Dog and statically rejects other objects which don't have a speak method. From Java's point of view, these are completely different types, regardless of the set of methods they expose. Why? Because they're named different things! I could manually tell Java that they implement a shared interface, but there are no language features that can work over objects of the same “shape” without explicitly indicating that they share that shape.

The OCaml language has a different approach to objects. I can define analogous classes in OCaml like this:

class cat = object
  method speak = print_endline "meow"
end
class dog = object
  method speak = print_endline "woof"
end

However, OCaml has a crucial difference in how it treats the types of objects: OCaml cares more about the interface than the name. Consider the function below:

let hear_what_it_has_to_say obj =
  let () = obj#speak in ()

If you're used to curly-brace languages, then this syntax might be unfamiliar: the # operator works like the . operator in Java and other object-oriented languages, so the expression obj#speak is the OCaml equivalent of obj.speak() in most traditional object-oriented languages. If we load this function into an OCaml repl, we can observe the type that OCaml has inferred for this function1:

# hear_what_it_has_to_say;;
- : < speak : unit;  .. >  ->  unit = <fun>

This again might be unfamiliar syntax, so the way to read this type is as follows: hear_what_it_has_to_say is a function which takes an object whose type (represented as the stuff in the angle brackets) has a method called speak which returns unit (more or less like void in Java or C++). The object may have other methods, which is what the .. means. Finally, the function itself returns unit.

In short, this function takes an argument that must be an object that has at least a speak method which doesn't return anything. This means I can call it with both a cat and a dog: after all, they both fit that description!

let () = hear_what_it_has_to_say (new cat)
(* prints "meow" *)
let () = hear_what_it_has_to_say (new dog)
(* prints "woof" *)

Notice that at no point in the above code did I indicate that cat or dog shared an interface: in fact, I didn't define any interfaces at all! I simply created data that had a particular shape, and wrote code that assumed a particular shape, and put them together.

Row Polymorphism

Sometimes when I talk about structural typing, I talk specifically about row types or row polymorphism. This is a particular implementation of structural typing which happens to be convenient and easy to reason about, although other approaches do exist.2

You've already seen an example of row polymorphism: OCaml's object types! The .. in the above type signature is what we would call a “row variable”, a stand-in for “the rest of the object”. In the above instance, both dog and cat had the same interface, but we could define a new object that features a different, larger interface:

class cow = object
  method speak = print_endline "moo"
  method num_stomachs = 4
end

Our cow class now has a method that neither cat nor dog bothered to implement. However, we can still call our hear_what_it_has_to_say method on it without trouble, even though its type is strictly larger than the types of both cat and dog. After all, it still fits the interface!

let () = hear_what_it_has_to_say (new cow)
(* prints "moo" *)

A powerful feature of row types is that we can give intermediate names to structures or parts of structures and use them accordingly. For example, I can write a function like the one above that calls a method and then returns the object it received:

let ecce_orator obj =
  let () = obj#speak in obj

Here's the type that OCaml infers for this:

# ecce_orator;
- : (< speak : unit; .. > as 'a) -> 'a = <fun>

This types looks mostly like the one before, except that we give the object type a temporary alias (here it is 'a) which allows us to express that the return value of the function is exactly the same as what we got in: that is, if we were passed a cat, we'll get back a cat with all its methods, if we were passed a cow, we'll get back a cow with all its methods, and so forth. This is important, and is one of the things that separates systems like row typing from other approaches to structural types like structural subtyping.3

Why Does It Matter?

I said near the beginning that structural types give you 95% of what you want from duck typing and 95% of what you want from static typing. For a long time, I suspected that people who were fans of dynamic languages would start to find structurally-typed systems and use them to build new languages which could take advantage of static types while retaining the flexibility of dynamic systems that permit duck-typed interfaces. I recently found a newer language which is a perfect example of exactly this approach: the Crystal language. To demonstrate, here's what the above OCaml snippets look like when translated into Crystal:

class Cat
  def speak() puts "meow" end
end

class Cow
  def speak() puts "moo" end
  def num_stomachs() 4 end
end

def hear_what_it_has_to_say(obj)
  obj.speak
end

hear_what_it_has_to_say(Cat.new)
hear_what_it_has_to_say(Cow.new)

If you know Ruby, this program will look very familiar: it's valid Ruby source! In particular, like the OCaml above, it's a program that can call hear_what_it_has_to_say on any object with a speak method through the magic of duck typing! Amazingly, it's also valid Crystal, and produces exactly the same output. There's an important difference, though: if I were to ammend this program with a line like hear_what_it_has_to_say(5), then the Crystal compiler gives me the following compile-time error:

    Error in src/main.cr:19: instantiating 'hear_what_it_has_to_say(Int32)'

    hear_what_it_has_to_say(5)
    ^~~~~~~~~~~~~~~~~~~~~~~

    in src/main.cr:12: undefined method 'speak' for Int32

      obj.speak
          ^~~~~

    Rerun with --error-trace to show a complete error trace.

This is a bit verbose and jargon-heavy, but what it's telling us is that the literal 5 (which Crystal gives the type Int32) doesn't have a method called speak, and therefore the program doesn't type-check. Crystal is doing something very much like what OCaml does here, but it's also doing it while presenting you with an interface that looks a lot like Ruby's: it's specifically designed to enable the use of duck typing while still preventing cases that would end up failing at runtime!

But You Said 95% Up At The Top

Okay, there are a few drawbacks to a system like this. One small cost relative to a more traditional nominally-typed system is performance: it's difficult to implement this kind of type system without some kind of indirection, which is a small but nonetheless present cost. When a method is given an object which has a speak method, it needs to know where the code for that method lives, which means I'll need some kind of function pointer or vtable to point it to the appropriate code, or else I'll have to replicate the method's code with specializations for each data layout we use. In most problem domains, this won't be a problem—-but nonetheless, this sort of indirection wouldn't necessarily be required in a system with more rigid types!

A slightly bigger cost is that structural systems like this have slightly weaker static guarantees than a more nominal system. In a Java-like setting, if I accidentally try to call myCat.speka() instead of myCat.speak(), then the compiler can immediately spot a problem right in the function definition: Cat objects don't have a speka method! In the analogous OCaml function, however, I might not find the problem so easily: if I mistyped out hear_what_it_has_to_say function above with a speka method, then the function itself would have been fine: it just means that it takes an object that has a speka method instead of what we intended! Our program as a whole still wouldn't compile, but the error would only arise elsewhere, when we try to pass a cat object to the method. In this case, we're probably safe, but when you start to look at programs across module or compilation unit boundaries, you can start seeing that it's possible for this sort of error to slip in unnoticed until later compilation units are presented with a nonsensical interface.

Finally, there's the cost relative to traditional dynamic type systems: these structurally-typed systems are often less expressive than pure duck typing! OCaml's approach, for example, doesn't let you branch on whether or not an object has a given method, like Python's hasattr or Ruby's respond_to?: you either use the interface you're given, or you don't. Crystal does let you do this, but the type system becomes more complex and sporadically harder to reason about, and it will regularly (although it didn't come up in my example above) simply give up inference and require the you to fill in the types that you want. Still, it's not as straightforward as passing in a thing that implements the stuff you want and letting it quack.

Of course, in several of these cases, I'm also setting up a bit of an artificial divide: there's nothing wrong with having a system that has features of structural and nominal typing! OCaml does, and this can be very powerful: we can have some parts of the program that work over types whose structure is inferred, and others that work over types whose structures is declared up-front, and both of these can happen in concert. Alternately, we can build gradually typed systems that allow full dynamic types and gradually use more static knowledge to move towards a system like this, for a full spectrum between flexibility and safety.

But even with the drawbacks as described, I contend that systems that use structural types as a modeling tool are a powerful middle step between dynamic and static languages, and they can definitely enable new powerful tools that allow the flexible experimentation of dynamically-typed languages while retaining many of the safety properties provided by statically-typed languages.


  1. I added a tiny bit of extra whitespace to the actual output, for clarity's sake.
  2. The most notable other approach to structural typing is structural subtyping, which I'm not going to go over here, but which also exists in OCaml: the Real World OCaml book has a section on Subtyping Versus Row Polymorphism which explains it in a bit more detail. There are some situations where you might prefer row types, and some where you might prefer a proper subtyping system: it very much depends!
  3. In particular, in a system with subtyping, I might have one type that's a subtype of another: say, a cat which is a subtype of animal. I can write a function which accepts an animal as argument, and pass it a cat, but once I've done that, I've casted that reference to a cat to a broader type, which means the type has lost some information! If that function returns the value it was given, it can't return the cat, it has to return the value it was given, and the type has no way of tying the full structure of the input type to the output type. That's where row typing is a valuable tool!

I want to say a few words on how I write Haskell as of 2017.

Before I go on, I want to make very clear: this is neither a style recommendation nor a judgment on any other style. This is how I write Haskell, and many of the principles below I've come to either because of past experiences with writing or maintaining Haskell, because of how I originally learned or was taught Haskell, or because of experiences I've had with other languages in general.

However, even though I like to adhere to these principles, I don't want to advocate that any of them are universal! You probably write Haskell differently, and that's not just okay but actively good: one of the great things about Haskell as a language is that it's a powerful raw material that permits all kinds of different powerful and expressive idioms, and someone else's Haskell style will definitely take advantage of Haskell in ways mine doesn't.

The Boring Basics

This stuff isn't actually interesting, and I make an effort to stick to the code style of the codebase I'm writing in, but here's the textual details of how I write Haskell in the absence of any other factors:

I prefer two spaces and, whenever possible, to keep code further to the left: that means, for example, that I like to start a new line after the = in data and (if they're longer than one line) after function heads and after the do keyword, in order to 'swing' the indentation back left. I also prefer left-aligning operators as much as possible, and I do like alignment of repeated operators within lines (like :: in record definitions or -> in case-expressions). For example:

data Pair a b = Pair
  { pFirst  :: a
  , pSecond :: b
  } deriving (Eq, Show)

data Bool
  = True
  | False
    deriving (Eq, Show)

printTheBoolAndItsNegation :: Bool -> IO ()
printTheBoolAndItsNegation b = do
  putStrLn ("I got " ++ show b)
  putStrLn ("And `not " ++ show b ++ "` is " ++ show (not b))

A special case of my always-to-the-left principles is that if a function applies a short expression to something which can be 'swung' to the left, like a do block or a list comprehension, then I'll often include the shorter expression on the same line as the function head, and then swing the expression it applies to. That's a bit abstract, so for a concrete example:

fizzBuzz :: IO ()
fizzBuzz = mapM_ print
  [ fizz ++ buzz ++ mbNum
  | n <- [0..100]
  , let fizz = if n `mod` 3 == 0 then "fizz" else ""
  , let buzz = if n `mod` 5 == 0 then "buzz" else ""
  , let mbNum = if null fizz && null buzz then show n else ""
  ]

(I'll also talk about this in more detail later, but I use list comprehensions pretty regularly in day-to-day programming, and I think they're the right choice for a lot of iteration.)

I always camelCase my names. This is largely for consistency—-I actually find snake_case and kebab-case to be vastly more readable, and in the absence of an existing community style, I probably would gravitate to them—-but it's what Haskell uses, so it's what I use.

I try to name things using names whose length is proportional to the range of the scope in which they appear—-that is, fewer characters for a scope of a line or two, more characters if a name shows up for a longer block—-but I tend to feel that it's never a bad idea to pick a slightly longer name. I definitely have a few conventions I hold to lightly for particular varieties of names:

  • x:xs for iterating over a generic sequence, and a similar _:_s for iterating over sequences of specific things. (e.g. if I'm iterating over a list of addresses, I might use addr:addrs or, for a single-line function, a:as.)

  • Functions for creating a thing are generally mkThing. I mostly like mk because it's evocative but also only two characters: it gets out of my way.

  • Things which might be Nothing I often give an Mb suffix, specifically when I'm going to discriminate on that same value and pull out the interior value. For example:

    logPathMb <- getEnv "LOG_PATH"
    case logPathMb of
      Nothing   -> die "No LOG_PATH provided"
      Just path -> runWithLogPath path
    

Naming is an art that's worth getting good at, and I don't know that I'm great at it, but I do like to think hard about exported names in APIs before I commit to a name.

Extensions

There are some extensions that are so comfortable to me that I almost always turn them on if there's even a minor chance I might need them: specifically, ScopedTypeVariables and OverloadedStrings.

There are a lot of syntactic extensions that I'm okay with if I feel like they're pulling their weight: a good example here is that some code can be made significantly cleaner by application of extensions like MultiWayIf, LambdaCase, and RecordWildCards. I'm not likely to turn those on in a source file unless I feel like not having them is painful. For a concrete example: parsing and reading JSON using the aeson library is much easier and more readable with RecordWildCards enabled, but I wouldn't use it if it made just one function nicer.

I have a similar relationship with a some of the type system extensions: RankNTypes, ExistentialTypes, GADTs, and TypeFamilies all can make code cleaner and more powerful, but again, I'm not going to turn them on unless it becomes very clear that I need them, or that a particular desirable API is straight-up inexpressible without them. Of these, I want to call out TypeFamilies in particular: sometimes, using TypeFamilies, especially closed type families, can make certain abstractions much nicer, as in using a type family to change the types of AST nodes based on the current phase of a compiler.

A lot of other extensions, though, I've never felt like I needed, and when I've used them I often feel they obscure the problem I'm solving. Many extensions—-even some of the ones I've already mentioned—-can make refactoring or analyzing code harder, and there are often better tools to reach for in those situations. In particular, almost any extension that exists to help with the definition or selection of more elaborate typeclass instances or contexts—-things like FlexibleInstances, FlexibleContexts, MultiParamTypeClasses, ConstraintKinds, and so forth—-tend to tip me off to the fact that I'm solving a problem with typeclasses that I should be solving another way. I'll get into that a little more down below, when I talk about my relationship with typeclasses in Haskell code.

(If I'm being honest with myself, I actually kind of like MonadComprehensions—-and I have semi-seriously pushed for ApplicativeComprehensions in the past as a better alternative to ApplicativeDo, which I find slightly rickety—-but I don't actually use monad comprehensions in my code in practice.)

Broader Principles

Now this section contains what I think is interesting: not, say, how many spaces to indent things, but rather, what kind of higher-level trade-offs to make while writing code.

Haskell records are bad, but use them anyway

I use a lot of records in my code, and the reason is simple: I like giving things names. I know naming things is supposed to be one of the hard problems in computer science, but in this case, I like to think of names as documentation of the purpose of a piece of data: giving a field a name tells me what that field is supposed to be or do. Any time a data type has exactly one constructor, I will try to give every piece of data inside it a field name. If it's a newtype, then it might be a field name like fromFoo, but I still do that anyway.

I should also be clear here: I use record syntax when I have a type with just one constructor. Partial functions are a dangerous thing to include and can silently introduce errors into code, and record selectors defined on only some constructors end up being partial functions! When I say 'records' here, I mean 'single-constructor types with record syntax'.

When creating values of a record type, I effectively always use the record syntax to do so. This is especially useful for records that contain more than two or three fields: sure, for a five-argument constructor, I could always look up what the arguments mean for every constructor and what order they appear in, but it's so much nicer to provide those by name and have that redundant characterization.

It's also a powerful way of refactoring function calls with more than two or three arguments. Haskell doesn't have named parameters, but they can be faked by bundling the information needed for a call into a single piece of data, and then that allows for e.g. providing defaults that can be overridden with record update syntax. I consider doing this for functions of four or more arguments; by the time I get to about six or seven arguments, not doing this is almost criminal.

This brings me to another point:

Never hesitate to create a type alias or—-even better—-a type

It doesn't matter if it doesn't get exported, or only gets used by one function, or has a wonky name: types are always good. One of my major pet peeves about Haskell-the-language is that I can't define new types within let bindings or where clauses, like one can in other ML languages, because sometimes I want a new type for just a few lines. Doesn't matter. Make it. This can go a long way towards clarifying data-structure-heavy code, and almost always makes it less susceptible to bugs, too.

It doesn't matter if the type I'm defining is 100% isomorphic to some other type, too: a common pattern in code I've written is to include a type which is identical to Maybe, occasionally monomorphized to some particular type, like

data CommandResult
  = SuccessfulResult Text
  | ErrorResult

In my view, a type like this is often better than just using a Maybe Text in these situations: using a Maybe is more flexible, but the flip-side is that it permits a slew of operations that I might not want to be valid on a CommandResult. Additionally, Maybe Text tells a reader little about where it came from—-it might or might not have a Text!—-but a CommandResult would localized to a smaller chunk of code.

As a related point:

Specialize for intent

This is one where I run against current Haskell orthodoxy, not to mention popular Haskell linters: while I like exporting polymorphic functions, I often prefer using (comparatively) monomorphic functions, even when they're a monomorphic variation on a more general polymorphic function. A good example here is concatMap, a function which I genuinely like. In many situations, concatMap acts exactly like the monadic (>>=) operator, but with the monadic type monomorphized to []:

concatMap :: Foldable t => (a -> [b]) -> t a -> [b]
(>>=)     :: Monad m    => (a -> m b) -> m a -> m b

The only difference between the two functions (at present) is that concatMap can take an arbitrary Foldable value and turn it into a list. However, code that uses concatMap with an argument whose type is known to also be [a] will raise complaints from linters like hlint, which suggests using (>>=) instead.

This runs contrary to my taste in code: I prefer using concatMap in these situations precisely because it is less generic. The typechecker can easily tell that my use of (>>=) is working on lists, and upon deeper examination of the code I can of course find that out, but that may not be obvious to me, the reader, from a perusal of the textual context of the code: on the other hand, the function concatMap always produces a list, and thus gives a human reader specific, concrete type information, even if that type information is redundant to the type-checker.1

In general, if I'm not explicitly working within an abstraction, I don't want to borrow generic features of that abstraction: I am happy to use ++ instead of <> to concatenate lists, and type-specific length functions instead of a generic Foldable one, and I do not like to rely on the Arrow typeclass unless I am explicitly writing Arrow-polymorphic code. Borrowing an abstraction to do something I can do without that abstraction feels like mixing concerns for me: if I have to keep details of the current code in mind, I'd rather be thinking in terms of what the code is actually doing, rather than in terms of a higher-level abstraction that's not actually abstracting anything.

Of course, if I am writing something that needs to be polymorphic, then of course I will use the polymorphic functions: if I'm writing a snippet of code that I want to be able to use different Monoid or Foldable or Arrow2 instances, then those polymorphic functions make sense. That means that I am using those abstractions as abstractions. Similarly, sometimes an API has been designed so that no monomorphic operations are actually available: an example of this is the excellent trifecta parsing library, which is written to expose most practical operations in terms of a Parsing typeclass defined elsewhere, rather than exporting its own monomorphic operations. (This is, after all, why these abstractions exist in the first place: so that people can provide functionality while abstracting over them!) But, all else being equal, if I have to choose between a monomorphic and a polymorphic version that are otherwise the same function, and I'm working with concrete types that I know, then I will generally prefer the monomorphic function.

Another, related phenomenon is redundant typeclass constraints. As a simple example: say I'm implementing a set using a binary tree, ordering it by the type's Ord instance. An insert operation will of course need to have an Ord constraint, but in theory, my empty set doesn't actually need any such constraint: after all, the Ord constraint is only necessary in insert because I'm comparing the inserted value against existing values in the set. Why include an Ord constraint for empty, where there are no values to even attempt to order?

data Set a = Node a (Set a) (Set a) | Leaf

insert :: Ord a => a -> Set a -> Set a
insert datum Leaf = Node datum Leaf Leaf
insert datum node@(Node c l r)
  | datum < c = Node c (insert datum l) r
  | datum > c = Node c l (insert datum r)
  | otherwise = node

-- This constraint is unnecessary!
empty :: Ord a => Set a
empty = Leaf

In many (although not all) cases, I would prefer to include such a redundant constraint, and the reason, yet again, is communicating programmer intent. While there's nothing stopping me from creating an empty Set of an unorderable type, there is also no practical reason to ever do so: I would never be able to insert or remove anything from it, and I could only ever peform entirely trivial operations on it, such as observing that its size is indeed zero. My intent in exposing this API is for all sets to contain orderable types: why not enforce that intent with these constraints, even if those constraints aren't strictly necessary in the function body?

In many of the above cases, what I am doing is deliberately opting out of abstractions in favor of concrete knowledge of what my program is actually doing. Abstractions can enable powerful behavior, but abstractions can also obscure the simple features of what a program is doing, and all else being equal, I'd rather communicate to a reader, “I am appending two lists,” instead of the more generic, “I am combining two values of a type that implements Monoid.”

Qualify imports, even when it's not strictly necessary

This is a habit that I've come around to after writing several larger pieces of software in Haskell, especially Matterhorn: qualified imports are pretty much always a good thing. At some point, my habit was to carefully curate import lists instead, but there are trade-offs there: it becomes easy to find out where a function comes from by looking at the source file header, but harder to find out where it comes from when looking at the use site. By qualifying imports, I can instead get information at the use site about where an import comes from, which I've found to be more and more useful over time: coming in and reading the center of a module, I can easily observe that this function comes from the module imported as X while this one comes from the module imported as Y.

A side-note here is that sometimes, especially when I'm using only one or two functions from an external module, or when two or more external modules are closely related in their operation, I will import multiple modules qualified under a single namespace. For example, if I need one or two functions from both System.Directory and System.FilePath, then I might import them both to the same shared namespace:

import qualified System.Directory as Sys
import qualified System.FilePath as Sys

I also do this often with Data.Text and Data.Text.IO together. This is something to be done with care, and it's good to err on the side of importing each module with a distinct qualified name, but occasionally it does help readability to keep the number of distinct module names in use in a particular file low.

Treat imports and dependencies as a minor liability

External dependencies in general have a cost: it's not a huge one, but a cost nonetheless, and I think it's worth weighing that cost against the benefit pretty actively.

A minor cost—-one that matters, but not all that much—-is total source size. Haskell programs tend to pull in a lot of dependencies, which means compiling all of them can take a very long time and produce a lot of intermediate stuff, even if a comparatively small amount makes it in to the final program. This isn't a dealbreaker—-if I need something, I need it!—-but it's something to be cognizant of. If my from-scratch build time triples because I imported a single helper function, then maybe that dependency isn't pulling its weight.

A bigger cost—-in my mind, anyway—-is breakage and control. It's possible (even likely) my code has a bug, and then I can fix it, run my tests, push it. It's also possible that an external dependency has a bug: fixing this is much more difficult. I would have to find the project and either avail myself of its maintainer (who might be busy or overworked or distracted) or figure it out myself, learn the ins and outs of that project, its abstractions, its style, its contribution structure, put in a PR or discuss it on an email list, and so forth. And I can freeze dependency versions, sure, but the world goes on without me: APIs break, bugs get introduced, functionality changes, all of which is a minor but present liability.

It's not a dealbreaker, but it's a cost, one that I weigh against the value I'm getting from the library. Should I depend on a web server library for my dynamic site? Well, I'm not gonna write a whole web server from scratch, and if I did, it'd take forever and be worse than an existing one, so the benefits far outweigh the costs. But on the other hand, if I'm pulling in a few shallow helper functions, I might consider replicating their functionality inline instead. There's always a cost: is it worth the weight of the dependency?

A concrete example where I decided, “No, it's not worth the dependency,” is in my config-ini INI parsing library. It includes an API that uses lenses. The lens library is very heavyweight: it includes a lot of transitive dependencies and a significant amount of code in its own right, and even minimal lens-compatible libraries like lens-family-core are comparatively hefty for what I needed: the Lens type alias and the ability to get and set a lens. Redefining those inline takes less than a dozen lines of code. In this case, the cost of duplicating those functions—-easily verified as correct!—-in my module was low enough that I didn't bother importing any lens library at all!

But of course, the Matterhorn chat client I work on does import a lens library. I'm less concerned about the cost of dependencies for a final executable, as it's not going to inflate anyone else's binary size or build process, and we use a significant number of other helper functions. In that case, the cost of duplicating that functionality inline is much, much greater than just using the library.

So my principle here isn't “always minimize dependencies”, but rather, “always remember to weigh the value of the dependency against its cost.”

Iterate with list comprehensions

I'll start out by saying that there are times I wouldn't prefer list comprehensions. For example, if I'm just mapping an existing function over a list, then I'd definitely prefer map f xs to its comprehension cousin [ f x | x <- xs ]: the former is clearer in its intent and doesn't introduce any new short-lived names. Similarly, a quick filter or even a map f . filter g will usually be shorter and clearer than the comprehension [ f x | x <- xs, g x ].

However, there are a lot of situations where a comprehension is much clearer than its hypothetical map– and filter-based equivalent. One of the biggest advantages to comprehensions, at least in my experience, is that they can use refutable patterns in order to simultaneously decompose and filter expressions from a list. Consider the actual definition of the catMaybes function from Data.Maybe:

catMaybes :: [Maybe a] -> [a]
catMaybes ls = [ x | Just x <- ls ]

We could write this using a combination of existing partial functions (i.e. as map fromJust . filter isJust) or other abstractions (>>= maybeToList) or we could write it with manual recursion, but this snippet is, in my opinion, much more terse and readable, and doesn't rely on any other helper functions. This extends to more complicated list processing tasks, as well:

getGoodDogs :: [Person] -> [String]
getGoodDogs people =
  [ name ++ " is a good dog."  -- all dogs are good
  | Person { personPets = pets } <- people
  , Pet { petName = Just name, petType = Dog } <- pets
  ]

For comparison, here's a point-free alternative to this snippet:

getGoodDogs' :: [Person] -> [String]
getGoodDogs'
  = map (++ " is a good dog.")
  . Maybe.catMaybes  -- assuming that Data.Maybe is imported
  . map petName
  . filter petIsDog  -- assuming this is defined
  . concat
  . map personPets

Some people may prefer this style (or might prefer a slightly different variation, e.g. by combining the concat . map personPets into (>>= personPets)), but for my personal preferences, I prefer reading the comprehension-based one: both of them transform a list in a 'step-by-step' manner, but the comprehension-based one has three 'logical' steps that correspond to decomposing each element of the people list, selecting and decomposing each pet from a person's list of pets, and then joining the final string, while the point-free one includes several steps or details related to massaging the relevant types (c.f. the concat and catMaybes lines), and also relies on a handful of other helper definitions that aren't necessary in the comprehension case.

Finally, ParallelListComp is a very good extension, because it provides a single language-level mechanism that otherwise would have to be filled with various zip and zipWith variants of various arities. (I will admit I'm less sold on TransformListComp, but maybe I haven't found a good use for it yet!)

Be 100% sure a typeclass is the right choice

I'm not gonna go hard-line like some people and assert that typeclasses are “considered harmful” or that they should be blanket avoided, but I do think they're easy to overuse, and it's easy to reach for one without weighing the upsides and downsides of typeclasses. And what's more, non-typeclass solutions can be surprisingly powerful!

The biggest advantage typeclasses provide is implicit instance selection. This is why typeclasses are such a powerful net win when working with, say, numeric types: without typeclasses, we'd have to have a separate monomorphic + for each numeric type, and that'd get unwieldy pretty fast! With a typeclass, we get that instance selection for free.

But on the other hand, typeclasses necessarily permit at most a single instance per type. In the case of +, that's pretty reasonable, but it's a little bit of a harder proposition when talking about something more like show. There are a lot of ways to turn a value into a String: we might want to control indentation, or padding, or use different representations (e.g. decimal or octal or hexadecimal for numbers), all of which would require elaborate newtype wrappers if we wanted to do them with the existing Show abstraction.

Of course, I'm picking on Show when I shouldn't be: Show isn't for tight control over printing, it's for quick programmer-focused output: if I wanted to tightly control how to print a structure, then I should be using a printf-style library or a pretty-printer, not Show. Let's instead talk about a classic example of a typeclass that I think shouldn't be a typeclass: QuickCheck's Arbitrary.

A simplified form of QuickCheck's Arbitrary typeclass looks like

class QC.Arbitrary a where
  arbitrary :: QC.Gen a

Where the Gen monad encapsulates non-deterministic generation of structures along with some extra information about the 'size' and 'depth' of what's being generated. In practice, that means that an Arbitrary instance for a type is going to look pretty mechanical: most of the fields can be initialized with the result of a simple call to arbitrary:

data Pet = Pet
  { petName :: Maybe String
  , petType :: PetType
  , petAge  :: Int
  }

instance QC.Arbitrary Pet where
  arbitrary =
    Pet <$> QC.arbitrary
        <*> QC.arbitrary
        <*> QC.arbitrary

Ah, but wait, I say to myself in this contrived scenario! My application's validation code only ever creates Pet values with non-negative ages, so most of my QuickCheck tests—-the ones that test internal application functionality—-are going to only be interested in valid Pet values. I will modify my instance accordingly: in this case, I can using the QuickCheck-provided Positive newtype, whose Arbitrary instance will only ever create positive numbers, in order to only ever generate pets with non-negative ages:

instance QC.Arbitrary Pet where
  arbitrary =
    Pet <$> QC.arbitrary
        <*> QC.arbitrary
        <*> (QC.getPositive <$> QC.arbitrary)

Ah, but sometimes I do want to run QuickCheck tests over invalid Pet values, just in case. Well, what to do now? I guess I need a newtype wrapper over Pet, too, so I can have one instance for valid Pet values and another for possibly-invalid ones:

-- lies; all pets are valid
newtype InvalidPet = InvalidPet { fromInvalidPet :: Pet }
instance QC.Arbitrary InvalidPet where
  arbitrary =
    (InvalidPet . Pet) <$> QC.arbitrary
                       <*> QC.arbitrary
                       <*> QC.arbitrary

Ah, but also pet names aren't going to be arbitrary unicode sequences; they're going to be validated so that people don't name their pets emoji or the empty string or something like that, so now I have two axes along which a pet might be invalid. Do I need four newtypes, for the cross-product of the four configuration options I want?

I'm of course being obtuse: my obvious choice here would be not to create an Arbitrary instance for Pet, but rather to create a function that returns a Gen value that I can use explicitly:

genPet :: Bool -> Bool -> QC.Gen Pet
genPet validName validAge = Pet <$> name <*> QC.arbitrary <*> age
  where
    name | validName = genValidName
         | otherwise = QC.arbitrary
    age | validAge  = QC.getPositive <$> QC.arbitrary
        | otherwise = QC.arbitrary

While somewhat contrived, I don't think the above is an unlikely scenario. In most situations, I wouldn't want just a single, non-modifiable generator for a type: I'd almost always want a generator that has some controls, or one of several different generators depending on context. Even with simple types like the built-in numeric types, I would want to generate numbers that have various properties that aren't inherent to the type: an Int that's always non-negative, or that's a valid index into a list, or something like that. QuickCheck offers various newtypes to work around this, but even still, one often has to resort to other generation methods (e.g. by filtering by a check or generating from a list of possibilities instead of using the bare Arbitrary instance.)

All that is to say: the Arbitrary typeclass makes sense if the choice of Gen t is always a function of the type t, but in practice, that's rarely true.

I've established now that some uses of a typeclass like Arbitrary are probably ill-advised, or at least could be better served by using named functions: but of course, QuickCheck does have many named functions of that sort which coexist peacefully with Arbitrary instances. And there's another big factor in the use of the typeclass: the convenience of being able to use a single value name (arbitrary) instead of looking up a different name for every generator function. There is something very pleasant about being able to write (,) <$> QC.arbitrary <*> QC.arbitrary: no need to look up the right function name for every type, and using Gen becomes almost mechanical!

But in the long term, I would still prefer an explicit version. Consider what the above, naïve generator—-the one that only generates 'valid' pets—-would look like using the API provided by hedgehog, a newer QuickCheck-like library that does not include a typeclass analogous to Arbitrary:

genValidPet :: Gen.Gen Pet
genValidPet =
  Pet <$> Gen.maybe (Gen.string (Range.linear 1 50))
      <*> Gen.element [Dog, Cat]
      <*> Gen.int (Range.linear 0 30)

Notice that every Gen function is explicit about exactly what it's generating: a Maybe wrapper around a String of a length that's at least 1 and at most 50; one of the two values Dog or Cat; an integer in the range 0 to 30. This is definitely more tedious to write than the typeclass-based version: I can't just spam arbitrarys separated by <*> until it compiles, now I gotta make sure I know the right function names, and it's noisier and bigger, and there's more information per line...

But if I haven't touched this code in weeks or months or years, and I come back to it: I can clearly see what it's doing. I don't have to go consult the definition of Pet to find out what the types are and what instances are getting called, because I know exactly what functions are getting invoked at every point.

That is to say: the implicitness of the Arbitrary approach is optimized for writing, but the explicitness of manual functions is optimized for reading and remembering.

And I haven't gone into the other places where typeclasses can fall down: for example, the way that typeclass resolution can sometimes fail, the various complicated extensions that can be used to get around typeclass selection, the sometimes voluminous and sometimes inscrutable error messages that they can sometimes generate: all those are factors, sure, but they're not the main reason that I try not to reach for typeclasses: even if all those things were solved via some kind of technical magic or a tower of LANGUAGE pragmas, I still would try to use explicit functions instead of typeclasses in my Haskell code.

I also should reiterate that I still use typeclasses, and sometimes I do want them! There are times that the explicitness is too heavy a cost, or the instance selection maintains invariants that I want to have maintained, or the code is just plain better-looking. But I also think that there are many places in Haskell where a typeclass is used and an explicit function or record of functions would be preferable, which is why my personal tendency is to avoid writing a new typeclass until I'm sure that what I need is a new typeclass.

In Summary

There's a unifying theme to a lot of my Haskell style, and it is this: be explicit and use names. Why pervasively use record fields? I'm being explicit about the purpose of those pieces of data by giving those fields a name. Why qualify imports? I'm being explicit about their provenance by giving their provenance a name. Why use functions or records of functions instead of typeclasses? I'm being explicit about what functionality I'm dispatching to by giving that functionality a name. A related theme is: optimize for reading code later. If it takes longer to write or uses more lines or more variable names, but it's going to be clearer to another person reading my code—-or to me, coming back to that source file in a month or a year—-I will absolutely take the time to write the longer code.

But also, like a mediocre high-school essay writer, I want to reiterate what I wrote at the beginning: this is one of many styles, and it has its own tradeoffs! (Verbosity being a big one here.) Your style might be (and probably is) different, and that's a good thing: in fact, I'd be interested in reading analogous posts to this one that describe the specifics of and motivation for other people's Haskell styles! I don't think there's a “wrong way” to write Haskell: there's a multiplicity of possibilities, and we as Haskell users should embrace that!

...well, unless y'all use TypeInType, in which case you're clearly depraved in ways that medical science cannot fix.


  1. Of course, sometimes using type information isn't redundant: occasionally, there are situations where some function is so polymorphic that Haskell can't even figure out the appropriate concrete type, and then using a monomorphic function is helpful not just to a human reader but is actually helpful to the type-checker! But that's just icing on the cake: even if the typechecker has more than enough information to infer the appropriate type, I still would often prefer to use the monomorphic versions.
  2. Y'know, if there were other Arrow instances that mattered for some reason.

This post is a bit more opinion-heavy than my usual fare here, but I wanted to lay out my position on this pretty clearly so I can point to it in the future.

I recently saw a comment on Twitter which irked me a bit. I'm not going to link to the original—I certainly don't want to start the kind of argument Twitter is infamous for!—but the comment went:

From a Curry-Howard perspective, “I like programming but I don't like maths.” means “I like programming but I don't like programming.”

I objected to this on Twitter as well, but it's hard to be nuanced or insightful in 140 (or even 280) characters, so I'm writing this slightly longer, hopefully clearer explanation of what it is that bothers me about this sentiment.

My first objection is that this sentiment draws an analogy that is materially and practically incorrect: when someone says, “I like programming,” it is a safe assumption that they mean, “I like sitting down at a computer and using various tools to describe software that I can then run.” On the other hand, when someone says, “I don't like maths,” it is a safe assumption that they mean, “I don't like performing operations in formal systems in my head or on paper in order to solve problems.” These two tasks are, in material and concrete terms, not the same: the way you set about them is often different, the artifacts you produce and the goals you have are often different, and the problems that arise in the course of performing them are often different! (I rarely have to debug segfaults when I do pen-and-paper mathematics, for example.)

You might object that programming and math do have much in common: in order to write a good program, you have to simulate a formal system in your head before you run your program just as in mathematics, and you also you can perform mathematical tasks using a computer-aided sytem (e.g. with an interactive theorem prover like Coq or Lean.) And there are some common goals, common artifacts, and common problems between the two. All that is true: programming and math have a common structure, and there are strong overlapping areas of practice, and I would argue that we should make the practices of both computing and mathematics more like each other! But for many people who do math or programming or both, the two activities are decidedly not identical.

Consequently, a reasonable interpretation of the sentence, “I like programming but I don't like maths,” might be, “I like making a computer do things but I don't like simulating a formal system on paper,” and indeed, if someone said this to me sans context, this is probably what I'd understand it to mean!

My second objection is that it's a somewhat broad (although not at all unusual) overstatement of what the Curry-Howard Isomorphism actually is. The Curry-Howard isomorphism is generally stated as, “Proofs are programs and programs are proofs.” Already, we can see a problem: the comment in question didn't say 'proofs', it said 'maths'. There are many things that a person might refer to as 'maths' that are not just proof-writing.

And even that short statement of Curry-Howard is a bit reductive: the Curry-Howard isomorphism is actually an isomorphism between proof calculi like natural deduction or sequent calculus, and static type systems for models of computation like the simply typed lambda calculus. Many programming languages—especially statically typed functional languages like Haskell and OCaml, but by no means only them—have type systems that correspond to reasonable proof systems, but many other programming languages used in practice have static type systems that correspond to unsound or nonsensical proof systems, and many other programming languages used in practice are dynamically typed and therefore have little if any meaning when viewed through the lens of Curry-Howard!

So Curry-Howard is not really about all programming, nor is it about all math. An accurate statement might be, “All programs in typed languages correspond to proofs in a formal proof system,” but that might be excessively pedantic, and it's rather a lot of characters to tweet. “All programs are proofs,” is reductive, but probably fine in context. “All programming is math,” is a clear step beyond what the Curry-Howard Isomorphism actually means.

Now, to be clear, the Curry-Howard isomorphism is an incredibly powerful idea, and I genuinely think that working programmers should be familiar with it. But assembly-language programming is also programming, after all, and it's not isomorphic to proof-writing in the same way that Haskell programming is.

My third and biggest objection to this statement is much, much simpler: it's divisive and, well, smug. While I nitpicked the phrasing above for being over-broad and reductive, I did carefully mention that there's still truth to it: someone steeped in the kinds of programming and mathematics where Curry-Howard is commonly cited likely wouldn't bat an eye at the quote! Programming (of a certain kind) and mathematics (of a certain kind) are deeply intertwined to the point of being slightly different views on the same underlying concepts and activity—and to the extent that they're not, I'm sure the sure the originator of the quote would agree with me that we should endeavor to bring them closer together by both tooling and practice.

But it's important to note that not everyone does this kind of programming or this kind of math. For a person whose exposure to mathematics is calculus and statistics, and whose exposure to programming is Python and R—a likely scenario for many people who do scientific computing!—the sentence above comes across at best as obscurantist and pedantic, at worst as nonsensical. This is true even for several working programmers and mathematicians to whom I mentioned this quote. Its phrasing is clearly intended to wow people, to evince a deep and perhaps surprising truth about the practices of math and programming: “Programming and math are exactly the same thing, have I blown your mind yet?” This quote is also radically non-applicable to the experience that many people have had of mathematics and of programming. When such a person reads the statement in question, it comes across as being a self-satisfied kind of ivory-tower sentiment.

Which—not at all coincidentally!—is exactly the impression that many working programmers have of functional programmers.

The person who wrote the original comment apparently saw my criticism—at least, he identified that criticism existed, and I can only assume that he was referring to mine—and his rebuttal was this:

What good does it do you for programming not to be a mathematical activity?

This comment is a pretty radical misreading of what I had said: I was specifically criticizing the rhetoric, not the content, and my comment had even identified this kind of statement as coming from “...people I in general basically agree with.” But even so, the content of this restatement is not actually the same as the content of his earlier statement: the first one glibly identified programming as identical to mathematics while explicitly minimizing any difference between them, while the second comment claims programming is a mathematical activity, a much weaker, less contentious, and less divisive statement.

Is programming a mathematical activity? Very often, yes! And I would contend that our work as programmers should be to make programming even more of a mathematical activity than it is already. We should build proof tools around code and simultaneously make code more amenable to proving. We should educate programmers about the kinds of mathematics that can make their day-to-day job of programming easier and more powerful. We should elucidate the existing mathematical structure they might not realize is already present in what they're doing. And we should build new mathematical abstractions that can benefit both programmers and mathematicians in new and powerful ways.

Right now, many programmers don't think of themselves as mathematicians. In order to bridge this divide, it's important to reach out and talk to these groups in the language that they understand: exposing the mathematical structure of programming and illustrating through example and practice the power of understanding programming as a mathematical activity.

But if we articulate our position as, “Math and programming are identical, have I blown your mind yet?” we don't come across as people trying to push for a better world, who are trying to share tools and knowledge to make both mathematics and programming better. We come across as preaching sectarians, and we unfortunately become yet another data point for the “functional programmers are smug and out-of-touch” pile.

This is apparently quite a contentious opinion, but after having worked with them for a while, I in general like Rust's modules. However, it's not contentious at all to say that they are complicated and hard to understand for someone looking at them for the first time. So, let's talk about Rust modules, building up some principles for understanding them as we go.

Important note: this blog post represents the state of Rust modules as of the time of writing. There are changes in store, and the module system as used in the Rust of 2019 is gearing up to be pretty different!1

Part One: Modules

Before I get to actual code snippets, let's talk about the first two basic principles:

  • Every named item in Rust—-functions, types, values, modules—-can be made public by prefixing it with the pub keyword.
  • Every compilation unit—-a library or an executable, what Rust calls a crate—-contains a 'root' module. For a library, this root module is defined in lib.rs, and for an executable, it's defined at main.rs.

Let's say I make a new fresh Cargo project that's a library. I get a file at src/lib.rs. Every item defined in that file lives in the root module, so let's write a simple function here:

// in src/lib.rs
pub fn hello() {
    println!("Hello!");
}

Inside of this module, we can refer to this function using the unqualified name hello, but we also can use a qualified name, which represents a path through our module hierarchy to get to the name. In this case, hello is exposed through the root module, so the qualified name for hello is ::hello, which leads us to another principle:

  • The root of the current compilation unit's module hierarchy is indicated in qualified paths with an initial ::.

So let's make a new module inside of the root module. Submodules are always indicated with the mod keyword, but (as we'll see) this keyword can be deployed in different ways depending on where we want to implement the module. For now, we'll define the contents of the module inside the same source file, in a curly-brace-delimited block headed by pub mod my_module_name:

// in src/lib.rs
pub mod english {
    // #1
    pub fn hello() {
        println!("Hello!");
    }
}

// #2
pub fn hello() {
    ::english::hello();
}

At this point, the root module contains two public items: ::hello, a function, and ::english, a module. In turn, the module ::english contains a public item: a function ::english::hello. In this case, I'm using a fully qualified names for disambiguation: this is useful here because I have two different defined functions both named hello. Without explicit qualification, the name hello will refer to different things depending on the module scope: when referenced inside the english module, hello refers to the hello function we defined the that I've indicated with the comment // #1, and outside of the english module, hello refers to the hello function I've indicated with the comment // #2.

While there is never 'ambiguity' from the point of view of the Rust compiler, having the same name in different scopes can be confusing for us as readers and writers of code. We can always be explicit and use fully qualified names to alleviate any confusion. For example, we can (although I don't know why you would) write the following code:

// in src/lib.rs
pub mod english {
    pub fn hello() {
        println!("Hello!");
    }

    pub fn outer_hello() {
        ::hello();
    }
}

pub fn hello() {
    ::english::hello();
}

In this example, we've added the new function ::english::outer_hello, which invokes ::hello using a fully qualified path, which in turn invokes ::english::hello using a fully qualified path.

So we've discovered a new principle:

  • Name lookup in expressions is relative to the module in which the expression appears unless the name is fully qualified.

There are also ways of traversing the module hierarchy with relative names. The above example can be rewritten using relative names like so:

// in src/lib.rs
pub mod english {
    pub fn hello() {
        println!("Hello!");
    }

    pub fn outer_hello() {
        super::hello();
    }
}

pub fn hello() {
    english::hello();
}

The expression english::hello() inside of the function ::hello is in the root module, and therefore the name is looked up relative to the root module. The declaration of ::english::outer_hello, however, is relative to the module ::english, so therefore if we want a relative reference to ::hello, we need to first move up a module level with the super keyword in order to access the item we want in the parent module. There's also a self keyword, which allows us to look up names relative to the current module. That keyword is redundant here—-we're already looking up names relative to the current module by default!—-but we'll find a use for it later on.

  • You can use super and self in qualified names to traverse the module hierarchy in a relative way.

(Note also that super and self only work at the beginning of a path: that means we can't write a perverse relative path like ::english::super::english::hello—-not that you'd want to write that anyway!)

There are also two other ways of creating a module, but all of them use the mod keyword in some way. Right now, we've defined a module using a curly-brace-delimited block in a single file, but we can also create a module by placing the definitions of items in that module into a separate file named the same name as the module. In these examples, because our module is called english, we can put declarations into the file src/english.rs, so our two source files will look like

// in src/lib.rs
pub mod english;

pub fn hello() {
    self::english::hello();
}

and

// in src/english.rs
pub fn hello() {
    println!("Hello!");
}

pub fn call_outer_hello() {
    super::hello();
}

Notice that we've kept the pub mod english declaration in the root module, but we've removed the body of that declaration. At the same time, the contents of src/english.rs are (modulo indentation and the comment) identical to the contents of the english module before, and similarly none of the rest of the code had to change at all. This reorganization is purely for our benefit: the code does not care whether we use an inline module or an external module.

There's a third way we could have organized our code, as well: instead of creating a file named after our module, we could have created a directory named after our module, and included a file called mod.rs inside that directory. Again, we could have equivalently written these modules:

// in src/lib.rs
pub mod english;

pub fn hello() {
    self::english::hello();
}

and

// in src/english/mod.rs
pub fn hello() {
    println!("Hello!");
}

pub fn call_outer_hello() {
    super::hello();
}

Notice that even less had to change this time: the contents of the files are identical to before, and only the on-disk organization has changed! Again, from Rust's point of view, the tree of modules and items we've created here is identical. This final on-disk organization method is more convenient if we want to create other nested submodules that are contained in their own files: if we wanted to have a module ::english::plurals, then we could define the module ::english in the file src/english/mod.rs, and then define the module ::english::plurals in src/english/plurals.rs2. If we used either of the other two on-disk organization methods, then we would either have to write nested module blocks like

pub mod english {
    pub mod plurals {
       // ...
    }
}

Or we would have to have a pub mod plurals { ... } block inside of the source file src/english.rs.

However, like I said, all of these choies are mostly important to us as programmers structuring a project on-disk and maintaining it as a repository, but from Rust's point of view, they define an identical namespace. In summary:

  • Modules can be defined in three ways—-using lexical pub mod my_module { ... } blocks, using a src/my_module.rs file, or using a src/my_module/mod.rs file, but the choice of approach is immaterial to the namespace structure being constructed.

This can be a powerful way of growing a project: a module might start life as one or two types and functions inside a source file, and as they grow, they can get moved first into another file, and then into a directory that itself contains several more files, and this can happen transparently, without any user of the code (inside or outside my crate) having to be aware of it.

Now we can move on to

Part Two: Extern Crates

The extern crate declaration does two things at once: it specifies a dependency on an external named crate, and it includes that crate's root module as a named module in the current module's namespace.

Let's build some randomness into our example code. We will use the rand crate to get some random numbers, and use them to choose from a list of greetings.

// in src/lib.rs
// importing an external crate in the root
extern crate rand;

// our statically-chosen list of greetings
static GREETINGS: &[&'static str] = &[
    "Hello!", "Heya!", "Greetings!",
];

pub fn hello() {
    // choose a random valid index
    let index = ::rand::random::<usize>() % GREETINGS.len();
    // and print the chosen greeting
    println!("{}", GREETINGS[index]);
}

Notice that the library functions in the rand crate are now available as though they were defined in a module called rand in the root of our crate. By default, the name of the imported module is the same as the name of the crate—-but we can change that! If we want a different module structure, we can use the as keyword to rename the imported module to a different name—-in this case, let's call it my_rand:

// in src/lib.rs
// importing an external crate in the root, with renaming
extern crate rand as my_rand;

static GREETINGS: &[&'static str] = &[
    "Hello!", "Heya!", "Greetings!",
];

pub fn hello() {
    let index = ::my_rand::random::<usize>() % GREETINGS.len();
    println!("{}", GREETINGS[index]);
}

Now, we've separated out the specification of the external name of the crate we want (the rand crate) from the name we're giving its root module in our internal module hierarchy (::my_rand).

Unlike many programming languages, we can declare extern crate dependencies anywhere in the module hierarchy we want! So, for example, let's import rand underneath a new module. Notice that we need to include a pub qualifier to make sure the module it creates is accessible outside of the deps module. (Remember that all names brought into scope must be made explicitly public with the pub keyword!)

// in src/lib.rs
// importing an external crate in a submodule
pub mod deps {
    pub extern crate rand as my_rand;
}

static GREETINGS: &[&'static str] = &[
    "Hello!", "Heya!", "Greetings!",
];

pub fn hello() {
    let index = ::deps::my_rand::random::<usize>() % ::GREETINGS.len();
    println!("{}", ::GREETINGS[index]);
}

This is a pretty stark departure from how most languages think of external dependencies: usually, they just get placed in some kind of root namespace, and you might be able to import them under a different name (like with Python's import module as name syntax). In Rust, you have full control over your own module hierarchy, and so you can insert the root module of an external crate whever you'd like.

  • The extern crate declaration specifies a dependency on a named external crate and imports that crate's root module as a named module in the current module scope.
  • The name of a module created by extern crate defaults to the name of the crate, but can be explicitly specified with the as keyword.

As an aside: why would you want to do this? Well, one reason is that you might have a lot of external dependencies, and it would be nice to manage them. Imagine that you're writing a video game, and therefore depend on a lot of libraries for things like graphics and physics and so forth. You could therefore imagine wanting to both organize and rename your dependencies in order to manage them, something along the lines of:

pub mod graphics {
    pub extern crate opengl as gl;
    pub extern crate sdl;
}
pub mod img {
    pub extern crate png;
    pub extern crate libjpeg as jpeg;
}

Now you can organize your dependencies in a way that's convenient and structured!

Part Three: Uses

The last thing we haven't talked about is the use declaration, which is given a qualified path and creates a new local name for the item named by that path.

  • The use keyword pulls in names from another scope into the current one.

We often use this with functions or types defined by external crates, so we can pull them into scope with more convenient local names, but it works with any named item, including modules.[\^2]

It's not uncommon to see declarations like use std::fs;, which allows the module ::std::fs to be accessed locally as fs without the std:: prefix. This brings a module, not a function or type, into the local scope.

// in src/lib.rs
extern crate rand as my_rand;

// ...but pulling the random function into scope here
use my_rand::random;

static GREETINGS: &[&'static str] = &[
    "Hello!", "Heya!", "Greetings!",
];

pub fn hello() {
    // we can now use random unqualified
    let index = random::<usize>() % GREETINGS.len();
    println!("{}", GREETINGS[index]);
}

Confusingly, use declarations work differently from expressions, because names in use declarations are always relative to the root. Consider the example below: because the extern crate rand as my_rand was declared in the root module, the fully qualified name of the random function is ::my_rand::random, but when I use that name inside the english module, I give it a relative path as though I'm looking up the symbol from the root.

// in src/lib.rs
extern crate rand as my_rand;

pub mod english {
    use my_rand::random;

    static GREETINGS: &[&'static str] = &[
        "Hello!", "Heya!", "Greetings!",
    ];

    pub fn hello() {
        let index = random::<usize>() % GREETINGS.len();
        println!("{}", GREETINGS[index]);
    }
}

pub fn hello() {
    english::hello();
}

If I do want to use a name that's relative to the current module, I can use the self keyword in the path, which starts me instead from the module containing the use declaration. In the below example, we've moved the extern crate line inside of the module, so now the my_rand module lives in ::english::my_rand, and we then use an explicit relative path to pull in the function we want.

// in src/lib.rs

pub mod english {
    extern crate rand as my_rand;

    use self::my_rand::random;

    static GREETINGS: &[&'static str] = &[
        "Hello!",
        "Heya!",
        "Greetings!",
    ];

    pub fn hello() {
        let index = random::<usize>() % GREETINGS.len();
        println!("{}", GREETINGS[index]);
    }
}

pub fn hello() {
    english::hello();
}

So our last principle is:

  • Name lookup in use declarations is relative to the root module unless the name is explicitly made relative with the self keyword.

I also can use the pub keyword on uses to pull names into a module while simultaneously exposing those names outside of that module. Sometimes I like to do this to carefully namespace a subset of external names I plan on using, allowing me to group several libraries together, or pick-and-choose items from separate but related compilation units into a single module for local use.

pub mod io {
    pub use std::fs::File;
    pub use std::io::*;
}

All The Principles In One Place

To summarize, I'm going to pull all the principles from the above text into a single list:

  • Every named item in Rust—-functions, types, values, modules—-can be made public by prefixing it with the pub keyword.
  • Every compilation unit—-a library or an executable, what Rust calls a crate—-contains a 'root' module. For a library, this root module is defined in lib.rs, and for an executable, it's defined at main.rs.
  • The root of the module hierarchy is indicated in qualified paths with an initial ::.
  • Name lookup in expressions is relative to the module in which the expression appears unless the name is fully qualified.
  • You can use super and self in qualified names to traverse the module hierarchy in a relative way.
  • Modules can be defined in three ways—-using lexical pub mod my_module { ... } blocks, using a src/my_module.rs file, or using a src/my_module/mod.rs file, but the choice of approach is immaterial to the namespace structure being constructed.
  • The extern crate declaration specifies a dependency on a named external crate and imports that crate's root module as a named module in the current module scope.
  • The name of a module created by extern crate defaults to the name of the crate, but can be explicitly specified with the as keyword.
  • The use keyword pulls in names from another scope into the current one.
  • Name lookup in use declarations is relative to the root module unless the name is explicitly made relative with the self keyword.

This is a lot to take in at once, and it can be confusingly different from other languages, but the biggest difference is that your module tree is always explicit. The drawback is that it's verbose and sometimes offers flexibility that can be unnecessary or confusing, but the advantage is that what's happening it's always explicit in the code, and also that you can express exactly the hierarchy of names and namespaces that you want.


  1. Is it a pun to say 'gearing up' when talking about Rust?
  2. Or, in src/english/plurals/mod.rs, of course!