Structurally-Typed Condition Handling

Something I've had puttering around in the back of my head for ages now is My Perfect Application Language. There are languages out there that I like more or less, but I still believe that another language could be a more perfect fit for the software I want to write1, and I have a large number of ideas about what features it would include, what features it would omit, how it might be written, and so forth.

One dissatisfaction I have with existing languages is that I think none of them do error-handling terribly well. Graydon Hoare's very good blog post What next? talks about places that programming languages might go in the future, and one thing he notes is that we've got various approaches but none of them feel obviously right. I very much agree and I don't want to pretend I've “solved” error-handling here.

However, I've got a hypothesis about what sort of error-handling system I might want in My Perfect Application Language, and I've never actually written it down in full. I'm not imagining a completely novel error-handling system: instead, I'm imagining taking a few ideas which have been tried and combining them in a somewhat novel way to hopefully combine their strengths and temper their weaknesses. To explain this hypothetical feature, I'm going to break it down into two language design hypotheses:

The first hypothesis: statically declaring the list of possible exceptions which a function can throw, like Java's checked exceptions or C++ exception specifications, can be a good feature so long as they can be optionally inferred and do not need to be explicitly listed by the programmer.

The second hypothesis: we can create an incredibly flexible and powerful error-handling mechanism by implementing a statically typed inferred condition system inspired by Common Lisp's condition system, coupling static inference of error conditions with static inference of restarts.

The rest of this post will explain them in fragments. I should also note that this all hypothetical: I haven't implemented this, and I don't believe any programming language feature can be really justified until it has proven itself in at least medium-scale programs in its intended domain. (In fact, one of the features I'm about to discuss seemed fine in tiny snippets and turned out to be abysmal in large programs.) That said, it's a hypothesis I'd like to test, because I suspect a feature like the one I'm about to describe would be a tremendous boon to the kinds of programs and libraries that I usually write!

Statically inferred checked exceptions

I think it's probably pretty uncontroversial to say that checked exceptions might sound appealing in theory but are downright miserable in practice. The idea is simple: you can statically verify that all error paths are checked if each method enumerates the exceptions that might be thrown. If I call a method foo() that declares no exceptions, then I know for sure that I won't need to catch any of them! If I look in the source and see foo() throws ThatException, then I know for sure that I might need to handle ThatException! The compiler will yell at me if I don't do something about ThatException: I either need to explicitly catch it, or I need to explicitly pass it on by adding throws ThatException to my method. I can sleep easy, knowing that all error paths are handled!

…except, of course, now you need to copy-and-paste throws ThatException again and again and again and again throughout your whole damn codebase, which is ridiculously noisy and nobody wants to do it. (To say nothing of the half-dozen or more exceptions I might want to actually throw!) In practice, Java programmers have pretty universally found checked exceptions to be firmly Not Worth It: either they haphazardly toss the generic and unhelpful throws Exception around all their methods, or they catch any exception without bothering to discriminate in order to silence the warning, or they use runtime errors which don't need to be included in the type. Findings with large codebases have firmly indicated that exception specifications increase toil but don't significantly impact safety.

But I suspect the problem isn't that they can't increase safety in some respects: the problem is that the amount of toil they cause is disproportionate to the safety you get, and in that they tend to push programmers to use them in ways which minimize the toil and therefore also minimizes safety. So what if we got rid of the toil?

Let's imagine a hypothetical programming language with exceptions: the syntax isn't important, so I'm going to borrow a handwavey Rust-like syntax, but I shold be clear that all the examples below are imaginary and in a fictional non-Rust language. What I imagine here is that the default way of a method might look something like this:

// an example function in a fictional language
fn find(haystack: String, needle: String) -> Nat {
  // implementation elided
}

(Wait, why Nat? I also think programming languages should use natural numbers more!)

Ah, but there's nothing in that snippet that mentions exceptions, right? In our hypothetical language, that doesn't mean it won't throw an exception: instead, that means it's allowed to throw any exception! However, the compiler can infer which exceptions it might throw, and it shouldn't be hard to compute: simply accumulate any uncaught exception thrown in the method body or any uncaught exception thrown by a method which is called in the method body. If I asked the compiler for the type of find here, it might tell me a type like

fn find(haystack: String, needle: String) -> Nat
  throws (SubstringNotFound)

So that means the compiler is well aware that find might throw an exception, and in fact knows what exception it might throw. Anything which uses find now can be aware that it might throw SubstringNotFound. That means that, in this hypothetical language, we can write something like

fn example() -> Nat throws () {
  find("foobar", "ee")
}

And now we've given our compiler license to yell at us: we've claimed that example throws no exceptions, but because we're using find, the compiler can correctly point out that we haven't handled SubstringNotFound. Just like in Java, we've got two basic options—either catching SubstringNotFound in the body of example, or adding SubstringNotFound to the list of exceptions thrown—but we've also got a third, substantially more terse option: we can wholly remove throws () and allow it to infer for us whatever it wants to. We can add the specification if we care but otherwise let the compiler handle the rest.

I think there's also more that can be done around expressivity here. For example, perhaps a programmer could choose to explicitly include an exception in the list: a type like fn example() -> Nat throws (SomeException, ...) would mean, “Whatever the compiler infers for the method body here, but it can also throw SomeException even if the compiler didn't infer that one.” One situation you might want this is when prototyping an API: perhaps I know that my API might eventually throw CacheMissException, but I haven't wired up the cache yet, so I'm going to make sure I include that in my type signatures in the appropriate places just in case, and elsewhere use throws () to make sure I handle it in places where I need to.

More usefully, though, I can imagine a syntax for ensuring that specific exceptions aren't included in the inferred type. In this case, a type like fn example() -> Nat throws (!OtherException, ...) would mean, “This throws whatever the compiler infers for the body, but if that inferred set includes OtherException then raise a compile error.” This means you don't need to regularly re-write the set of exceptions for a complicated API which might throw a dozen different specific errors, but you could still say, “I don't want this specific exception to escape, so keep me honest: if example() ever tries to throw OtherException, then yell at me about it.”

In fact, I can imagine wanting to implement this in a way where exception lists, even empty ones like (), will actually implicitly include pervasive “existential” exceptions: for example, exceptions that represent signals like SIGKILL or exceptions that get raised when the process runs out of memory. In that case, fn foo() throws () would be a convenient fiction, because it won't force the programmer to handle out-of-memory errors, but a programmer could write fn foo() throws (!OutOfMemory) to indicate that foo not only doesn't throw any user-written or typical stdlib exceptions, it also promises to handle out-of-memory exceptions that bubbled up from within it. A typical program probably would still probably define fn main() throws (), but a server that's intended to be long-lived might define fn main() throws (!OutOfMemory, !Sigkill) so that the compiler can let you know when you've failed to handle those error conditions.

I haven't implemented this in any language, so it's quite possible it'd still have problems. And I've handwaved away a number of issues that I haven't tried to solve. For example, I haven't tried fully articulating the typing rules for higher-order functions, or how this would interact with typeclasses, or how the compilation model will work: there's a lot to be done, and it might require other limitations or compromises or heuristics in practice. But my hypothesis is that a feature such as this would let people use checked exceptions in a way that includes minimal overhead, allowing programmers to opt-in to useful checks but also get out of their way when it's not useful.

What's this about conditions and restarts, then?

I said there was another part of my hypothesis, and to explain that I'll have to talk about Common Lisp's condition system. Let's start with exceptions and describe in fine detail how they work, because that'll help us understand how conditions differ.

In a language with exceptions, you signal the existence of an error by you constructing a “thing” called an exception, which is usually a value of a specific type. In languages with inheritance these are often values that inherit from a specific base class (although not always—Ruby, for example, allows any value to be thrown) and in languages without inheritance they usually have some kind of error-specific tag (e.g. in Haskell or various ML languages.) These values can be “raised” or “thrown”. When an exception is “raised”, the language runtime will begin to walk up the stack, freeing stack frames as it goes, until it finds a “handler”, a bit of code which matches the appropriate exception type or tag and is attached to a block of error-handling code. If it fails to find a handler, the runtime will usually terminate the program and print a relevant message. If it does find a handler, it resumes from that point, providing the exception to that bit of code.

This is a widely-used mechanism for errors: so much so, in fact, that it can be hard to imagine alternatives. What else could you do? What else would you want out of it?

I'm going to steal an example from Peter Seibel's Practical Common Lisp as motivation, but rewrite it in a curly-bracket syntax for people who aren't as comfortable with Lisp. Imagine that I'm writing a library which is parsing log files in a particular format. I have a function called parse_log_entry which takes a fragment of text and produces a log entry. Say I've written it like this:

fn parse_log_entry(text: String) -> LogEntry {
  if (is_well_formed(text)) {
    return LogEntry::from_text(text);
  } else {
    raise MalformedLogEntry(text);
  }
}

Now the library is about parsing whole log files, so I also expose a function to parse an entire file like this:

fn parse_log_file(f: File) -> List<LogEntry> {
  let mut list = List::new();
  for ln in f.read_lines() {
    list.append(parse_log_entry(ln));
  }
  list
}

This is nice and simple! Unfortunately, if a single entry fails to parse, we'll throw a MalformedLogEntry exception and lose access to the whole log parsed so far! In some applications, maybe that's fine, but I've said we're writing a library, and we want it to be as flexible as possible for an eventual user. Perhaps a user of the library would like us to put a special kind of LogEntry value that represents a malformed entry instead? We could write something like this:

fn parse_log_file(f: File) -> List<LogEntry> {
  let mut list = List::new();
  for ln in f.read_lines() {
    try {
      list.append(parse_log_entry(ln))
    } catch (exn: MalformedLogEntry) {
      list.append(bad_log_entry())
    }
  }
  list
}

Now we handle that error gracefully. But now we're assuming that's how the user wants us to handle that error. Maybe the user wants us to quietly skip that entry instead! We can write that, too:

fn parse_log_file(f: File) -> List<LogEntry> {
  let mut list = List::new();
  for ln in f.read_lines() {
    try {
      list.append(parse_log_entry(ln))
    } catch (_exn: MalformedLogEntry) {
      // do nothing
    }
  }
  list
}

But what if they want us to skip them but write the errors to stdout? Or to a specific logger? Or apply a correction heuristic and try parsing the line again? Or…

Okay, library design is hard, and designing libraries which handle errors in every possible way is really hard. One approach here might be to provide all these as options which can be chosen by the library user. Maybe we expose different methods to the user which each implement different versions of these strategies, like parse_log_file_with_default versus parse_log_file_skip_bad. Maybe we provide one function with lots of optional parameters, like parse_log_file(with_default: ..., skip_invalid: ...). Maybe we just throw our hands in the air and choose one we think is sensible: convention over configuration, right?

On the other hand, if we had a condition system, we would have a really powerful way of allowing the user to choose any of these and more without having to significantly change our interface. What a condition system does is separate out two different concerns that get conflated by exception-handling: what error recovery code does and what error recovery code should be invoked. To begin with, instead of an exception handler, we install what's called a restart and give it a name: this is how we define our recovery code that might run after an error is raised, but it does not guarantee that the error-handling code associated with the restart is actually run. In this case, let's start with logic around skipping entries:

fn parse_log_file(f: File) -> List<LogEntry> {
  let mut list = List::new();
  for ln in f.read_lines() {
    try {
      list.append(parse_log_entry(ln))
    } restart SkipEntry {
      // do nothing
    }
  }
  list
}

This restart block represents a possible recovery strategy in the presence of an error, and SkipError is the name we've given it. However, we're still missing something: we haven't actually told our program to use it. Our library shouldn't be the one to make that choice, so let's imagine that we're calling the parse_log_file library function from some application code. We now tell our application code to handle a condition by invoking the restart we've defined:

fn analyze_log() -> List<LogEntry> {
  try {
    parse_log_file(File::open("my_log.txt"))
  } handle {
    MalformedLogEntry(_) => restart SkipEntry,
  }
}

This is where I'm telling the program which piece of recovery code to use. What I'm saying is, “If we ever come across a situation where our code has produced a MalformedLogEntry, then recover from that by finding the recovery path labeled with SkipEntry and restart from there.”

So far we've only defined that one recovery path. Let's revisit parse_log_entry and add a few more strategies we might use to recover from an error within that function. Unlike SkipEntry above, these also take parameters, which are pieces of information that the handler can supply in the handle blocks:

fn parse_log_entry(text: String) -> LogEntry {
  if (is_well_formed(text)) {
    return LogEntry::from_text(text);
  } else {
    try {
      raise MalformedLogEntry(text);
    } restart UseValue(v: LogEntry) {
      return v;
    } restart RetryParse(new_text: String) {
      return parse_log_entry(new_text);
    }
  }
}

Now we have a total of three possible ways to recover from a MalformedLogEntry: we can invoke the SkipEntry restart which will simply skip past malformed lines, we can use the UseValue restart with a value of type LogEntry to replace the bad log with a different provided one, or we can use the RetryParse restart to supply a new corrected string and attempt the parsing again.

The important thing now is that the library allows all of these restarts to exist simultaneously, but does not specify which to take: that's up to the calling code. Let's change our application code to supply bad_log_entry() as a default value instead; this means an application will still include as many LogEntry values as we had lines, but some are specifically represented as bad ones:

fn analyze_log() -> List<LogEntry> {
  try {
    parse_log_file(File::open("my_log.txt"))
  } handle {
    MalformedLogEntry(_) => restart UseEntry(bad_log_entry()),
  }
}

What if we want to skip the bad ones but still record that we saw them by printing messages to our logger? We can use SkipEntry with some extra handler code, then:

fn analyze_log() -> List<LogEntry> {
  try {
    parse_log_file(File::open("my_log.txt"))
  } handle {
    MalformedLogEntry(text) => {
      logger.log("Found bad log entry: `{}`", text);
      restart SkipEntry;
    }
  }
}

What if we want to try applying a correction heuristic to the first several errors that we see, but exit the program if we see more than an pre-determined “allowable” amount of bad errors? We can use shared state in our handler and the RetryParse restart:

fn analyze_log() -> List<LogEntry> {
  let mut errors = 0;
  try {
    parse_log_file(File::open("my_log.txt"))
  } handle {
    MalformedLogEntry(text) => {
      if (errors < ALLOWED_LOG_ERRORS) {
        errors += 1;
        restart RetryParse(try_correction(text));
      } else {
        logger.log("Encountered too many bad log entries; exiting");
        system.exit(1);
      }
    }
  }
}

Admittedly, this system is definitely more fiddly than exception-handling: you've got more moving parts, what with the error conditions (which in this example we had only one of) plus the named error restarts plus the handler logic. (I suspect this is one reason why language designers haven't bothered porting this into new languages, preferring simpler exception-based errors or explicit result types instead.) But the separation can be incredibly powerful: we no longer need to manually thread state through our API; instead, our API is designed for the “happy path”, but errors can still be handled in a granular way, and what's more, the application has full control over how those errors get handled. Breaking up strategies for error recovering (as named restarts) from how to actually handle errors (as handler strategies) allows for some incredibly simple but powerful API designs.

Statically inferred typed restarts

Okay, so let's put these together. The last section glossed over types, and for good reason: in Common Lisp, there aren't types for restarts. In fact, it's possible for a handler to specify a restart which doesn't exist, in which case it'll produce a different kind of error (a CONTROL-ERROR in Common Lisp parlance) because it wasn't able to find the place where code should resume.

But we could build a statically typed language that implements a condition system, so that the compiler could reject programs which catch non-existent errors or try to resume from non-existent restarts (or supply those restarts with values of the wrong type.) Yet again, the way I'd propose doing this is by allowing errors—or conditions, to use the Common Lisp terminology—but also restarts to be inferred as part of a function's type even when the rest of the type is explicitly written.

I think in this situation, the code we wrote up with above (without any type information) should still be valid, in the same way that the example code at the start of the post could infer possible exceptions if you didn't explicitly try to specify them. In this case, if we asked our hypothetical language to infer the relevant types, we might end up with something like this:

fn parse_log_entry(text: String) -> LogEntry
  throws (MalformedLogEntry(String))
  restarts (UseValue(LogEntry), RetryParse(String))

fn parse_log_file(file: File) -> List<LogEntry>
  throws (MalformedLogEntry(String))
  restarts (UseValue(LogEntry), RetryParse(String), SkipEntry)

fn analyze_log() -> List<LogEntry> throws () restarts ()

After all, the language can tell which errors are possible and which restarts are available. But as with the inferred checked exceptions before, we wouldn't need to write them out: they're easily inferred by the compiler, and it simplifies the process of writing the code tremendously! The compiler would reject your program if you try to use a restart that doesn't exist, or if you claim your function is error-free but fail to handle an error that might bubble up. This would allow us to have a very powerful set of tools for error handling and also careful API design without putting a strong burden of tracking minutiae on the programmer.

As I said before, I haven't implemented this anywhere, and who knows if this idea would survive contact with an end-user2. It's not only possible but likely that there are cases I haven't considered, complicated interactions I've failed to foresee, or ergonomic concerns that this doesn't address. But it's a hypothesis, and one I'd like to test some day, because if it works, I can imagine it being a tremendous tool for safety and legibility!


  1. Why do I specify this? Well, because I believe that there's no one programming language for all applications. I like Rust and I like Haskell and I like Python, and there are a few applications where any of them might be fine, but more often there are programs that would be perfect for one and terrible for the other two. I think that problem domain matters a lot to programming language design.
  2. I've also elided both implementation concerns (for example: how does this work with higher-order functions?) and other design questions. There are two big design questions that I've considered but don't have conclusions about. The first question is: how should conditions and restarts be named and namespaced? I've treated them in the code as global-ish, but maybe they should be namespaced underneath the function name (e.g. parse_log_entry::UseValue) or at least underneath the compilation unit, but there are a lot of possible choices here. The second question is: how “far up” should restarts be accessible, and how do you control that? I handwaved in the inferred type signatures above that, because we handle the condition which can get raised, we don't “re-export” restarts from analyze_log, but we do re-export them from parse_log_file. Is that an analysis we can do? If not, how do we make sure that restarts are available without adding too much more busywork for the programmer?