Objects in Functional Languages

July 8, 2020

Let's talk about objects in functional languages.

This could be a sticking point, so I'll get this out of the way up-front: I find arguments over whether such-and-such a thing is “really” functional or “really” imperative or “really” object-oriented to be for the most part frustrating and useless. A lot of these arguments boil down to value judgments in disguise (a thing I've touched on in the past when talking about the rhetorical emphasis placed on math in programming) and many of the rest turn out to be based on informal definitions where 'functional' means 'like my favorite or first functional language'.¹ I prefer to think about these concept less in terms of necessary criteria and hard lines, and more in terms of influence and approach and focus.

The paper On Understanding Data Abstraction Revisited by William Cook (itself a paper that revisits On understanding data types, data abstraction, and polymorphism by Luca Cardelli and Peter Wegner) lays out simple definitions of the often-used concepts of “abstract data type” and “object”. His definition of “object” is, of course, not the only one, but I think it's a useful one. Importantly, Cook's definition is based on abstract criteria that can apply even when the language itself does not have an explicit concept of an “object”. To emphasize this, I'm going to use Haskell in this blog post to demonstrate a pure functional variation on “object-oriented programming”².

Beyond being a useful definition, Cook's formulation is also a valuable design pattern, one which has both advantages and disadvantages when designing programs. Consequently, the goal of this blog post is twofold: first, to show how “objects” can (under some definition) exist in a language like Haskell which has no built-in notion of “object-oriented programming”, and secondly, to show the advantages and disadvantages of code written in this style.

Abstract data types

Cook's definition of an abstract data type is a type that

[…] has a public name, a hidden representation, and operations to create, combine, and observe values of the abstraction.

To use a simplified version of his example, here's a (not terribly efficient) way of defining a “set of integers” in Haskell as an abstract data type. In this case, we've got an IntSet type whose representation can be easily hidden, because the operations that a consumer of this library cares about about have been defined in terms of that abstract representation.

-- we'll export `IntSet` but not the constructors
data IntSet
  = SetEmpty
  | SetInsert Int IntSet

empty :: IntSet
empty = SetEmpty

isEmpty :: IntSet -> Bool
isEmpty SetEmpty = True
isEmpty _ = False

insert :: Int -> IntSet -> IntSet
insert x set
  | contains set x = set
  | otherwise = SetInsert x set

contains :: IntSet -> Int -> Bool
contains SetEmpty x = False
contains (SetInsert y rest) x
  | x == y = True
  | otherwise = contains rest x

This is probably not a terribly controversial definition or design: it's a pretty typical one for most functional languages! …well, it might be controversial to use since it's got awful algorithmic performance, but it's fairly unobjectionable as a teaching tool, at least.

What would a user see when looking at the documentation for this module? Since we're not exporting the constructors for IntSet, it'll look something like this:

data IntSet
empty :: IntSet
isEmpty :: IntSet -> Bool
insert :: Int -> IntSet -> IntSet
contains :: Int -> IntSet -> Bool

The ability to hide the definition of IntSet is the thing that makes this an abstract data type. A user of the library doesn't care—and ideally doesn't need to care—what constructors hide behind that IntSet.

What is an object?

Cook then goes on to describe “objects”. Here's another (equally inefficient) implementation of IntSet, which I'll define as OIntSet so I can easily refer to both:

data OIntSet = OIntSet
  { oIsEmpty :: Bool
  , oContains :: Int -> Bool
  }

oEmpty :: OIntSet
oEmpty = OIntSet
  { oIsEmpty = True
  , oContains = \_ -> False
  }

oInsert :: Int -> OIntSet -> OIntSet
oInsert x set
  | oContains set x = set
  | otherwise = OIntSet
      { oIsEmpty = False
      , oContains = \i -> i == x || oContains set i
      }

It's possible that we can choose our export list carefully so that this implementation of OIntSet reveals the exact same set of operations as the previous one. However, there's a major difference here: OIntSet is not actually hiding a specific type. Instead, it just bundles the relevant set of operations inside of a record of functions, which acts like an interface type. In both the ADT-based approach and the “object”-based approach, a user does not know about the internal representation of OIntSet, but in the ADT approach, this is because there exists a single representation which is non-public, while in the “object”-based approach, there may be multiple separate implementations that are indistinguishable.

Why use objects?

An “object”-like representation allows for a vast amount of flexibility. Because a consumer of an OIntSet can use any value as long as the value has provided implementations of the relevant “methods”, we can easily and conveniently define new instances of OIntSet that have radically different internal representations. For example, we can define infinite sets that use simple numerical computations to define their oContains method, like this set of all even numbers:

oEvenNumbers :: OIntSet
oEvenNumbers = OIntSet
  { oIsEmpty = False
  , oContains = \i -> i `mod` 2 == 0
  }

Or we could construct OIntSet values that use a different data representation, such as a list, to store the members of the set:

oFromList :: [Int] -> OIntSet
oFromList list = OIntSet
  { oIsEmpty = null list
  , oContains = \i -> i `elem` list
  }

But even though these OIntSet definitions use different underlying data representations, they expose the same interface, so we can use the same operations to manipulate them. For example, we can define an oUnion operation which computes the union of two OIntSet objects, since that operation is easily expressible in terms of oIsEmpty and oContains:

oUnion :: OIntSet -> OIntSet -> OIntSet
oUnion set set' = OIntSet
  { oIsEmpty = oIsEmpty set && oIsEmpty set'
  , oContains = \i -> oContains set i || oContains set' i
  }

Our oUnion operation—indeed, any operation we define—can work on any two OIntSets even if they have wildly different internal representations. We can even use this to combine all of our previously-defined OIntSet constructors into one expression, to create a set that uses a combination of Haskell lists, numeric predicates, and closures to represent a set:

sample :: OIntSet
sample = oInsert 1 oEvenNumbers `oUnion` oFromList [10..20]

This is a very convenient way of building certain abstractions. By building around external interfaces, you can include varying data representations that easily work together.

This example is clearly a little bit contrived, so it's probably worth giving some other “real-world” examples where this design approach is useful. Object-oriented programming is generally cited as a good fit for a particular style of user interface programming, because it allows you to define classes of “widgets” that expose a common interface but have different internal representations. You could build a Haskell GUI library in this style by defining Widgets as “objects” with a common interface, something like this:

data Widget = Widget
  { drawWidget :: Ctx -> Position -> IO ()
  , handleEvent :: Event -> IO ()
  }

This is similar to the approach taken by the Brick TUI library, which has its own Widget record.

Why not use objects?

One major concern with this style of data representation is performance and optimization. Consider our original ADT representation for IntSet: it's inefficient, yes, but we can make it more efficient in a number of ways. For example, we could modify it so that, instead of always inserting new elements “at the front”, we can instead insert them in such a way that the internal representation of the set is always sorted lowest-to-highest. This means that we may no longer have to traverse the entire list to check for element membership. Even better, we might swap out the list-like representation for a binary tree representation, maybe doing some rebalancing in certain cases.

There is no way in general to apply these optimizations to the OIntSet-style program. You could define an OIntSet that sits in front of a balanced tree and therefore has faster lookup and insertion, but once it's sitting behind the interface, you no longer have access to those internals. You cannot, for example, write an oUnion operation that rebalances the binary trees behind the two sets it's operating on: it doesn't even know if both sets are backed by trees!

In effect, the major selling point of the “object”-style design here is also a major downside: you don't have guarantees about the specific representation of data, which means your programs can easily mix-and-match different representations, but it also means that your program can't make use of representation-specific knowledge in ways that are advantageous.

There's another major concern as well, and that's that the specific choice of “object” representation can make a big difference in terms of what operations you can and cannot support. Look back at OIntSet—I was able to define oUnion, but what about oIntersection? It turns out that it's not actually possible using the specific representation I've chosen³:

oIntersection :: OIntSet -> OIntSet -> OIntSet
oIntersection set set' = OIntSet
  { oIsEmpty = {- ??? -}
  , oIntersection = \i -> oContains set i && oContains set' i
  }

How do I implement oIsEmpty? I might naïvely try to write the inverse of oUnion and define it as oIsEmpty set || oIsEmpty set', but that's not at all what I want: the intersection of the set of even numbers and the set of odd numbers is an empty set, but neither the even nor the odd numbers are empty, so this would incorrectly compute their intersection as non-empty.

This is an artifact of the specific interface chosen for the set. I could modify the interface and be able to recapture this behavior, but almost any choice I make is going to have different repercussions: for example, I could add a method to enumerate all the values contained in the set, at which point I now have a convenient way to find out whether the intersection of two sets is indeed empty… but now I have made infinite sets significantly more difficult to define!

This is another face of the performance problem: the specific interface chosen is going to have far-reaching ramifications not only on what operations are efficient or inefficient, but on what operations are possible to write at all.

Why do these count as “objects”?

A lot of definitions of “objects” in the sense of “object-oriented programming” go back to Alan Kay. Kay was a major force behind the SmallTalk programming language, and he once gave this definition of OOP:

OOP to me means only messaging, local retention, and protection and hiding of state-process, and extreme late-binding of all things. It can be done in Smalltalk and in LISP. There are possibly other systems in which this is possible, but I'm not aware of them.

Our treatment of “objects” here does not fit in this definition, but neither do most languages that are typically called “object-oriented”. In particular, “late-binding” here means that methods are not looked up until runtime: in a proper SmallTalk-like system, this would be done by name, meaning that even virtual dispatch in a language like C++ or Java does not count. You can use languages like Ruby or Python in a way that matches this definition, but they're not typically used this way. Many object-oriented languages are also somewhat lax with respect to protection of local information: Python is a big offender here, as its instance variables are typically made private by convention rather than a language mechanism! And of course, almost none of the modern OOP languages are built strictly around messaging.

However, many of these languages are considered “object-oriented” because they try to capture the advantages of these features while not adhering strictly to them. Late-binding and all-you-can-do-is-send-a-message systems require some complicated machinery in order to implement efficiently, because otherwise the pervasive lookup of methods can become a source of slowdown, so many systems use virtual dispatch instead of extreme late-binding. Similarly, many systems do not adhere to strict information-hiding, but allow some public information for various conveniences (e.g. non-private variables in C++ or Java). In many ways, these are design decisions which help sandbag against the problems with “objects” described above. If we're being sticklers about the Alan Kay definition, we might call these languages “object-oriented-ish”.

The Cook definition is something of a distillation of several of the properties which are used by these “object-oriented-ish” languages. It encodes virtual dispatch: all operations here are higher-order functions, so you cannot know which you are calling until you are passed a function to call. It encodes hiding of local information in a way that's arguably stricter than Java or C++: an external consumer of this code cannot depend on anything but the provided functions, so any “local” information is necessarily hidden. If used appropriately, it can encode protection of state-process: the above example is pure, but if our “methods” had a type like IO (), we could include methods which update internal state (perhaps contained in an IORef) in response to other operations.

Alan Kay's definition is valuable, not just because it describes a powerful approach to programming as implemented in object-oriented languages like SmallTalk and IO, but also because because it describes the behavior of a number of systems that are not programming languages while also capturing what makes those systems powerful: for example, Microsoft's Component Object Model, Plan 9's 9P file system protocol, and arguably even HTTP itself are, from a certain point of view, systems based around doing extremely late-bound message-passing. But I would argue that Cook's definition is also valuable, because it describes both languages in wide use as well as pattern of data modeling in other systems.

I once read a blog post that complained about zippers—a pure functional interface to certain data structures, one that's not even embedded in a monad or some other side-effect-ey abstraction—by claiming that they were “imperative”. Nonsense! This was clearly an example of a value judgment masquerading as some kind of technical truth, and a pretty shallow one at that. I'd be interested in an analysis of the advantages and disadvantages of using zippers, but “I think this abstraction is imperative, therefore it is bad,” is not that!
Cook's paper uses an ML-like pseudocode for these examples, and also includes more detail: I've pared them down a bit.
Actually, if you look closely at my representation, it is possible, just not practical: in Haskell, Int is a machine integer with a finite number of values, so we could enumerate every possible Int value and check whether it's present in both sets. That means I could implement this operation as oIsEmpty = or [ oContains set i && oContains set' i | i <- [minBound..maxBound]], which just needs to check 2^64 possible values on most modern processor architectures!