## Haskell Type Equality Constraints

Earlier today I pasted a snippet of Haskell code into our work chat that contained a multi-parameter type class instance that looked more or less like this:

```
instance a ~ b => C a b where {- ... -}
```

The constraint `a ~ b`

in this declaration is a type equality constraint: it means that `a`

and `b`

must be the same type. One of my coworkers saw this and remarked, more or less, “Why did you write it in that obtuse way? You could have just declared an instance for `C a a`

, which means the same thing.”^{1}

Those two declarations actually *don't* mean the same thing—-but I can't blame said coworker for thinking they did, because the difference between the two is subtle and probably won't come up unless you've done a fair amount of fiddling with type classes. To that end, I want to show an example where those two differ, and then explain why.

## The Example

Let's say we have this (more than a little contrived) multi-parameter type class (which will require the `MultiParamTypeclasses`

extension):

```
class PairOf a b where
thePair :: (a, b)
```

I then define an instance that looks like this (which will require the `FlexibleInstances`

extension):

```
instance Monoid a => PairOf a a where
thePair = (mempty, mempty)
```

So far, so good! Now let's try a particular use of `thePair`

in a (probably even more contrived) definition:

```
obtuseMempty :: Monoid t => t
obtuseMempty = fst thePair
```

What do I expect to happen in this code snippet? I would hope that `obtuseMempty`

ends up being the same as `mempty`

. In particular, I want `thePair`

to resolve to the instance I defined above, and consequently evaluate to `(mempty, mempty)`

, and then, `fst (mempty, mempty) == mempty`

. What *actually* happens when I compile this?

• Could not deduce (PairOf a b0) arising from a use of ‘thePair’ from the context: Monoid a bound by the type signature for: obtuseMempty :: Monoid a => a at sample.hs:11:1-29 The type variable ‘b0’ is ambiguous Relevant bindings include obtuseMempty :: a (bound at sample.hs:12:1) These potential instance exist: instance Monoid a => PairOf a a — Defined at sample.hs:8:10 • In the first argument of ‘fst’, namely ‘thePair’ In the expression: fst thePair In an equation for ‘obtuseMempty’: obtuseMempty = fst thePair

Well, *that's* not what I had wanted, but I guess it makes sense: `thePair`

has to be a pair (because of the use of `fst`

), but since we never use the second element in `thePair`

, there's no reason for GHC to believe that it's the same type as the first. However, let's try tweaking the definition a bit: instead of `PairOf a a`

, let's define it as `PairOf a b`

and then include a constraint that keeps `a`

and `b`

equal (which will require either `GADTs`

or `TypeFamilies`

enabled):

```
instance (Monoid a, a ~ b) => PairOf a b where
thePair = (mempty, mempty)
```

If we try compiling `obtuseMempty`

with this definition, it works without any errors! But *why* does one work while the other doesn't? What's the difference between the two? The answer has to do with a subtlety of how type class resolution works, a subtlety that can be hard to observe right until it stares you in the face.

## How Type Class Resolution Works

When GHC searches for an appropriate instance of a type class, it has to check against all the instances you've defined and find one that “matches” your use of the type class. This is pretty easy when you define instances for a basic type like `Int`

or `Bool`

. For example, using the following (pretty useless) type class:

```
class MyClass t where
myValue :: t
instance MyClass Int where myValue = 0
instance MyClass Bool where myValue = True
```

If I use `myValue`

in a context where it's being interpreted as `Int`

, then Haskell will search through the available instances and find our declaration of `MyClass Int`

. That's simple enough! Things get a bit more complicated when we add types that include type parameters. For example, this is a definition of `MyClass`

for pairs of values:

```
instance (MyClass a, MyClass b) => MyClass (a, b) where
myValue = (myValue, myValue)
```

As a point of vocabulary, everything before the `=>`

in this definition is called the *context*, and everything after it is called the *head*. So now, if I use `myValue`

in a context where it's being interpreted as `(Int, Bool)`

, then GHC will find the matching instance `MyClass (a, b)`

, and then in turn tries to search for two other instances: one for `MyClass Int`

and another for `MyClass Bool`

.

Notice the precise way I phrased that, because this is where the basic intuition around type class instances can diverge from the actual implementation: it looks like the instance definition is saying, “If we have an instance for `MyClass a`

and an instance for `MyClass b`

, then we can also have an instance for `MyClass (a, b)`

.” This is a reasonable intuition, but it's *precisely backwards* from what Haskell is actually doing. What happens instead is that, when it sees `myValue`

being used as a pair, GHC will first select the instance `MyClass (a, b)`

*without having even examined the constraints in the context*. Only after it's already committed to using the instance `MyClass (a, b)`

will it then examine the context and try to satisfy the `MyClass a`

and `MyClass b`

constraints. Or, to say it in a more concise but jargon-heavy way, it first matches the head, and once it has done so, attempts to satisfy the context.

Consider what happens if write the following code without having defined an instance of the form `MyClass ()`

:

```
blah :: (Int, ())
blah = myValue
```

The precise error GHC gives us is: “No instance for `(MyClass ())`

arising from a use of `myVal`

.” GHC has already settled on the `MyClass (a, b)`

instance, has found a satisfying instance for `MyClass Int`

, but then cannot find an instance for `MyClass ()`

.

In this case, making that distinction seems a little bit pedantic. In code like this, it amounts to simply an implementation detail! It shouldn't really matter *what* order GHC is using underneath the surface: either way, the problem is that we don't have an instance for `MyClass ()`

, and as long as GHC reports that error to us in some way, it doesn't matter whether it first tries to satify the context before committing to to `MyClass (a, b)`

, or if it commits first and then tries to satisfy the context: we'll run into the same problem either way! But knowing the precise way GHC tries to instantiate type classes can be very useful in other contexts.

## Repeated Types in Instance Heads

Let's look at our first problematic `PairOf`

instance:

```
instance Monoid a => PairOf a a where
thePair = (mempty, mempty)
```

Remember: GHC has to match the instance head first, and after that it solves the remaining constraints from the context! So, in the definition of `obtuseMempty`

, what is the inferred type of the use of `thePair`

?

```
obtuseMempty :: Monoid t => t
obtuseMempty = fst thePair
```

It should be obvious, but for good measure, let's replace `thePair`

with a hole and find out what type GHC is assigning based on the surrounding context:

• Found hole: _ :: (a, b0) Where: ‘a’ is a rigid type variable bound by the type signature for: obtuseMempty :: forall a. Monoid a => a at sample.hs:11:17 ‘b0’ is an ambiguous type variable • In the first argument of ‘fst’, namely ‘_’ In the expression: fst _ In an equation for ‘obtuseMempty’: obtuseMempty = fst _

So GHC needs to find an instance of the form `MyPair t b0`

. Let's look at our instances: do we have one that has a matching head? ...well, no, we only have `MyPair a a`

.

“But wait!” you say. “We don't do anything with the second part of the pair, so why *can't* GHC choose `MyPair a a`

as a valid instance?” But GHC can't read your mind, and there's nothing in your program to indicate that the type variables `a`

and `b0`

are *supposed* to be the same. As soon as you somehow indicate that, GHC will comply and find your `MyPair a a`

instance:

```
evenMoreObtuseMempty :: Monoid t => t
evenMoreObtuseMempty = let p = thePair in (fst p `mappend` snd p)
```

In this definition, it's clear that `thePair`

is supposed to have two fields of the same type, because we're clearly using `mappend`

to combine them. But is there a way of making it work even if we don't give GHC that extra nudge at the use site?

## Type Equality to the Rescue

Let's move on to the second `MyPair`

instance I described above:

```
instance (Monoid a, a ~ b) => PairOf a b where
thePair = (mempty, mempty)
```

Using typical informal reasoning, it seems like this should be identical to the other instance definition: “It's an instance where both parameters are the same type, and that type needs to be a `Monoid`

”. But remember how GHC chooses instances: *first* by matching the head, *then* by elaborating the constraints. And this definition is different in an important way: the fact that the two type parameters are the same type is no longer a feature of the *head*, but rather a feature of the *context*! So when we revisit our definition of `obtuseMempty`

with this new definition, GHC behaves differently:

```
obtuseMempty :: Monoid t => t
obtuseMempty = fst thePair
```

The value `thePair`

in this context still has the inferred type `PairOf t b0 => (t, b0)`

, but now we have an instance for `PairOf a b`

, which means GHC can unify those and select the instance we want! And *now that it has chosen that instance*, GHC can take the constraints associated with that instance and try to solve them: the expression `thePair`

now has the inferred type `(Monoid t, t ~ b0) => (t, b0)`

, which GHC can solve as: `(Monoid t) => (t, t)`

. After that, our function type-checks successfully!

## So What's The Lesson?

A rough intuition is that you can think of type equality in code like this as helping you *propagate* the fact that types are equal in some contexts. In the examples above, we as programmers intended the two fields of `thePair`

should have the same type, but the first definition—-the `MyPair a a`

one—-meant we could only use `myPair`

when GHC *already knew* those types were equal. By moving the type equality to the context, we can use `myPair`

in places where the two types aren't already known to be distinct and *introduce* that fact. This isn't the *only* use of type equality—-type equality is very important in the context of type families, for example—-but it's nevertheless an important use.

GHC type class resolution is really very straightforward, even in the face of extensions like the ones we enabled in this post. In fact, part of the problem is that it's sometimes tempting to imagine GHC is doing something more clever than it actually is! Methodically stepping through the way GHC processes your code can really help understand situations like these, where intial intuitions sometimes lead you down the wrong path.

Well, okay, they ACTUALLY assumed I wrote an excessively obtuse type signature as an elaborate way of trolling them. And to be fair, the code snippet I was pasting was a deliberately silly and ill-advised piece of code—-but the use of type equality wasn't part of the joke.

...well, if you MUST know, it was an instance definition very similar to the following:

`instance (a ~ b) => Num ((Integer -> a) -> b) where fromIntegral x f = f x {- the rest of this is not important here -}`

this allows you to use numeric literals as functions that take another function and apply that function to themselves. This gives you some cute syntactic sugar:

`newtype Pixels = Pixels { fromPixels :: Integer } width, height :: Pixels width = 320 Pixels height = 240 Pixels`

This is a disgusting and terrible, and please never do it.