## Typeclass Refactoring and Default Superclass Instances

As part of recent type-refactoring efforts in Haskell, a discussion about adding `Semigroup`

as a parent class of `Monoid`

has been bouncing around the mailing list. From a theoretical point of view, this is a great idea: it is more flexible than the current approach that would allow for greater expressibility.

From a *practical* point of view, however, I am inclined to oppose it. Not because it is *in itself* a bad change—it's a very reasonable change that has advantages for new code—but because I have, in the past, had to update large systems written in Haskell after GHC updates, and therefore I know that *this kind of change has a cost*. The Applicative-Monad changes seem to have made way for the Foldable-Traversable Prelude, but those have in turn inspired a number of suggestions for modifications to the Haskell standard library, each one of which, taken on its own, is reasonable, but taken as a mass, mean *much more work* for maintainers. This is *especially* true if we continue on this path of making minor refactorings at each release: each year a project will need changes, or it will quickly bit-rot beyond utility.

## Default Superclass Instances

There is, however, an alternative I would like to discuss—one which has also been brought up on the mailing list, but which I'd like to provide more concrete motivation for. Some of these changes—especially the `Semigroup`

/`Monoid`

split—seem to involve taking the functionality of a class and splitting it into multiple smaller classes. For example, we can think of the `Semigroup`

/`Monoid`

change as converting

```
class Monoid m where
mempty :: m
mappend :: m -> m -> m
```

into^{1}

```
class Semigroup m where
mappend :: m -> m -> m
class Semigroup m => Monoid m where
mempty :: m
```

Something that has been proposed before (in a few different forms) and which I suggest be more actively considered if changes like these are to become common is to allow *superclass instances to be declared within a subclass declaration*. This would allow you to write a single `instance`

declaration for a class, and in that body *also include implementations for methods belong to a superclass of that class*. As an example, perhaps we could be able to write^{2}:

```
newtype MyId a = MyId a
instance Monad MyId where
-- Functor method
fmap f (MyId x) = MyId (f x)
-- Applicative methods
pure = MyId
MyId f <*> MyId x = MyId (f x)
-- Monad methods
return = MyId
MyId x >>= f = f x
```

For the `Monoid`

/`Semigroup`

proposal, this would mean that any `Monoid`

instances that exist would continue to work unchanged, but new instances could (optionally) split apart their declarations. Under this proposal, either of these would be acceptable:

```
class Semigroup m where mappend :: m -> m -> m
class Semigroup m => Monoid m where mempty :: m
-- combined `instance` declarations:
instance Monoid [a] where
mempty = []
mappend = (++)
```

or, equivalently,

```
class Semigroup m where mappend :: m -> m -> m
class Semigroup m => Monoid m where mempty :: m
-- split apart `instance` declarations
instance Semigroup [a] where
mappend = (++)
instance Monoid [a] where
mempty = []
```

And because the `Monoid`

declaration for `[]`

is already written like the former, we can make the `Semigroup`

/`Monoid`

split without having to rewrite the instance declarations!

## Exciting Additional Opportunities

Because this lowers the cost of updating for new versions, various *other* useful changes might be considered that would otherwise involve far too much breakage. For example, we could consider splitting `Num`

apart into small constituent parts. With Default Superclass Instances, we could refactor `Num`

into the following *without changing any instance declarations*:^{3}

```
class Add a where (+) :: a -> a -> a
class Sub a where (-) :: a -> a -> a
class Add a => Zero a where zero :: a
class Mul a where (*) :: a -> a -> a
class Mul a => One a where one :: a
class (Zero a, One a) => FromInteger a where
fromInteger :: Integer -> a
instance Zero a where zero = fromInteger 0
instance One a where one = fromInteger 1
class Signed a where
negate :: a -> a
abs :: a -> a
signum :: a -> a
class ( Eq a, Show a, Add a, Sub a
, Mul a, FromInteger a, Signed a) => Num a where
```

which would allow certain numeric types to only implement a subset of the relevant operations:

```
data Nat = Zero | Succ Nat
instance Add Nat where
Z + y = y
S x + y = S (x + y)
{- et cetera --- but no implementations for e.g. Signed,
- which is not meaningful for `Nat`! -}
```

and also allow current `Num`

functions to have a looser set of constraints than they do at present:

```
sum :: (Zero a, Add a) => [a] -> a
sum (x:xs) = x + sum xs
sum [] = zero
prod :: (One a, Mul a) => [a] -> a
prod (x:xs) = x * prod xs
prod [] = one
```

We could also consider splitting `Arrow`

^{4} into distinct components, again without having to change any instance declarations:

```
class Category a => Pairwise a where
first :: a b c -> a (b, d) (c, d)
second :: a b c -> a (d, b) (d, c)
(***) :: a b c -> a b' c' -> a (b, b') (c, c')
(&&&) :: a b c -> a b c' -> a b (c, c')
class Pairwise a => Arrow a where
arr :: (b -> c) -> a b c
```

or even (dare I dream) splitting `Bits`

into something that is not a 22-method monstrosity!

## Potential Drawbacks

On the other hand, this proposal does have some down-sides:

### Grepping for Instance Declarations

Right now, I can often find an instance declaration for a type `T`

by grepping for `instance C T`

(modulo some line noise) whereas with this change, it's possible that there *is* no declaration for `instance C T`

, because all of `C`

's methods are declared by a subclass `C'`

instead. The compiler ought to be able to deal with this without problem, which means that tools like Haddock documentation should somewhat alleviate this problem, but a user might be confused.

### Introduces New Possible Errors

The declarations below are of course nonsensical, and would be rejected by the compiler—-but the fact that this change would *introduce new failure conditions at all* is a drawback of the proposal.

```
instance Semigroup Int where
mappend = (+)
instance Monoid Int where
-- this conflicts with an existing declaration
mappend = (*)
mempty = 1
```

### A Pragma-Less Extension

In order to be *really* useful, we'd want to use this without a `LANGUAGE`

pragma. After all, we're arguing for it on the basis of preserving backwards-compatibility, but that argument is much less compelling if we still have to change the source files to make use of it! On the other hand, that means we'd have included a GHC *extension* that takes effect despite not being enabled, which is *also* a worrying precedent!

It still might be a useful extension even if it had to be enabled by a `LANGUAGE`

pragma, as it is easier to add said pragma to a source file than to manually break apart dozens of instance declarations, but it makes this feature less compelling in general.

## In Conclusion

As I said before, my primary objection to typeclass changes like those above—-even if they are good changes for newly written code—-is that they *break existing code*. I don't want to have to modify a handful of miscellaneous instance declarations on a yearly basis as people discover new levels of abstraction or become dissatisfied with current choices, especially as those changes will grow more extensive as I build more projects in Haskell! But with an extension like this, we could grow the typeclass ecosystem gradually and fix what we see as past warts *while maintaining backwards-compatibility*, which would be a very powerful tool to have.

This is perhaps more simplistic than we want: we can also use the existing

`Semigroup`

class from the`semigroup`

package and then, in the`Monoid`

class declaration, explain how to derive the methods of the superclass if no superclass instance is present. This, according to the linked proposal, would look like:`class Semigroup m => Monoid m where mempty :: m mappend :: m -> m -> m instance Semigroup m where (.++.) = mappend`

The example above is slightly simpler, which is why I relegated this version to a footnote.

- This isn't a concrete proposal, so maybe the actual syntax or semantics of these things should be changed! I want to focus on the
*feature*and not the*instantiation*. For this example, I added

`Zero`

and`One`

classes so that a given type might implement an additive and multiplicative unit while not necessarily having a sensible`FromInteger`

implementation. For example, it might not make sense to implement`fromInteger`

for a complex number, but complex numbers clearly have both an additive and multiplicative unit:`data Complex = Complex Float Float deriving (Eq, Show) {- ... -} instance Zero Complex where zero = Complex 0.0 0.0 instance One Complex where one = Complex 1.0 0.0`

This means that the

`Sum`

and`Product`

monoids could be rewritten like:`newtype Product a = Product { getProduct :: a } deriving (Eq, Show) instance (One a, Mul a) => Monoid (Product a) where mempty = Product one x `mappend` y = Product (getProduct x * getProduct y)`

Notice that I added

`Zero`

and`One`

in such a way that an existing`Num`

instance declarations needn't be changed!I have on numerous occasions had reason to use the

`Arrow`

abstraction, but haven't had a sensible implementation of`arr`

. To use a slightly contrived example, I could define a GADT that can describe the structure of boolean circuits in a way that resembles an Arrow, but has no way of expressing`arr`

:`data Circ a b where BoolFunc :: (Bool -> Bool) -> Circ Bool Bool Ident :: Circ a a Compose :: Circ a b -> Circ b c -> Circ a c First :: Circ a b -> Circ (a, d) (b, d) Second :: Circ a b -> Circ (d, a) (d, b) Parallel :: Circ a b -> Circ a' b' -> Circ (a, a') (b, b') Fanout :: Circ a b -> Circ a b' -> Circ a (b, b') instance Category Circ where id = Ident (.) = flip Compose instance Arrow Circ where first = First second = Second (***) = Parallel (&&&) = Fanout arr = error "Nothing sensible to put here!" -- for example, using this definition, we can write xor: xor :: BoolCircuit (Bool, Bool) Bool xor = ((first Not >>> And) &&& (second Not >>> And)) >>> Or -- ...and using xor, write a half-adder: halfAdder :: BoolCircuit (Bool, Bool) (Bool, Bool) halfAdder = xor &&& And`

This is not an unreasonable definition—-it would be nice to abstract over such a definition using existing tools—-but the structure of the

`Arrow`

typeclass makes it difficult!