Typeclass Refactoring and Default Superclass Instances
As part of recent type-refactoring efforts in Haskell, a discussion about adding Semigroup
as a parent class of Monoid
has been bouncing around the mailing list. From a theoretical point of view, this is a great idea: it is more flexible than the current approach that would allow for greater expressibility.
From a practical point of view, however, I am inclined to oppose it. Not because it is in itself a bad change—it's a very reasonable change that has advantages for new code—but because I have, in the past, had to update large systems written in Haskell after GHC updates, and therefore I know that this kind of change has a cost. The Applicative-Monad changes seem to have made way for the Foldable-Traversable Prelude, but those have in turn inspired a number of suggestions for modifications to the Haskell standard library, each one of which, taken on its own, is reasonable, but taken as a mass, mean much more work for maintainers. This is especially true if we continue on this path of making minor refactorings at each release: each year a project will need changes, or it will quickly bit-rot beyond utility.
Default Superclass Instances
There is, however, an alternative I would like to discuss—one which has also been brought up on the mailing list, but which I'd like to provide more concrete motivation for. Some of these changes—especially the Semigroup
/Monoid
split—seem to involve taking the functionality of a class and splitting it into multiple smaller classes. For example, we can think of the Semigroup
/Monoid
change as converting
class Monoid m where
mempty :: m
mappend :: m -> m -> m
into1
class Semigroup m where
mappend :: m -> m -> m
class Semigroup m => Monoid m where
mempty :: m
Something that has been proposed before (in a few different forms) and which I suggest be more actively considered if changes like these are to become common is to allow superclass instances to be declared within a subclass declaration. This would allow you to write a single instance
declaration for a class, and in that body also include implementations for methods belong to a superclass of that class. As an example, perhaps we could be able to write2:
newtype MyId a = MyId a
instance Monad MyId where
-- Functor method
fmap f (MyId x) = MyId (f x)
-- Applicative methods
pure = MyId
MyId f <*> MyId x = MyId (f x)
-- Monad methods
return = MyId
MyId x >>= f = f x
For the Monoid
/Semigroup
proposal, this would mean that any Monoid
instances that exist would continue to work unchanged, but new instances could (optionally) split apart their declarations. Under this proposal, either of these would be acceptable:
class Semigroup m where mappend :: m -> m -> m
class Semigroup m => Monoid m where mempty :: m
-- combined `instance` declarations:
instance Monoid [a] where
mempty = []
mappend = (++)
or, equivalently,
class Semigroup m where mappend :: m -> m -> m
class Semigroup m => Monoid m where mempty :: m
-- split apart `instance` declarations
instance Semigroup [a] where
mappend = (++)
instance Monoid [a] where
mempty = []
And because the Monoid
declaration for []
is already written like the former, we can make the Semigroup
/Monoid
split without having to rewrite the instance declarations!
Exciting Additional Opportunities
Because this lowers the cost of updating for new versions, various other useful changes might be considered that would otherwise involve far too much breakage. For example, we could consider splitting Num
apart into small constituent parts. With Default Superclass Instances, we could refactor Num
into the following without changing any instance declarations:3
class Add a where (+) :: a -> a -> a
class Sub a where (-) :: a -> a -> a
class Add a => Zero a where zero :: a
class Mul a where (*) :: a -> a -> a
class Mul a => One a where one :: a
class (Zero a, One a) => FromInteger a where
fromInteger :: Integer -> a
instance Zero a where zero = fromInteger 0
instance One a where one = fromInteger 1
class Signed a where
negate :: a -> a
abs :: a -> a
signum :: a -> a
class ( Eq a, Show a, Add a, Sub a
, Mul a, FromInteger a, Signed a) => Num a where
which would allow certain numeric types to only implement a subset of the relevant operations:
data Nat = Zero | Succ Nat
instance Add Nat where
Z + y = y
S x + y = S (x + y)
{- et cetera --- but no implementations for e.g. Signed,
- which is not meaningful for `Nat`! -}
and also allow current Num
functions to have a looser set of constraints than they do at present:
sum :: (Zero a, Add a) => [a] -> a
sum (x:xs) = x + sum xs
sum [] = zero
prod :: (One a, Mul a) => [a] -> a
prod (x:xs) = x * prod xs
prod [] = one
We could also consider splitting Arrow
4 into distinct components, again without having to change any instance declarations:
class Category a => Pairwise a where
first :: a b c -> a (b, d) (c, d)
second :: a b c -> a (d, b) (d, c)
(***) :: a b c -> a b' c' -> a (b, b') (c, c')
(&&&) :: a b c -> a b c' -> a b (c, c')
class Pairwise a => Arrow a where
arr :: (b -> c) -> a b c
or even (dare I dream) splitting Bits
into something that is not a 22-method monstrosity!
Potential Drawbacks
On the other hand, this proposal does have some down-sides:
Grepping for Instance Declarations
Right now, I can often find an instance declaration for a type T
by grepping for instance C T
(modulo some line noise) whereas with this change, it's possible that there is no declaration for instance C T
, because all of C
's methods are declared by a subclass C'
instead. The compiler ought to be able to deal with this without problem, which means that tools like Haddock documentation should somewhat alleviate this problem, but a user might be confused.
Introduces New Possible Errors
The declarations below are of course nonsensical, and would be rejected by the compiler—-but the fact that this change would introduce new failure conditions at all is a drawback of the proposal.
instance Semigroup Int where
mappend = (+)
instance Monoid Int where
-- this conflicts with an existing declaration
mappend = (*)
mempty = 1
A Pragma-Less Extension
In order to be really useful, we'd want to use this without a LANGUAGE
pragma. After all, we're arguing for it on the basis of preserving backwards-compatibility, but that argument is much less compelling if we still have to change the source files to make use of it! On the other hand, that means we'd have included a GHC extension that takes effect despite not being enabled, which is also a worrying precedent!
It still might be a useful extension even if it had to be enabled by a LANGUAGE
pragma, as it is easier to add said pragma to a source file than to manually break apart dozens of instance declarations, but it makes this feature less compelling in general.
In Conclusion
As I said before, my primary objection to typeclass changes like those above—-even if they are good changes for newly written code—-is that they break existing code. I don't want to have to modify a handful of miscellaneous instance declarations on a yearly basis as people discover new levels of abstraction or become dissatisfied with current choices, especially as those changes will grow more extensive as I build more projects in Haskell! But with an extension like this, we could grow the typeclass ecosystem gradually and fix what we see as past warts while maintaining backwards-compatibility, which would be a very powerful tool to have.
This is perhaps more simplistic than we want: we can also use the existing
Semigroup
class from thesemigroup
package and then, in theMonoid
class declaration, explain how to derive the methods of the superclass if no superclass instance is present. This, according to the linked proposal, would look like:class Semigroup m => Monoid m where mempty :: m mappend :: m -> m -> m instance Semigroup m where (.++.) = mappend
The example above is slightly simpler, which is why I relegated this version to a footnote.
- This isn't a concrete proposal, so maybe the actual syntax or semantics of these things should be changed! I want to focus on the feature and not the instantiation.
For this example, I added
Zero
andOne
classes so that a given type might implement an additive and multiplicative unit while not necessarily having a sensibleFromInteger
implementation. For example, it might not make sense to implementfromInteger
for a complex number, but complex numbers clearly have both an additive and multiplicative unit:data Complex = Complex Float Float deriving (Eq, Show) {- ... -} instance Zero Complex where zero = Complex 0.0 0.0 instance One Complex where one = Complex 1.0 0.0
This means that the
Sum
andProduct
monoids could be rewritten like:newtype Product a = Product { getProduct :: a } deriving (Eq, Show) instance (One a, Mul a) => Monoid (Product a) where mempty = Product one x `mappend` y = Product (getProduct x * getProduct y)
Notice that I added
Zero
andOne
in such a way that an existingNum
instance declarations needn't be changed!I have on numerous occasions had reason to use the
Arrow
abstraction, but haven't had a sensible implementation ofarr
. To use a slightly contrived example, I could define a GADT that can describe the structure of boolean circuits in a way that resembles an Arrow, but has no way of expressingarr
:data Circ a b where BoolFunc :: (Bool -> Bool) -> Circ Bool Bool Ident :: Circ a a Compose :: Circ a b -> Circ b c -> Circ a c First :: Circ a b -> Circ (a, d) (b, d) Second :: Circ a b -> Circ (d, a) (d, b) Parallel :: Circ a b -> Circ a' b' -> Circ (a, a') (b, b') Fanout :: Circ a b -> Circ a b' -> Circ a (b, b') instance Category Circ where id = Ident (.) = flip Compose instance Arrow Circ where first = First second = Second (***) = Parallel (&&&) = Fanout arr = error "Nothing sensible to put here!" -- for example, using this definition, we can write xor: xor :: BoolCircuit (Bool, Bool) Bool xor = ((first Not >>> And) &&& (second Not >>> And)) >>> Or -- ...and using xor, write a half-adder: halfAdder :: BoolCircuit (Bool, Bool) (Bool, Bool) halfAdder = xor &&& And
This is not an unreasonable definition—-it would be nice to abstract over such a definition using existing tools—-but the structure of the
Arrow
typeclass makes it difficult!