# What are type families?

I just returned from yet another fantastic ICFP week. I always have a blast at the conference. It would be easy to take the conversations I had there and turn them into a full-blown research program that would easily take me to next year’s ICFP… and then rinse and repeat!

But in this post, I want to focus on one particlar thorn in my side that started bothering me more than it had before: type families. I’ve realized, in the course of many conversations, that I don’t really understand them. This post will be more questions than answers as I try to write down what ails me.

``````> {-# LANGUAGE TypeFamilies, DataKinds, UndecidableInstances, GADTs,
>              MultiParamTypeClasses, FunctionalDependencies,
>              FlexibleInstances, FlexibleContexts #-}
> module WhatAreTypeFamilies where
``````

## Total type families are OK

A total type family feels quite comfortable. By total here, I mean a type family whose equations cover all the possible cases and which is guaranteed to terminate. That is, a total type family is properly a function on types. Those other dependently typed languages have functions on types, and they seem to work nicely. I am completely unbothered by total type families.

## Non-covering type families are strange

A non-covering type family is a type family whose patterns do not cover the whole space. Let’s consider closed and open families separately, because the issues that come up are different.

### Closed type families

For example:

``````> type family F1 a where
>   F1 Int = Bool
``````

Note that `F1` is closed – we won’t ever be adding more equations. `F1` is clearly non-covering. Yet, the type `F1 Char` is perfectly well formed.

``````> sillyId :: F1 Char -> F1 Char
> sillyId x = x
``````

But `sillyId` can’t ever be called on a value. This doesn’t stop us from doing chicanery with `F1` though.

``````> data SillyGadt a where
``````

Now, even though `F1 Char` is nonsense, `SillyGadt (F1 Char)` is inhabited. And GHC distinguishes between `F1 Char` and `F1 Double`.

In GHC, a type family that can’t reduce is considered “stuck”. Such a type is effectively a new type, equal only to itself, like empty datatypes a programmer might declare. The only thing different about something like `F1 Char` from something like `Maybe Char` is that `F1 Char` cannot be used in a type pattern (that is, the left-hand side of a type family equation or data family instance). That’s the only difference. So, when we declare `F1`, we create an infinite family of unique types containing `F1` applied to all types of kind `*`, except for `Int`, because `F1 Int` is identically `Bool`.

I find this treatment of non-covering closed type families rather bizarre. Compare against term-level behavior. When a term-level function is non-covering, passing an argument for which there are no equations immediately results in a call to `error`. Yet the same thing in types behaves very differently.

I’m not the only one who finds this strange: Lennart Augustsson posted GHC bug #9636 about this very issue. Indeed, my current views on the subject are strongly influenced by Lennart’s comments.

#### But, sometimes you want stuckness

Sadly, sometimes authors want stuck type families. I know of two examples.

1. The bowels of GHC defines this:
``type family Any :: k where {}``

Thus, `Any` is a closed type family, available at any kind, with no equations. GHC makes convenient use for `Any`. Consider the expression `length []`. This expression has some pointless polymorphism in it: `length` can be applied to a list of any type, and `[]` can be a list of any type. Yet, GHC must choose precisely which type these polymorphic definitions are specialized to. Not to play favorites, GHC chooses `Any :: *`. Such pointless polymorphism can indeed happen at any kind, so `Any` is polymorphic in its return kind. (Before kind polymorphism, `Any` existed as a special form. It’s not so special any more.)

It’s quite important that `Any` is a type family, not a datatype. If it were a datatype (and we forget about the weirdness of a datatype polymorphic in its return kind), then it could be matched in type patterns, making this strangeness possible:

``````type family ThreeBools (b :: Bool) :: Nat where
ThreeBools 'True  = 1
ThreeBools 'False = 2
ThreeBools Any    = 3``````

Bizarre! And yet this was possible in GHC 7.6 and below (modulo the closed type family, which didn’t exist yet). And it caused problems.

Of course, with `Any`, we really want it to be stuck. We don’t want `Any` to reduce to some other type, and we don’t want uses of `Any` to be errors.

2. In his work implementing a units-of-measure library via type-checker plugins, Adam Gundry uses closed type families to define type functions whose behavior is specified only via a plugin, not by GHC’s built-in type family reduction rules. He wanted empty closed type families specifically, and so implemented them within GHC. The idea is that units-of-measure with their integer powers form an abelian group, which can be solved via custom machinery but not by GHC’s rather limited left-to-right rewriting used in type family reduction.

Once again, we want a closed type family that is stuck but is not an error. The existence of multiple such things got me ruminating on all of this some time ago. I attempted to figure this all out in a post on a GHC ticket, but that didn’t get me very far. And I won’t claim to have made any real progress since that post.

### Open type families

Having a stuck open type family makes moderately more sense than a stuck closed type family. Suppose we have this:

``````> type family F2 a
> type instance F2 Int = Bool
``````

Because new equations can be added to open type families at any time, talking about `F2 Char` isn’t utterly bogus. But it might be nice to have to assert that an equation for `F2 Char` is in scope before we use it in a type.

### Injective type families

Injective type families are a new feature that just got merged with GHC head. Jan Stolarek, Simon Peyton Jones, and I had a paper at Haskell Symposium about them.

The idea is that you can put an injectivity annotation on a type family declaration, which is rather like a functional dependency. (The injective type family code in this post is not runnable, as it works only with GHC HEAD. The feature will be available with GHC 8.0, due out early 2016.)

``````type family F3 a = r | r -> a where
F3 Int  = Bool
F3 Bool = Int
F3 a    = a``````

Note that if you know the value of `F3 a`, then you can discover the value of `a`: that’s what injectivity is all about.

Now consider this:

``````type family F4 a = r | r -> a where
F4 (Maybe a) = a``````

Looks injective, right? Wrong. The problem is that, if this were injective, we could take a proof that `F4 a ~ F4 b` and get a proof of `a ~ b`. Now, consider `F4 (Maybe (F4 Int))`. This reduces to `F4 Int`. Thus, we have `F4 (Maybe (F4 Int)) ~ F4 Int`. And now, by injectivity, we get `Maybe (F4 Int) ~ Int`. That is, a `Maybe <blah>` equals an `Int`. Terrible!

So, in our implementation of injectivity, we exclude type families like `F4`. But it’s really unexpected that `F4` should be rejected (at least to me). The problem, at its core, is about partiality. Note that we needed `F4 Int` – an un-covered case – to make the bad equality proof.

## Looping type families are strange

At first blush, it seems that a looping type families cause chiefly one problem: a looping GHC. But I’m very unconcerned about a looping GHC. After all, if a user has written a non-terminating type family, that amounts to a non-terminating program to be run at compile time. No one should be very surprised when a program that contains a loop does, in fact, loop.

The problem is that looping type families can actually create effectively infinite types. GHC has various protections against actually running off to infinity, but a clever programmer can defeat these protections. And infinite types can cause unsoundness. Indeed, while writing the closed type family paper, we uncovered one such unsoundness, closing bug #8162.

However, even with #8162 gone, we still have a lingering problem. Consider

``````> type family F5 a b where
>   F5 a a = 'True
>   F5 a b = 'False
``````

This is a straightforward equality test, and it works in many cases. But if we ask `F5 a (Maybe a)` (for some unknown `a`), the type family doesn’t reduce. It really should! There is no way a finite `a` can be the same as `Maybe a`. But we can have an infinite `a`, instantiated to `MaybeInf`:

``````> type family MaybeInf where MaybeInf = Maybe MaybeInf
``````

`MaybeInf` here is a perfectly well-formed nullary (0-argument) type family. If the `a` in `F5 a (Maybe a)` were instantiated with `MaybeInf`, then reducing to `'True` would be correct. And thus GHC can’t reduce `F5 a (Maybe a)` to `'False`, as one might like.

And this isn’t purely an academic concern. I don’t have links I can share, but this has bitten real programmers doing real work. Once again, the problem boils down to partiality: GHC must be conservative about reducing closed type families (such as `F5`) because of the possibility that someone, somewhere will write a non-terminating type family. If `F5 a (Maybe a)` were to reduce to `'False`, I’m confident that a clever enough person could use this behavior to implement `unsafeCoerce`. Similar concerns prevent certain type families from being considered injective – see the paper for the example. If we could somehow be sure that there were no non-terminating type families, these concerns would go away.

## Contrast with functional dependencies

Simply, functional dependencies don’t suffer from these problems. Let’s look at the issue around injectivity, as it’s representative. First, recast in terms of functional dependencies:

``````> class F4' a r | a -> r, r -> a
> instance F4' (Maybe a) a
``````

Note the bidirectional functional dependency. The left-to-right dependency is the built-in dependency inherent in type families. The right-to-left dependency arises from the injectivity annotation.

Our problem case above was `F4 (Maybe (F4 Int))`. This case maps to

``````fundep :: (F4' Int f4_int, F4' (Maybe f4_int) result) => ()
fundep = ()``````

``````Couldn't match type ‘Maybe f4_int’ with ‘Int’
arising from a functional dependency between:
constraint ‘F4' Int f4_int’
arising from the type signature for
fundep :: (F4' Int f4_int, F4' (Maybe f4_int) result) => Int
instance ‘F4' (Maybe a) a’``````

So GHC notices that there is a problem. But even if it didn’t, we’d be OK. That’s because the `F4' Int f4_int` constraint essentially asserts that there is some value for `F4' Int`. Of course, there isn’t any such value, and so `fundep` would never be callable. Indeed, we can’t write any more instances of `F4'`, as anything we could write would violate the dependencies.

The bottom line here is that there is no way to get into trouble here because the constraints necessary to use bogus functional dependencies can never be satisfied.

So, what should we do about all of this? I’m inspired by the non-problems with functional dependencies. I thus propose the following. (Actually, I don’t. This is too radical. But I’m trying to provoke discussion. So take this all with a heap of salt.)

1. Remove unassociated open type families. They’re quite awkward anyway – either they should be closed and define some sort of concrete function, or they will need a class constraint nearby to be useful.
2. Require class constraints to use associated type families. That is, if `F a` is associated with a class `C a`, writing `F Int` without a `C Int` constraint in scope is an error. This means that the use of an associated type family in the right-hand side of another associated type family will require the class constraint in the instance context.
3. Require closed type families to be total, with an improved termination checker. Naturally, closed type families would be unable to call partial type families.
4. Devise a mechanism to associate a closed type family with a class. This would allow for partial closed type families, which indeed are useful.

The bottom line to all of this is that any partial type family should be associated with a class constraint. If the constraint can be satisfied, this means that the type family is defined and well-behaved at the arguments provided. With this, I believe all the problems above go away.

Another way to see this proposal is that functional dependencies are the Right Way to do this, and type families (at least partial ones) are wonky. But type families have much better syntax, and I’d like to keep that.

So, where to go from here? I’m not sure. This proposal is pretty radical, but it could potentially be implemented with the requisite deprecation period, etc. But, mostly, I’m curious to see what conversation this all provokes.