This post is a literate Haskell file. Copy and paste it into a .lhs file, and you’re off to the races. But first, of course, a little throat-clearing:

```
> {-# LANGUAGE TemplateHaskell, PolyKinds, DataKinds, TypeFamilies,
> ScopedTypeVariables, GADTs, StandaloneDeriving, RankNTypes,
> TypeOperators #-}
>
> import Data.Singletons.TH
> import Unsafe.Coerce -- don't hate me yet! keep reading!
```

The singletons library was developed as part of the research behind this paper, published at the Haskell Symposium, 2012.

Singleton types are a technique for “faking” dependent types in non-dependent languages, such as Haskell. They have been known for some time – please see the original research paper for more history and prior work. A singleton type is a type with exactly one value. (Note that `undefined`

is *not* a value!) Because of this curious fact, learning something about the value of a singleton type tells you about the type, and vice versa.

A few lines of example is worth several paragraphs of awkward explanation, so here we go (the underscores are to differentiate from our second version, below):

```
> data Nat_ = Zero_ | Succ_ Nat_
> data SNat_ :: Nat_ -> * where
> SZero_ :: SNat_ Zero_
> SSucc_ :: SNat_ n -> SNat_ (Succ_ n)
>
> plus_ :: Nat_ -> Nat_ -> Nat_
> plus_ Zero_ n = n
> plus_ (Succ_ m) n = Succ_ (plus_ m n)
>
> type family Plus_ (m :: Nat_) (n :: Nat_) :: Nat_
> type instance Plus_ Zero_ n = n
> type instance Plus_ (Succ_ m) n = Succ_ (Plus_ m n)
>
> sPlus_ :: SNat_ m -> SNat_ n -> SNat_ (Plus_ m n)
> sPlus_ SZero_ n = n
> sPlus_ (SSucc_ m) n = SSucc_ (sPlus_ m n)
```

Here, `SNat_`

defines a singleton family of types. Note that, for any `n`

, there is exactly one value of `SNat_ n`

. This means that when we pattern-match on a `SNat_`

, we learn about the *type* variable along with the term-level variable. This, in turn, allows for more type-level reasoning to show correctness for our code. See the paper for more explanation here.

Using singletons, we can pretend Haskell is dependently typed. For example, I have written a richly-typed database client and a provably* correct sorting algorithm using singletons.

*Of course, Haskell is not a *total* language (that is, it has `undefined`

and friends), so any proof should be viewed with suspicion. More accurately, it is a proof of *partial* correctness. When the sorting algorithm compiles in finite time and when it runs in finite time, the result it produces is indeed a sorted list.

The above definitions are neat and all, but they sure are annoying. Haskell’s built-in promotion mechanism duplicates `Nat_`

at the type and kind level for us, but we have to be responsible for all *three* versions of `plus_`

. Let’s use the singletons library to help us!

```
> $(singletons [d|
> data Nat = Zero | Succ Nat
> deriving Eq
>
> plus :: Nat -> Nat -> Nat
> plus Zero n = n
> plus (Succ m) n = Succ (plus m n)
> |])
```

The code above is a Template Haskell splice, containing a call to the `singletons`

function (exported from `Data.Singletons.TH`

). That function’s one argument is a Template Haskell quote, containing the abstract syntax tree of the definitions in the quote. The singletons library chews on those definitions to produce all the definitions above, and more.

To demonstrate the usefulness of singletons, we’ll need length-indexed vectors:

```
> data Vec :: * -> Nat -> * where
> VNil :: Vec a Zero
> VCons :: a -> Vec a n -> Vec a (Succ n)
>
> instance Show a => Show (Vec a n) where
> show VNil = "VNil"
> show (VCons h t) = show h ++ " : " ++ show t
```

Now, we can write a well-typed `vReplicate`

function:

```
> vReplicate :: SNat n -> a -> Vec a n
> vReplicate SZero _ = VNil
> vReplicate (SSucc n') x = VCons x (vReplicate n' x)
```

This works as expected:

```
ghci> vReplicate (SSucc (SSucc (SSucc SZero))) "hi"
"hi" : "hi" : "hi" : VNil
```

Even better, we can make the numerical argument to `vReplicate`

*implicit*, using `SingI`

. The `SingI`

class is very simple:

```
class SingI (a :: k) where
sing :: Sing a
```

A dictionary for (that is, a class constraint of) `SingI`

just holds an implicit singleton. (See the paper for more info about `Sing`

, which I won’t go over in this post.) Now, we can define the improved `vReplicateI`

:

`> vReplicateI :: forall a n. SingI n => a -> Vec a n > vReplicateI x = > case sing :: SNat n of > SZero -> VNil > SSucc n' -> VCons x (withSingI n' $ vReplicateI x)`

`ghci> vReplicateI "hi" :: Vec String (Succ (Succ (Succ Zero))) "hi" : "hi" : "hi" : VNil`

Cool!

At this point, you may also want to check out the generated documentation for the singletons library to see a little more of what’s going on. The rest of this post will focus on some of the strange and wonderful new features of v0.9.

`SingI`

dictionariesPrevious versions of singletons had definitions like this:

```
data instance Sing (n :: Nat) where
SZero :: Sing Zero
SSucc :: SingI n => Sing n -> Sing (Succ n)
```

The key difference here is the `SingI`

constraint in the `SSucc`

constructor. This constraint meant that if you pattern-matched on an `SNat`

and got a `SSucc`

, you would get *both* an explicit singleton for (n-1) and an implicit singleton (that is, a `SingI`

dictionary) for (n-1). This was useful, and it meant that the old version of `vReplicateI`

wouldn’t need the `withSingI`

business. But, it also meant that *every* node in a singleton had duplicated information. Since this was true at *every* (interior) node, singleton trees were **exponentially** larger than necessary. Yuck. Somehow, none of my advisor, our reviewers, nor me noticed this before. My advisor (Stephanie Weirich) and I somehow convinced ourselves that the duplication would lead to trees that were double the necessary size, which we deemed acceptable. Oops!

In singletons 0.9, though, a singleton just contains the explicit version. We then needed a way to convert from explicit ones to implicit ones. To do this, I used a trick I learned from Edward Kmett at ICFP this year: take advantage of the fact that classes with exactly one method (and no superclass) are represented solely by the method, and nothing else. Thus, a dictionary for `SingI`

is actually the same, in memory, as a real singleton! To go from explicit to implicit, then, we just have to wave a magic wand and change the type of a singleton from `Sing a`

to `SingI a`

.

The magic wand is easy; it’s called `unsafeCoerce`

. What’s a little trickier is the fact that, of course, we can’t have dictionaries in the same place as normal datatypes in Haskell code. The first step is to create a way to explicitly talk about dictionaries. We make a datatype wrapper:

```
> data SingInstance (a :: k) where
> SingInstance :: SingI a => SingInstance a
```

To call the `SingInstance`

constructor, we need to have a `SingI a`

lying around. When we pattern-match on a `SingInstance`

, we get that `SingI a`

back. Perfect.

Now, we need a way to call the `SingInstance`

constructor when we have an *explicit* singleton. Naively, we could imagine saying something like

`... (unsafeCoerce SingInstance :: Sing a -> SingInstance a) ...`

because, after all, `SingI a => SingInstance a`

is the same under the hood as `Sing a -> SingInstance a`

. The problem here is that as soon as we say `SingInstance`

in Haskell code, GHC helpfully tries to solve the arising `SingI a`

constraint – something we do *not* want here. (Once the `SingInstance`

is instantiated, its type is just `SingInstance a`

, which is *not* the same as `Sing a -> SingInstance a`

!) The answer is to use a `newtype`

the prevents instantiation:

```
> newtype DI a = Don'tInstantiate (SingI a => SingInstance a)
```

Now, after a call to the `Don'tInstantiate`

constructor, GHC will refrain from instantiating. Great – now we just need to connect the dots:

```
> singInstance :: forall (a :: k). Sing a -> SingInstance a
> singInstance s = with_sing_i SingInstance
> where
> with_sing_i :: (SingI a => SingInstance a) -> SingInstance a
> with_sing_i si = unsafeCoerce (Don'tInstantiate si) s
```

It’s dirty work, but someone’s got to do it. And it saves us from exponential blow-up, so I’d say it’s worth it. The `withSingI`

function we saw used above is just a convenient wrapper:

```
> withSingI :: Sing n -> (SingI n => r) -> r
> withSingI sn r =
> case singInstance sn of
> SingInstance -> r
```

A previous post on this blog discussed the different between Boolean equality and propositional equality. Previous versions of singletons contained the `SEq`

“kind class” to use Boolean equality on singleton types. Singletons 0.9 also contains the `SDecide`

class to allow for decidable propositional equality on singleton types.

Before we dive right into `SDecide`

though, let’s review a few new definitions in the standard library (`base`

package) shipping with GHC 7.8. Under `Data.Type.Equality`

, we have these handy definitions:

```
data a :~: b where
Refl :: a :~: a
gcastWith :: (a :~: b) -> ((a ~ b) => r) -> r
gcastWith Refl x = x
class TestEquality (f :: k -> *) where
testEquality :: f a -> f b -> Maybe (a :~: b)
```

The idea behind the `TestEquality`

class is that it should classify datatypes whose definitions are such that we can (perhaps) learn about the equality of type variables by looking at terms. Singletons are the chief candidates for instances of this class. `Typeable`

almost is, but it’s at the wrong kind – `k -> Constraint`

instead of `k -> *`

. (See the new function `Data.Typeable.eqT`

for comparison.)

`SDecide`

takes the `TestEquality`

idea one step further, providing full decidable propositional equality. See the previous post on propositional equality for more background.

```
data Void
type Refuted a = a -> Void
data Decision a = Proved a
| Disproved (Refuted a)
class (kproxy ~ 'KProxy) => SDecide (kproxy :: KProxy k) where
(%~) :: forall (a :: k) (b :: k). Sing a -> Sing b -> Decision (a :~: b)
```

We can now use `(%~)`

to (perhaps) produce an equality that GHC can use to complete type inference. Instances of `SDecide`

(and of `SEq`

, for that matter) are generated for any datatype passed to the `singletons`

Template Haskell function that derive `Eq`

. Or, you can use other functions exported by `Data.Singletons.TH`

to create these instances; see the generated documentation.

While the improvements in v0.9 are substantial, there is still much distance to cover. In particular, I conjecture that *almost any* function definable at the term level can be promoted to the type level. The exceptions would be unpromotable datatypes (like `Double`

, `IO`

, or GADTs) and higher-rank functions (there are no higher-rank kinds). Short of that, I think it’s all within reach.

How?

- Closed type families allow for overlapping patterns.

- Defunctionalization allows for unsaturated type-level functions.

- The th-desugar library desugars Haskell’s fancy constructs into a manageable set.

- Case statements can be straightforwardly encoded using lambda lifting.

But, I don’t seem to have the time to put this all into action. If you’re interested in taking some or all of this on, I’d be very happy to collaborate. I believe some interesting research-y things might come out of it all, too, so there might even be something publishable in it. Drop me a line to discuss!

]]>

**Roles fix a problem in GHC’s type system that has existed for years.**

The problem is the combination of `GeneralizedNewtypeDeriving`

(henceforth called GND) and type families. An example is always best:

```
> {-# LANGUAGE GeneralizedNewtypeDeriving, TypeFamilies, StandaloneDeriving,
> MultiParamTypeClasses, GADTs #-}
>
> module Roles where
>
> class Incrementable a where
> incr :: a -> a
>
> instance Incrementable Int where
> incr x = x + 1
>
> newtype Age = MkAge Int
> deriving Incrementable
```

The idea is that, because the declaration for `Age`

says that any `Age`

is really just an `Int`

in disguise, we can just re-use the instance for `Incrementable Int`

and make it into an instance for `Incrementable Age`

. Internally, `Age`

has the exact same representation as `Int`

, so GHC can really just reuse the methods that were defined for `Int`

.

So far, so good. This makes good sense, allows for less boilerplate code, and is efficient at runtime. Yet, a problem lurks if we start using some type families:

```
> type family IsItAgeOrInt a
> type instance IsItAgeOrInt Int = Bool
> type instance IsItAgeOrInt Age = Char
>
> class BadIdea a where
> frob :: a -> IsItAgeOrInt a
>
> instance BadIdea Int where
> frob x = (x > 0)
>
> deriving instance BadIdea Age
```

Calling `frob`

on an `Int`

produces nothing strange:

```
ghci> frob 5
True
ghci> frob 0
False
ghci> frob (-3)
False
```

But, what happens if we call `frob`

on an `Age`

? According to `frob`

’s type, calling `frob`

on an `Age`

should produce a `IsItAgeOrInt Age`

– that is, a `Char`

. But, we have reused the definition of `frob`

for `Int`

in the instance for `Age`

! The result: general unhappiness:

```
ghci> frob (MkAge 5)
'\-1152921504589753323'
ghci> frob (MkAge 0)
'\4375976000'
ghci> frob (MkAge (-3))
'\4375976000'
```

We’ve broken the type system.

This problem is fairly well-known. The GHC Trac has 3 bugs from it (#1496, #4846, and #7148), there was a POPL paper about it, and at least one blog post. The bug – which rightly is a flaw of the theory, not the implementation – is one of the reasons that modules with GND enabled are not considered part of the Safe Haskell subset. (The other reason is that GND can be used to break module abstraction, a subject which is not considered further here. See ticket #5498.)

The solution comes from that POPL paper: assign so-called *roles* to type variables to constrain how those type variables are used.

Why precisely is it a bad idea to say `deriving instance BadIdea Age`

? Because a method in that class uses a type family in its type, and type families can tell the difference between `Int`

and `Age`

. The GND mechanism, though, pretends that `Int`

and `Age`

are the same. Thus, the two bits induce GHC to experience some cognitive dissonance, and we all suffer.

Roles label the type variables of datatypes, classes, and vanilla type synonyms indicating whether or not these all uses of these variables respect the equality between a newtype and its representation. If all uses of a variable do respect newtype-induced equalities, we say that the variable’s role is `R`

(for “representational”). If it might be possible for a use of the variable to spot the difference between `Age`

and `Int`

, we say that the variable’s role is `N`

(for “nominal”). There is a third role, `P`

for “phantom”, which I get to later.

In our example, the parameter to `Incrementable`

would have role `R`

, while `BadIdea`

would get role `N`

. Because of these role assignments, GHC HEAD will refuse to compile the code above. It issues this error:

```
Can't make a derived instance of ‛BadIdea Age’
(even with cunning newtype deriving):
it is not type-safe to use GeneralizedNewtypeDeriving on this class;
the last parameter of ‛BadIdea’ is at role N
In the stand-alone deriving instance for ‛BadIdea Age’
```

How do we know what role to use for what parameter? We use role inference, of course! Role inference is actually quite straightforward. GHC will look at all uses of a type variable in a datatype, class, or vanilla type synonym definition and see if any of those uses are at role `N`

. If any uses are at role `N`

, then the variable itself must be at role `N`

. Otherwise, (if the variable is used at all) it gets role `R`

. The base cases are type families, `(~)`

, and certain type applications. All arguments to type families naturally get role `N`

, as do the two arguments to the type equality operator `(~)`

. (This is because `(~)`

will *not* say that `Age`

and `Int`

are equal.) Because GADTs are internally represented using `(~)`

, any GADT-like parameter will also be at role `N`

.

Above, I said that certain type applications cause a role to be forced to `N`

. This is a little subtler than the other cases, and needs an example:

`> class Twiddle a b where > meth :: a Int -> a b > > instance Twiddle [] Int where > meth = id > > data GADT a where > GInt :: Int -> GADT Int > deriving instance Show (GADT a) > > instance Twiddle GADT Int where > meth = id > > deriving instance Twiddle GADT Age > > weird :: GADT Age > weird = meth (GInt 5)`

`ghci> weird GInt 5 ghci> :t weird weird :: GADT Age`

What’s going on here? As usual, the GND mechanism just appropriates the definition for `meth`

wholesale for the instance for `Age`

. That method is just the identity function. But, in the `Age`

instance for `Twiddle`

, the `meth`

method has type `GADT Int -> GADT Age`

– clearly *not* an identity function. Yet, it still works just fine, creating the ill-typed `weird`

. A little more pushing here can create `unsafeCoerce`

and segfaults.

But we already knew that GADTs behaved strangely with respect to GND. The difference in this case is that the derived class, `Twiddle`

, *does not mention any GADTs*. The solution is that, whenever a type variable is used as the argument to another type variable, such as `b`

in the definition of `Twiddle`

, that variable gets role `N`

. The variable `a`

has nothing unusual about it, so it gets role `R`

.

There is ongoing work (not mine) on implementing newtype wrappers, as described here. Newtype wrappers will allow you to write code that “lifts” a newtype equality (such as the one between `Age`

and `Int`

) into other types (such as equating `[Age]`

with `[Int]`

). These lifted equalities would have no runtime cost. This is quite different than the situation today: although `Age`

and `Int`

can be converted for free, converting `[Age]`

to `[Int]`

requires iterating down the list.

With that application in mind, consider this type:

```
> data Phantom a = MkPhantom Int
```

Should we be able to convert `[Phantom Bool]`

to `[Phantom Char]`

? Sure we should. Labeling `a`

’s role as `P`

allows for this. The technical details of how `P`

works internally are beyond the scope of this post (but you could see here for starters), but we are guaranteed that any variable at role `P`

is never actually used in the representation of a datatype.

Sometimes, an implementation of an idea doesn’t quite match the abstraction. For example, the definition of `Ptr`

, GHC’s type for pointers that might be used with foreign functions, is this:

`data Ptr a = Ptr Addr#`

If left to its own devices, GHC would infer role `P`

for the type parameter `a`

, because `a`

is not used in the definition of `Ptr`

. Yet, that goes against what we have in mind – we don’t really want to convert `Ptr Int`

s to `Ptr (Bool -> Bool)`

s willy-nilly. So, we use a *role annotation* (enabled with `-XRoleAnnotations`

) which allows the programmer to override GHC’s inference, thus:

`data Ptr a@R = Ptr Addr#`

This role annotation (the `@R`

) forces `a`

’s role to be `R`

, as desired. Note that you can’t force an unsafe role, such as requiring `BadIdea`

’s parameter to be at role `R`

. Role annotations only allow for stricter roling.

Hopefully, roles won’t affect your code too much. But, it is possible that some code that has previously worked now will not. Although code breakage is rightly seen as a terrible thing, it’s actually intentional here: much of the broken code probably has a type error lurking somewhere. In my experience, there are two problems that may arise:

- Uses of GND that worked previously and you know are safe now fail. First off, are you
*sure*that it’s safe? Sometimes, the answer is a resounding “yes!”, but GHC still won’t let you use GND. You will have to write an instance yourself, or provide other wrappers. In the design of this feature, we have considered adding a way to use GND unsafely, but we’re not sure what the demand will be. Do you need to use GND unsafely? Let me know. - In
`.hs-boot`

files (see the relevant section of GHC’s user manual), all declarations must match exactly with the declarations in the corresponding`.hs`

file. This includes roles. However, it is acceptable to leave out definitions in a`.hs-boot`

file. By default, roles are guessed to be`R`

in`.hs-boot`

files. If you have an`.hs-boot`

file that declares a datatype or class whose definition is omitted and whose parameters are not at role`R`

, you will have to add a role annotation. I’m hoping this doesn’t come up too often.

Separately from breakage, writers of libraries may also want to think about whether a role annotation would be helpful in their declarations. The best reason I can think of to do this is to prevent users from using GND on a class. For example, a `Set`

uses a type’s `Ord`

instance to order the data internally. But, the definition of `Set`

does not use type families or other features forcing an `N`

role. So, `Set`

’s parameter will be at role `R`

. Yet, if GND is used to lift a class mentioning `Set`

from, say, `Int`

to `Age`

, the `Ord`

instances for `Int`

and `Age`

might be different, and bad behavior will result. Note that this “bad behavior” would not be a type error, just incorrect runtime behavior. (If it were a type error, roles would hopefully stop this from happening!) The upshot of this is that `Set`

’s parameter should be labeled with role `N`

.

To learn more about roles in GHC, you can check out the up-to-date wiki pages on the subject, for users and for implementers.

If you run into problems, do let me know!

]]>

This post is a literate Haskell file (though there really isn’t that much code). Paste it into a *.lhs* file and load it into GHCi. Because the post uses branched instances, you will need the HEAD version of GHC. (Branched instances will be included in GHC 7.8, but not before.)

```
> {-# LANGUAGE TypeFamilies, DataKinds, GADTs, TypeOperators #-}
```

Our examples will be over `Bool`

s, and we need some way to get GHC to evaluate our type families. The easiest way is to use the following singleton GADT:

```
> data SBool :: Bool -> * where
> SFalse :: SBool False
> STrue :: SBool True
```

When compiling type family instances, GHC checks the instances for conflicts. To know if two instances conflict (i.e., could both match the same usage site), GHC unifies the two left-hand sides. For example, the following code is bad and is rejected:

```
type family F x
type instance F Int = Bool
type instance F a = Double
```

Compiling the above instances gives the following error message:

```
Conflicting family instance declarations:
F Int -- Defined at ...
F a -- Defined at ...
```

This check is a good thing, because otherwise it would be possible to equate two incompatible types, such as `Int`

and `Bool`

.

Here is a nice example of how coincident overlap is useful:

```
> type family (x :: Bool) && (y :: Bool) :: Bool
> type instance False && a = False -- 1
> type instance True && b = b -- 2
> type instance c && False = False -- 3
> type instance d && True = d -- 4
```

Although the first two equations fully define the `&&`

operation, the last two instances allow GHC to reduce a use of `&&`

that could not otherwise be reducible. For example:

```
> and1 :: SBool a -> SBool (True && a)
> and1 x = x
>
> and2 :: SBool a -> SBool (a && True)
> and2 x = x
```

`and1`

uses the second instance of `&&`

, but `and2`

requires the fourth instance. If we comment that instance out, `and2`

fails to compile, because GHC cannot figure out that `a && True`

must be `a`

for all values of `a`

. For various good reasons, perhaps to be explored in another post, GHC does *not* do case analysis on types during type inference.

How does GHC know that overlap is coincident? During the conflict check, GHC looks for a substitution that unifies two potentially-conflicting instances. In our case, the fourth and first instances would conflict under the substitution `{a |-> True, d |-> False}`

. However, after finding the unifying substitution, GHC checks the right-hand sides under that same substitution. If they are the same, then GHC considers the overlap to be coincident and allows the instance pair. In our case, applies the substitution `{a |-> True, d |-> False}`

to `False`

and `d`

and discovers that both are `False`

, and so the instances are allowed.

When thinking about branched instances and coincident overlap, there are two possibilities to consider: coincident overlap within a branched instance and coincident overlap among two separate branched instances. Let’s consider the first case here.

Imagine we define `||`

analogously to `&&`

, but using one branched instance:

```
> type family (x :: Bool) || (y :: Bool) :: Bool
> type instance where
> False || a = a -- 1
> True || b = True -- 2
> c || False = c -- 3
> d || True = True -- 4
```

Now, let’s consider simplifying the type `e || False`

. The first two branches don’t match, but the third does. Now, following the rule for branched instance simplification (as stated in the Haskell wiki), we check to see if any previous branches might be applicable to `e || False`

, for any possible instantiation of `e`

. The first branch certainly might apply, and so `e || False`

fails to simplify. This is surely counterintuitive, because the third branch matches `e || False`

exactly!

Just to prove this behavior, I tried running this code through GHC:

```
bar :: SBool a -> SBool (a || False)
bar x = x
```

Here is the error:

`Couldn't match type ‛a’ with ‛a || 'False’`

At first blush, it seems I’ve missed something important here in the implementation — allowing coincident overlap within a branched instance. But, there is a major problem with such overlap in this setting. Let’s think about how coincident overlap would work in this setting. After selecting the third branch to simplify `e || False`

(with the substitution `{c |-> e}`

), GHC checks to see if any previous branch could be applicable to `e || False`

. The first branch, `False || a`

, unifies with `e || False`

, so it might be applicable later on. The unifying substitution is `{a |-> False, e |-> False}`

. Now, if we wanted to check for coincident overlap, we would apply *both* substitutions (`{c |-> e}`

and `{a |-> False, e |-> False}`

) to the right-hand sides. In this case, we would see that both right-hand sides would become `False`

, and it seems we should allow the simplification of `e || False`

to `e`

.

Let’s try a harder case. What if we want to simplify `(G f) || False`

, for some type family `G`

? The third branch matches, with the substitution `{c |-> G f}`

. Now, we check earlier branches for applicability. The first branch is potentially applicable, if `G f`

simplifies to `False`

. But, we can’t produce a substitution over type variables to witness to check right-hand sides. In this case, it wouldn’t be hard to imagine a substitution like `{(G f) |-> False}`

, but that’s a slippery slope to slide down. What if `f`

appears multiple times in the type, perhaps under different type family applications? How do we deal with this? There may well be an answer, but it would be subtle and likely quite fragile from the programmer’s point of view. So, we decided to ignore the possibility of coincident overlap within a branch. We were unable to come up with a compelling example of why anyone would want this feature, it seemed hard to get right, and we can just write `||`

using separate instances, anyway.

Consider the following (contrived) example:

```
type family F (x :: Bool) (y :: Bool) :: *
type instance where
F False True = Int
F a a = Int
F b c = Bool
type instance F d True = Int
```

Is this set of instances allowable? Is that your final answer?

I believe that this set of instances wouldn’t produce any conflicts. Anything that matches `F d True`

would have to match one of the first two branches of the branched instance, meaning that the right-hand sides coincide. However, it is difficult to reason about such cases, for human and GHC alike. So, for the sake of simplicity, we also forbid coincident overlap whenever even one instance is branched. This means that

`type instance F Int = Bool`

and

`type instance where F Int = Bool`

are very subtly different.

This post was partially inspired by Andy Adams-Moran’s comment in which Andy poses this example, paraphrased slightly:

```
> data T a
> type family Equiv x y :: Bool
> type instance where
> Equiv a a = True -- 1
> Equiv (T b) (T c) = True -- 2
> Equiv (t d) (t e) = Equiv d e -- 3
> Equiv f g = False -- 4
```

Alas, this does not work as well as one would like. The problem is that we cannot simplify, say, `Equiv (T a) (T b)`

to `True`

, because this matches the second branch but might later match the first branch. Simplifying this use would require coincident overlap checking within branched instances. We could move the first branch to the third position, and that would help, but not solve the problem. With that change, `Equiv x x`

would not simplify, until the head of `x`

were known.

So, is this the “compelling example” we were looking for? Perhaps. Andy, why do you want this? Can your use case be achieved with other means? Do you (anyone out there) have a suggestion for how to deal with coincident overlap within branched instances in a simple, easy-to-explain manner?

]]>

`singletons`

library came from. This library allows you to write some dependently typed code in Haskell, using singleton types. I didn’t invent this idea, but I did write a nice library to remove some of the pain of using this encoding. SHE can be considered an ancestor of `singletons`

.
At my Haskell Symposium (2012) presentation of the singletons work, an attendee asked if singleton generation works for higher-order functions, like `map`

. I innocently answered “yes”, at which point Conor McBride, sitting in the back, stood up and said “I don’t believe you!” I wasn’t lying — `singletons`

does indeed handle higher-order functions. However, Conor’s skepticism isn’t unfounded: a “singletonized” higher-order function isn’t so useful.

This blog post explores why singletonized higher-order functions aren’t useful and suggests defunctionalization as a way to fix the problem.

Before we get too much further, this blog post is a literate Haskell file, so we have some necessary throat-clearing:

```
> {-# LANGUAGE TemplateHaskell, DataKinds, PolyKinds, TypeFamilies,
> GADTs, FlexibleContexts, RankNTypes, TypeOperators #-}
> import Prelude hiding (map)
> import Data.Singletons
```

I should also warn that some of the code rendered as Haskell in this blog post does not have bird-tracks. This code is *not* intended as part of the executable code. I should finally note that this code does not compile with the current HEAD of GHC (but it does compile with 7.6.1, at least). The new ambiguity check overzealously flags some of the code here as inherently ambiguous, which it is not. I have filed bug report #7804.

First off, what is a singleton? I will give a brief introduction here, but I refer you to the singletons paper on the subject. A singleton is a type with exactly one value. Let’s make a singleton for the natural numbers:

```
> data Nat = Zero | Succ Nat
> $(genSingletons [''Nat])
```

The second line above generates the following singleton definition for `Nat`

:

```
data SNat :: Nat -> * where
SZero :: SNat Zero
SSucc :: SNat n -> SNat (Succ n)
```

(Well, it doesn’t quite generate that, but let’s pretend it does. See the paper [or use `-ddump-splices`

!] for more details.) According to this definition, there is exactly one value for every `SNat n`

. For example, the type `SNat (Succ Zero)`

has one value: `SSucc SZero`

. This is interesting because it means that once we identify a value, say in a `case`

expression, we also know a type index. This interplay between term-level matching and type-level information is what makes singletons enable something like dependently typed programming.

The `singletons`

library provides a singleton for `[]`

, but with alphanumeric names. We can pretend this is the definition:

```
data SList :: [k] -> * where
SNil :: SList '[]
SCons :: Sing h -> SList t -> SList (h ': t)
```

The `Sing`

in there (the first parameter to `SCons`

) is defined thus:

`data family Sing (a :: k)`

Using a data family allows GHC to choose the correct singleton type, depending on the kind `k`

. An instance of this family is defined for every singleton type we create. So, actually, the list type built into the `singletons`

library is more like

```
data instance Sing (list :: [k]) where
SNil :: Sing '[]
SCons :: Sing h -> Sing t -> Sing (h ': t)
```

The `singletons`

library also provides a synonym `SList`

to refer to this instance of the `Sing`

family. Again, you may find more clarity in the singletons paper, which spends a little more time drawing all of this out.

We can singletonize more than just datatypes. We can singletonize functions. Consider the following predecessor function on `Nat`

s, defined in such a way that we get the singleton definition generated for us:

```
> $(singletons [d|
> pred :: Nat -> Nat
> pred Zero = Zero
> pred (Succ n) = n
> |])
```

A definition of `pred`

that works on singleton `Nat`

s is generated for us. It looks something like

```
sPred :: SNat n -> SNat ???
sPred SZero = SZero
sPred (SSucc n) = n
```

The problem is those pesky `???`

marks. Because the type indices of a singleton mirror the computation of singleton values, every function on singletons must be mirrored at the type level. So, to define `sPred`

, we must have the type family `Pred`

as well:

```
type family Pred (n :: Nat) :: Nat
type instance Pred Zero = Zero
type instance Pred (Succ n) = n
sPred :: SNat n -> SNat (Pred n)
...
```

The `singletons`

library generates both the type-level `Pred`

and the singletonized `sPred`

.

But what about `map`

?

```
> $(singletons [d|
> map :: (a -> b) -> [a] -> [b]
> map _ [] = []
> map f (h : t) = f h : map f t
> |])
```

The `singletons`

library generates these definitions (but with some extra class constraints that don’t concern us):

```
type family Map (f :: k1 -> k2) (list :: [k1]) :: [k2]
type instance Map f '[] = '[]
type instance Map f (h ': t) = ((f h) ': (Map f t)
sMap :: (forall a. Sing a -> Sing (f a)) -> Sing list -> Sing (Map f list)
sMap _ SNil = SNil
sMap f (SCons h t) = SCons (f h) (sMap f t)
```

Whoa! What’s the bizarre type doing in `sMap`

? The `forall`

declares that the function passed into `sMap`

must be valid for any `a`

. That’s not so strange, when we think about the fact the index `a`

must be isomorphic to the term `Sing a`

. We’re used to having functions that work for any term. Here, because of the relationship between term values and type indices, the function must also work for any type index `a`

. This is particularly important, because `Map`

will apply `f`

to all the `a`

s in the list `list`

. If we leave off the `forall`

, the function won’t type check.

This is all well and good, but this definition of `sMap`

isn’t useful. This is because the type of that first parameter is quite restrictive. We must have a function of that type, and * f must be inferrable*. Let’s look at some examples. We can write the following just fine:

```
> sOne = SSucc SZero
> sTwo = SSucc sOne
> sThree = SSucc sTwo
> sNums = SCons sOne $ SCons sTwo $ SCons sThree SNil -- [1,2,3]
>
> two_three_four = sMap sSucc sNums
```

(`sSucc`

is a so-called “smart” constructor. It is equivalent to `SSucc`

, but adds extra class constraints that don’t concern us here. See Section 3.1 of the singletons paper.) The type of `SSucc`

is `forall a. Sing a -> Sing (Succ a)`

, so the call to `sMap`

type checks just fine. `SSucc`

is perfect here. Let’s try something else:

`zero_one_two = sMap sPred sNums`

The type of `sPred`

is `forall n. Sing n -> Sing (Pred n)`

, as written above, so one would think all is good. All is not good. The problem is that `Pred`

is a *type family*, not a regular type constructor like good old `Succ`

. Thus, GHC does not (and cannot, with good reason) infer that `f`

in the type of `sMap`

should be `Pred`

:

```
Couldn't match type `Pred t1' with `t t1'
Expected type: Sing Nat t1 -> Sing Nat (t t1)
Actual type: Sing Nat t1 -> Sing Nat (Pred t1)
```

The reason this inference is bogus is that GHC will not let a type variable unify with a type family. In its internal constraint-solving machinery, GHC assumes that all type variables are both *injective* and *generative*. Injective means that from an assumption of `t a ~ t b`

, (where `~`

denotes type equality) we can derive `a ~ b`

. Generative means that from an assumption of `t a ~ s b`

, we can derive `t ~ s`

. Type families, in general, have neither property. So, GHC won’t let a type variable unify with a type family.

This problem — called the saturation requirement of type families — is what Conor was thinking about when he disbelieved that `singletons`

handled `map`

.

Over lunch while at ICFP, I had the good fortune of sitting with Tillmann Rendel, and we got to talking about this problem. He suggested that I think about defunctionalization. I have thought about this, and I think it’s the answer to the problem.

Defunctionalization is an old technique of dealing with higher-order functions. The idea is that, instead of making a closure or other pointer to code, represent a function with some symbol that can be looked up and linked to the code later. Danvy and Nielsen wrote a more recent paper explaining how the whole thing works. One drawback of the technique that they outline is that defunctionalization tends to require whole-program translation. That is, the transformation requires looking at the entire codebase to do the translation. This is generally necessary so that the table of function “symbols”, encoded as an algebraic datatype, can be matched on. However, in Haskell, we have *open* type functions, so this problem does not limit us.

Another drawback of defunctionalization is that it is generally poorly-typed. If we are just using some symbol to denote a function, how can we be sure that a function application is well-typed? Pottier and Gauthier address this issue in their paper on the topic by using generalized algebraic datatypes (GADTs). But, given the way type families work in Haskell, we don’t need the power of GADTs to do this for us.

At the heart of any defunctionalization scheme is an `apply`

function:

`type family Apply (f :: k1 -> k2) (a :: k1) :: k2`

But wait, we don’t really want `Apply`

to have that kind, because then we would have to pass `Pred`

in as the function, and `Pred`

all on its own is unsaturated. What we need is some symbol that can represent `Pred`

:

```
type family Apply (f :: *) (a :: k1) :: k2
data PredSym :: *
type instance Apply PredSym n = Pred n
```

This is progress. We can now pass `PredSym`

around all by itself, and when we apply it, we get the desired behavior. But, this is weakly kinded. We would like to be able to define many symbols akin to `PredSym`

, and we would like GHC to be able to make sure that we use these symbols appropriately — that is, we don’t say `Apply PredSym '[Int, Bool]`

.

Yet, we still want to be able to create new symbols at will. So, we want to use `data`

declarations to create the symbols. Thus, the kind of these symbols must end in `-> *`

. But, we have complete freedom as to what appears to the left of that arrow. We will use this definition to store the kinds:

```
> data TyFun :: * -> * -> *
```

Now, we can make our richly kinded `Apply`

:

```
> type family Apply (f :: (TyFun k1 k2) -> *) (x :: k1) :: k2
> data PredSym :: (TyFun Nat Nat) -> *
> type instance Apply PredSym x = Pred x
```

This one works. But, we want it to work also for real type constructors (like `Succ`

), not just type families. We have to wrap these type constructors in an appropriately kinded wrapper:

```
> data TyCon :: (k1 -> k2) -> (TyFun k1 k2) -> *
> type instance Apply (TyCon tc) x = tc x
```

Then, we define a new version of `sMap`

that works with `Apply`

:

`> type family DFMap (f :: (TyFun k1 k2) -> *) (ls :: [k1]) :: [k2] > type instance DFMap f '[] = '[] > type instance DFMap f (h ': t) = (Apply f h ': DFMap f t)`

`sDFMap :: forall (f :: TyFun k1 k2 -> *) (ls :: [k1]). (forall a. Sing a -> Sing (Apply f a)) -> Sing ls -> Sing (DFMap f ls) sDFMap _ SNil = SNil sDFMap f (SCons h t) = SCons (f h) (sDFMap f t)`

We’re close, but we’re not there yet. This `sDFMap`

function has a major problem: it is inherently ambiguous. The type variable `f`

appears only inside of type family applications, and so there’s no way for GHC to infer its value. This problem has a straightforward solution: use a proxy.

`> data Proxy a = P`

`sDFMap :: forall (f :: TyFun k1 k2 -> *) (ls :: [k1]). Proxy f -> (forall a. Sing a -> Sing (Apply f a)) -> Sing ls -> Sing (DFMap f ls) sDFMap _ _ SNil = SNil sDFMap p f (SCons h t) = SCons (f h) (sDFMap p f t)`

This one really does work, but we can still do better. The problem is that some plumbing is exposed. When calling this version of `sDFMap`

with `sPred`

, we still have to explicitly create the proxy argument and give it the correct type, even though we would like to be able to infer it from `sPred`

. The trick is that, while we do need to have `f`

exposed in the type of `sDFMap`

, the location where it is exposed doesn’t matter. It can actually be in an argument to the *callback* function. This next, final version also contains those pesky class constraints that we’ve been trying to avoid this whole time.

```
> sDFMap :: forall (f :: TyFun k1 k2 -> *) (ls :: [k1]).
> SingKind (KindParam :: OfKind k2) =>
> (forall a. Proxy f -> Sing a -> Sing (Apply f a)) ->
> Sing ls -> Sing (DFMap f ls)
> sDFMap _ SNil = sNil
> sDFMap f (SCons h t) = sCons (f P h) (sDFMap f t)
```

To call this function, we will need to create wrappers around, say, `sPred`

and `sSucc`

that explicitly relate the functions to their defunctionalization symbols:

```
> sPred' :: Proxy PredSym -> Sing n -> Sing (Pred n)
> sPred' _ = sPred
> sSucc' :: Proxy (TyCon Succ) -> Sing n -> Sing (Succ n)
> sSucc' _ = sSucc
```

Now, finally, we can use `sDFMap`

as desired:

```
> two_three_four' = sDFMap sSucc' sNums
> zero_one_two = sDFMap sPred' sNums
```

This whole thing is a bit of a hack, but it’s one that seems to follow a nice pattern that could be automated. In particular, I believe it should be relatively straightforward to incorporate this kind of encoding into a future version of `singletons`

. This would allow more code to be singletonized and support dependently typed programming better.

One clear drawback of this approach is that the arity of a defunctionalized function must be a part of the encoding. In some future world with kind families, it may be possible to generalize the arities. One idea I have for a future blog post is to grapple with higher-arity and partially applied functions, which may be a bit icky. And, what about defunctionalizing higher-order functions?

The question at the end of all of this is: could this be an approach to desaturating type families in Haskell? In other words, could an approach based on the ideas here be incorporated into GHC to make all of this Just Work? I’ve thought about it a bit, but not long enough or hard enough to really have an opinion. What do you think?

And, if I write this functionality into `singletons`

, will it be useful to you? We all have limited time, and it’s helpful to know if such an improvement will be put to use.

]]>

This blog post is a literate Haskell file. Copy and paste into a .lhs file to try out this code. This file will only compile with GHC HEAD, however.

We need some header formalities:

```
> {-# LANGUAGE TypeFamilies, DataKinds, PolyKinds, TypeOperators #-}
> import Prelude hiding (zipWith)
```

When writing term-level functions, it is natural to write a series of equations, each using a sequence of patterns to select which equation should be triggered when calling the function. Critically for this discussion, the first matching equation is used. Let’s use a particularly favorite function of mine as an example:

```
> import Prelude hiding (zipWith)
>
> zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
> zipWith f (a:as) (b:bs) = (f a b):(zipWith f as bs)
> zipWith _ _ _ = []
```

Let’s try to naively write this function at the type level on promoted lists:

```
type family ZipWith (f :: a -> b -> c) (as :: [a]) (bs :: [b]) :: [c]
type instance ZipWith f (a ': as) (b ': bs) = (f a b) ': (ZipWith f as bs)
type instance ZipWith f as bs = '[]
```

Urk. We get the following error:

```
Conflicting family instance declarations:
ZipWith k k k f ((':) k a as) ((':) k b bs)
ZipWith k k k f as bs
```

(The repetition of the variable `k`

is incorrect, and has been reported as GHC bug #7524. This is not the issue we are pursuing here, however.)

The problem is really that `type instance`

declarations are essentially *unordered*. The order in which they appear in a file is irrelevant to GHC. Relatedly, a programmer can define instances of the same type family in multiple modules. With separate compilation, the lack of ordering and the overlap check are necessary for type soundness. This is quite different from term-level function definition equations. All equations defining the same function not only have to be in the same module, but they must be one right after another.

The particular example here has an easy solution. Because we are matching over a closed kind (`[a]`

at the kind level), we could simply expand out the different cases we wish to match against. However, this solution is not possible when matching over an open kind, such as `*`

. We’ll see a useful example of overlap involving `*`

shortly.

GHC HEAD now contains an implementation for ordered overlapping type family instances. The example above can be written thus:

```
> type family ZipWith (f :: a -> b -> c) (as :: [a]) (bs :: [b]) :: [c]
> type instance where
> ZipWith f (a ': as) (b ': bs) = (f a b) ': (ZipWith f as bs)
> ZipWith f as bs = '[]
```

More interestingly, we can now define this:

```
> type family Equals (a :: k) (b :: k) :: Bool
> type instance where
> Equals a a = True
> Equals a b = False
```

Ordered overlapping type family instances allow us to define a general, write-once use-everywhere Boolean equality at the type level. Yay!

This new form of type family instance also seems to close the biggest known gap between the expressivity of functional dependencies and type families: functional dependencies have long supported overlap (through `OverlappingInstances`

or `IncoherentInstances`

) that type families could not. Although functional dependencies’ overlap requires ordering based on specificity and type families’ overlap is based on an explicit ordering, it would seem that any code that took advantage of functional dependencies and overlap can now be translated to use overlapping type families.

`type instance where`

does not work with associated types. Class instances can be sprinkled across modules, and having this form of overlap appear across modules would not be type safe in the presence of separate compilation.`type instance where`

does not work with associated types, even when the overlap is intended to exist just within one instance. There is no great reason for this restriction, but it seemed unimportant. Yell if this matters to you.- Choosing which equation in a group to use is somewhat delicate. For example, consider the
`Equals`

type family. What if we want to simplify`Equals a Int`

? Well, we can’t. That’s because`a`

might (sometimes) be instantiated to`Int`

, and if we simplified`Equals a Int`

to`False`

, we would have a type soundness issue. So, perhaps counterintuitively, we can’t simplify even`Equals a b`

to`False`

until`a`

and`b`

are known.

This GHC wiki page gives an outline of how to get GHC compiling on your machine so you can play with this feature in HEAD. I don’t imagine it will be in 7.6.2, but I believe it will be in 7.8.1, whenever that is released. Enjoy, and please post any feedback!

Many thanks to Simon Peyton Jones, Dimitrios Vytiniotis, and Stephanie Weirich for getting me started and helping me think through some of the finer points.

]]>

Do you have a stake in this work? Are you planning on reasoning about GHC’s core language? Check out the formalism here. Any feedback is welcome!

]]>

This blog post is a literate Haskell file, compatible with GHC 7.6.1. As usual, we need some initial declarations to get off the ground.

```
> {-# LANGUAGE DataKinds, PolyKinds, GADTs, TypeFamilies,
> LambdaCase #-}
> {-# OPTIONS_GHC -fwarn-incomplete-patterns #-}
>
> data Nat = Zero | Succ Nat
```

Throughout the post, I will be talking about equality on `Nat`

s, but the ideas here extend to any types that would admit a good instance of `Eq`

.

We will need length-indexed vectors to make interesting use of the ideas here:

```
> data Vec :: * -> Nat -> * where
> VNil :: Vec a Zero
> VCons :: a -> Vec a n -> Vec a (Succ n)
>
> safeHead :: Vec a (Succ n) -> a
> safeHead (VCons h _) = h
>
> safeTail :: Vec a (Succ n) -> Vec a n
> safeTail (VCons _ t) = t
```

Note that there is no need for another clause for `safeHead`

or `safeTail`

because the types forbid, for example, `safeHead VNil`

.

We are all (hopefully?) familiar with Boolean equality:

```
> boolEq :: Nat -> Nat -> Bool
> boolEq Zero Zero = True
> boolEq (Succ a) (Succ b) = boolEq a b
> boolEq _ _ = False
```

We can even lift this idea to the type level:

```
> type family BoolEq (a :: Nat) (b :: Nat) :: Bool
> type instance BoolEq Zero Zero = True
> type instance BoolEq (Succ a) (Succ b) = BoolEq a b
> type instance BoolEq Zero (Succ x) = False
> type instance BoolEq (Succ x) Zero = False
```

Let’s try to write a function to retrieve the second element from a vector:

```
cadr :: Vec a n -> a
cadr v = safeHead (safeTail v)
```

We get an error:

`Couldn't match type `n' with 'Succ ('Succ n0)`

Naturally, GHC can’t confidently apply `safeHead`

and `safeTail`

because we don’t know that `v`

has at least 2 elements.

Let’s try again:

```
cadr :: (BoolEq n (Succ (Succ n')) ~ True) => Vec a n -> a
cadr v = safeHead (safeTail v)
```

Still doesn’t work:

```
Could not deduce (n ~ 'Succ ('Succ n0))
from the context (BoolEq n ('Succ ('Succ n')) ~ 'True)
```

The problem is that GHC doesn’t know that our Boolean equality function really shows the equality of two types.

This is a contrived example, though. Instead, let’s consider a program passing around explicit evidence of whether or not a list has at least two elements. If the list doesn’t, the function should return a supplied default.

To pass around evidence of a type-level Boolean quantity, we need the singleton type for Booleans:

```
> data SBool :: Bool -> * where
> STrue :: SBool True
> SFalse :: SBool False
```

(If you’ve never seen singleton types before, my singletons paper on the subject contains helpful information.)

```
cadr :: SBool (BoolEq n (Succ (Succ n'))) -> a -> Vec a n -> a
cadr evidence deflt v = case evidence of
STrue -> safeHead (safeTail v)
SFalse -> deflt
```

Still, no go:

```
Could not deduce (n ~ 'Succ ('Succ n0))
from the context (BoolEq n ('Succ ('Succ n')) ~ 'True)
bound by a pattern with constructor
STrue :: SBool 'True,
```

In the end, this last example is the same as the previous. Pattern-matching on the `SBool`

just brings the equality `(BoolEq n (Succ (Succ n'))) ~ True`

into the context.

We need to convert Boolean equality to propositional equality, which is denoted by `~`

. Propositional equality is an equality among types that GHC can make use of in type checking code. To work with propositional equality, we need to make it first class, instead of just a constraint.

```
> data PropEq :: k -> k -> * where
> Refl :: PropEq x x
```

Let’s now try to write a conversion function from Boolean equality to propositional equality:

```
boolToProp :: (BoolEq a b ~ True) => PropEq a b
boolToProp = Refl
```

Same old problem:

```
Could not deduce (a ~ b)
from the context (BoolEq a b ~ 'True)
```

What we need to do is to build up the propositional equality from pieces that GHC can easily verify are indeed equal. We need an inductive proof that our definition of Boolean equality is correct for any natural number. To write such a proof, we will need to do case analysis on a and b. To do that, in turn, we will need a singleton over natural numbers.

```
> data SNat :: Nat -> * where
> SZero :: SNat Zero
> SSucc :: SNat n -> SNat (Succ n)
```

Now, let’s write the inductive proof:

```
boolToProp :: (BoolEq a b ~ True) => SNat a -> SNat b -> PropEq a b
boolToProp SZero SZero = Refl
boolToProp (SSucc x') (SSucc y') = boolToProp x' y'
```

Oops:

```
Could not deduce (n ~ 'Succ n)
...
Could not deduce (n1 ~ 'Succ n1)
```

The problem is that we are returning the result of `boolToProp x' y'`

directly from `boolToProp`

, even though `x'`

and `y'`

have different types than `SSucc x`

and `SSucc y`

. The solution is to use a pattern match on the result from the recursive call. Let’s call the type index associated with `x'`

to be `a'`

and that with `y'`

to be `b'`

. Then, the recursive call gives us `(Refl :: PropEq a' b')`

. If we pattern match on this, we get the propositional equality `a' ~ b'`

into the context. This can be used to show `Succ a' ~ Succ b'`

(which is really just `a ~ b`

), so we can now use `Refl`

once again, though at a different type:

```
> boolToProp :: (BoolEq a b ~ True) => SNat a -> SNat b -> PropEq a b
> boolToProp SZero SZero = Refl
> boolToProp (SSucc a') (SSucc b') =
> case boolToProp a' b' of
> Refl -> Refl
```

Great. Except now we get this warning:

```
Pattern match(es) are non-exhaustive
In an equation for `boolToProp':
Patterns not matched:
SZero (SSucc _)
(SSucc _) SZero
```

The problem is that there is no possible way out in these cases, so we’ll just have to put undefined:

```
boolToProp SZero (SSucc _) = undefined
boolToProp (SSucc _) SZero = undefined
```

Wait. Now there’s a new problem:

```
Couldn't match type 'False with 'True
Inaccessible code in ...
Couldn't match type 'False with 'True
Inaccessible code in ...
```

GHC rightly determines that these cases are impossible. Why are they impossible? Because we know that `BoolEq a b ~ True`

. In these cases, that wouldn’t be the case, so GHC can’t match `False`

with `True`

.

But now we are in a quandary. Without the extra matches, we get a warning (due to `-fwarn-incomplete-patterns`

, which you should never leave home without). With the matches, we get an error. That’s silly. And others agree that it’s silly, filing bug report #3927. According to the commentary on the bug report, Simon PJ says, “The more complaints the more likely I am to put off other things to do this one!” So, at the risk of pestering dear Simon, if you are annoyed by this, please complain! The best way to complain is simply to add yourself to the Cc list of the bug report. If enough people do this, the bug will get fixed sooner. Or, even better, try to fix it yourself!

So, where does this leave us? I can’t stand a warning in my code, so we’ll suppress it with this:

```
> boolToProp _ _ = error "bug 3927"
```

Let’s try to write `cadr`

one last time, this time armed with `boolToProp`

:

```
> cadr :: SBool (BoolEq n (Succ (Succ n'))) -> SNat n
> -> SNat n' -> a -> Vec a n -> a
> cadr evidence n n' deflt v = case evidence of
> STrue -> case boolToProp n (SSucc (SSucc n')) of
> Refl -> safeHead (safeTail v)
> SFalse -> deflt
```

It works! Hurrah!

The sad part here is that, to make it work, we needed to pass around two `SNat`

s and perform an O(n) operation (at runtime – the `boolToProp`

“proof” runs!) to prove to GHC that the operation is valid. Can we do better?

Yes, we can.

The problem lies in the fact that we branch on a witness of *Boolean* equality. There is an alternative: decidable propositional equality. The idea is that instead of just type-level Booleans, decidable equality stores either evidence that two types are equal or evidence that they are not. We know how to write evidence that two types `a`

and `b`

are equal: `PropEq a b`

. What’s the opposite of `PropEq a b`

? It’s the statement that `PropEq a b`

implies falsehood. In Haskell, we can represent falsehood with an empty type.

```
> data Falsehood
> type Not a = a -> Falsehood
```

Now, we can define decidable equality in Haskell:

```
> type DecidableEquality (a :: k) (b :: k) = Either (PropEq a b) (Not (PropEq a b))
```

We can even write a function to decide equality over `Nat`

s. Because this function produces runtime evidence, it uses singletons to work with types.

```
> decEq :: SNat a -> SNat b -> DecidableEquality a b
> decEq SZero SZero = Left Refl
> decEq (SSucc x') (SSucc y') = case decEq x' y' of
> Left Refl -> Left Refl
> Right contra -> Right (\case Refl -> contra Refl)
```

There’s a little magic going on here, so let’s pause and reflect. The first equation is straightforward. In the second, we recur on `a'`

and `b'`

. If those in fact are equal, we still unpack the `Left Refl`

result and create a new `Left Refl`

at a different type. We’ve seen this before, so it’s OK.

But what’s going on with `Right`

? Once again, let’s call the type index associated with `x'`

to be `a'`

, and likewise with `y'`

and `b'`

. The return type of `decEq x' y'`

is `DecidableEquality a' b'`

. Because we’ve matched with `Right`

, we know that `contra`

must have type `Not (PropEq a' b')`

, synonymous with `PropEq a' b' -> Falsehood`

. We must produce something of type `PropEq a b -> Falsehood`

. So, we write a lambda-case pattern match on the `PropEq a b`

to get the equality `a ~ b`

. Because `a`

is `Succ a'`

and `b`

is `Succ b'`

, GHC can use `a ~ b`

to derive `a' ~ b'`

, and thus we can call `contra`

with `Refl :: PropEq a' b'`

. Whew.

Now, we deal with the failure cases. If we know that, say, `a`

is `Zero`

and `b`

is `Succ b'`

, then GHC rightly figures out that `PropEq a b`

(which expands to `PropEq Zero (Succ b')`

) is an empty type.

```
decEq SZero (SSucc _) = Right (\case {})
decEq (SSucc _) SZero = Right (\case {})
```

No go:

`parse error on input `}'`

GHC does not support empty pattern matches. (UPDATE [Jan 4, 2013]: Simon PJ has implemented empty pattern matches in HEAD. Yay!) A feature request to support these was submitted as bug report #2431. Drat. Explicitly pattern matching on `Refl`

gives us an inaccessible code error (correctly), so we are forced to do this dreadful workaround:

```
> decEq SZero (SSucc _) = Right (\_ -> error "bug 2431")
> decEq (SSucc _) SZero = Right (\_ -> error "bug 2431")
```

(Incidentally, this bug seems much easier to fix than #3927, so if you have a little time, go for it!)

So, now that we’ve defined `decEq`

, how can we use it? Let’s write a wrapper around `safeHead`

with evidence. We’ll first need a way to eliminate `Falsehood`

. From logic, we learn that falsehood implies anything, neatly expressed in this type:

```
> exFalsoQuodlibet :: Falsehood -> a
```

Unfortunately, even with a manifestly empty type, we can’t use an empty pattern match. So, we do this:

```
> exFalsoQuodlibet = \case _ -> error "bug 2431"
```

Here is the type for our new head operation:

```
> safeHead' :: DecidableEquality n Zero -> SNat n -> a -> Vec a n -> a
```

In this example, as opposed to above, we reason about whether `n`

is `Zero`

. Because we set the example up this way, the evidence when `n`

is *not* `Zero`

will be necessary for a complete definition of the `safeHead'`

operation.

```
> safeHead' (Left Refl) _ deflt _ = deflt
> safeHead' (Right contra) n _ v = case n of
> SZero -> exFalsoQuodlibet (contra Refl)
> SSucc _ -> safeHead v
```

Note that this definition is complete: we can use the primitives that we have built up to eliminate the impossible case match. Of course, those primitives had to avoid empty pattern matches. However, it is easy to imagine a future where bugs #2431 and #3927 are gone and we can define this without any partial features of Haskell.

I should also note that we can’t use `exFalsoQuodlibet (contra Refl)`

in the `SSucc _`

case. GHC rightly complains `Couldn't match type 'Succ n1 with 'Zero`

in the use of `Refl`

.

Fun stuff, no?

]]>

`zipWith`

in Haskell using GHC 7.6.1? Can we leverage the new expressivity in promoted types and kinds to do away with `zipWith3`

, `zipWith4`

and friends? The answer turns out to be yes.
Let’s start by enabling a bunch of non-controversial language options and declaring the module:

```
{-# LANGUAGE TypeFamilies, ExplicitForAll, DataKinds, GADTs,
MultiParamTypeClasses, FlexibleInstances, FlexibleContexts #-}
module ZipWith where
import Prelude hiding (zipWith)
```

Though promotion is not strictly necessary to pull this off, it turns out to be convenient for GHC to kind-check our code. We define the natural numbers to use at the kind level:

`data Nat = Zero | Succ Nat`

Now, we need to start thinking about what the type of a variable-arity `zipWith`

must be. Clearly, it will need to take the function to apply and a bunch of lists, but the number of lists is not known when we write the type. We correspondingly don’t know how many arguments the function itself should take. We’ve narrowed our type down to `f -> <dragons>`

, for some function type `f`

. The dragons will have to be some type-level function that evaluates to the correct sequence of arrows and argument types, based on the type substituted for `f`

.

Examples may help here:

- If
`f`

is`a -> b`

, then the dragons should be`[a] -> [b]`

. - If
`f`

is`a -> b -> c`

, then the dragons should be`[a] -> [b] -> [c]`

. - and so on.

OK. That’s not too hard. We essentially want to map the type-level `[]`

operator over the components of the type of `f`

. However, a problem lurks: what if the final result type is itself an arrow? In the first example above, there is nothing stopping `b`

from being `d -> e`

. This turns out to be a fundemental ambiguity in variable-arity `zipWith`

. Let’s explore this for a moment.

We’ll need a three-argument function to make the discussion interesting. Here is such a function:

```
splotch :: Int -> Char -> Double -> String
splotch a b c = (show a) ++ (show b) ++ (show c)
```

Now, there are two conceivable ways to apply `splotch`

with `zipWith`

:

```
*ZipWith> :t zipWith2 splotch
zipWith2 splotch :: [Int] -> [Char] -> [Double -> String]
*ZipWith> :t zipWith3 splotch
zipWith3 splotch :: [Int] -> [Char] -> [Double] -> [String]
```

(Here, `zipWith2`

is really just the `zipWith`

in the Prelude.)

In general, there is no way for an automated system to know which one of these possibilities we want, so it is sensible to have to provide a number to the dragons, which we’ll now name `Listify`

. This number is the number of arguments to the function `f`

. Here is the definition for `Listify`

:

```
-- Map the type constructor [] over the types of arguments and return value of
-- a function
type family Listify (n :: Nat) (arrows :: *) :: *
type instance Listify (Succ n) (a -> b) = [a] -> Listify n b
type instance Listify Zero a = [a]
```

Now it would seem we can write the type of `zipWith`

. Except, when we think about it, we realize that the operation of `zipWith`

will have to be different depending on the choice for `n`

in `Listify`

. Because this `n`

is a type, it is not available at runtime. We will need some runtime value that the implementation of `zipWith`

can branch on.

Furthermore, we will need to convince GHC that we’re not doing something very silly, like trying `Listify (Succ (Succ (Succ Zero))) (Int -> Int)`

. So, we create a GADT that achieves both of these goals. A value from this GADT will both be a runtime witness controlling how `zipWith`

should behave and will assert at compile time that the argument to `Listify`

is appropriate:

```
-- Evidence that a function has at least a certain number of arguments
data NumArgs :: Nat -> * -> * where
NAZero :: NumArgs Zero a
NASucc :: NumArgs n b -> NumArgs (Succ n) (a -> b)
oneArg = NASucc NAZero
twoArgs = NASucc oneArg
threeArgs = NASucc twoArgs
```

Finally, we can give the type for `zipWith`

:

`zipWith :: NumArgs numArgs f -> f -> Listify numArgs f`

Note that, though this `zipWith`

is variable-arity, we still have to tell it the desired arity. More on this point later.

Once we have the type, we still need an implementation, which will need to be recursive both in the length of the lists and the number of arguments. When we think about recursion in the number of arguments to `f`

, currying comes to the rescue… almost. Consider the following:

`zipWith threeArgs splotch [1,2] ['a','b'] [3.5,4.5]`

We would like a recursive call to come out to be something like

`zipWith twoArgs ['a','b'] [3.5,4.5]`

The problem is that there is no replacement for `<splotch ??>`

that works. We want to apply `(splotch 1)`

to the first members of the lists and to apply `(splotch 2)`

to the second members. What we really need is to take a list of functions to apply. Let’s call the function that works with list of functions `listApply`

. Then, the recursive call would look like

`listApply twoArgs [splotch 1, splotch 2] ['a','b'] [3.5,4.5]`

With such a `listApply`

function, we can now implement `zipWith`

:

`zipWith numArgs f = listApply numArgs (repeat f)`

The type and implementation of `listApply`

is perhaps a little hard to come up with, but otherwise unsurprising.

```
-- Variable arity application of a list of functions to lists of arguments
-- with explicit evidence that the number of arguments is valid
listApply :: NumArgs n a -> [a] -> Listify n a
listApply NAZero fs = fs
listApply (NASucc na) fs = listApply na . apply fs
where apply :: [a -> b] -> [a] -> [b]
apply (f:fs) (x:xs) = (f x : apply fs xs)
apply _ _ = []
```

And now we’re done. Here are some examples of it all working:

```
example1 = listApply (NASucc NAZero) (repeat not) [False,True]
example2 = listApply (NASucc (NASucc NAZero)) (repeat (+)) [1,3] [4,5]
example3 = zipWith twoArgs (&&) [False, True, False] [True, True, False]
example4 = zipWith twoArgs (+) [1,2,3] [4,5,6]
example5 = zipWith threeArgs splotch [1,2,3] ['a','b','c'] [3.14, 2.1728, 1.01001]
```

But wait: can we do better? The `zipWith`

built here still needs to be told what its arity should be. Notwithstanding the ambiguity mentioned above, can we somehow infer this arity?

I have not come up with a way to do this in GHC 7.6.1. But, I happen to (independently) be working on an extension to GHC to allow ordering among type family instance equations, just like equations for term-level functions are ordered. GHC will try the first equation and then proceed to other only if the first doesn’t match. The details are beyond the scope of this post (but will hopefully appear later), but you can check out the GHC wiki page on the subject. The following example should hopefully make sense:

```
-- Count the number of arguments of a function
type family CountArgs (f :: *) :: Nat
type instance where
CountArgs (a -> b) = Succ (CountArgs b)
CountArgs result = Zero
```

This function counts the number of arrows in a function type. Note that this cannot be defined without ordered equations, because there is no way in GHC 7.6.1 to say that `result`

(the type variable in the last equation) is not an arrow.

Now, all we need to do is to be able to make the runtime witness of the argument count implicit through the use of a type class:

```
-- Use type classes to automatically infer NumArgs
class CNumArgs (numArgs :: Nat) (arrows :: *) where
getNA :: NumArgs numArgs arrows
instance CNumArgs Zero a where
getNA = NAZero
instance CNumArgs n b => CNumArgs (Succ n) (a -> b) where
getNA = NASucc getNA
```

Here is the new, implicitly specified variable-arity `zipWith`

:

```
{-# LANGUAGE ScopedTypeVariables #-}
-- Variable arity zipWith, inferring the number of arguments and using
-- implicit evidence of the argument count.
-- Calling this requires having a concrete return type of the function to
-- be applied; if it's abstract, we can't know how many arguments the function
-- has. So, zipWith (+) ... won't work unless (+) is specialized.
zipWith' :: forall f. CNumArgs (CountArgs f) f => f -> Listify (CountArgs f) f
zipWith' f = listApply (getNA :: NumArgs (CountArgs f) f) (repeat f)
```

This version does compile and work with my enhanced version of GHC. Expect to see ordered type family instances coming to a GHC near you soon!

]]>