A recent discussion on the cvs-ghc mailing list touched on propositional decidable equality in Haskell. This post will explain propositional equality and consider different encodings of this idea in Haskell.
Preface
This blog post is a literate Haskell file, compatible with GHC 7.6.1. As usual, we need some initial declarations to get off the ground.
> {-# LANGUAGE DataKinds, PolyKinds, GADTs, TypeFamilies,
> LambdaCase #-}
> {-# OPTIONS_GHC -fwarn-incomplete-patterns #-}
>
> data Nat = Zero | Succ Nat
Throughout the post, I will be talking about equality on Nat
s, but the ideas here extend to any types that would admit a good instance of Eq
.
We will need length-indexed vectors to make interesting use of the ideas here:
> data Vec :: * -> Nat -> * where
> VNil :: Vec a Zero
> VCons :: a -> Vec a n -> Vec a (Succ n)
>
> safeHead :: Vec a (Succ n) -> a
> safeHead (VCons h _) = h
>
> safeTail :: Vec a (Succ n) -> Vec a n
> safeTail (VCons _ t) = t
Note that there is no need for another clause for safeHead
or safeTail
because the types forbid, for example, safeHead VNil
.
Propositional Equality
We are all (hopefully?) familiar with Boolean equality:
> boolEq :: Nat -> Nat -> Bool
> boolEq Zero Zero = True
> boolEq (Succ a) (Succ b) = boolEq a b
> boolEq _ _ = False
We can even lift this idea to the type level:
> type family BoolEq (a :: Nat) (b :: Nat) :: Bool
> type instance BoolEq Zero Zero = True
> type instance BoolEq (Succ a) (Succ b) = BoolEq a b
> type instance BoolEq Zero (Succ x) = False
> type instance BoolEq (Succ x) Zero = False
Let’s try to write a function to retrieve the second element from a vector:
cadr :: Vec a n -> a
cadr v = safeHead (safeTail v)
We get an error:
Couldn't match type `n' with 'Succ ('Succ n0)
Naturally, GHC can’t confidently apply safeHead
and safeTail
because we don’t know that v
has at least 2 elements.
Let’s try again:
cadr :: (BoolEq n (Succ (Succ n')) ~ True) => Vec a n -> a
cadr v = safeHead (safeTail v)
Still doesn’t work:
Could not deduce (n ~ 'Succ ('Succ n0))
from the context (BoolEq n ('Succ ('Succ n')) ~ 'True)
The problem is that GHC doesn’t know that our Boolean equality function really shows the equality of two types.
This is a contrived example, though. Instead, let’s consider a program passing around explicit evidence of whether or not a list has at least two elements. If the list doesn’t, the function should return a supplied default.
To pass around evidence of a type-level Boolean quantity, we need the singleton type for Booleans:
> data SBool :: Bool -> * where
> STrue :: SBool True
> SFalse :: SBool False
(If you’ve never seen singleton types before, my singletons paper on the subject contains helpful information.)
cadr :: SBool (BoolEq n (Succ (Succ n'))) -> a -> Vec a n -> a
cadr evidence deflt v = case evidence of
STrue -> safeHead (safeTail v)
SFalse -> deflt
Still, no go:
Could not deduce (n ~ 'Succ ('Succ n0))
from the context (BoolEq n ('Succ ('Succ n')) ~ 'True)
bound by a pattern with constructor
STrue :: SBool 'True,
In the end, this last example is the same as the previous. Pattern-matching on the SBool
just brings the equality (BoolEq n (Succ (Succ n'))) ~ True
into the context.
We need to convert Boolean equality to propositional equality, which is denoted by ~
. Propositional equality is an equality among types that GHC can make use of in type checking code. To work with propositional equality, we need to make it first class, instead of just a constraint.
> data PropEq :: k -> k -> * where
> Refl :: PropEq x x
Let’s now try to write a conversion function from Boolean equality to propositional equality:
boolToProp :: (BoolEq a b ~ True) => PropEq a b
boolToProp = Refl
Same old problem:
Could not deduce (a ~ b)
from the context (BoolEq a b ~ 'True)
What we need to do is to build up the propositional equality from pieces that GHC can easily verify are indeed equal. We need an inductive proof that our definition of Boolean equality is correct for any natural number. To write such a proof, we will need to do case analysis on a and b. To do that, in turn, we will need a singleton over natural numbers.
> data SNat :: Nat -> * where
> SZero :: SNat Zero
> SSucc :: SNat n -> SNat (Succ n)
Now, let’s write the inductive proof:
boolToProp :: (BoolEq a b ~ True) => SNat a -> SNat b -> PropEq a b
boolToProp SZero SZero = Refl
boolToProp (SSucc x') (SSucc y') = boolToProp x' y'
Oops:
Could not deduce (n ~ 'Succ n)
...
Could not deduce (n1 ~ 'Succ n1)
The problem is that we are returning the result of boolToProp x' y'
directly from boolToProp
, even though x'
and y'
have different types than SSucc x
and SSucc y
. The solution is to use a pattern match on the result from the recursive call. Let’s call the type index associated with x'
to be a'
and that with y'
to be b'
. Then, the recursive call gives us (Refl :: PropEq a' b')
. If we pattern match on this, we get the propositional equality a' ~ b'
into the context. This can be used to show Succ a' ~ Succ b'
(which is really just a ~ b
), so we can now use Refl
once again, though at a different type:
> boolToProp :: (BoolEq a b ~ True) => SNat a -> SNat b -> PropEq a b
> boolToProp SZero SZero = Refl
> boolToProp (SSucc a') (SSucc b') =
> case boolToProp a' b' of
> Refl -> Refl
Great. Except now we get this warning:
Pattern match(es) are non-exhaustive
In an equation for `boolToProp':
Patterns not matched:
SZero (SSucc _)
(SSucc _) SZero
The problem is that there is no possible way out in these cases, so we’ll just have to put undefined:
boolToProp SZero (SSucc _) = undefined
boolToProp (SSucc _) SZero = undefined
Wait. Now there’s a new problem:
Couldn't match type 'False with 'True
Inaccessible code in ...
Couldn't match type 'False with 'True
Inaccessible code in ...
GHC rightly determines that these cases are impossible. Why are they impossible? Because we know that BoolEq a b ~ True
. In these cases, that wouldn’t be the case, so GHC can’t match False
with True
.
But now we are in a quandary. Without the extra matches, we get a warning (due to -fwarn-incomplete-patterns
, which you should never leave home without). With the matches, we get an error. That’s silly. And others agree that it’s silly, filing bug report #3927. According to the commentary on the bug report, Simon PJ says, “The more complaints the more likely I am to put off other things to do this one!” So, at the risk of pestering dear Simon, if you are annoyed by this, please complain! The best way to complain is simply to add yourself to the Cc list of the bug report. If enough people do this, the bug will get fixed sooner. Or, even better, try to fix it yourself!
So, where does this leave us? I can’t stand a warning in my code, so we’ll suppress it with this:
> boolToProp _ _ = error "bug 3927"
Let’s try to write cadr
one last time, this time armed with boolToProp
:
> cadr :: SBool (BoolEq n (Succ (Succ n'))) -> SNat n
> -> SNat n' -> a -> Vec a n -> a
> cadr evidence n n' deflt v = case evidence of
> STrue -> case boolToProp n (SSucc (SSucc n')) of
> Refl -> safeHead (safeTail v)
> SFalse -> deflt
It works! Hurrah!
The sad part here is that, to make it work, we needed to pass around two SNat
s and perform an O(n) operation (at runtime – the boolToProp
“proof” runs!) to prove to GHC that the operation is valid. Can we do better?
Decidable Propositional Equality
Yes, we can.
The problem lies in the fact that we branch on a witness of Boolean equality. There is an alternative: decidable propositional equality. The idea is that instead of just type-level Booleans, decidable equality stores either evidence that two types are equal or evidence that they are not. We know how to write evidence that two types a
and b
are equal: PropEq a b
. What’s the opposite of PropEq a b
? It’s the statement that PropEq a b
implies falsehood. In Haskell, we can represent falsehood with an empty type.
> data Falsehood
> type Not a = a -> Falsehood
Now, we can define decidable equality in Haskell:
> type DecidableEquality (a :: k) (b :: k) = Either (PropEq a b) (Not (PropEq a b))
We can even write a function to decide equality over Nat
s. Because this function produces runtime evidence, it uses singletons to work with types.
> decEq :: SNat a -> SNat b -> DecidableEquality a b
> decEq SZero SZero = Left Refl
> decEq (SSucc x') (SSucc y') = case decEq x' y' of
> Left Refl -> Left Refl
> Right contra -> Right (\case Refl -> contra Refl)
There’s a little magic going on here, so let’s pause and reflect. The first equation is straightforward. In the second, we recur on a'
and b'
. If those in fact are equal, we still unpack the Left Refl
result and create a new Left Refl
at a different type. We’ve seen this before, so it’s OK.
But what’s going on with Right
? Once again, let’s call the type index associated with x'
to be a'
, and likewise with y'
and b'
. The return type of decEq x' y'
is DecidableEquality a' b'
. Because we’ve matched with Right
, we know that contra
must have type Not (PropEq a' b')
, synonymous with PropEq a' b' -> Falsehood
. We must produce something of type PropEq a b -> Falsehood
. So, we write a lambda-case pattern match on the PropEq a b
to get the equality a ~ b
. Because a
is Succ a'
and b
is Succ b'
, GHC can use a ~ b
to derive a' ~ b'
, and thus we can call contra
with Refl :: PropEq a' b'
. Whew.
Now, we deal with the failure cases. If we know that, say, a
is Zero
and b
is Succ b'
, then GHC rightly figures out that PropEq a b
(which expands to PropEq Zero (Succ b')
) is an empty type.
decEq SZero (SSucc _) = Right (\case {})
decEq (SSucc _) SZero = Right (\case {})
No go:
parse error on input `}'
GHC does not support empty pattern matches. (UPDATE [Jan 4, 2013]: Simon PJ has implemented empty pattern matches in HEAD. Yay!) A feature request to support these was submitted as bug report #2431. Drat. Explicitly pattern matching on Refl
gives us an inaccessible code error (correctly), so we are forced to do this dreadful workaround:
> decEq SZero (SSucc _) = Right (\_ -> error "bug 2431")
> decEq (SSucc _) SZero = Right (\_ -> error "bug 2431")
(Incidentally, this bug seems much easier to fix than #3927, so if you have a little time, go for it!)
So, now that we’ve defined decEq
, how can we use it? Let’s write a wrapper around safeHead
with evidence. We’ll first need a way to eliminate Falsehood
. From logic, we learn that falsehood implies anything, neatly expressed in this type:
> exFalsoQuodlibet :: Falsehood -> a
Unfortunately, even with a manifestly empty type, we can’t use an empty pattern match. So, we do this:
> exFalsoQuodlibet = \case _ -> error "bug 2431"
Here is the type for our new head operation:
> safeHead' :: DecidableEquality n Zero -> SNat n -> a -> Vec a n -> a
In this example, as opposed to above, we reason about whether n
is Zero
. Because we set the example up this way, the evidence when n
is not Zero
will be necessary for a complete definition of the safeHead'
operation.
> safeHead' (Left Refl) _ deflt _ = deflt
> safeHead' (Right contra) n _ v = case n of
> SZero -> exFalsoQuodlibet (contra Refl)
> SSucc _ -> safeHead v
Note that this definition is complete: we can use the primitives that we have built up to eliminate the impossible case match. Of course, those primitives had to avoid empty pattern matches. However, it is easy to imagine a future where bugs #2431 and #3927 are gone and we can define this without any partial features of Haskell.
I should also note that we can’t use exFalsoQuodlibet (contra Refl)
in the SSucc _
case. GHC rightly complains Couldn't match type 'Succ n1 with 'Zero
in the use of Refl
.
Fun stuff, no?
So you are telling us that in a context where returning (Left Refl) is accepted, Right (contra Refl) will always be rejected? I.e. when applying this principle to (the equality of) pairs there will be only one possible definition with 4 cases (1xLeft, 3xRight) ? That would be awesome and a manifestly added value w.r.t. Maybe (PropEq a b), which still can be extracted from DecidableEquality a b. I’d love to hear your opinion on these matters.
In a perfect world, yes, at most one of (Left Refl) or (Right …) would typecheck. This world is not perfect, however, because Haskell does not allow empty pattern matches, permits incomplete pattern matches, and has undefined. We can rely on (Left Refl), because Refl will typecheck only where it is appropriate to use. On the other hand, it easy to make a term of type (Not (PropEq a b)), which expands to (PropEq a b -> Falsehood). Just say (\_ -> undefined). In fact, in ground cases (cases where the inequality of two terms does not depend on the inequality of smaller components), (\_ -> undefined) is currently our only viable option. In those ground cases, writing (\Refl -> …) would not typecheck, no matter what is on the right hand side. Barring (\case {}), we are forced to use undefined.
In a future world with empty case statements, DecidableEquality a b would be inhabited by at most one of (Left Refl) and (Right (\case {})). Any other inhabitant of DecidableEquality a b would have to be derived from undefined or have an incomplete pattern match. (In a given context, DecidableEquality a b might have no inhabitants, if we can’t be certain about the equality or inequality of a and b.)
So, I’ve added some blurb about decidable equality to my blog (http://heisenbug.blogspot.de/2012/12/decidable-equality.html). Does it sound about right?
Notably, you don’t actually need -XLambdaCase when deconstructing the PropEq. A simple (\Refl -> …) works even in GHC 6.12.1. Indeed, for those of us stuck in the past, with minor syntactic changes all of the above works in 6.12.1; the only downsides are (1) you need to manually lift data constructors to data kind constructors, (2) you don’t actually get the data *kind*, and (3) you don’t get a polykinded PropEq.
Also, rather than defining a Falsehood type (or even better, reusing Void), I prefer the following System F style definition of negation:
newtype Not a = Not { exfalso :: forall r. a -> r }
Excellent points. I guess I got caught up worrying about the lack of empty pattern matches and forgot about just using a pattern in a lambda. And, thanks for the idea for a different encoding of Not and the pointer to Void, which I had not yet seen.
The fact that Falsehood is unfortunately inhabited by undefined (and other bottoms) got me thinking about the fact that GHC does have various unlifted, primitive types, which aren’t. The drawback is that they can’t be used polymorphically. As a test I took this blog post and tried replacing data Falsehood with type Falsehood = (# #), the unboxed empty tuple type (which does have a constructor, but we’re not going to use it), to see what would happen. And surprisingly it still compiled! That got me into an investigation of how the heck that could possibly happen, given that (lacking empty cases) you employ ‘error’ where a Falsehood is expected, but (# #) is not, presumably, inhabited by bottoms. It turns out that error is different in some way from undefined, because this doesn’t compile:
foo :: () -> (# #)
foo () = undefined
but this does:
foo :: () -> (# #)
foo () = error “blabla”
What the heck? And it even behaves “as expected”:
main = case foo () of _ -> print 9
prints “blabla”, and not 9. So indeed (# #) is not inhabited by bottoms, and if you use error (or presumably an infinite recursion) in place of one, the exception is thrown before the value would be produced, and not when it would be requested (like strict evaluation, in other words). (It turns out the full type of error is forall (a :: OpenKind). [Char] -> a, where OpenKind is the superkind of both # and *, which is why the whole thing typechecks.)
All of this is a longwinded way of getting around to wondering whether, if GHC had an unlifted primitive type that was truly uninhabited, which could be used in place of Falsehood, that might make the whole construction more useful? The effect would be that while the compiler wouldn’t prevent you from writing a definition for a Falsehood, there would be no way to actually obtain a value of type Falsehood at runtime. Would that be a valuable property to have? (And again, the drawback would be that it couldn’t be unified with any *-kinded type variables. Would that bite anywhere?)
And I think GHC might actually have such a type: State# AnythingExceptRealWorld. data State# a is unlifted and has no constructors. As far as I can tell, the only way to get a value of that type is realWorld#, but realWorld# :: State# RealWorld, so if we parameterize it on a type *other* than RealWorld, theoretically there should be no way to get one…
…except for unsafeCoerce#.
My thoughts on this issue were to (eventually) have a flag in GHC which prevents the use of the partial features, somewhat along the lines of Safe Haskell. (Anyone feel like writing a termination checker? 🙂 ) But, I had never considered leveraging unlifted types to eliminate the possibility of bottom. For now, I’m just going to hope that programmers using dependently typed features like this know to avoid writing undefined, but I’ll stow this thought away for when I revisit this down the road. Thanks for suggesting the idea!
Pingback: singletons v0.9 Released! | Types and Kinds