Dependent types in Haskell: Progress Report

It was drawn to my attention that there is an active Reddit thread about the future of dependent types in Haskell. (Thanks for the heads up, @thomie!) Instead of writing a long response inline in Reddit, it seems best to address the (very knowledgeable, respectful, and all around heartening) debate here.

When can we expect dependent types in GHC?

The short answer: GHC 8.4 (2018) at the very earliest. More likely 8.6 or 8.8 (2019-20).

Here is the schedule as I see it:

  • GHC 8.2: Clean up some of the lingering issues with -XTypeInType. As I will be starting a new job (Asst. Prof. at Bryn Mawr College) this fall, I simply won’t have the opportunity to do more than some cleanup work for the next release. Also, quite frankly, I need a release cycle off from the challenge of putting in a large new feature. Polishing up -XTypeInType for release took probably an extra full month of unexpected work time! I don’t regret this in the slightest, but I could use a cycle off.
  • GHC 8.4: This depends on what other research opportunities come up in the next year and how much more attention -XTypeInType needs. If all goes well this fall and I can get a few publications out in the next year (certainly possible – I have several in the works), then I could conceivably start primary implementation of -XDependentTypes next summer. The odds that it will be in a state to merge for 8.4 are slim, however.
  • GHC 8.6: We’re now talking about a real possibility here. Assuming I start the implementation next summer, this would give me a year and a half to complete it. I desperately wish to avoid merging late in the cycle (which is what made -XTypeInType so stressful) so perhaps merging soon after GHC 8.4 comes out is a good goal. If this slips, GHC 8.8 seems like quite a likely candidate.

Regardless of the schedule, I have every intention of actually doing this work.

One major oversight in the schedule above: I have completely discounted the possibility of collaborators in coding this up. Do you wish to help make this a reality? If so, let’s talk. I’ll admit that there may be a bit of a hill for an outside collaborator to climb before I really start collaborating in earnest, so be prepared to show (or create!) evidence that you’re good at getting your hands dirty in the greasy wheels of GHC’s typechecker without very much oversight. I’ve encouraged several of you on Trac/Phab – you know who you are, I hope. For others who have not yet contributed to GHC: you’re likely best served starting smaller than implementing dependent types! But let’s talk anyway, if you’re interested.

What is the design of dependent types in Haskell?

I expect to hand in my dissertation in the next two weeks. While I am developing it in the public eye (and warmly welcome issues to be posted), there is no publicly available PDF build. Building instructions are in the README. I will post a built PDF when I hand in my draft to my thesis committee. I will be defending on September 1 and will likely make some revisions (as suggested by my committee) afterward.

Readers here will likely be most interested in Chapters 3 and 4. Chapter 3 will contain (I’m still writing it!) numerous examples of dependent types in Haskell and how they might be useful. Chapter 4 presents the design of dependent types in Haskell as a diff over current Haskell. This design is not yet up to the standard of the Haskell Reports, mostly because it is unimplemented and seems not worth the effort of formalization until we know it is implementable. For the overly curious, Chapter 5 contains a new intermediate language (to replace GHC Core) and Chapter 6 contains the type inference algorithm that will deal with dependent types. Happy summer reading!

Responses to particular comments on the Reddit thread

  • @elucidatum: How radical of a change and how much of a challenge will it be to refactor existing codebases?

    No change is necessary. All code that currently compiles with -XTypeInType will compile with -XDependentTypes. (The only reason I have to set -XTypeInType as my baseline is because of the parsing annoyance around *, which must be imported to be used with -XTypeInType.) Any refactoring will be around how much you, as the Haskell author, wish to take advantage of dependent types.

  • @jlimperg: Pervasive laziness in Haskell means that we like to use a lot of coinductive types. But this is really annoying for proof purposes because coinduction, despite the nice duality with induction, is a lot less pleasant to work with.

    Yes, this is true. My approach is, essentially, to pretend that types are inductive, not coinductive. One consequence of this decision is that proofs of type equality will have to be run. This means that including dependent types will slow down your running program. Sounds horrible, doesn’t it? It doesn’t have to, though.

    Suppose you have a function proof :: ... -> a :~: b. The function proof provides a proof that type a equals type b. If this function terminates at all, it will always produce Refl. Thus, if we knew that the function terminated, then we wouldn’t have to run it. We can’t know whether it terminates. But you, the programmer, can assert that it terminates, like this: {-# RULES "proof" proof ... = unsafeCoerce Refl #-} Now, GHC will replace any use of proof with Refl directly. Note that GHC still type-checks your proof. All this RULE does is assert termination.

    “But,” you say, “I don’t want to have to merely assert termination! If I wanted to assert correctness instead of prove it, I wouldn’t use dependent types.” My response: “Touché.” Haskell indeed is less powerful than those other dependently typed languages in this regard. Nevertheless, by using dependent types in Haskell, you still get a whole lot more compile-time guarantees than you get without dependent types. You just don’t get termination. You thus have a choice: “prove” termination at runtime by actually running your proofs, or assert it at compile time and keep your dependently typed program efficient, and still with lots of guarantees. Recall that we Haskellers have never proved termination of anything, ever, so not proving termination of a function named proof shouldn’t be all that alarming.

    Note: If RULES such as the one above become pervasive, perhaps a compiler flag can make them automatic, effectively (and concisely) assuming termination for all proofs in a module.

  • @jlimperg: Proving is hard

    Yes, it is. Typechecker plugins may help here. It is easy enough, however, to implement dependent types without this advanced support. As we work through the consequences of dependent types in Haskell, I’m confident the community (including me) will come up with ways to make it easier.

  • @jlimperg: As is well-known, a dependent type system is inconsistent (i.e. everything can be proven) unless all computations terminate, which is obviously not the case for general Haskell.

    Yes, this is true. Dependent types in Haskell will be inconsistent, when we view Haskell as a logic. Dependent types will still remain type-safe however. (This is because we run the proofs!) I just wish to point out here that @jlimperg suggests a syntactic check for termination: this, sadly, will not work. Haskell has too many ways for a computation to diverge, and a syntactic check can rule out only general recursion and partial pattern matches. Haskell also has infinite data types, non-strictly-positive datatypes, TypeRep (which can be abused to cause a loop), Type :: Type, and likely more back doors. I’d love to see a termination checker for Haskell, but I’m not holding my breath.

  • @jlimperg: This point is very speculative, but I strongly dislike the current flavour of ‘almost-dependent’ type-level programming in Haskell.

    So do I. Singletons are an abomination. I hate them. These are gone with my design for dependent types, and I think the resulting language has the niceties we all want in a dependently typed language (modulo termination checking).

  • @dogodel: I think it will take a while for the change to percolate into better design. … I think the most immediate benefit will be for (no surprise) expressive DSL, which are actually quite common in any modest codebase, but maybe not the “core” of the project.

    Agreed completely. It will take time for us to figure out the best way to use dependent types.

  • @sinyesdo: The best summary I can offer is that “totality” is really really important for DT languages

    This is true for those other dependently typed languages, but not for Haskell. See my response to the first of @jlimperg’s points quoted here.

  • @ccasin: The main reasons to care that your programming language is total are to make the type system consistent as a logic and to make type checking decidable. But we’ve already given up both these things in Haskell, and it’s possible to have dependent types without them.

    Agreed in full, and with @ccasin’s full post.

  • @jmite: If you want fully Dependent-Type types, why not use Idris, Agda, Coq, or F*?

    I’m in broad agreement with the responses to this point on Reddit: basically, that none of these languages have Haskell’s ecosystem or optimizing compiler. See also Section 3.3 of my dissertation.

  • @tactics: What does “fully dependent types” even mean for something like Haskell? You would have to rework the core language entirely.

    Yes, precisely. This is Chapter 5 of my dissertation.

Conclusion

I hope this information helps. Do keep an eye out for my finished dissertation sometime in the next few months. And, as always, this is all a work in progress and no one should take anything as set in stone. If there’s a design decision you disagree with, please speak up!

Haskell as a gradually typed dynamic language

I spent a few days this week at POPL (one day too short due to a massive blizzard currently pounding Philadelphia). As always at conferences, I had a great time and left full of new ideas.

This time, I sat in on a series of talks on gradual typing. Forgive me as I’m nowhere near an expert on gradual typing, but I took away this gist: with a gradual typing system, a type-checker does the best it can to assign static types; anything it can’t do is deferred to runtime in a dynamic type check.

As I nearly always do when considering a type system concept, I thought about how this might apply to Haskell. And I realized with a start that Haskell is only a stone’s throw away from a very spiffy gradually typed dynamic language! Let’s explore how this is the case.

(Disclaimer: I have no plans to implement any idea in this post. I also apologize, specifically, if I’m misusing the term “gradual typing”, which I’m not deeply versed in. This is all meant to be a bit tongue-in-cheek, though I think there’s a proper good idea in all of this.)

First, I wish to address a common fallacy about an existing feature in GHC:

Deferred type errors do not make Haskell a dynamic language

Recent versions of GHC come with a flag -fdefer-type-errors. (Naturally, this feature comes with its own published paper from ICFP’12.) With this flag enabled, type errors become warnings. When you try to run the code that contains a type error, you get a runtime exception. To wit:

> {-# OPTIONS_GHC -fdefer-type-errors #-}
> module GradualHaskell where
> 
> import Data.Typeable
> 
> ex1 = not 'x'

Compiling produces

/Users/rae/work/blog/011-dynamic/post.lhs:32:13: warning:
    • Couldn't match expected type ‘Bool’ with actual type ‘Char’
    • In the first argument of ‘not’, namely ‘'x'’
      In the expression: not 'x'
      In an equation for ‘ex1’: ex1 = not 'x'

Running produces

*** Exception: /Users/rae/work/blog/011-dynamic/post.lhs:32:13: error:
    • Couldn't match expected type ‘Bool’ with actual type ‘Char’
    • In the first argument of ‘not’, namely ‘'x'’
      In the expression: not 'x'
      In an equation for ‘ex1’: ex1 = not 'x'
(deferred type error)

What’s nice about deferred type errors is that they let you compile a program even with errors. You can then test the type-correct parts of your program while you’re still developing the type-incorrect parts.

However, deferred type errors don’t make Haskell a dynamic language.

Consider

> silly :: a -> b
> silly True  = False
> silly False = "Howdy!"
> silly 'x'   = 42

This, of course, compiles with lots of warnings. But you can’t successfully call silly, even at one of the values matched against. Running silly True produces a runtime type error. This is not what one would expect in a dynamic language, where silly True is perfectly False. The problem in Haskell is that we have no way to know if the type silly is called at is really Bool: all types are erased. All deferred type errors do is to take the normal compile-time type errors and stuff them into thunks that get evaluated when you force the ill-typed expression. Haskell still never does a runtime type check.

Preserving type information to runtime

The way to make Haskell into a proper dynamic language must be to retain type information to runtime. But we already have a way of doing runtime type checks: Typeable and friends.

The constraint Typeable a means that we know a’s type at runtime. The Typeable interface (from Data.Typeable) includes, for example, the function

cast :: (Typeable a, Typeable b) => a -> Maybe b

which does a runtime type equality test and, if the test succeeds, casts an expression’s type.

As of GHC 7.10, all statically-known types are Typeable. (Saying deriving Typeable is a no-op.) But type variables might still have a type unrecoverable at runtime.

So, all of this adds up to an interesting idea. I propose a new language extension -XDynamicTypes with two effects:

  1. All type variables automatically get a Typeable constraint.
  2. The type-checker behaves quite like it does with -fdefer-type-errors, but instead of putting an unconditional type error in an expression, it does a runtime type check. If the check succeeds, the expression evaluates normally; otherwise, a runtime error is thrown.

Change (1), coupled with the fact that all non-variable types are already Typeable, means that all types are known at runtime. Then, change (2) enabled runtime typing. Under this scenario, my silly function above would elaborate to

> forceCast :: (Typeable a, Typeable b) => a -> b
> forceCast x = case cast x of
>   Just y  -> y
>   Nothing -> error "type error" -- could even print the types if we wanted
> 
> notSilly :: (Typeable a, Typeable b) => a -> b
> notSilly true  | Just True  <- cast true  = forceCast False
> notSilly false | Just False <- cast false = forceCast "Howdy!"
> notSilly x     | Just 'x'   <- cast x     = forceCast (42 :: Int)

This definition works just fine. It does do a lot of casting, though. That’s because of silly’s very general type, a -> b, which is necessary due to the differing types of silly’s equations. However, if all the equations of a function have the same type, then the function is inferred with a precise type, and no casting is necessary. Indeed, in a program that is type-correct without -XDynamicTypes, no runtime checks would ever take place.

This idea seems to give you get the flexibility of dynamic types with the speed of static types (if your program is statically type-correct). And this general model seems to fit exactly in the space of possibilities of gradual type systems.

I’ve heard it said that Haskell is the best imperative programming language. Could it also become the best dynamic language?

What are type families?

I just returned from yet another fantastic ICFP week. I always have a blast at the conference. It would be easy to take the conversations I had there and turn them into a full-blown research program that would easily take me to next year’s ICFP… and then rinse and repeat!

But in this post, I want to focus on one particlar thorn in my side that started bothering me more than it had before: type families. I’ve realized, in the course of many conversations, that I don’t really understand them. This post will be more questions than answers as I try to write down what ails me.

This post is Literate Haskell.

> {-# LANGUAGE TypeFamilies, DataKinds, UndecidableInstances, GADTs,
>              MultiParamTypeClasses, FunctionalDependencies,
>              FlexibleInstances, FlexibleContexts #-}
> module WhatAreTypeFamilies where

Total type families are OK

A total type family feels quite comfortable. By total here, I mean a type family whose equations cover all the possible cases and which is guaranteed to terminate. That is, a total type family is properly a function on types. Those other dependently typed languages have functions on types, and they seem to work nicely. I am completely unbothered by total type families.

Non-covering type families are strange

A non-covering type family is a type family whose patterns do not cover the whole space. Let’s consider closed and open families separately, because the issues that come up are different.

Closed type families

For example:

> type family F1 a where
>   F1 Int = Bool

Note that F1 is closed – we won’t ever be adding more equations. F1 is clearly non-covering. Yet, the type F1 Char is perfectly well formed.

> sillyId :: F1 Char -> F1 Char
> sillyId x = x

But sillyId can’t ever be called on a value. This doesn’t stop us from doing chicanery with F1 though.

> data SillyGadt a where
>   MkSillyGadt1 :: Int   -> SillyGadt (F1 Char)
>   MkSillyGadt2 :: Float -> SillyGadt (F1 Double)

Now, even though F1 Char is nonsense, SillyGadt (F1 Char) is inhabited. And GHC distinguishes between F1 Char and F1 Double.

In GHC, a type family that can’t reduce is considered “stuck”. Such a type is effectively a new type, equal only to itself, like empty datatypes a programmer might declare. The only thing different about something like F1 Char from something like Maybe Char is that F1 Char cannot be used in a type pattern (that is, the left-hand side of a type family equation or data family instance). That’s the only difference. So, when we declare F1, we create an infinite family of unique types containing F1 applied to all types of kind *, except for Int, because F1 Int is identically Bool.

I find this treatment of non-covering closed type families rather bizarre. Compare against term-level behavior. When a term-level function is non-covering, passing an argument for which there are no equations immediately results in a call to error. Yet the same thing in types behaves very differently.

I’m not the only one who finds this strange: Lennart Augustsson posted GHC bug #9636 about this very issue. Indeed, my current views on the subject are strongly influenced by Lennart’s comments.

But, sometimes you want stuckness

Sadly, sometimes authors want stuck type families. I know of two examples.

  1. The bowels of GHC defines this:
    type family Any :: k where {}

    Thus, Any is a closed type family, available at any kind, with no equations. GHC makes convenient use for Any. Consider the expression length []. This expression has some pointless polymorphism in it: length can be applied to a list of any type, and [] can be a list of any type. Yet, GHC must choose precisely which type these polymorphic definitions are specialized to. Not to play favorites, GHC chooses Any :: *. Such pointless polymorphism can indeed happen at any kind, so Any is polymorphic in its return kind. (Before kind polymorphism, Any existed as a special form. It’s not so special any more.)

    It’s quite important that Any is a type family, not a datatype. If it were a datatype (and we forget about the weirdness of a datatype polymorphic in its return kind), then it could be matched in type patterns, making this strangeness possible:

    type family ThreeBools (b :: Bool) :: Nat where
      ThreeBools 'True  = 1
      ThreeBools 'False = 2
      ThreeBools Any    = 3

    Bizarre! And yet this was possible in GHC 7.6 and below (modulo the closed type family, which didn’t exist yet). And it caused problems.

    Of course, with Any, we really want it to be stuck. We don’t want Any to reduce to some other type, and we don’t want uses of Any to be errors.

  2. In his work implementing a units-of-measure library via type-checker plugins, Adam Gundry uses closed type families to define type functions whose behavior is specified only via a plugin, not by GHC’s built-in type family reduction rules. He wanted empty closed type families specifically, and so implemented them within GHC. The idea is that units-of-measure with their integer powers form an abelian group, which can be solved via custom machinery but not by GHC’s rather limited left-to-right rewriting used in type family reduction.

    Once again, we want a closed type family that is stuck but is not an error. The existence of multiple such things got me ruminating on all of this some time ago. I attempted to figure this all out in a post on a GHC ticket, but that didn’t get me very far. And I won’t claim to have made any real progress since that post.

Open type families

Having a stuck open type family makes moderately more sense than a stuck closed type family. Suppose we have this:

> type family F2 a
> type instance F2 Int = Bool

Because new equations can be added to open type families at any time, talking about F2 Char isn’t utterly bogus. But it might be nice to have to assert that an equation for F2 Char is in scope before we use it in a type.

Injective type families

Injective type families are a new feature that just got merged with GHC head. Jan Stolarek, Simon Peyton Jones, and I had a paper at Haskell Symposium about them.

The idea is that you can put an injectivity annotation on a type family declaration, which is rather like a functional dependency. (The injective type family code in this post is not runnable, as it works only with GHC HEAD. The feature will be available with GHC 8.0, due out early 2016.)

type family F3 a = r | r -> a where
  F3 Int  = Bool
  F3 Bool = Int
  F3 a    = a

Note that if you know the value of F3 a, then you can discover the value of a: that’s what injectivity is all about.

Now consider this:

type family F4 a = r | r -> a where
  F4 (Maybe a) = a

Looks injective, right? Wrong. The problem is that, if this were injective, we could take a proof that F4 a ~ F4 b and get a proof of a ~ b. Now, consider F4 (Maybe (F4 Int)). This reduces to F4 Int. Thus, we have F4 (Maybe (F4 Int)) ~ F4 Int. And now, by injectivity, we get Maybe (F4 Int) ~ Int. That is, a Maybe <blah> equals an Int. Terrible!

So, in our implementation of injectivity, we exclude type families like F4. But it’s really unexpected that F4 should be rejected (at least to me). The problem, at its core, is about partiality. Note that we needed F4 Int – an un-covered case – to make the bad equality proof.

Looping type families are strange

At first blush, it seems that a looping type families cause chiefly one problem: a looping GHC. But I’m very unconcerned about a looping GHC. After all, if a user has written a non-terminating type family, that amounts to a non-terminating program to be run at compile time. No one should be very surprised when a program that contains a loop does, in fact, loop.

The problem is that looping type families can actually create effectively infinite types. GHC has various protections against actually running off to infinity, but a clever programmer can defeat these protections. And infinite types can cause unsoundness. Indeed, while writing the closed type family paper, we uncovered one such unsoundness, closing bug #8162.

However, even with #8162 gone, we still have a lingering problem. Consider

> type family F5 a b where
>   F5 a a = 'True
>   F5 a b = 'False

This is a straightforward equality test, and it works in many cases. But if we ask F5 a (Maybe a) (for some unknown a), the type family doesn’t reduce. It really should! There is no way a finite a can be the same as Maybe a. But we can have an infinite a, instantiated to MaybeInf:

> type family MaybeInf where MaybeInf = Maybe MaybeInf

MaybeInf here is a perfectly well-formed nullary (0-argument) type family. If the a in F5 a (Maybe a) were instantiated with MaybeInf, then reducing to 'True would be correct. And thus GHC can’t reduce F5 a (Maybe a) to 'False, as one might like.

And this isn’t purely an academic concern. I don’t have links I can share, but this has bitten real programmers doing real work. Once again, the problem boils down to partiality: GHC must be conservative about reducing closed type families (such as F5) because of the possibility that someone, somewhere will write a non-terminating type family. If F5 a (Maybe a) were to reduce to 'False, I’m confident that a clever enough person could use this behavior to implement unsafeCoerce. Similar concerns prevent certain type families from being considered injective – see the paper for the example. If we could somehow be sure that there were no non-terminating type families, these concerns would go away.

Contrast with functional dependencies

Simply, functional dependencies don’t suffer from these problems. Let’s look at the issue around injectivity, as it’s representative. First, recast in terms of functional dependencies:

> class F4' a r | a -> r, r -> a
> instance F4' (Maybe a) a

Note the bidirectional functional dependency. The left-to-right dependency is the built-in dependency inherent in type families. The right-to-left dependency arises from the injectivity annotation.

Our problem case above was F4 (Maybe (F4 Int)). This case maps to

fundep :: (F4' Int f4_int, F4' (Maybe f4_int) result) => ()
fundep = ()

GHC helpfully tells us

Couldn't match type ‘Maybe f4_int’ with ‘Int’
arising from a functional dependency between:
  constraint ‘F4' Int f4_int’
    arising from the type signature for
                   fundep :: (F4' Int f4_int, F4' (Maybe f4_int) result) => Int
  instance ‘F4' (Maybe a) a’

So GHC notices that there is a problem. But even if it didn’t, we’d be OK. That’s because the F4' Int f4_int constraint essentially asserts that there is some value for F4' Int. Of course, there isn’t any such value, and so fundep would never be callable. Indeed, we can’t write any more instances of F4', as anything we could write would violate the dependencies.

The bottom line here is that there is no way to get into trouble here because the constraints necessary to use bogus functional dependencies can never be satisfied.

A radical proposal

So, what should we do about all of this? I’m inspired by the non-problems with functional dependencies. I thus propose the following. (Actually, I don’t. This is too radical. But I’m trying to provoke discussion. So take this all with a heap of salt.)

  1. Remove unassociated open type families. They’re quite awkward anyway – either they should be closed and define some sort of concrete function, or they will need a class constraint nearby to be useful.
  2. Require class constraints to use associated type families. That is, if F a is associated with a class C a, writing F Int without a C Int constraint in scope is an error. This means that the use of an associated type family in the right-hand side of another associated type family will require the class constraint in the instance context.
  3. Require closed type families to be total, with an improved termination checker. Naturally, closed type families would be unable to call partial type families.
  4. Devise a mechanism to associate a closed type family with a class. This would allow for partial closed type families, which indeed are useful.

The bottom line to all of this is that any partial type family should be associated with a class constraint. If the constraint can be satisfied, this means that the type family is defined and well-behaved at the arguments provided. With this, I believe all the problems above go away.

Another way to see this proposal is that functional dependencies are the Right Way to do this, and type families (at least partial ones) are wonky. But type families have much better syntax, and I’d like to keep that.

So, where to go from here? I’m not sure. This proposal is pretty radical, but it could potentially be implemented with the requisite deprecation period, etc. But, mostly, I’m curious to see what conversation this all provokes.

Planned change to GHC: merging types and kinds

I’m proud to announce that I’m nearing completion on a major patch to GHC, merging types and kinds. This patch has been in development since late 2012 (!), with many interruptions in the meantime. But I really do think it will make it for 7.12, due out early 2016. This post is meant to generate discussion in the community about the proposed changes and to get feedback about any user-facing aspects which might be of interest.

Motivation

The real motivation for writing this is that it’s a key step toward dependent types, as described in the paper laying out the theory that underlies this patch. But other motivation is close to hand as well. This patch fixes GHC bug #7961, which concerns promotion of GADTs – after this patch is merged, all types can be used as kinds, because kinds are the same as types! This patch also contributes toward the solution of the problems outlined in the wiki page for the concurrent upgrade to Typeable, itself part of the Distributed Haskell plan.

Examples

Below are some fun examples that compile with my patch. As usual, this page is a literate Haskell file, and these examples really do compile! (I haven’t yet implemented checking for the proposed extension StarInStar, which this will require in the end.)

> {-# LANGUAGE DataKinds, PolyKinds, GADTs, TypeOperators, TypeFamilies #-}
> {-# OPTIONS_GHC -fwarn-unticked-promoted-constructors #-}
> 
> -- a Proxy type with an explicit kind
> data Proxy k (a :: k) = P
> prox :: Proxy * Bool
> prox = P
> 
> prox2 :: Proxy Bool 'True
> prox2 = P
> 
> -- implicit kinds still work
> data A
> data B :: A -> *
> data C :: B a -> *
> data D :: C b -> *
> data E :: D c -> *
> -- note that E :: forall (a :: A) (b :: B a) (c :: C b). D c -> *
> 
> -- a kind-indexed GADT
> data TypeRep (a :: k) where
>   TInt   :: TypeRep Int
>   TMaybe :: TypeRep Maybe
>   TApp   :: TypeRep a -> TypeRep b -> TypeRep (a b)
> 
> zero :: TypeRep a -> a
> zero TInt            = 0
> zero (TApp TMaybe _) = Nothing
> 
> data Nat = Zero | Succ Nat
> type family a + b where
>   'Zero     + b = b
>   ('Succ a) + b = 'Succ (a + b)
> 
> data Vec :: * -> Nat -> * where
>   Nil  :: Vec a 'Zero
>   (:>) :: a -> Vec a n -> Vec a ('Succ n)
> infixr 5 :>
> 
> -- promoted GADT, and using + as a "kind family":
> type family (x :: Vec a n) ++ (y :: Vec a m) :: Vec a (n + m) where
>   'Nil      ++ y = y
>   (h ':> t) ++ y = h ':> (t ++ y)
> 
> -- datatype that mentions *
> data U = Star *
>        | Bool Bool
> 
> -- kind synonym
> type Monadish = * -> *
> class MonadTrans (t :: Monadish -> Monadish) where
>   lift :: Monad m => m a -> t m a
> data Free :: Monadish where
>   Return :: a -> Free a
>   Bind   :: Free a -> (a -> Free b) -> Free b
> 
> -- yes, * really does have type *.
> type Star = (* :: (* :: (* :: *)))

Details

More details are in the wiki page for this redesign. As stated above, I’d love your feedback on all of this!

singletons v0.9 Released!

I just uploaded singletons-0.9.2 to hackage. This is a significant upgrade from previous versions, and it gives me a good chance to explain some of its features here.

This post is a literate Haskell file. Copy and paste it into a .lhs file, and you’re off to the races. But first, of course, a little throat-clearing:

> {-# LANGUAGE TemplateHaskell, PolyKinds, DataKinds, TypeFamilies,
>              ScopedTypeVariables, GADTs, StandaloneDeriving, RankNTypes,
>              TypeOperators #-}
> 
> import Data.Singletons.TH
> import Unsafe.Coerce    -- don't hate me yet! keep reading!

The singletons library was developed as part of the research behind this paper, published at the Haskell Symposium, 2012.

What are singletons?

Singleton types are a technique for “faking” dependent types in non-dependent languages, such as Haskell. They have been known for some time – please see the original research paper for more history and prior work. A singleton type is a type with exactly one value. (Note that undefined is not a value!) Because of this curious fact, learning something about the value of a singleton type tells you about the type, and vice versa.

A few lines of example is worth several paragraphs of awkward explanation, so here we go (the underscores are to differentiate from our second version, below):

> data Nat_ = Zero_ | Succ_ Nat_
> data SNat_ :: Nat_ -> * where
>   SZero_ :: SNat_ Zero_
>   SSucc_ :: SNat_ n -> SNat_ (Succ_ n)
> 
> plus_ :: Nat_ -> Nat_ -> Nat_
> plus_ Zero_     n = n
> plus_ (Succ_ m) n = Succ_ (plus_ m n)
> 
> type family Plus_ (m :: Nat_) (n :: Nat_) :: Nat_
> type instance Plus_ Zero_     n = n
> type instance Plus_ (Succ_ m) n = Succ_ (Plus_ m n)
> 
> sPlus_ :: SNat_ m -> SNat_ n -> SNat_ (Plus_ m n)
> sPlus_ SZero_     n = n
> sPlus_ (SSucc_ m) n = SSucc_ (sPlus_ m n)

Here, SNat_ defines a singleton family of types. Note that, for any n, there is exactly one value of SNat_ n. This means that when we pattern-match on a SNat_, we learn about the type variable along with the term-level variable. This, in turn, allows for more type-level reasoning to show correctness for our code. See the paper for more explanation here.

Using singletons, we can pretend Haskell is dependently typed. For example, I have written a richly-typed database client and a provably* correct sorting algorithm using singletons.

*Of course, Haskell is not a total language (that is, it has undefined and friends), so any proof should be viewed with suspicion. More accurately, it is a proof of partial correctness. When the sorting algorithm compiles in finite time and when it runs in finite time, the result it produces is indeed a sorted list.

Don’t Repeat Yourself

The above definitions are neat and all, but they sure are annoying. Haskell’s built-in promotion mechanism duplicates Nat_ at the type and kind level for us, but we have to be responsible for all three versions of plus_. Let’s use the singletons library to help us!

> $(singletons [d|
>   data Nat = Zero | Succ Nat
>     deriving Eq
> 
>   plus :: Nat -> Nat -> Nat
>   plus Zero     n = n
>   plus (Succ m) n = Succ (plus m n)
>   |])

The code above is a Template Haskell splice, containing a call to the singletons function (exported from Data.Singletons.TH). That function’s one argument is a Template Haskell quote, containing the abstract syntax tree of the definitions in the quote. The singletons library chews on those definitions to produce all the definitions above, and more.

To demonstrate the usefulness of singletons, we’ll need length-indexed vectors:

> data Vec :: * -> Nat -> * where
>   VNil  :: Vec a Zero
>   VCons :: a -> Vec a n -> Vec a (Succ n)
> 
> instance Show a => Show (Vec a n) where
>   show VNil        = "VNil"
>   show (VCons h t) = show h ++ " : " ++ show t

Now, we can write a well-typed vReplicate function:

> vReplicate :: SNat n -> a -> Vec a n
> vReplicate SZero      _ = VNil
> vReplicate (SSucc n') x = VCons x (vReplicate n' x)

This works as expected:

ghci> vReplicate (SSucc (SSucc (SSucc SZero))) "hi"
  "hi" : "hi" : "hi" : VNil

Even better, we can make the numerical argument to vReplicate implicit, using SingI. The SingI class is very simple:

class SingI (a :: k) where
  sing :: Sing a

A dictionary for (that is, a class constraint of) SingI just holds an implicit singleton. (See the paper for more info about Sing, which I won’t go over in this post.) Now, we can define the improved vReplicateI:

> vReplicateI :: forall a n. SingI n => a -> Vec a n
> vReplicateI x =
>   case sing :: SNat n of
>     SZero    -> VNil
>     SSucc n' -> VCons x (withSingI n' $ vReplicateI x)
 ghci> vReplicateI "hi" :: Vec String (Succ (Succ (Succ Zero)))

  "hi" : "hi" : "hi" : VNil

Cool!

At this point, you may also want to check out the generated documentation for the singletons library to see a little more of what’s going on. The rest of this post will focus on some of the strange and wonderful new features of v0.9.

Magic SingI dictionaries

Previous versions of singletons had definitions like this:

data instance Sing (n :: Nat) where
  SZero :: Sing Zero
  SSucc :: SingI n => Sing n -> Sing (Succ n)

The key difference here is the SingI constraint in the SSucc constructor. This constraint meant that if you pattern-matched on an SNat and got a SSucc, you would get both an explicit singleton for (n-1) and an implicit singleton (that is, a SingI dictionary) for (n-1). This was useful, and it meant that the old version of vReplicateI wouldn’t need the withSingI business. But, it also meant that every node in a singleton had duplicated information. Since this was true at every (interior) node, singleton trees were exponentially larger than necessary. Yuck. Somehow, none of my advisor, our reviewers, nor me noticed this before. My advisor (Stephanie Weirich) and I somehow convinced ourselves that the duplication would lead to trees that were double the necessary size, which we deemed acceptable. Oops!

In singletons 0.9, though, a singleton just contains the explicit version. We then needed a way to convert from explicit ones to implicit ones. To do this, I used a trick I learned from Edward Kmett at ICFP this year: take advantage of the fact that classes with exactly one method (and no superclass) are represented solely by the method, and nothing else. Thus, a dictionary for SingI is actually the same, in memory, as a real singleton! To go from explicit to implicit, then, we just have to wave a magic wand and change the type of a singleton from Sing a to SingI a.

The magic wand is easy; it’s called unsafeCoerce. What’s a little trickier is the fact that, of course, we can’t have dictionaries in the same place as normal datatypes in Haskell code. The first step is to create a way to explicitly talk about dictionaries. We make a datatype wrapper:

> data SingInstance (a :: k) where
>   SingInstance :: SingI a => SingInstance a

To call the SingInstance constructor, we need to have a SingI a lying around. When we pattern-match on a SingInstance, we get that SingI a back. Perfect.

Now, we need a way to call the SingInstance constructor when we have an explicit singleton. Naively, we could imagine saying something like

... (unsafeCoerce SingInstance :: Sing a -> SingInstance a) ...

because, after all, SingI a => SingInstance a is the same under the hood as Sing a -> SingInstance a. The problem here is that as soon as we say SingInstance in Haskell code, GHC helpfully tries to solve the arising SingI a constraint – something we do not want here. (Once the SingInstance is instantiated, its type is just SingInstance a, which is not the same as Sing a -> SingInstance a!) The answer is to use a newtype the prevents instantiation:

> newtype DI a = Don'tInstantiate (SingI a => SingInstance a)

Now, after a call to the Don'tInstantiate constructor, GHC will refrain from instantiating. Great – now we just need to connect the dots:

> singInstance :: forall (a :: k). Sing a -> SingInstance a
> singInstance s = with_sing_i SingInstance
>   where
>     with_sing_i :: (SingI a => SingInstance a) -> SingInstance a
>     with_sing_i si = unsafeCoerce (Don'tInstantiate si) s

It’s dirty work, but someone’s got to do it. And it saves us from exponential blow-up, so I’d say it’s worth it. The withSingI function we saw used above is just a convenient wrapper:

> withSingI :: Sing n -> (SingI n => r) -> r
> withSingI sn r =
>   case singInstance sn of
>     SingInstance -> r

Decidable propositional equality

A previous post on this blog discussed the different between Boolean equality and propositional equality. Previous versions of singletons contained the SEq “kind class” to use Boolean equality on singleton types. Singletons 0.9 also contains the SDecide class to allow for decidable propositional equality on singleton types.

Before we dive right into SDecide though, let’s review a few new definitions in the standard library (base package) shipping with GHC 7.8. Under Data.Type.Equality, we have these handy definitions:

data a :~: b where
  Refl :: a :~: a

gcastWith :: (a :~: b) -> ((a ~ b) => r) -> r
gcastWith Refl x = x

class TestEquality (f :: k -> *) where
   testEquality :: f a -> f b -> Maybe (a :~: b)

The idea behind the TestEquality class is that it should classify datatypes whose definitions are such that we can (perhaps) learn about the equality of type variables by looking at terms. Singletons are the chief candidates for instances of this class. Typeable almost is, but it’s at the wrong kind – k -> Constraint instead of k -> *. (See the new function Data.Typeable.eqT for comparison.)

SDecide takes the TestEquality idea one step further, providing full decidable propositional equality. See the previous post on propositional equality for more background.

data Void
type Refuted a = a -> Void
data Decision a = Proved a
                | Disproved (Refuted a)

class (kproxy ~ 'KProxy) => SDecide (kproxy :: KProxy k) where
  (%~) :: forall (a :: k) (b :: k). Sing a -> Sing b -> Decision (a :~: b)

We can now use (%~) to (perhaps) produce an equality that GHC can use to complete type inference. Instances of SDecide (and of SEq, for that matter) are generated for any datatype passed to the singletons Template Haskell function that derive Eq. Or, you can use other functions exported by Data.Singletons.TH to create these instances; see the generated documentation.

Future work

While the improvements in v0.9 are substantial, there is still much distance to cover. In particular, I conjecture that almost any function definable at the term level can be promoted to the type level. The exceptions would be unpromotable datatypes (like Double, IO, or GADTs) and higher-rank functions (there are no higher-rank kinds). Short of that, I think it’s all within reach.

How?

  • The th-desugar library desugars Haskell’s fancy constructs into a manageable set.
  • Case statements can be straightforwardly encoded using lambda lifting.

But, I don’t seem to have the time to put this all into action. If you’re interested in taking some or all of this on, I’d be very happy to collaborate. I believe some interesting research-y things might come out of it all, too, so there might even be something publishable in it. Drop me a line to discuss!

Roles: a new feature of GHC

Last week, I pushed an implementation of roles to GHC HEAD. This post explains what roles are, why they exist, and why you should care. Roles will be a part of GHC 7.8.

An old problem

Roles fix a problem in GHC’s type system that has existed for years.

The problem is the combination of GeneralizedNewtypeDeriving (henceforth called GND) and type families. An example is always best:

> {-# LANGUAGE GeneralizedNewtypeDeriving, TypeFamilies, StandaloneDeriving,
>              MultiParamTypeClasses, GADTs #-}
> 
> module Roles where
> 
> class Incrementable a where
>   incr :: a -> a
> 
> instance Incrementable Int where
>   incr x = x + 1
> 
> newtype Age = MkAge Int
>   deriving Incrementable

The idea is that, because the declaration for Age says that any Age is really just an Int in disguise, we can just re-use the instance for Incrementable Int and make it into an instance for Incrementable Age. Internally, Age has the exact same representation as Int, so GHC can really just reuse the methods that were defined for Int.

So far, so good. This makes good sense, allows for less boilerplate code, and is efficient at runtime. Yet, a problem lurks if we start using some type families:

> type family IsItAgeOrInt a
> type instance IsItAgeOrInt Int = Bool
> type instance IsItAgeOrInt Age = Char
> 
> class BadIdea a where
>   frob :: a -> IsItAgeOrInt a
> 
> instance BadIdea Int where
>   frob x = (x > 0)
> 
> deriving instance BadIdea Age

Calling frob on an Int produces nothing strange:

ghci> frob 5

  True

ghci> frob 0

  False

ghci> frob (-3)

  False

But, what happens if we call frob on an Age? According to frob’s type, calling frob on an Age should produce a IsItAgeOrInt Age – that is, a Char. But, we have reused the definition of frob for Int in the instance for Age! The result: general unhappiness:

ghci> frob (MkAge 5)
  '\-1152921504589753323'

ghci> frob (MkAge 0)
  '\4375976000'

ghci> frob (MkAge (-3))
  '\4375976000'

We’ve broken the type system.

This problem is fairly well-known. The GHC Trac has 3 bugs from it (#1496, #4846, and #7148), there was a POPL paper about it, and at least one blog post. The bug – which rightly is a flaw of the theory, not the implementation – is one of the reasons that modules with GND enabled are not considered part of the Safe Haskell subset. (The other reason is that GND can be used to break module abstraction, a subject which is not considered further here. See ticket #5498.)

The solution comes from that POPL paper: assign so-called roles to type variables to constrain how those type variables are used.

Roles

Why precisely is it a bad idea to say deriving instance BadIdea Age? Because a method in that class uses a type family in its type, and type families can tell the difference between Int and Age. The GND mechanism, though, pretends that Int and Age are the same. Thus, the two bits induce GHC to experience some cognitive dissonance, and we all suffer.

Roles label the type variables of datatypes, classes, and vanilla type synonyms indicating whether or not these all uses of these variables respect the equality between a newtype and its representation. If all uses of a variable do respect newtype-induced equalities, we say that the variable’s role is R (for “representational”). If it might be possible for a use of the variable to spot the difference between Age and Int, we say that the variable’s role is N (for “nominal”). There is a third role, P for “phantom”, which I get to later.

In our example, the parameter to Incrementable would have role R, while BadIdea would get role N. Because of these role assignments, GHC HEAD will refuse to compile the code above. It issues this error:

Can't make a derived instance of ‛BadIdea Age’
  (even with cunning newtype deriving):
  it is not type-safe to use GeneralizedNewtypeDeriving on this class;
  the last parameter of ‛BadIdea’ is at role N
In the stand-alone deriving instance for ‛BadIdea Age’

Role inference

How do we know what role to use for what parameter? We use role inference, of course! Role inference is actually quite straightforward. GHC will look at all uses of a type variable in a datatype, class, or vanilla type synonym definition and see if any of those uses are at role N. If any uses are at role N, then the variable itself must be at role N. Otherwise, (if the variable is used at all) it gets role R. The base cases are type families, (~), and certain type applications. All arguments to type families naturally get role N, as do the two arguments to the type equality operator (~). (This is because (~) will not say that Age and Int are equal.) Because GADTs are internally represented using (~), any GADT-like parameter will also be at role N.

Above, I said that certain type applications cause a role to be forced to N. This is a little subtler than the other cases, and needs an example:

> class Twiddle a b where
>   meth :: a Int -> a b
> 
> instance Twiddle [] Int where
>   meth = id
> 
> data GADT a where
>   GInt :: Int -> GADT Int
> deriving instance Show (GADT a)
> 
> instance Twiddle GADT Int where
>   meth = id
> 
> deriving instance Twiddle GADT Age
> 
> weird :: GADT Age
> weird = meth (GInt 5)

ghci> weird
  GInt 5

ghci> :t weird
  weird :: GADT Age

What’s going on here? As usual, the GND mechanism just appropriates the definition for meth wholesale for the instance for Age. That method is just the identity function. But, in the Age instance for Twiddle, the meth method has type GADT Int -> GADT Age – clearly not an identity function. Yet, it still works just fine, creating the ill-typed weird. A little more pushing here can create unsafeCoerce and segfaults.

But we already knew that GADTs behaved strangely with respect to GND. The difference in this case is that the derived class, Twiddle, does not mention any GADTs. The solution is that, whenever a type variable is used as the argument to another type variable, such as b in the definition of Twiddle, that variable gets role N. The variable a has nothing unusual about it, so it gets role R.

Phantom roles

There is ongoing work (not mine) on implementing newtype wrappers, as described here. Newtype wrappers will allow you to write code that “lifts” a newtype equality (such as the one between Age and Int) into other types (such as equating [Age] with [Int]). These lifted equalities would have no runtime cost. This is quite different than the situation today: although Age and Int can be converted for free, converting [Age] to [Int] requires iterating down the list.

With that application in mind, consider this type:

> data Phantom a = MkPhantom Int

Should we be able to convert [Phantom Bool] to [Phantom Char]? Sure we should. Labeling a’s role as P allows for this. The technical details of how P works internally are beyond the scope of this post (but you could see here for starters), but we are guaranteed that any variable at role P is never actually used in the representation of a datatype.

Role annotations

Sometimes, an implementation of an idea doesn’t quite match the abstraction. For example, the definition of Ptr, GHC’s type for pointers that might be used with foreign functions, is this:

data Ptr a = Ptr Addr#

If left to its own devices, GHC would infer role P for the type parameter a, because a is not used in the definition of Ptr. Yet, that goes against what we have in mind – we don’t really want to convert Ptr Ints to Ptr (Bool -> Bool)s willy-nilly. So, we use a role annotation (enabled with -XRoleAnnotations) which allows the programmer to override GHC’s inference, thus:

data Ptr a@R = Ptr Addr#

This role annotation (the @R) forces a’s role to be R, as desired. Note that you can’t force an unsafe role, such as requiring BadIdea’s parameter to be at role R. Role annotations only allow for stricter roling.

How does all this affect you?

Hopefully, roles won’t affect your code too much. But, it is possible that some code that has previously worked now will not. Although code breakage is rightly seen as a terrible thing, it’s actually intentional here: much of the broken code probably has a type error lurking somewhere. In my experience, there are two problems that may arise:

  • Uses of GND that worked previously and you know are safe now fail. First off, are you sure that it’s safe? Sometimes, the answer is a resounding “yes!”, but GHC still won’t let you use GND. You will have to write an instance yourself, or provide other wrappers. In the design of this feature, we have considered adding a way to use GND unsafely, but we’re not sure what the demand will be. Do you need to use GND unsafely? Let me know.
  • In .hs-boot files (see the relevant section of GHC’s user manual), all declarations must match exactly with the declarations in the corresponding .hs file. This includes roles. However, it is acceptable to leave out definitions in a .hs-boot file. By default, roles are guessed to be R in .hs-boot files. If you have an .hs-boot file that declares a datatype or class whose definition is omitted and whose parameters are not at role R, you will have to add a role annotation. I’m hoping this doesn’t come up too often.

Separately from breakage, writers of libraries may also want to think about whether a role annotation would be helpful in their declarations. The best reason I can think of to do this is to prevent users from using GND on a class. For example, a Set uses a type’s Ord instance to order the data internally. But, the definition of Set does not use type families or other features forcing an N role. So, Set’s parameter will be at role R. Yet, if GND is used to lift a class mentioning Set from, say, Int to Age, the Ord instances for Int and Age might be different, and bad behavior will result. Note that this “bad behavior” would not be a type error, just incorrect runtime behavior. (If it were a type error, roles would hopefully stop this from happening!) The upshot of this is that Set’s parameter should be labeled with role N.

Further reading

To learn more about roles in GHC, you can check out the up-to-date wiki pages on the subject, for users and for implementers.

If you run into problems, do let me know!

Coincident Overlap in Type Families

Haskell allows what I will call coincident overlap among type family instances. Coincident overlap occurs when two (or more) type family instances might be applicable to a given type family usage site, but they would evaluate to the same right-hand side. This post, inspired by Andy Adams-Moran’s comment to an earlier blog post, explores coincident overlap and how to extend it (or not!) to branched instances.

This post is a literate Haskell file (though there really isn’t that much code). Paste it into a .lhs file and load it into GHCi. Because the post uses branched instances, you will need the HEAD version of GHC. (Branched instances will be included in GHC 7.8, but not before.)

> {-# LANGUAGE TypeFamilies, DataKinds, GADTs, TypeOperators #-}

Our examples will be over Bools, and we need some way to get GHC to evaluate our type families. The easiest way is to use the following singleton GADT:

> data SBool :: Bool -> * where
>   SFalse :: SBool False
>   STrue  :: SBool True

Conflict checking with type family instances

When compiling type family instances, GHC checks the instances for conflicts. To know if two instances conflict (i.e., could both match the same usage site), GHC unifies the two left-hand sides. For example, the following code is bad and is rejected:

type family F x
type instance F Int = Bool
type instance F a   = Double

Compiling the above instances gives the following error message:

Conflicting family instance declarations:
  F Int -- Defined at ...
  F a -- Defined at ...

This check is a good thing, because otherwise it would be possible to equate two incompatible types, such as Int and Bool.

Coincident overlap among unbranched instances

Here is a nice example of how coincident overlap is useful:

> type family (x :: Bool) && (y :: Bool) :: Bool
> type instance False && a     = False   -- 1
> type instance True  && b     = b       -- 2
> type instance c     && False = False   -- 3
> type instance d     && True  = d       -- 4

Although the first two equations fully define the && operation, the last two instances allow GHC to reduce a use of && that could not otherwise be reducible. For example:

> and1 :: SBool a -> SBool (True && a)
> and1 x = x
> 
> and2 :: SBool a -> SBool (a && True)
> and2 x = x

and1 uses the second instance of &&, but and2 requires the fourth instance. If we comment that instance out, and2 fails to compile, because GHC cannot figure out that a && True must be a for all values of a. For various good reasons, perhaps to be explored in another post, GHC does not do case analysis on types during type inference.

How does GHC know that overlap is coincident? During the conflict check, GHC looks for a substitution that unifies two potentially-conflicting instances. In our case, the fourth and first instances would conflict under the substitution {a |-> True, d |-> False}. However, after finding the unifying substitution, GHC checks the right-hand sides under that same substitution. If they are the same, then GHC considers the overlap to be coincident and allows the instance pair. In our case, applies the substitution {a |-> True, d |-> False} to False and d and discovers that both are False, and so the instances are allowed.

Coincident overlap within branched instances

When thinking about branched instances and coincident overlap, there are two possibilities to consider: coincident overlap within a branched instance and coincident overlap among two separate branched instances. Let’s consider the first case here.

Imagine we define || analogously to &&, but using one branched instance:

> type family (x :: Bool) || (y :: Bool) :: Bool
> type instance where
>   False || a     = a     -- 1
>   True  || b     = True  -- 2
>   c     || False = c     -- 3
>   d     || True  = True  -- 4

Now, let’s consider simplifying the type e || False. The first two branches don’t match, but the third does. Now, following the rule for branched instance simplification (as stated in the Haskell wiki), we check to see if any previous branches might be applicable to e || False, for any possible instantiation of e. The first branch certainly might apply, and so e || False fails to simplify. This is surely counterintuitive, because the third branch matches e || False exactly!

Just to prove this behavior, I tried running this code through GHC:

bar :: SBool a -> SBool (a || False)
bar x = x

Here is the error:

Couldn't match type ‛a’ with ‛a || 'False’

At first blush, it seems I’ve missed something important here in the implementation — allowing coincident overlap within a branched instance. But, there is a major problem with such overlap in this setting. Let’s think about how coincident overlap would work in this setting. After selecting the third branch to simplify e || False (with the substitution {c |-> e}), GHC checks to see if any previous branch could be applicable to e || False. The first branch, False || a, unifies with e || False, so it might be applicable later on. The unifying substitution is {a |-> False, e |-> False}. Now, if we wanted to check for coincident overlap, we would apply both substitutions ({c |-> e} and {a |-> False, e |-> False}) to the right-hand sides. In this case, we would see that both right-hand sides would become False, and it seems we should allow the simplification of e || False to e.

Let’s try a harder case. What if we want to simplify (G f) || False, for some type family G? The third branch matches, with the substitution {c |-> G f}. Now, we check earlier branches for applicability. The first branch is potentially applicable, if G f simplifies to False. But, we can’t produce a substitution over type variables to witness to check right-hand sides. In this case, it wouldn’t be hard to imagine a substitution like {(G f) |-> False}, but that’s a slippery slope to slide down. What if f appears multiple times in the type, perhaps under different type family applications? How do we deal with this? There may well be an answer, but it would be subtle and likely quite fragile from the programmer’s point of view. So, we decided to ignore the possibility of coincident overlap within a branch. We were unable to come up with a compelling example of why anyone would want this feature, it seemed hard to get right, and we can just write || using separate instances, anyway.

Coincident overlap between branched instances

Consider the following (contrived) example:

type family F (x :: Bool) (y :: Bool) :: *
type instance where
  F False True = Int
  F a     a    = Int
  F b     c    = Bool
type instance F d True = Int

Is this set of instances allowable? Is that your final answer?

I believe that this set of instances wouldn’t produce any conflicts. Anything that matches F d True would have to match one of the first two branches of the branched instance, meaning that the right-hand sides coincide. However, it is difficult to reason about such cases, for human and GHC alike. So, for the sake of simplicity, we also forbid coincident overlap whenever even one instance is branched. This means that

type instance F Int = Bool

and

type instance where F Int = Bool

are very subtly different.

Andy Adams-Moran’s example

This post was partially inspired by Andy Adams-Moran’s comment in which Andy poses this example, paraphrased slightly:

> data T a
> type family Equiv x y :: Bool
> type instance where
>    Equiv a      a     = True        -- 1
>    Equiv (T b)  (T c) = True        -- 2
>    Equiv (t d)  (t e) = Equiv d e   -- 3
>    Equiv f      g     = False       -- 4

Alas, this does not work as well as one would like. The problem is that we cannot simplify, say, Equiv (T a) (T b) to True, because this matches the second branch but might later match the first branch. Simplifying this use would require coincident overlap checking within branched instances. We could move the first branch to the third position, and that would help, but not solve the problem. With that change, Equiv x x would not simplify, until the head of x were known.

So, is this the “compelling example” we were looking for? Perhaps. Andy, why do you want this? Can your use case be achieved with other means? Do you (anyone out there) have a suggestion for how to deal with coincident overlap within branched instances in a simple, easy-to-explain manner?

Defunctionalization for the win

I enjoy using a type system to help make sure my term level code is unimpeachably correct. This is where my interest in writing the singletons library came from. This library allows you to write some dependently typed code in Haskell, using singleton types. I didn’t invent this idea, but I did write a nice library to remove some of the pain of using this encoding. SHE can be considered an ancestor of singletons.

At my Haskell Symposium (2012) presentation of the singletons work, an attendee asked if singleton generation works for higher-order functions, like map. I innocently answered “yes”, at which point Conor McBride, sitting in the back, stood up and said “I don’t believe you!” I wasn’t lying — singletons does indeed handle higher-order functions. However, Conor’s skepticism isn’t unfounded: a “singletonized” higher-order function isn’t so useful.

This blog post explores why singletonized higher-order functions aren’t useful and suggests defunctionalization as a way to fix the problem.

Before we get too much further, this blog post is a literate Haskell file, so we have some necessary throat-clearing:

> {-# LANGUAGE TemplateHaskell, DataKinds, PolyKinds, TypeFamilies,
>              GADTs, FlexibleContexts, RankNTypes, TypeOperators #-}
> import Prelude hiding (map)
> import Data.Singletons

I should also warn that some of the code rendered as Haskell in this blog post does not have bird-tracks. This code is not intended as part of the executable code. I should finally note that this code does not compile with the current HEAD of GHC (but it does compile with 7.6.1, at least). The new ambiguity check overzealously flags some of the code here as inherently ambiguous, which it is not. I have filed bug report #7804.

Introduction to singletons

First off, what is a singleton? I will give a brief introduction here, but I refer you to the singletons paper on the subject. A singleton is a type with exactly one value. Let’s make a singleton for the natural numbers:

> data Nat = Zero | Succ Nat
> $(genSingletons [''Nat])

The second line above generates the following singleton definition for Nat:

data SNat :: Nat -> * where
  SZero :: SNat Zero
  SSucc :: SNat n -> SNat (Succ n)

(Well, it doesn’t quite generate that, but let’s pretend it does. See the paper [or use -ddump-splices!] for more details.) According to this definition, there is exactly one value for every SNat n. For example, the type SNat (Succ Zero) has one value: SSucc SZero. This is interesting because it means that once we identify a value, say in a case expression, we also know a type index. This interplay between term-level matching and type-level information is what makes singletons enable something like dependently typed programming.

The singletons library provides a singleton for [], but with alphanumeric names. We can pretend this is the definition:

data SList :: [k] -> * where
  SNil  :: SList '[]
  SCons :: Sing h -> SList t -> SList (h ': t)

The Sing in there (the first parameter to SCons) is defined thus:

data family Sing (a :: k)

Using a data family allows GHC to choose the correct singleton type, depending on the kind k. An instance of this family is defined for every singleton type we create. So, actually, the list type built into the singletons library is more like

data instance Sing (list :: [k]) where
  SNil  :: Sing '[]
  SCons :: Sing h -> Sing t -> Sing (h ': t)

The singletons library also provides a synonym SList to refer to this instance of the Sing family. Again, you may find more clarity in the singletons paper, which spends a little more time drawing all of this out.

Singleton first-order functions

We can singletonize more than just datatypes. We can singletonize functions. Consider the following predecessor function on Nats, defined in such a way that we get the singleton definition generated for us:

> $(singletons [d|
>   pred :: Nat -> Nat
>   pred Zero     = Zero
>   pred (Succ n) = n
>   |])

A definition of pred that works on singleton Nats is generated for us. It looks something like

sPred :: SNat n -> SNat ???
sPred SZero     = SZero
sPred (SSucc n) = n

The problem is those pesky ??? marks. Because the type indices of a singleton mirror the computation of singleton values, every function on singletons must be mirrored at the type level. So, to define sPred, we must have the type family Pred as well:

type family Pred (n :: Nat) :: Nat
type instance Pred Zero     = Zero
type instance Pred (Succ n) = n

sPred :: SNat n -> SNat (Pred n)
...

The singletons library generates both the type-level Pred and the singletonized sPred.

Singleton higher-order functions

But what about map?

> $(singletons [d|
>   map :: (a -> b) -> [a] -> [b]
>   map _ []      = []
>   map f (h : t) = f h : map f t
>   |])

The singletons library generates these definitions (but with some extra class constraints that don’t concern us):

type family Map (f :: k1 -> k2) (list :: [k1]) :: [k2]
type instance Map f '[]      = '[]
type instance Map f (h ': t) = ((f h) ': (Map f t)

sMap :: (forall a. Sing a -> Sing (f a)) -> Sing list -> Sing (Map f list)
sMap _ SNil        = SNil
sMap f (SCons h t) = SCons (f h) (sMap f t)

Whoa! What’s the bizarre type doing in sMap? The forall declares that the function passed into sMap must be valid for any a. That’s not so strange, when we think about the fact the index a must be isomorphic to the term Sing a. We’re used to having functions that work for any term. Here, because of the relationship between term values and type indices, the function must also work for any type index a. This is particularly important, because Map will apply f to all the as in the list list. If we leave off the forall, the function won’t type check.

This is all well and good, but this definition of sMap isn’t useful. This is because the type of that first parameter is quite restrictive. We must have a function of that type, and f must be inferrable. Let’s look at some examples. We can write the following just fine:

> sOne   = SSucc SZero
> sTwo   = SSucc sOne
> sThree = SSucc sTwo
> sNums  = SCons sOne $ SCons sTwo $ SCons sThree SNil -- [1,2,3]
> 
> two_three_four = sMap sSucc sNums

(sSucc is a so-called “smart” constructor. It is equivalent to SSucc, but adds extra class constraints that don’t concern us here. See Section 3.1 of the singletons paper.) The type of SSucc is forall a. Sing a -> Sing (Succ a), so the call to sMap type checks just fine. SSucc is perfect here. Let’s try something else:

zero_one_two = sMap sPred sNums

The type of sPred is forall n. Sing n -> Sing (Pred n), as written above, so one would think all is good. All is not good. The problem is that Pred is a type family, not a regular type constructor like good old Succ. Thus, GHC does not (and cannot, with good reason) infer that f in the type of sMap should be Pred:

Couldn't match type `Pred t1' with `t t1'
Expected type: Sing Nat t1 -> Sing Nat (t t1)
  Actual type: Sing Nat t1 -> Sing Nat (Pred t1)

The reason this inference is bogus is that GHC will not let a type variable unify with a type family. In its internal constraint-solving machinery, GHC assumes that all type variables are both injective and generative. Injective means that from an assumption of t a ~ t b, (where ~ denotes type equality) we can derive a ~ b. Generative means that from an assumption of t a ~ s b, we can derive t ~ s. Type families, in general, have neither property. So, GHC won’t let a type variable unify with a type family.

This problem — called the saturation requirement of type families — is what Conor was thinking about when he disbelieved that singletons handled map.

Defunctionalization

Over lunch while at ICFP, I had the good fortune of sitting with Tillmann Rendel, and we got to talking about this problem. He suggested that I think about defunctionalization. I have thought about this, and I think it’s the answer to the problem.

Defunctionalization is an old technique of dealing with higher-order functions. The idea is that, instead of making a closure or other pointer to code, represent a function with some symbol that can be looked up and linked to the code later. Danvy and Nielsen wrote a more recent paper explaining how the whole thing works. One drawback of the technique that they outline is that defunctionalization tends to require whole-program translation. That is, the transformation requires looking at the entire codebase to do the translation. This is generally necessary so that the table of function “symbols”, encoded as an algebraic datatype, can be matched on. However, in Haskell, we have open type functions, so this problem does not limit us.

Another drawback of defunctionalization is that it is generally poorly-typed. If we are just using some symbol to denote a function, how can we be sure that a function application is well-typed? Pottier and Gauthier address this issue in their paper on the topic by using generalized algebraic datatypes (GADTs). But, given the way type families work in Haskell, we don’t need the power of GADTs to do this for us.

Encoding defunctionalization in Haskell type families

At the heart of any defunctionalization scheme is an apply function:

type family Apply (f :: k1 -> k2) (a :: k1) :: k2

But wait, we don’t really want Apply to have that kind, because then we would have to pass Pred in as the function, and Pred all on its own is unsaturated. What we need is some symbol that can represent Pred:

type family Apply (f :: *) (a :: k1) :: k2
data PredSym :: *
type instance Apply PredSym n = Pred n

This is progress. We can now pass PredSym around all by itself, and when we apply it, we get the desired behavior. But, this is weakly kinded. We would like to be able to define many symbols akin to PredSym, and we would like GHC to be able to make sure that we use these symbols appropriately — that is, we don’t say Apply PredSym '[Int, Bool].

Yet, we still want to be able to create new symbols at will. So, we want to use data declarations to create the symbols. Thus, the kind of these symbols must end in -> *. But, we have complete freedom as to what appears to the left of that arrow. We will use this definition to store the kinds:

> data TyFun :: * -> * -> *

Now, we can make our richly kinded Apply:

> type family Apply (f :: (TyFun k1 k2) -> *) (x :: k1) :: k2
> data PredSym :: (TyFun Nat Nat) -> *
> type instance Apply PredSym x = Pred x

This one works. But, we want it to work also for real type constructors (like Succ), not just type families. We have to wrap these type constructors in an appropriately kinded wrapper:

> data TyCon :: (k1 -> k2) -> (TyFun k1 k2) -> *
> type instance Apply (TyCon tc) x = tc x

Then, we define a new version of sMap that works with Apply:

> type family DFMap (f :: (TyFun k1 k2) -> *) (ls :: [k1]) :: [k2]
> type instance DFMap f '[]      = '[]
> type instance DFMap f (h ': t) = (Apply f h ': DFMap f t)

sDFMap :: forall (f :: TyFun k1 k2 -> *) (ls :: [k1]).
          (forall a. Sing a -> Sing (Apply f a)) -> Sing ls -> Sing (DFMap f ls)
sDFMap _ SNil        = SNil
sDFMap f (SCons h t) = SCons (f h) (sDFMap f t)

We’re close, but we’re not there yet. This sDFMap function has a major problem: it is inherently ambiguous. The type variable f appears only inside of type family applications, and so there’s no way for GHC to infer its value. This problem has a straightforward solution: use a proxy.

> data Proxy a = P

sDFMap :: forall (f :: TyFun k1 k2 -> *) (ls :: [k1]).
          Proxy f -> (forall a. Sing a -> Sing (Apply f a)) ->
          Sing ls -> Sing (DFMap f ls)
sDFMap _ _ SNil        = SNil
sDFMap p f (SCons h t) = SCons (f h) (sDFMap p f t)

This one really does work, but we can still do better. The problem is that some plumbing is exposed. When calling this version of sDFMap with sPred, we still have to explicitly create the proxy argument and give it the correct type, even though we would like to be able to infer it from sPred. The trick is that, while we do need to have f exposed in the type of sDFMap, the location where it is exposed doesn’t matter. It can actually be in an argument to the callback function. This next, final version also contains those pesky class constraints that we’ve been trying to avoid this whole time.

> sDFMap :: forall (f :: TyFun k1 k2 -> *) (ls :: [k1]).
>           SingKind (KindParam :: OfKind k2) =>
>           (forall a. Proxy f -> Sing a -> Sing (Apply f a)) ->
>           Sing ls -> Sing (DFMap f ls)
> sDFMap _ SNil        = sNil
> sDFMap f (SCons h t) = sCons (f P h) (sDFMap f t)

To call this function, we will need to create wrappers around, say, sPred and sSucc that explicitly relate the functions to their defunctionalization symbols:

> sPred' :: Proxy PredSym -> Sing n -> Sing (Pred n)
> sPred' _ = sPred
> sSucc' :: Proxy (TyCon Succ) -> Sing n -> Sing (Succ n)
> sSucc' _ = sSucc

Now, finally, we can use sDFMap as desired:

> two_three_four' = sDFMap sSucc' sNums
> zero_one_two = sDFMap sPred' sNums

Conclusions

This whole thing is a bit of a hack, but it’s one that seems to follow a nice pattern that could be automated. In particular, I believe it should be relatively straightforward to incorporate this kind of encoding into a future version of singletons. This would allow more code to be singletonized and support dependently typed programming better.

One clear drawback of this approach is that the arity of a defunctionalized function must be a part of the encoding. In some future world with kind families, it may be possible to generalize the arities. One idea I have for a future blog post is to grapple with higher-arity and partially applied functions, which may be a bit icky. And, what about defunctionalizing higher-order functions?

The question at the end of all of this is: could this be an approach to desaturating type families in Haskell? In other words, could an approach based on the ideas here be incorporated into GHC to make all of this Just Work? I’ve thought about it a bit, but not long enough or hard enough to really have an opinion. What do you think?

And, if I write this functionality into singletons, will it be useful to you? We all have limited time, and it’s helpful to know if such an improvement will be put to use.

Ordered overlapping type family instances

I am pleased to announce that I have pushed my implementation of ordered overlapping type family instances to GHC HEAD.

This blog post is a literate Haskell file. Copy and paste into a .lhs file to try out this code. This file will only compile with GHC HEAD, however.

We need some header formalities:

> {-# LANGUAGE TypeFamilies, DataKinds, PolyKinds, TypeOperators #-}
> import Prelude hiding (zipWith)

The Problem

When writing term-level functions, it is natural to write a series of equations, each using a sequence of patterns to select which equation should be triggered when calling the function. Critically for this discussion, the first matching equation is used. Let’s use a particularly favorite function of mine as an example:

> import Prelude hiding (zipWith)
> 
> zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
> zipWith f (a:as) (b:bs) = (f a b):(zipWith f as bs)
> zipWith _ _      _      = []

Let’s try to naively write this function at the type level on promoted lists:

type family ZipWith (f :: a -> b -> c) (as :: [a]) (bs :: [b]) :: [c]
type instance ZipWith f (a ': as) (b ': bs) = (f a b) ': (ZipWith f as bs)
type instance ZipWith f as        bs        = '[]

Urk. We get the following error:

Conflicting family instance declarations:
  ZipWith k k k f ((':) k a as) ((':) k b bs)
  ZipWith k k k f as bs

(The repetition of the variable k is incorrect, and has been reported as GHC bug #7524. This is not the issue we are pursuing here, however.)

The problem is really that type instance declarations are essentially unordered. The order in which they appear in a file is irrelevant to GHC. Relatedly, a programmer can define instances of the same type family in multiple modules. With separate compilation, the lack of ordering and the overlap check are necessary for type soundness. This is quite different from term-level function definition equations. All equations defining the same function not only have to be in the same module, but they must be one right after another.

The particular example here has an easy solution. Because we are matching over a closed kind ([a] at the kind level), we could simply expand out the different cases we wish to match against. However, this solution is not possible when matching over an open kind, such as *. We’ll see a useful example of overlap involving * shortly.

The Solution

GHC HEAD now contains an implementation for ordered overlapping type family instances. The example above can be written thus:

> type family ZipWith (f :: a -> b -> c) (as :: [a]) (bs :: [b]) :: [c]
> type instance where
>   ZipWith f (a ': as) (b ': bs) = (f a b) ': (ZipWith f as bs)
>   ZipWith f as        bs        = '[]

More interestingly, we can now define this:

> type family Equals (a :: k) (b :: k) :: Bool
> type instance where
>   Equals a a = True
>   Equals a b = False

Ordered overlapping type family instances allow us to define a general, write-once use-everywhere Boolean equality at the type level. Yay!

This new form of type family instance also seems to close the biggest known gap between the expressivity of functional dependencies and type families: functional dependencies have long supported overlap (through OverlappingInstances or IncoherentInstances) that type families could not. Although functional dependencies’ overlap requires ordering based on specificity and type families’ overlap is based on an explicit ordering, it would seem that any code that took advantage of functional dependencies and overlap can now be translated to use overlapping type families.

Details

  • type instance where does not work with associated types. Class instances can be sprinkled across modules, and having this form of overlap appear across modules would not be type safe in the presence of separate compilation.
  • type instance where does not work with associated types, even when the overlap is intended to exist just within one instance. There is no great reason for this restriction, but it seemed unimportant. Yell if this matters to you.
  • Choosing which equation in a group to use is somewhat delicate. For example, consider the Equals type family. What if we want to simplify Equals a Int? Well, we can’t. That’s because a might (sometimes) be instantiated to Int, and if we simplified Equals a Int to False, we would have a type soundness issue. So, perhaps counterintuitively, we can’t simplify even Equals a b to False until a and b are known.

This GHC wiki page gives an outline of how to get GHC compiling on your machine so you can play with this feature in HEAD. I don’t imagine it will be in 7.6.2, but I believe it will be in 7.8.1, whenever that is released. Enjoy, and please post any feedback!

Acknowledgments

Many thanks to Simon Peyton Jones, Dimitrios Vytiniotis, and Stephanie Weirich for getting me started and helping me think through some of the finer points.

A formalization of GHC’s core language

There have been a handful of papers about System FC, the internal language used within GHC ([1] [2] [3] [4], etc.). Each of these papers uses a different characterization of FC, with variations among the definitions and judgments. Unsurprisingly, each of these formalisms differs also from the actual implementation of FC within GHC. At ICFP, Simon PJ asked me to look at GHC’s implementation of System FC, and write it up using formal notation. The result of this work (developed using Ott) now lives in the GHC repo, in docs/core-spec. There are comments in various places reminding GHC contributors to update the formalization as they update the implementation.

Do you have a stake in this work? Are you planning on reasoning about GHC’s core language? Check out the formalism here. Any feedback is welcome!