Some time ago, I wrote a post titled Meaningfully Typechecking a Language in Idris. The gist of the post was as follows:
-
Programming Language Fundamentals students were surprised that, despite
having run their expression through (object language) typechecking, they still had to
have a
Maybe
type in their evaluation functions. This was due to the fact that the (meta language) type system was not certain that (object language) typechecking worked. - A potential solution was to write separate expression types such
as
ArithExpr
andBoolExpr
, which are known to produce booleans or integers. However, this required the re-implementation of most of the logic forIfElse
, for which the branches could have integers, booleans, or strings. - An alternative solution was to use dependent types, and index
the
Expr
type with the type it evaluates to. We defined a data typedata ExprType = IntType | StringType | BoolType
, and then were able to write types likeSafeExpr IntType
that we knew would evaluate to an integer, orSafeExpr BoolType
, which we also knew would evaluate to a boolean. We then made ourtypecheck
function return a pair of(type, SafeExpr of that type)
.
Unfortunately, I think that post is rather incomplete. I noted
at the end of it that I was not certain on how to implement
if-expressions, which were my primary motivation for not just
sticking with ArithExpr
and BoolExpr
. It didn’t seem too severe
then, but now I just feel like a charlatan. Today, I decided to try
again, and managed to figure it out with the excellent help from
people in the #idris
channel on Freenode. It required a more
advanced use of dependent types: in particular, I ended up using
Idris’ theorem proving facilities to get my code to pass typechecking.
In this post, I will continue from where we left off in the previous
post, adding support for if-expressions.
Let’s start with the new Expr
and SafeExpr
types. Here they are:
|
|
For Expr
, the IfElse
constructor is very straightforward. It takes
three expressions: the condition, the ’then’ branch, and the ’else’ branch.
With SafeExpr
and IfThenElse
, things are more rigid. The condition
of the expression has to be of a boolean type, so we make the first argument
SafeExpr BoolType
. Also, the two branches of the if-expression have to
be of the same type. We encode this by making both of the input expressions
be of type SafeExpr t
. Since the result of the if-expression will be
the output of one of the branches, the whole if-expression is also
of type SafeExpr t
.
What Stumped Me: Equality
Typechecking if-expressions is where things get interesting. First,
we want to require that the condition of the expression evaluates
to a boolean. For this, we can write a function requireBool
,
that takes a dependent pair produced by typecheck
. This
function does one of two things:
- If the dependent pair contains a
BoolType
, and therefore also an expression of typeSafeExpr BoolType
,requireBool
succeeds, and returns the expression. - If the dependent pair contains any type other than
BoolType
,requireBool
fails with an error message. Since we’re usingEither
for error handling, this amounts to using theLeft
constructor.
Such a function is quite easy to write:
|
|
We can then write all of the recursive calls to typecheck
as follows:
|
|
Alright, so we have the types of the t
and e
branches. All we have to
do now is use (==)
. We could implement (==)
as follows:
implementation Eq ExprType where
IntType == IntType = True
BoolType == BoolType = True
StringType == StringType = True
_ == _ = False
Now we’re golden, right? We can just write the following:
|
|
No, this is not quire right. Idris complains:
Type mismatch between et and tt
Huh? But we just saw that et == tt
! What’s the problem?
The problem is, in fact, that (==)
is meaningless as far
as the Idris typechecker is concerned. We could have just
as well written,
implementation Eq ExprType where
_ == _ = True
This would tell us that IntType == BoolType
. But of course,
SafeExpr IntType
is not the same as SafeExpr BoolType
; I
would be very worried if the typechecker allowed me to assert
otherwise. There is, however, a kind of equality that we can
use to convince the Idris typechecker that two types are the
same. This equality, too, is a type.
Curry-Howard Correspondence
Spend enough time learning about Programming Language Theory, and you will hear the term Curry Howard Correspondence. If you’re the paper kind of person, I suggest reading Philip Wadler’s Propositions as Types paper. Alternatively, you can take a look at Logical Foundations’ Proof Objects chapter. I will give a very brief explanation here, too, for the sake of completeness. The general gist is as follows: propositions (the logical kind) correspond to program types, and proofs of the propositions correspond to values of the types.
To get settled into this idea, let’s look at a few ‘well-known’ examples:
-
(A,B)
, the tuple of two typesA
andB
is equivalent to the proposition , which means and . Intuitively, to provide a proof of , we have to provide the proofs of and . -
Either A B
, which contains one ofA
orB
, is equivalent to the proposition , which means or . Intuitively, to provide a proof that either or is true, we need to provide one of them. -
A -> B
, the type of a function fromA
toB
, is equivalent to the proposition , which reads implies . We can think of a functionA -> B
as creating a proof ofB
given a proof ofA
.
Now, consider Idris’ unit type ()
:
data () = ()
This type takes no arguments, and there’s only one way to construct
it. We can create a value of type ()
at any time, by just writing ()
.
This type is equivalent to : only one proof of it exists,
and it requires no premises. It just is.
Consider also the type Void
, which too is present in Idris:
-- Note: this is probably not valid code.
data Void = -- Nothing
The type Void
has no constructors: it’s impossible
to create a value of this type, and therefore, it’s
impossible to create a proof of Void
. Thus, as you may have guessed, Void
is equivalent to .
Finally, we get to a more complicated example:
data (=) : a -> b -> Type where
Refl : x = x
This defines a = b
as a type, equivalent to the proposition
that a
is equal to b
. The only way to construct such a type
is to give it a single value x
, creating the proof that x = x
.
This makes sense: equality is reflexive.
This definition isn’t some loosey-goosey boolean-based equality! If we can construct a value of
type a = b
, we can prove to Idris’ typechecker that a
and b
are equivalent. In
fact, Idris’ standard library gives us the following function:
replace : {a:_} -> {x:_} -> {y:_} -> {P : a -> Type} -> x = y -> P x -> P y
This reads, given a type a
, and values x
and y
of type a
, if we know
that x = y
, then we can rewrite any proposition in terms of x
into
another, also valid proposition in terms of y
. Let’s make this concrete.
Suppose a
is Int
, and P
(the type of which is now Int -> Type
),
is Even
, a proposition that claims that its argument is even.
[note:
I'm only writing type signatures for replace'
to avoid overloading. There's no need to define a new function;
replace'
is just a specialization of replace
,
so we can use the former anywhere we can use the latter.
]
replace' : {x : Int} -> {y : Int} -> x = y -> Even x -> Even y
That is, if we know that x
is equal to y
, and we know that x
is even,
it follows that y
is even too. After all, they’re one and the same!
We can take this further. Recall:
|
|
We can therefore write:
replace'' : {x : ExprType} -> {y : ExprType} -> x = y -> SafeExpr x -> SafeExpr y
This is exactly what we want! Given a proof that one ExprType
, x
, is equal to
another ExprType
, y
, we can safely convert SafeExpr x
to SafeExpr y
.
We will use this to convince the Idris typechecker to accept our program.
First Attempt: Eq
implies Equality
It’s pretty trivial to see that we did define (==)
correctly (IntType
is equal
to IntType
, StringType
is equal to StringType
, and so on). Thus,
if we know that x == y
is True
, it should follow that x = y
. We can thus
define the following proposition:
eqCorrect : {a : ExprType} -> {b : ExprType} -> (a == b = True) -> a = b
We will see shortly why this is not the best solution, and thus, I won’t bother creating a proof / implementation for this proposition / function. It reads:
If we have a proof that
(==)
returned true for someExprType
sa
andb
, it must be thata
is the same asb
.
We can then define a function to cast
a SafeExpr a
to SafeExpr b
, given that (==)
returned True
for some a
and b
:
safeCast : {a : ExprType} -> {b : ExprType} -> (a == b = True) -> SafeExpr a -> SafeExpr b
safeCast h e = replace (eqCorrect h) e
Awesome! All that’s left now is to call safeCast
from our typecheck
function:
|
|
No, this doesn’t work after all. What do we put for ?uhOh
? We need to have
a value of type tt == et = True
, but we don’t have one. Idris’ own if-then-else
expressions do not provide us with such proofs about their conditions. The awesome
people at #idris
pointed out that the with
clause can provide such a proof.
We could therefore write:
createIfThenElse ce (tt ** et) (et ** ee) with (et == tt) proof p
| True = pure (tt ** IfThenElse ce te (safeCast p ee))
| False = Left "Incompatible branch types."
Here, the with
clause effectively adds another argument equal to (et == tt)
to createIfThenElse
,
and tries to pattern match on its value. When we combine this with the proof
keyword,
Idris will give us a handle to a proof, named p
, that asserts the new argument
evaluates to the value in the pattern match. In our case, this is exactly
the proof we need to give to safeCast
.
However, this is ugly. Idris’ with
clause only works at the top level of a function,
so we have to define a function just to use it. It also shows that we’re losing
information when we call (==)
, and we have to reconstruct or recapture it using
some other means.
Second Attempt: Decidable Propositions
More awesome folks over at #idris
pointed out that the whole deal with (==)
is inelegant; they suggested I use decidable propositions, using the Dec
type.
The type is defined as follows:
data Dec : Type -> Type where
Yes : (prf : prop) -> Dec prop
No : (contra : prop -> Void) -> Dec prop
There are two ways to construct a value of type Dec prop
:
- We use the
Yes
constructor, which means that the propositionprop
is true. To use this constructor, we have to give it a proof ofprop
, calledprf
in the constructor. - We use the
No
constructor, which means that the propositionprop
is false. We need a proof of typeprop -> Void
to represent this: if we have a proof ofprop
, we arrive at a contradiction.
This combines the nice True
and False
of Bool
, with the
‘real’ proofs of the truthfulness or falsity. At the moment
that we would have been creating a boolean, we also create
a proof of that boolean’s value. Thus, we don’t lose information.
Here’s how we can go about this:
|
|
We pattern match on the input expression types. If the types are the same, we return
Yes
, and couple it with Refl
(since we’ve pattern matched on the types
in the left-hand side of the function definition, the typechecker has enough
information to create that Refl
). On the other hand, if the expression types
do not match, we have to provide a proof that their equality would be absurd.
For this we use helper functions / theorems like intBoolImpossible
:
|
|
I’m not sure if there’s a better way of doing this than using impossible
.
This does the job, though: Idris understands that there’s no way we can get
an input of type IntType = BoolType
, and allows us to skip writing a right-hand side.
We can finally use this new decEq
function in our type checker:
|
|
Idris is happy with this! We should also add IfThenElse
to our eval
function.
This part is very easy:
|
|
Since the c
part of the IfThenElse
is indexed with BoolType
, we know
that evaluating it will give us a boolean. Thus, we can use that
directly in the Idris if-then-else expression. Let’s try this with a few
expressions:
BinOp Add (IfElse (BoolLit True) (IntLit 6) (IntLit 7)) (BinOp Multiply (IntLit 160) (IntLit 2))
This evaluates 326
, as it should. What if we make the condition non-boolean?
BinOp Add (IfElse (IntLit 1) (IntLit 6) (IntLit 7)) (BinOp Multiply (IntLit 160) (IntLit 2))
Our typechecker catches this, and we end up with the following output:
Type error: Not a boolean.
Alright, let’s make one of the branches of the if-expression be a boolean, while the other remains an integer.
BinOp Add (IfElse (BoolLit True) (BoolLit True) (IntLit 7)) (BinOp Multiply (IntLit 160) (IntLit 2))
Our typechecker catches this, too:
Type error: Incompatible branch types.
Conclusion
I think this is a good approach. Should we want to add more types to our language, such as tuples,
lists, and so on, we will be able to extend our decEq
approach to construct more complex equality
proofs, and keep the typecheck
method the same. Had we not used this approach,
and instead decided to pattern match on types inside of typecheck
, we would’ve quickly
found that this only works for languages with finitely many types. When we add polymorphic tuples
and lists, we start being able to construct an arbitrary number of types: [a]
. [[a]]
, and
so on. Then, we cease to be able to enumerate all possible pairs of types, and require a recursive
solution. I think that this leads us back to decEq
.
I also hope that I’ve now redeemed myself as far as logical arguments go. We used dependent types
and made our typechecking function save us from error-checking during evaluation. We did this
without having to manually create different types of expressions like ArithExpr
and BoolExpr
,
and without having to duplicate any code.
That’s all I have for today, thank you for reading! As always, you can check out the full source code for the typechecker and interpreter on my Git server.