Some time ago, I wrote a post titled Meaningfully Typechecking a Language in Idris. I then followed it up with Meaningfully Typechecking a Language in Idris, Revisited. In these posts, I described a hypothetical way of ‘typechecking’ an expression data type Expr into a typesafe form SafeExpr. A SafeExpr can be evaluated without any code to handle type errors, since it’s by definition impossible to construct ill-typed expressions using it. In the first post, we implemented the method only for simple arithmetic expressions; in my latter post, we extended this to support if-expressions. Near the end of the post, I made the following comment:

When we add polymorphic tuples and lists, we start being able to construct an arbitrary number of types: [a]. [[a]], and so on. Then, we cease to be able t enumerate all possible pairs of types, and require a recursive solution. I think that this leads us back to [our method].

Recently, I thought about this some more, and decided that it’s rather simple to add tuples into our little language. The addition of tuples mean that our language will have an infinite number of possible types. We would have Int, (Int, Int), ((Int, Int), Int), and so on. This would make it impossible to manually test every possible case in our typechecker, but our approach of returning Dec (a = b) will work just fine.

Extending The Syntax

First, let’s extend our existing language with expressions for tuples. For simplicity, let’s use pairs (a,b) instead of general n-element tuples. This would make typechecking less cumbersome while still having the interesting effect of making the number of types in our language infinite. We can always represent the 3-element tuple (a,b,c) as ((a,b), c), after all. To be able to extract values from our pairs, we’ll add the fst and snd functions into our language, which accept a tuple and return its first or second element, respectively.

Our Expr data type, which allows ill-typed expressions, ends up as follows:

From TypesafeIntrV3.idr, lines 31 through 39
 31 32 33 34 35 36 37 38 39  data Expr = IntLit Int | BoolLit Bool | StringLit String | BinOp Op Expr Expr | IfElse Expr Expr Expr | Pair Expr Expr | Fst Expr | Snd Expr

I’ve highlighted the new lines. The additions consist of the Pair constructor, which represents the tuple expression (a, b), and the Fst and Snd constructors, which represent the fst e and snd e expressions, respectively. In a similar manner, we extend our SafeExpr GADT:

From TypesafeIntrV3.idr, lines 41 through 49
 41 42 43 44 45 46 47 48 49  data SafeExpr : ExprType -> Type where IntLiteral : Int -> SafeExpr IntType BoolLiteral : Bool -> SafeExpr BoolType StringLiteral : String -> SafeExpr StringType BinOperation : (repr a -> repr b -> repr c) -> SafeExpr a -> SafeExpr b -> SafeExpr c IfThenElse : SafeExpr BoolType -> SafeExpr t -> SafeExpr t -> SafeExpr t NewPair : SafeExpr t1 -> SafeExpr t2 -> SafeExpr (PairType t1 t2) First : SafeExpr (PairType t1 t2) -> SafeExpr t1 Second : SafeExpr (PairType t1 t2) -> SafeExpr t2

Finally, to provide the PairType constructor, we extend the ExprType and repr functions:

From TypesafeIntrV3.idr, lines 1 through 11
  1 2 3 4 5 6 7 8 9 10 11  data ExprType = IntType | BoolType | StringType | PairType ExprType ExprType repr : ExprType -> Type repr IntType = Int repr BoolType = Bool repr StringType = String repr (PairType t1 t2) = Pair (repr t1) (repr t2)

Implementing Equality

An important part of this change is the extension of the decEq function, which compares two types for equality. The kind folks over at #idris previously recommended the use of the Dec data type for this purpose. A value of type Dec P [note: It's possible that a proposition $P$ is not provable, and neither is $\lnot P$. It is therefore not possible to construct a value of type Dec P for any proposition P. Having a value of type Dec P, then, provides us nontrivial information. ] Our decEq function, given two types a and b, returns Dec (a = b). Thus, it will return either a proof that a = b (which we can then use to convince the Idris type system that two SafeExpr values are, in fact, of the same type), or a proof of a = b -> Void (which tells us that a and b are definitely not equal). If you’re not sure what the deal with (=) and Void is, check out this section of the previous article.

A lot of the work in implementing decEq went into constructing proofs of falsity. That is, we needed to explicitly list every case like decEq IntType BoolType, and create a proof that IntType cannot equal BoolType. However, here’s how we use decEq in the typechecking function:

From TypesafeIntrV2.idr, lines 76 through 78
 76 77 78   case decEq tt et of Yes p => pure (_ ** IfThenElse ce (replace p te) ee) No _ => Left "Incompatible branch types."

We always throw away the proof inequality! So, rather than spending the time constructing useless proofs like this, we can just switch decEq to return a Maybe (a = b). The Just case will tell us that the two types are equal (and, as before, provide a proof); the Nothing case will tell us that the two types are not equal, and provide no further information. Let’s see the implementation of decEq now:

From TypesafeIntrV3.idr, lines 13 through 23
 13 14 15 16 17 18 19 20 21 22 23  decEq : (a : ExprType) -> (b : ExprType) -> Maybe (a = b) decEq IntType IntType = Just Refl decEq BoolType BoolType = Just Refl decEq StringType StringType = Just Refl decEq (PairType lt1 lt2) (PairType rt1 rt2) = do subEq1 <- decEq lt1 rt1 subEq2 <- decEq lt2 rt2 let firstEqual = replace {P = \t1 => PairType lt1 lt2 = PairType t1 lt2} subEq1 Refl let secondEqual = replace {P = \t2 => PairType lt1 lt2 = PairType rt1 t2} subEq2 firstEqual pure secondEqual decEq _ _ = Nothing

Lines 14 through 16 are pretty simple; in this case, we can tell at a glance that the two types are equal, and Idris can infer an equality proof in the form of Refl. We return this proof by writing it in a Just. Line 23 is the catch-all case for any combination of types we didn’t handle. Any combination of types we don’t handle is invalid, and thus, this case returns Nothing.

What about lines 17 through 22? This is the case for handling the equality of two pair types, (lt1, lt2) and (rt1, rt2). The equality of the two types depends on the equality of their constituents. That is, if we know that lt1 = rt1 and lt2 = rt2, we know that the two pair types are also equal. If one of the two equalities doesn’t hold, the two pairs obviously aren’t equal, and thus, we should return Nothing. This should remind us of Maybe’s monadic nature: we can first compute decEq lt1 rt1, and then, if it succeeds, compute decEq lt2 rt2. If both succeed, we will have in hand the two proofs, lt1 = rt1 and lt2 = rt2. We achieve this effect using do-notation, storing the sub-proofs into subEq1 and subEq2.

What now? Once again, we have to use replace. Recall its type:

replace : {a:_} -> {x:_} -> {y:_} -> {P : a -> Type} -> x = y -> P x -> P y


Given some proposition in terms of a, and knowing that a = b, replace returns the original proposition, but now in terms of b. We know for sure that:

PairType lt1 lt2 = PairType lt1 lt2


We can start from there. Let’s handle one thing at a time, and try to replace the second lt1 with rt1. Then, we can replace the second lt2 with rt2, and we’ll have our equality!

Easier said than done, though. How do we tell Idris which lt1 we want to substitute? After all, of the following are possible:

PairType rt1 lt2 = PairType lt1 lt2 -- First lt1 replaced
PairType lt1 lt2 = PairType rt1 lt2 -- Second lt1 replaced
PairType rt1 lt2 = PairType rt1 lt2 -- Both replaced


The key is in the signature, specifically the expressions P x and P y. We can think of P as a function, and of replace as creating a value of P applied to another argument. Thus, the substitution will occur exactly where the argument of P is used. Then, to achieve each of the above substitution, we can write P as follows:

 1 2 3  t1 => PairType t1 lt2 = PairType lt1 lt2 t1 => PairType lt1 lt2 = PairType t1 lt2 t1 => PairType t1 lt2 = PairType t1 lt2 

The second function (highlighted) is the one we’ll need to use to attain the desired result. Since P is an implicit argument to replace, we can explicitly provide it with {P=...}, leading to the following line:

From TypesafeIntrV3.idr, line 20
 20   let firstEqual = replace {P = \t1 => PairType lt1 lt2 = PairType t1 lt2} subEq1 Refl

We now have a proof of the following proposition:

PairType lt1 lt2 = PairType rt1 lt2


We want to replace the second lt2 with rt2, which means that we write our P as follows:

t2 => PairType lt1 lt2 = PairType rt1 t2


Finally, we perform the second replacement, and return the result:

From TypesafeIntrV3.idr, lines 21 through 22
 21 22   let secondEqual = replace {P = \t2 => PairType lt1 lt2 = PairType rt1 t2} subEq2 firstEqual pure secondEqual

Great! We have finished implement decEq.

It’s time to make our typechecker work with tuples. First, we need to fix the IfElse case to accept Maybe (a=b) instead of Dec (a=b):

From TypesafeIntrV3.idr, lines 71 through 78
 71 72 73 74 75 76 77 78  typecheck (IfElse c t e) = do ce <- typecheck c >>= requireBool (tt ** te) <- typecheck t (et ** ee) <- typecheck e case decEq tt et of Just p => pure (_ ** IfThenElse ce (replace p te) ee) Nothing => Left "Incompatible branch types."

Note that the only change is from Dec to Maybe; we didn’t need to add new cases or even to know what sort of types are available in the language.

Next, we can write the cases for the new expressions in our language. We can start with Pair, which, given expressions of types a and b, creates an expression of type (a,b). As long as the arguments to Pair are well-typed, so is the Pair expression itself; thus, there are no errors to handle.

From TypesafeIntrV3.idr, lines 79 through 83
 79 80 81 82 83  typecheck (Pair l r) = do (lt ** le) <- typecheck l (rt ** re) <- typecheck r pure (_ ** NewPair le re)

The case for Fst is more complicated. If the argument to Fst is a tuple of type (a, b), then Fst constructs from it an expression of type a. Otherwise, the expression is ill-typed, and we return an error.

From TypesafeIntrV3.idr, lines 84 through 89
 84 85 86 87 88 89  typecheck (Fst p) = do (pt ** pe) <- typecheck p case pt of PairType _ _ => pure $(_ ** First pe) _ => Left "Applying fst to non-pair." The case for Snd is very similar: From TypesafeIntrV3.idr, lines 90 through 96  90 91 92 93 94 95  typecheck (Snd p) = do (pt ** pe) <- typecheck p case pt of PairType _ _ => pure$ (_ ** Second pe) _ => Left "Applying snd to non-pair." 

Evaluation Function and Conclusion

We conclude with our final eval and resultStr functions, which now look as follows.

From TypesafeIntrV3.idr, lines 97 through 111
  97 98 99 100 101 102 103 104 105 106 107 108 109 110 111  eval : SafeExpr t -> repr t eval (IntLiteral i) = i eval (BoolLiteral b) = b eval (StringLiteral s) = s eval (BinOperation f l r) = f (eval l) (eval r) eval (IfThenElse c t e) = if (eval c) then (eval t) else (eval e) eval (NewPair l r) = (eval l, eval r) eval (First p) = fst (eval p) eval (Second p) = snd (eval p) resultStr : {t : ExprType} -> repr t -> String resultStr {t=IntType} i = show i resultStr {t=BoolType} b = show b resultStr {t=StringType} s = show s resultStr {t=PairType t1 t2} (l,r) = "(" ++ resultStr l ++ ", " ++ resultStr r ++ ")"

As you can see, we require no error handling in eval; the expressions returned by typecheck are guaranteed to evaluate to valid Idris values. We have achieved our goal, with very little changes to typecheck other than the addition of new language constructs. In my opinion, this is a win!

As always, you can see the code on my Git server. Here’s the latest Idris file, if you want to check it out (and maybe verify that it compiles). I hope you found this interesting!