Some time ago, I wrote a post titled Meaningfully Typechecking a Language in Idris. I then followed it up with Meaningfully Typechecking a Language in Idris, Revisited. In these posts, I described a hypothetical way of ‘typechecking’ an expression data type Expr into a typesafe form SafeExpr. A SafeExpr can be evaluated without any code to handle type errors, since it’s by definition impossible to construct ill-typed expressions using it. In the first post, we implemented the method only for simple arithmetic expressions; in my latter post, we extended this to support if-expressions. Near the end of the post, I made the following comment:

When we add polymorphic tuples and lists, we start being able to construct an arbitrary number of types: [a]. [[a]], and so on. Then, we cease to be able t enumerate all possible pairs of types, and require a recursive solution. I think that this leads us back to [our method].

Recently, I thought about this some more, and decided that it’s rather simple to add tuples into our little language. The addition of tuples mean that our language will have an infinite number of possible types. We would have Int, (Int, Int), ((Int, Int), Int), and so on. This would make it impossible to manually test every possible case in our typechecker, but our approach of returning Dec (a = b) will work just fine.

Extending The Syntax

First, let’s extend our existing language with expressions for tuples. For simplicity, let’s use pairs (a,b) instead of general n-element tuples. This would make typechecking less cumbersome while still having the interesting effect of making the number of types in our language infinite. We can always represent the 3-element tuple (a,b,c) as ((a,b), c), after all. To be able to extract values from our pairs, we’ll add the fst and snd functions into our language, which accept a tuple and return its first or second element, respectively.

Our Expr data type, which allows ill-typed expressions, ends up as follows:

From TypesafeIntrV3.idr, lines 31 through 39
data Expr
  = IntLit Int
  | BoolLit Bool
  | StringLit String
  | BinOp Op Expr Expr
  | IfElse Expr Expr Expr
  | Pair Expr Expr
  | Fst Expr
  | Snd Expr

I’ve highlighted the new lines. The additions consist of the Pair constructor, which represents the tuple expression (a, b), and the Fst and Snd constructors, which represent the fst e and snd e expressions, respectively. In a similar manner, we extend our SafeExpr GADT:

From TypesafeIntrV3.idr, lines 41 through 49
data SafeExpr : ExprType -> Type where
  IntLiteral : Int -> SafeExpr IntType
  BoolLiteral : Bool -> SafeExpr BoolType
  StringLiteral : String -> SafeExpr StringType
  BinOperation : (repr a -> repr b -> repr c) -> SafeExpr a -> SafeExpr b -> SafeExpr c
  IfThenElse : SafeExpr BoolType -> SafeExpr t -> SafeExpr t -> SafeExpr t
  NewPair : SafeExpr t1 -> SafeExpr t2 -> SafeExpr (PairType t1 t2)
  First : SafeExpr (PairType t1 t2) -> SafeExpr t1
  Second : SafeExpr (PairType t1 t2) -> SafeExpr t2

Finally, to provide the PairType constructor, we extend the ExprType and repr functions:

From TypesafeIntrV3.idr, lines 1 through 11
data ExprType
  = IntType
  | BoolType
  | StringType
  | PairType ExprType ExprType

repr : ExprType -> Type
repr IntType = Int
repr BoolType = Bool
repr StringType = String
repr (PairType t1 t2) = Pair (repr t1) (repr t2)

Implementing Equality

An important part of this change is the extension of the decEq function, which compares two types for equality. The kind folks over at #idris previously recommended the use of the Dec data type for this purpose. A value of type Dec P [note: It's possible that a proposition PP is not provable, and neither is ¬P\lnot P. It is therefore not possible to construct a value of type Dec P for any proposition P. Having a value of type Dec P, then, provides us nontrivial information. ] Our decEq function, given two types a and b, returns Dec (a = b). Thus, it will return either a proof that a = b (which we can then use to convince the Idris type system that two SafeExpr values are, in fact, of the same type), or a proof of a = b -> Void (which tells us that a and b are definitely not equal). If you’re not sure what the deal with (=) and Void is, check out this section of the previous article.

A lot of the work in implementing decEq went into constructing proofs of falsity. That is, we needed to explicitly list every case like decEq IntType BoolType, and create a proof that IntType cannot equal BoolType. However, here’s how we use decEq in the typechecking function:

From TypesafeIntrV2.idr, lines 76 through 78
    case decEq tt et of
      Yes p => pure (_ ** IfThenElse ce (replace p te) ee)
      No _ => Left "Incompatible branch types."

We always throw away the proof inequality! So, rather than spending the time constructing useless proofs like this, we can just switch decEq to return a Maybe (a = b). The Just case will tell us that the two types are equal (and, as before, provide a proof); the Nothing case will tell us that the two types are not equal, and provide no further information. Let’s see the implementation of decEq now:

From TypesafeIntrV3.idr, lines 13 through 23
decEq : (a : ExprType) -> (b : ExprType) -> Maybe (a = b)
decEq IntType IntType = Just Refl
decEq BoolType BoolType = Just Refl
decEq StringType StringType = Just Refl
decEq (PairType lt1 lt2) (PairType rt1 rt2) = do
  subEq1 <- decEq lt1 rt1
  subEq2 <- decEq lt2 rt2
  let firstEqual = replace {P = \t1 => PairType lt1 lt2 = PairType t1 lt2} subEq1 Refl
  let secondEqual = replace {P = \t2 => PairType lt1 lt2 = PairType rt1 t2} subEq2 firstEqual
  pure secondEqual
decEq _ _ = Nothing

Lines 14 through 16 are pretty simple; in this case, we can tell at a glance that the two types are equal, and Idris can infer an equality proof in the form of Refl. We return this proof by writing it in a Just. Line 23 is the catch-all case for any combination of types we didn’t handle. Any combination of types we don’t handle is invalid, and thus, this case returns Nothing.

What about lines 17 through 22? This is the case for handling the equality of two pair types, (lt1, lt2) and (rt1, rt2). The equality of the two types depends on the equality of their constituents. That is, if we know that lt1 = rt1 and lt2 = rt2, we know that the two pair types are also equal. If one of the two equalities doesn’t hold, the two pairs obviously aren’t equal, and thus, we should return Nothing. This should remind us of Maybe's monadic nature: we can first compute decEq lt1 rt1, and then, if it succeeds, compute decEq lt2 rt2. If both succeed, we will have in hand the two proofs, lt1 = rt1 and lt2 = rt2. We achieve this effect using do-notation, storing the sub-proofs into subEq1 and subEq2.

What now? Once again, we have to use replace. Recall its type:

replace : {a:_} -> {x:_} -> {y:_} -> {P : a -> Type} -> x = y -> P x -> P y

Given some proposition in terms of a, and knowing that a = b, replace returns the original proposition, but now in terms of b. We know for sure that:

PairType lt1 lt2 = PairType lt1 lt2

We can start from there. Let’s handle one thing at a time, and try to replace the second lt1 with rt1. Then, we can replace the second lt2 with rt2, and we’ll have our equality!

Easier said than done, though. How do we tell Idris which lt1 we want to substitute? After all, of the following are possible:

PairType rt1 lt2 = PairType lt1 lt2 -- First lt1 replaced
PairType lt1 lt2 = PairType rt1 lt2 -- Second lt1 replaced
PairType rt1 lt2 = PairType rt1 lt2 -- Both replaced

The key is in the signature, specifically the expressions P x and P y. We can think of P as a function, and of replace as creating a value of P applied to another argument. Thus, the substitution will occur exactly where the argument of P is used. Then, to achieve each of the above substitution, we can write P as follows:

t1 => PairType t1 lt2 = PairType lt1 lt2
t1 => PairType lt1 lt2 = PairType t1 lt2
t1 => PairType t1 lt2 = PairType t1 lt2

The second function (highlighted) is the one we’ll need to use to attain the desired result. Since P is an implicit argument to replace, we can explicitly provide it with {P=...}, leading to the following line:

From TypesafeIntrV3.idr, line 20
  let firstEqual = replace {P = \t1 => PairType lt1 lt2 = PairType t1 lt2} subEq1 Refl

We now have a proof of the following proposition:

PairType lt1 lt2 = PairType rt1 lt2

We want to replace the second lt2 with rt2, which means that we write our P as follows:

t2 => PairType lt1 lt2 = PairType rt1 t2

Finally, we perform the second replacement, and return the result:

From TypesafeIntrV3.idr, lines 21 through 22
  let secondEqual = replace {P = \t2 => PairType lt1 lt2 = PairType rt1 t2} subEq2 firstEqual
  pure secondEqual

Great! We have finished implement decEq.

Adjusting The Typechecker

It’s time to make our typechecker work with tuples. First, we need to fix the IfElse case to accept Maybe (a=b) instead of Dec (a=b):

From TypesafeIntrV3.idr, lines 71 through 78
typecheck (IfElse c t e) =
    ce <- typecheck c >>= requireBool
    (tt ** te) <- typecheck t
    (et ** ee) <- typecheck e
    case decEq tt et of
      Just p => pure (_ ** IfThenElse ce (replace p te) ee)
      Nothing => Left "Incompatible branch types."

Note that the only change is from Dec to Maybe; we didn’t need to add new cases or even to know what sort of types are available in the language.

Next, we can write the cases for the new expressions in our language. We can start with Pair, which, given expressions of types a and b, creates an expression of type (a,b). As long as the arguments to Pair are well-typed, so is the Pair expression itself; thus, there are no errors to handle.

From TypesafeIntrV3.idr, lines 79 through 83
typecheck (Pair l r) =
    (lt ** le) <- typecheck l
    (rt ** re) <- typecheck r
    pure (_ ** NewPair le re)

The case for Fst is more complicated. If the argument to Fst is a tuple of type (a, b), then Fst constructs from it an expression of type a. Otherwise, the expression is ill-typed, and we return an error.

From TypesafeIntrV3.idr, lines 84 through 89
typecheck (Fst p) =
    (pt ** pe) <- typecheck p
    case pt of
      PairType _ _ => pure $ (_ ** First pe)
      _ => Left "Applying fst to non-pair."

The case for Snd is very similar:

From TypesafeIntrV3.idr, lines 90 through 96
typecheck (Snd p) =
    (pt ** pe) <- typecheck p
    case pt of
      PairType _ _ => pure $ (_ ** Second pe)
      _ => Left "Applying snd to non-pair."

Evaluation Function and Conclusion

We conclude with our final eval and resultStr functions, which now look as follows.

From TypesafeIntrV3.idr, lines 97 through 111
eval : SafeExpr t -> repr t
eval (IntLiteral i) = i
eval (BoolLiteral b) = b
eval (StringLiteral s) = s
eval (BinOperation f l r) = f (eval l) (eval r)
eval (IfThenElse c t e) = if (eval c) then (eval t) else (eval e)
eval (NewPair l r) = (eval l, eval r)
eval (First p) = fst (eval p)
eval (Second p) = snd (eval p)

resultStr : {t : ExprType} -> repr t -> String
resultStr {t=IntType} i = show i
resultStr {t=BoolType} b = show b
resultStr {t=StringType} s = show s
resultStr {t=PairType t1 t2} (l,r) = "(" ++ resultStr l ++ ", " ++ resultStr r ++ ")"

As you can see, we require no error handling in eval; the expressions returned by typecheck are guaranteed to evaluate to valid Idris values. We have achieved our goal, with very little changes to typecheck other than the addition of new language constructs. In my opinion, this is a win!

As always, you can see the code on my Git server. Here’s the latest Idris file, if you want to check it out (and maybe verify that it compiles). I hope you found this interesting!