I’ve been working on a relatively large Agda project for a few months now, and I’d like to think that I’ve become quite proficient. Recently, I came up with a little trick to help simplify some of my proofs, and it seems like this trick might have broader applications.
In my head, I call this trick ‘Deeply Embedded Expressions’. Before I introduce it, let me explain the part of my work that motivated developing the trick.
Proofs about Map Operations
A part of my Agda project is the formalization of simple keyvalue maps.
I model keyvalue maps as lists of keyvalue pairs. On top of this, I implement
two operations: join
and meet
, which in my code are denoted using ⊔
and ⊓
.
When “joining” two maps, you create a new map that has the keys from both input ones.
If a key is only present in one of the input maps, then the new “joined” map
has the same value for that key as the original. On the other hand, if the key
is present in both maps, then its value in the new map is the result of “joining”
the original values. The “meet” operation is similar, except instead of taking
keys from either map, the result only has keys that were present in both maps,
“meeting” their values. In a way, “join” and “meet” are similar to set union
and intersection — but they also operate on the values in the map.
Given these operations, I need to prove certain properties of these operation. The most inconvenient to prove is probably associativity:


This property is, in turn, proven using two ‘subset’ relations on maps, defined in the usual way.




The reason this property is so inconvenient to prove is that there are a lot of cases to consider. That’s because your claim, in words, is something like:
Suppose a keyvalue pair
k , v
is present in(m₁ ⊔ m₂) ⊔ m₃
. Show thatk , v
is also inm₁ ⊔ (m₂ ⊔ m₃)
.
The only thing you can really do with k , v
is figure out how it got into
the threeway union map: did it come from m₁
, m₂
, or m₃
, or perhaps
several of them? The essence of the proof boils down to repeated uses
of the fact that for a key to be in the union, it must be in at least one
of the two maps. You end up with witnesses, repeated application of the same
lemmas, lots of let
expressions or where
clauses. It’s relatively tedious
and, what’s more frustrating, driven entirely by the structure of the
map operations. It seems like one shouldn’t have to mimic that structure
using boilerplate lemmas. So I started looking at other ways.
Case Analysis using GADTs
A “proof by cases” in a dependently typed language like Agda usually brings
to mind pattern matching. So, here’s an idea: what if for each expression
involving ⊔
and ⊓
, we had some kind of data type, and that data type
had exactly as many inhabitants as there are cases to analyze? A data type
corresponding to m₁ ⊔ m₂
might have three cases, and the one for
(m₁ ⊔ m₂) ⊔ m₃
might have seven. Each case would contain the information
necessary to perform the proof.
A data type whose “shape” depends on an expression in the way I described above is said to be indexed by that expression. In Agda, GADTs are used to create indexed types. My initial attempt was something like this:
data Provenance (k : A) : B → Map → Set (a ⊔ℓ b) where
single : ∀ {v : B} {m : Map} → (k , v) ∈ m → Provenance k v m
in₁ : ∀ {v : B} {m₁ m₂ : Expr} → Provenance k v e₁ → ¬ k ∈k m₂ → Provenance k v (e₁ ⊔ e₂)
in₂ : ∀ {v : B} {m₁ m₂ : Expr} → ¬ k ∈k m₁ → Provenance k v m₂ → Provenance k v (e₁ ⊔ e₂)
bothᵘ : ∀ {v₁ v₂ : B} {m₁ m₂ : Expr} → Provenance k v₁ m₁ → Provenance k v₂ m₂ → Provenance k (v₁ ⊔ v₂) (e₁ ⊔ e₂)
bothⁱ : ∀ {v₁ v₂ : B} {m₁ m₂ : Expr} → Provenance k v₁ m₁ → Provenance k v₂ m₂ → Provenance k (v₁ ⊓ v₂) (e₁ ⊓ e₂)
I was planning on a proof of associativity (in one direction) that looked
something like the following — pattern matching on cases from the new
Provenance
type.
⊔assoc₁ : ((m₁ ⊔ m₂) ⊔ m₃) ⊆ (m₁ ⊔ (m₂ ⊔ m₃))
⊔assoc₁ k v k,v∈m₁₂m₃
with getProvenance k,v∈m₁₂m₃
...  in₂ k∉km₁₂ (single v∈m₃) = ...
...  in₁ (in₂ k∉km₁ (single v∈m₂)) k∉km₃ = ...
...  bothᵘ (in₂ k∉km₁ (single {v₂} v₂∈m₂)) (single {v₃} v₃∈m₃) = ...
...  in₁ (in₁ (single v₁∈m₁) k∉km₂) k∉km₃ = ...
...  bothᵘ (in₁ (single {v₁} v₁∈m₁) k∉km₂) (single {v₃} v₃∈m₃) = ...
...  in₁ (bothᵘ (single {v₁} v₁∈m₁) (single {v₂} v₂∈m₂)) k∉ke₃ = ...
...  bothᵘ (bothᵘ (single {v₁} v₁∈m₁) (single {v₂} v₂∈m₂)) (single {v₃} v₃∈m₃) = ...
However, this doesn’t work. Agda has trouble figuring out which cases of
the Provenance
GADT are allowed, in which aren’t. Is m₁ ⊔ m
a single map,
fit for the single
case, or should it be broken up into more cases like
in₁
and in₂
? In general, is some expression of type Map
the “bottom”
of our recursion, or should it be analyzed further?
The above hints at what’s wrong. The mistake here is requiring Agda to infer the shape of our “join” and “meet” expressions from arbitrary terms. The set of expressions that we want to reason about is much more restricted – each expression will always be of three components: “meet”, “join”, and basecase maps being combined using these operations.
Defining an Expression Data Type
If you’re like me, and have spent years of your life around programming language theory and domain specific languages (DSLs), the last sentence of the previous section may be ringing a bell. In fact, it’s eerily similar to how we describe recursive grammars:
An expression of interest is either,
 A map
 The “join” of two expressions
 The “meet” of two expressions
Mathematically, we might write this as follows:
$\begin{array}{rcll} e & ::= & m & \text{(maps)} \\ &  & e \sqcup e & \text{(join)} \\ &  & e \sqcap e & \text{(meet)} \end{array}$And in Agda,


In the code, I used the set union and intersection operators
to avoid overloading the ⊔
and ⊓
more than they already are.
We have just defined a very small expression language. In computer science, a language is called deeply embedded if a data type (or class hierarchy, or other ’explicit’ representation) is defined for its syntax in the host language (Agda, in our case). This is in contrast to a shallow embedding, in which expressions in the (new) language are just expressions in the host language.
In this sense, our Expr
is deeply embedded — we defined new container for it,
and _∪_
is a distinct entity from _⊔_
. Our first attempt was a shallow
embedding. That fell through because the Agda language is much broader than
our expression language, which makes case analysis very difficult.
An obvious thing to do with an expression is to evaluate it. This will be
important for our proofs, because it will establish a connection between
expressions (created via Expr
) and actual Agda objects that we need to
reason about at the end of the day. The notation $\llbracket e \rrbracket$
is commonly used in PL circles for evaluation (it comes from
Denotational Semantics).
Thus, my Agda evaluation function is written as follows:


On top of this, here is my actual implementation of the Provenance
data type.
This time, it’s indexed by expressions in Expr
, which makes it much easier to
pattern match on instances:


Note that we have to use the evaluation function to be able to use
operators such as ∈
. That’s because these are still defined on maps,
and not expressions.
With this, I was able to write my proof in the way that I had hoped. It has the exact form of my previous sketchofproof.
(click here to see the full example, including each case’s implementation)


The General Trick
So far, I’ve presented a problem I faced in my Agda proof and a solution for that problem. However, it may not be clear how useful the trick is beyond this narrow case that I’ve encountered. The way I see it, the “deeply embedded expression” trick is applicable whenever you have data that is constructed from some fixed set of cases, and when proofs about that data need to follow the structure of these cases. Thus, examples include:
 Proofs about the origin of keys in a map (this one): the “data” is the keyvalue map that is being analyzed. The enumeration of cases for this map is driven by the structure of the “join” and “meet” operations used to build the map.
 Automatic derivation of function properties: suppose you’re interested in working with continuous functions. You also know that the addition, subtraction, and multiplication of two functions preserves continuity. Of course, the constant function $x \mapsto c$ and the identity function $x \mapsto x$ are continuous too. You may define an expression data type that has cases for these operations. Then, your evaluation function could transform the expression into a plain function, and a proof on the structure of the expression can be used to verify the resulting function’s continuity.
 Proof search for algebraic expressions: suppose that you wanted to automatically find solutions for certain algebraic (in)equalities. Instead of using some sort of reflection mechanism to inspect terms and determine how constraints should be solved, you might represent the set of operations in you equation system as cases in a data type. You can then use regular Agda code to manipulate terms; an evaluation function can then be used to recover the equations in Agda, together with witnesses justifying the solution.
There are some pretty clear commonalities about examples above, which are the ingredients to this trick:

The expression: you create a new expression data type that encodes all
the operations (and bases cases) on your data. In my example, this is
the
Expr
data type. 
The evaluation function: you provide a way to lower the expression
you’ve defined back into a regular Agda term. This connects your (abstract)
operations to their interpretation in Agda. In my example, this is the
⟦_⟧
function. 
The proofs: you write proofs that consider only the fixed set of cases
encoded by the data type (
Expr
), but state properties about the evaluated expression. In my example, this isProvenance
and theExprProvenance
function. Specifically, theProvenance
data type connects expressions and the terms they evaluate to, because it is indexed by expressions, but contains data in the formk ∈k ⟦ e₂ ⟧
.
Conclusion
I’ll be the first to admit that this trick is quite situational, and may not be as farreaching as the “Is Something” pattern I wrote about before, which seems to occur far more in the wild. However, there have now been two times when I personally reached for this trick, which seems to suggest that it may be useful to someone else.
I hope you’ve found this useful. Happy (dependently typed) programming!