Implementing and Verifying "Static Program Analysis" in Agda, Part 2: Combining Lattices

Table of Contents

In the previous post, I wrote about how lattices arise when tracking, comparing and combining static information about programs. I then showed two simple lattices: the natural numbers, and the (parameterized) “above-below” lattice, which modified an arbitrary set with “bottom” and “top” elements ( $\bot$ and $\top$ respectively). One instance of the “above-below” lattice was the sign lattice, which could be used to reason about the signs (positive, negative, or zero) of variables in a program.

At the end of that post, I introduced a source of complexity: the “full” lattices that we want to use for the program analysis aren’t signs or numbers, but maps of states and variables to lattice-based descriptions. The full lattice for sign analysis might something in the form:

\text{Info} \triangleq \text{ProgramStates} \to (\text{Variables} \to \text{Sign})

Thus, we have to compare and find least upper bounds (e.g.) of not just signs, but maps! Proving the various lattice laws for signs was not too challenging, but for for a two-level map like $\text{Info}$ above, we’d need to do a lot more work. We need tools to build up such complicated lattices.

The way to do this, it turns out, is by using simpler lattices as building blocks. To start with, let’s take a look at a very simple way of combining lattices into a new one: taking the Cartesian product.

The Cartesian Product Lattice

Suppose you have two lattices $L_1$ and $L_2$ . As I covered in the previous post, each lattice comes equipped with a “least upper bound” operator $(\sqcup)$ and a “greatest lower bound” operator $(\sqcap)$ . Since we now have two lattices, let’s use numerical suffixes to disambiguate between the operators of the first and second lattice: $(\sqcup_1)$ will be the LUB operator of the first lattice $L_1$ , and $(\sqcup_2)$ of the second lattice $L_2$ , and so on.

Then, let’s take the Cartesian product of the elements of $L_1$ and $L_2$ ; mathematically, we’ll write this as $L_1 \times L_2$ , and in Agda, we can just use the standard Data.Product module. Then, I’ll define the lattice as another parameterized module. Since both $L_1$ and $L_2$ are lattices, this parameterized module will require IsLattice instances for both types:

From Prod.agda, lines 1 through 7



1
2
3
4
5
6
7



open import Lattice

module Lattice.Prod {a b} {A : Set a} {B : Set b}
    (_≈₁_ : A → A → Set a) (_≈₂_ : B → B → Set b)
    (_⊔₁_ : A → A → A) (_⊔₂_ : B → B → B)
    (_⊓₁_ : A → A → A) (_⊓₂_ : B → B → B)
    (lA : IsLattice A _≈₁_ _⊔₁_ _⊓₁_) (lB : IsLattice B _≈₂_ _⊔₂_ _⊓₂_) where

Elements of $L_1 \times L_2$ are in the form $(l_1, l_2)$ , where $l_1 \in L_1$ and $l_2 \in L_2$ . Knowing that, let’s define what it means for two such elements to be equal. Recall that we opted for a custom equivalence relation instead of definitional equality to allow similar elements to be considered equal; we’ll have to define a similar relation for our new product lattice. That’s easy enough: we have an equality predicate _≈₁_ that checks if an element of $L_1$ is equal to another, and we have _≈₂_ that does the same for $L_2$ . It’s reasonable to say that pairs of elements are equal if their respective first and second elements are equal:

(l_1, l_2) \approx (j_1, j_2) \iff l_1 \approx_1 j_1 \land l_2 \approx_2 j_2

In Agda:

From Prod.agda, lines 39 through 40

39
40

_≈_ : A × B → A × B → Set (a ⊔ℓ b)
(a₁ , b₁) ≈ (a₂ , b₂) = (a₁ ≈₁ a₂) × (b₁ ≈₂ b₂)

Verifying that this relation has the properties of an equivalence relation boils down to the fact that _≈₁_ and _≈₂_ are themselves equivalence relations.

From Prod.agda, lines 42 through 48



42
43
44
45
46
47
48



≈-equiv : IsEquivalence (A × B) _≈_
≈-equiv = record
    { ≈-refl = λ {p} → (≈₁-refl , ≈₂-refl)
    ; ≈-sym = λ {p₁} {p₂} (a₁≈a₂ , b₁≈b₂) → (≈₁-sym a₁≈a₂ , ≈₂-sym b₁≈b₂)
    ; ≈-trans = λ {p₁} {p₂} {p₃} (a₁≈a₂ , b₁≈b₂) (a₂≈a₃ , b₂≈b₃) →
        ( ≈₁-trans a₁≈a₂ a₂≈a₃ , ≈₂-trans b₁≈b₂ b₂≈b₃ )
    }

Defining $(\sqcup)$ and $(\sqcap)$ by simply applying the corresponding operators from $L_1$ and $L_2$ seems quite natural as well.

(l_1, l_2) \sqcup (j_1, j_2) \triangleq (l_1 \sqcup_1 j_1, l_2 \sqcup_2 j_2) \\ (l_1, l_2) \sqcap (j_1, j_2) \triangleq (l_1 \sqcap_1 j_1, l_2 \sqcap_2 j_2)

As an example, consider the product lattice $\text{Sign}\times\text{Sign}$ , which is made up of pairs of signs that we talked about in the previous post. Two elements of this lattice are $(+, +)$ and $(+, -)$ . Here’s how the $(\sqcup)$ operation is evaluated on them:

(+, +) \sqcup (+, -) = (+ \sqcup + , + \sqcup -) = (+ , \top)

In Agda, the definition is written very similarly to its mathematical form:

From Prod.agda, lines 50 through 54

50
51
52
53
54

_⊔_ : A × B → A × B → A × B
(a₁ , b₁) ⊔ (a₂ , b₂) = (a₁ ⊔₁ a₂ , b₁ ⊔₂ b₂)

_⊓_ : A × B → A × B → A × B
(a₁ , b₁) ⊓ (a₂ , b₂) = (a₁ ⊓₁ a₂ , b₁ ⊓₂ b₂)

All that’s left is to prove the various (semi)lattice properties. Intuitively, we can see that since the “combined” operator _⊔_ just independently applies the element operators _⊔₁_ and _⊔₂_, as long as they are idempotent, commutative, and associative, so is the “combined” operator itself. Moreover, the proofs that _⊔_ and _⊓_ form semilattices are identical up to replacing $(\sqcup)$ with $(\sqcap)$ . Thus, in Agda, we can write the code once, parameterizing it by the binary operators involved (and proofs that these operators obey the semilattice laws).

From Prod.agda, lines 56 through 82



56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82



private module ProdIsSemilattice (f₁ : A → A → A) (f₂ : B → B → B) (sA : IsSemilattice A _≈₁_ f₁) (sB : IsSemilattice B _≈₂_ f₂) where
    isSemilattice : IsSemilattice (A × B) _≈_ (λ (a₁ , b₁) (a₂ , b₂) → (f₁ a₁ a₂ , f₂ b₁ b₂))
    isSemilattice = record
        { ≈-equiv = ≈-equiv
        ; ≈-⊔-cong = λ (a₁≈a₂ , b₁≈b₂) (a₃≈a₄ , b₃≈b₄) →
            ( IsSemilattice.≈-⊔-cong sA a₁≈a₂ a₃≈a₄
            , IsSemilattice.≈-⊔-cong sB b₁≈b₂ b₃≈b₄
            )
        ; ⊔-assoc = λ (a₁ , b₁) (a₂ , b₂) (a₃ , b₃) →
            ( IsSemilattice.⊔-assoc sA a₁ a₂ a₃
            , IsSemilattice.⊔-assoc sB b₁ b₂ b₃
            )
        ; ⊔-comm = λ (a₁ , b₁) (a₂ , b₂) →
            ( IsSemilattice.⊔-comm sA a₁ a₂
            , IsSemilattice.⊔-comm sB b₁ b₂
            )
        ; ⊔-idemp = λ (a , b) →
            ( IsSemilattice.⊔-idemp sA a
            , IsSemilattice.⊔-idemp sB b
            )
        }

isJoinSemilattice : IsSemilattice (A × B) _≈_ _⊔_
isJoinSemilattice = ProdIsSemilattice.isSemilattice _⊔₁_ _⊔₂_ joinSemilattice₁ joinSemilattice₂

isMeetSemilattice : IsSemilattice (A × B) _≈_ _⊓_
isMeetSemilattice = ProdIsSemilattice.isSemilattice _⊓₁_ _⊓₂_ meetSemilattice₁ meetSemilattice₂

Above, I used f₁ to stand for “either _⊔₁_ or _⊓₁_”, and similarly f₂ for “either _⊔₂_ or _⊓₂_”. Much like the semilattice properties, proving lattice properties boils down to applying the lattice properties of $L_1$ and $L_2$ to individual components.

From Prod.agda, lines 84 through 96



84
85
86
87
88
89
90
91
92
93
94
95
96



isLattice : IsLattice (A × B) _≈_ _⊔_ _⊓_
isLattice = record
    { joinSemilattice = isJoinSemilattice
    ; meetSemilattice = isMeetSemilattice
    ; absorb-⊔-⊓ = λ (a₁ , b₁) (a₂ , b₂) →
        ( IsLattice.absorb-⊔-⊓ lA a₁ a₂
        , IsLattice.absorb-⊔-⊓ lB b₁ b₂
        )
    ; absorb-⊓-⊔ = λ (a₁ , b₁) (a₂ , b₂) →
        ( IsLattice.absorb-⊓-⊔ lA a₁ a₂
        , IsLattice.absorb-⊓-⊔ lB b₁ b₂
        )
    }

This concludes the definition of the product lattice, which is made up of two other lattices. If we have a type of analysis that can be expressed as a pair of two signs, [note: Perhaps the signs are the smallest and largest possible values of a variable. ] for example, we won’t have to do all the work of proving the (semi)lattice properties of those pairs. In fact, we can build up even bigger data structures. By taking a product twice, like $L_1 \times (L_2 \times L_3)$ , we can construct a lattice of 3-tuples. Any of the lattices involved in that product can itself be a product; we can therefore create lattices out of arbitrary bundles of data, so long as the smallest pieces that make up the bundles are themselves lattices.

Products will come very handy a bit later in this series. For now though, our goal is to create another type of lattice: the map lattice. We will take the same approach we did with products: assuming the elements of the map are lattices, we’ll prove that the map itself is a lattice. Then, just like we could put products inside products when building up lattices, we’ll be able to put a map inside a map. This will allow us to represent the $\text{Info}$ lattice, which is a map of maps.

The Map Lattice

The Theory

When I say “map”, what I really means is something that associates keys with values, like dictionaries in Python. This data structure need not have a value for every possible key; a very precise author might call such a map a “partial map”. We might have a map whose value (in Python-ish notation) is { "x": +, "y": - }. Such a map states that the sign of the variable x is +, and the sign of variable y is -. Another possible map is { "y": +, "z": - }; this one states that the sign of y is +, and the sign of another variable z is -.

Let’s start thinking about what sorts of lattices our maps will be. The thing that motivated our introduction of lattices was comparing them by “specificity”, so let’s try figure out how to compare maps. For that, we can begin small, by looking at singleton maps. If we have {"x": +} and {"x": ⊤}, which one of them is smaller? Well, we have previously established that + is more specific (and thus less than) ⊤. Thus, it shouldn’t be too much of a stretch to say that for singleton maps of the same key, the one with the smaller value is smaller.

Now, what about a pair of singleton maps like {"x": +} and {"y": ⊤}? Among these two, each contains some information that the other does not. Although the value of y is larger than the value of x, it describes a different key, so it seems wrong to use that to call the y-singleton “larger”. Let’s call these maps incompatible, then. More generally, if we have two maps and each one has a key that the other doesn’t, we can’t compare them.

If only one map has a unique key, though, things are different. Take for instance {"x": +} and {"x": +, "y": +}. Are they really incomparable? The keys that the two maps do share can be compared (+ <= +, because they’re equal).

All of the above leads to the following conventional definition, which I find easier to further motivate using $(\sqcup)$ and $(\sqcap)$ (and do so below).

A map m1 is less than or equal to another map m2 (m1 <= m2) if for every key k that has a value in m1, the key also has a value in m2, and m1[k] <= m2[k].

That definitions matches our intuitions so far. The only key in {"x": +} is x; this key is also in {"x": ⊤} (check) and + < ⊤ (check). On the other hand, both {"x": +} and {"y": ⊤} have a key that the other doesn’t, so the definition above is not satisfied. Finally, for {"x": +} and {"x": +, "y": +}, the only key in the former is also present in the latter, and + <= +; the definition is satisfied.

Next, we need to define the $(\sqcup)$ and $(\sqcap)$ operators that match our definition of “less than or equal”. Let’s start with $(\sqcup)$ . For two maps $m_1$ and $m_2$ , the join of those two maps, $m_1 \sqcup m_2$ should be greater than or equal to both; in other words, both sub-maps should be less than or equal to the join.

Our newly-introduced condition for “less than or equal” requires that each key in the smaller map be present in the larger one; as a result, $m_1 \sqcup m_2$ should contain all the keys in $m_1$ and all the keys in $m_2$ . So, we could just take the union of the two maps: copy values from both into the result. Only, what happens if both $m_1$ and $m_2$ have a value mapped to a particular key $k$ ? The values in the two maps could be distinct, and they might even be incomparable. This is where the second part of the condition kicks in: the value in the combination of the maps needs to be bigger than the value in either sub-map. We already know how to get a value that’s bigger than two other values: we use a join on the values!

Thus, define $m_1 \sqcup m_2$ as a map that has all the keys from $m_1$ and $m_2$ , where the value at a particular key is given as follows:

(m_1 \sqcup m_2)[k] = \begin{cases} m_1[k] \sqcup m_2[k] & k \in m_1, k \in m_2 \\ m_1[k] & k \in m_1, k \notin m_2 \\ m_2[k] & k \notin m_1, k \in m_2 \end{cases}

If you’re familiar with set theory, this operation is like an extension of the union operator $(\cup)$ [note: There are, of course, other ways to extend the "union" operation to maps. Haskell, for instance, defines it in a "left-biased" way (preferring the elements from the left operand of the operation when duplicates are encountered).

However, with a "join" operation $(\sqcup)$ that's defined on the values stored in the map gives us an extra tool to work with. As a result, I would argue that our extension, given such an operator, is the most natural. ] to maps. In fact, this begins to motivate the choice to use $(\sqcup)$ to denote this operation. A further bit of motivation is this: we’ve already seen that the $(\sqcup)$ and $(\sqcap)$ operators correspond to “or” and “and”. The elements in the union of two sets are precisely those that are in one set or the other. Thus, using union here fits our notion of how the $(\sqcup)$ operator behaves.

Now, let’s take a look at the $(\sqcap)$ operator. For two maps $m_1$ and $m_2$ , the meet of those two maps, $m_1 \sqcap m_2$ should be less than or equal to both. Our definition above requires that each key of the smaller map is present in the larger map; for the combination of two maps to be smaller than both, we must ensure that it only has keys present in both maps. To combine the elements from the two maps, we can use the $(\sqcap)$ operator on values.

(m_1 \sqcap m_2)[k] = m_1[k] \sqcap m_2[k]

Turning once again to set theory, we can think of this operation like the extension of the intersection operator $(\cap)$ to maps. This can be motivated in the same way as the union operation above; the $(\sqcap)$ operator combines lattice elements in such away that the result represents both of them, and intersections of sets contain elements that are in both sets.

Now we have the the two binary operators and the comparison function in hand. There’s just one detail we’re missing: what it means for two maps to be equivalent. Here, once again we take our cue from set theory: two sets are said to be equal when each one is a subset of the other. Mathematically, we can write this as follows:

m_1 \approx m_2 \triangleq m_1 \subseteq m_2 \land m_1 \supseteq m_2

I might as well show you the Agda definition of this, since it’s a word-for-word transliteration:

From Map.agda, lines 530 through 531

530
531

_≈_ : Map → Map → Set (a ⊔ℓ b)
_≈_ m₁ m₂ = m₁ ⊆ m₂ × m₂ ⊆ m₁

Defining equivalence more abstractly this way helps avoid concerns about the precise implementation of our maps.

Okay, but we haven’t actually defined what it means for one map to be a subset of another. My definition is as follows: if $m_1 \subseteq m_2$ , that is, if $m_1$ is a subset of $m_2$ , then every key in $m_1$ is also present in $m_2$ , and they are mapped to the same value. My first stab at a mathematical definition of this is the following:

m_1 \subseteq m_2 \triangleq \forall k, v.\ (k, v) \in m_1 \Rightarrow (k, v) \in m_2

Only there’s a slight complication; remember that our values themselves come from a lattice, and that this lattice might use its own equivalence operator $(\approx)$ to group similar elements. One example where this is important is our now-familiar “map of maps” scenario: the values store in the “outer” map are themselves maps, and we don’t want the order of the keys or other menial details of the inner maps to influence whether the outer maps are equal. Thus, we settle for a more robust definition of $m_1 \subseteq m_2$ that allows $m_1$ to have different-but-equivalent values from those in $m_2$ .

m_1 \subseteq m_2 \triangleq \forall k, v.\ (k, v) \in m_1 \Rightarrow \exists v'.\ v \approx v' \land (k, v') \in m_2

In Agda, the core of my definition is once again very close:

From Map.agda, lines 98 through 99

98
99

    subset m₁ m₂ = ∀ (k : A) (v : B) → (k , v) ∈ m₁ →
                   Σ B (λ v' → v ≈₂ v' × ((k , v') ∈ m₂))

The Implementation

Now it’s time to show you how I implemented the Map lattice. I chose represent maps using a list of key-value pairs, along with a condition that the keys are unique (non-repeating). I chose this definition because it was simple to implement, and because it makes it possible to iterate over the keys of a map. That last property is useful if we use the maps to later represent sets (which I did). Moreover, lists of key-value pairs are easy to serialize and write to disk. This isn’t hugely important for my immediate static program analysis needs, but it might be nice in the future. The requirement that the keys are unique prevents the map from being a multi-map (which might have several values associated with a particular key).

My Map module is parameterized by the key and value types (A and B respectively), and additionally requires some additional properties to be satisfied by these types.

From Map.agda, lines 6 through 10



 6
 7
 8
 9
10



module Lattice.Map {a b : Level} {A : Set a} {B : Set b}
    {_≈₂_ : B → B → Set b}
    {_⊔₂_ : B → B → B} {_⊓₂_ : B → B → B}
    (≡-dec-A : Decidable (_≡_ {a} {A}))
    (lB : IsLattice B _≈₂_ _⊔₂_ _⊓₂_) where

For A, the key property is the decidability of equality: there should be a way to compare keys for equality. This is important for all sorts of map operations. For example, when inserting a new value into a map, we need to decide if the value is already present (so that we know to override it), but if we can’t check if two values are equal, we can’t see if it’s already there.

The values of the map (represented by B) we expected to be lattices, so we require them to provide the lattice operations $(\sqcup)$ and $(\sqcap)$ , as well as the equivalence relation $(\approx)$ and the proof of the lattice properties in isLattice. To distinguish the lattice operations on B from the ones we’ll be defining on the map itself – you might’ve noticed that there’s a bit of overleading going on in this post – I’ve suffixed them with the subscript 2. My convention is to use the subscript corresponding to the number of the type parameter. Here, A is “first” and B is “second”, so the operators on B get 2.

From there, I define the map as a pair; the first component is the list of key-value pairs, and the second is the proof that all the keys in the list occur only once.

From Map.agda, lines 480 through 481

480
481

Map : Set (a ⊔ℓ b)
Map = Σ (List (A × B)) (λ l → Unique (ImplKeys.keys l))

Now, to implement union and intersection; for the most part, the proofs deal just with the first component of the map – the key-value pairs. For union, the key operation is “insert-or-combine”. We can think of merging two maps as inserting all the keys from one map (arbitrary, the “left”) into the other. If a key is not in the “left” map, insertion won’t do anything to its prior value in the right map; similarly, if a key is not in the “right” map, then it should appear unchanged in the final result after insertion. Finally, if a key is inserted into the “right” map, but already has a value there, then the two values need to be combined using _⊔₂_. This leads to the following definition of insert on key-value pair lists:

From Map.agda, lines 114 through 118



114
115
116
117
118



    insert : A → B → List (A × B) → List (A × B)
    insert k v [] = (k , v) ∷ []
    insert k v (x@(k' , v') ∷ xs) with ≡-dec-A k k'
    ...                             | yes _ = (k' , f v v') ∷ xs
    ...                             | no _ = x ∷ insert k v xs

Above, f is just a stand-in for _⊓₂_ (making the definition a tiny bit more general). For each element in the “right” key-value list, we check if its key matches the one we’re inserting; if it does, we have to combine the values, and there’s no need to recurse into the rest of the list. If on the other hand the key doesn’t match, we move on to the next element of the list. If we run out of elements, we know that the key we’re inserting wasn’t in the “right” map, so we insert it as-is.

The union operation is just about inserting every pair from one map into another.

From Map.agda, lines 120 through 121

120
121

    union : List (A × B) → List (A × B) → List (A × B)
    union m₁ m₂ = foldr insert m₂ m₁

Here, I defined my own version of foldr which unpacks the pairs, for convenience:

(Click here to see the definition of my foldr)

From Map.agda, lines 110 through 112

110
111
112

        foldr : ∀ {c} {C : Set c} → (A → B → C → C) -> C -> List (A × B) -> C
        foldr f b [] = b
        foldr f b ((k , v) ∷ xs) = f k v (foldr f b xs)

For intersection, we do something similar; however, since only elements in both maps should be in the final output, if our “insertion” doesn’t find an existing key, it should just fall through; this can be achieved by defining a version of insert whose base case simply throws away the input. Of course, this function should also use _⊓₂_ instead of _⊔₂_; below, though, I again use a general function f to provide a more general definition. I called this version of the function update.

From Map.agda, lines 295 through 299



295
296
297
298
299



    update : A → B → List (A × B) → List (A × B)
    update k v [] = []
    update k v ((k' , v') ∷ xs) with ≡-dec-A k k'
    ...                            | yes _ = (k' , f v v') ∷ xs
    ...                            | no _ = (k' , v') ∷ update k v xs

Just changing insert to update is not enough. It’s true that calling update with all keys from m1 on m2 would forget all keys unique to m1, it would still leave behind the only-in-m2 keys. To get rid of these, I defined another function, restrict, that drops all keys in its second argument that aren’t present in its first argument.

From Map.agda, lines 304 through 308



304
305
306
307
308



    restrict : List (A × B) → List (A × B) → List (A × B)
    restrict l [] = []
    restrict l ((k' , v') ∷ xs) with ∈k-dec k' l
    ...                           | yes _ = (k' , v') ∷ restrict l xs
    ...                           | no _ = restrict l xs

Altogether, intesection is defined as follows, where updates just calls update for every key-value pair in its first argument.

From Map.agda, lines 310 through 311

310
311

    intersect : List (A × B) → List (A × B) → List (A × B)
    intersect l₁ l₂ = restrict l₁ (updates l₁ l₂)

The next hurdle is all the proofs about these implementations. I will leave the details of the proofs either as appendices or as links to other posts on this site.

The first key property is that the insertion, union, update, and intersection operations all preserve uniqueness of keys; the proofs for this are here. The set of properties are the lattice laws for union and intersection. The proofs of those proceed by cases; to prove that $(\sqcup)$ is commutative, we reason that if $(k , v) \in m_1 \sqcup m_2$ , then it must be either in $m_1$ , in $m_2$ , or in both; for each of these three possible cases, we can show that $(k , v)$ must be the same in $m_2 \sqcup m_1$ . Things get even more tedious for proofs of associativity, since there are 7 cases to consider; I describe the strategy I used for such proofs in my article about the “Expression” pattern in Agda.

Additional Properties of Lattices

The product and map lattices are the two pulling the most weight in my implementation of program analyses. However, there’s an additional property that they have: if the lattices they are made of have a finite height, then so do products and map lattices themselves. A lattice having a finite height means that we can only line up so many elements using the less-than operator <. For instance, the natural numbers are not a finite-height lattice; we can create the infinite chain:

0 < 1 < 2 < ...

On the other hand, our sign lattice is of finite height; the longest chains we can make have three elements and two < signs. Here’s one:

\bot < + < \top

As a result of this, pairs of signs also have a finite height; the longest chains we can make have five elements and four < signs. An example: [note: Notice that the elements in the example progress the same way as the ones in the single-sign chain. This is no accident; the longest chains in the pair lattice can be constructed from longest chains of its element lattices. The length of the product lattice chain, counted by the number of "less than" signs, is the sum of the lengths of the element chains. ]

(\bot, \bot) < (\bot, +) < (\bot, \top) < (+, \top) < (\top, \top)

The same is true for maps, under certain conditions.

The finite-height property is crucial to lattice-based static program analysis; we’ll talk about it in more detail in the next post of this series.

Appendix: Proof of Uniqueness of Keys

I will provide sketches of the proofs here, and omit the implementations of my lemmas. Click on the link in the code block headers to jump to their implementation on my Git server.

First, note that if we’re inserting a key that’s already in a list, then the keys of that list are unchanged.

From Map.agda, lines 123 through 124

123
124

    insert-keys-∈ : ∀ {k : A} {v : B} {l : List (A × B)} →
                    k ∈k l → keys l ≡ keys (insert k v l)

On the other hand, if we’re inserting a new key, it ends up at the end, and the rest of the keys are unchanged.

From Map.agda, lines 134 through 135

134
135

    insert-keys-∉ : ∀ {k : A} {v : B} {l : List (A × B)} →
                    ¬ (k ∈k l) → (keys l ++ (k ∷ [])) ≡ keys (insert k v l)

Then, for any given key-value pair, the key either is or isn’t in the list we’re inserting it into. If it is, then the list ends up unchanged, and remains unique if it was already unique. On the other hand, if it’s not in the list, then it ends up at the end; adding a new element to the end of a unique list produces another unique list. Thus, in either case, the final keys are unique.

From Map.agda, lines 143 through 148



143
144
145
146
147
148



    insert-preserves-Unique : ∀ {k : A} {v : B} {l : List (A × B)}
                              → Unique (keys l) → Unique (keys (insert k v l))
    insert-preserves-Unique {k} {v} {l} u
        with (∈k-dec k l)
    ...   | yes k∈kl rewrite insert-keys-∈ {v = v} k∈kl = u
    ...   | no k∉kl rewrite sym (insert-keys-∉ {v = v} k∉kl) = Unique-append k∉kl u

By induction, we can then prove that calling insert many times as we do in union preserves uniqueness too. Here, insert-preserves-Unique serves as the inductive step.

From Map.agda, lines 164 through 168



164
165
166
167
168



    union-preserves-Unique : ∀ (l₁ l₂ : List (A × B)) →
                             Unique (keys l₂) → Unique (keys (union l₁ l₂))
    union-preserves-Unique [] l₂ u₂ = u₂
    union-preserves-Unique ((k₁ , v₁) ∷ xs₁) l₂ u₂ =
        insert-preserves-Unique (union-preserves-Unique xs₁ l₂ u₂)

For update, things are simple; it doesn’t change the keys of the argument list at all, since it only modifies, and doesn’t add new pairs. This is captured by the update-keys property:

From Map.agda, lines 313 through 314

313
314

    update-keys : ∀ {k : A} {v : B} {l : List (A × B)} →
                  keys l ≡ keys (update k v l)

If the keys don’t change, they obviously remain unique.

From Map.agda, lines 328 through 330

328
329
330

    update-preserves-Unique : ∀ {k : A} {v : B} {l : List (A × B)} →
                              Unique (keys l) → Unique (keys (update k v l ))
    update-preserves-Unique {k} {v} {l} u rewrite update-keys {k} {v} {l} = u

For restrict, we note that it only ever removes keys; as a result, if a key was not in the input to restrict, then it won’t be in its output, either.

From Map.agda, lines 337 through 338

337
338

    restrict-preserves-k≢ : ∀ {k : A} {l₁ l₂ : List (A × B)} →
                            All (λ k' → ¬ k ≡ k') (keys l₂) → All (λ k' → ¬ k ≡ k') (keys (restrict l₁ l₂))

As a result, for each key of the list being restricted, we either drop it (which does not damage uniqueness) or we keep it; since we only remove keys, and since the keys were originally unique, the key we kept won’t conflict with any of the other final keys.

From Map.agda, lines 345 through 351



345
346
347
348
349
350
351



    restrict-preserves-Unique : ∀ {l₁ l₂ : List (A × B)} →
                                Unique (keys l₂) → Unique (keys (restrict l₁ l₂))
    restrict-preserves-Unique {l₁} {[]} _ = Utils.empty
    restrict-preserves-Unique {l₁} {(k , v) ∷ xs} (push k≢xs uxs)
        with ∈k-dec k l₁
    ...   | yes _ = push (restrict-preserves-k≢ k≢xs) (restrict-preserves-Unique uxs)
    ...   | no _ = restrict-preserves-Unique uxs

Since both update and restrict preserve uniqueness, then so does intersect:

From Map.agda, lines 353 through 355

353
354
355

    intersect-preserves-Unique : ∀ {l₁ l₂ : List (A × B)} →
                                 Unique (keys l₂) → Unique (keys (intersect l₁ l₂))
    intersect-preserves-Unique {l₁} u = restrict-preserves-Unique (updates-preserve-Unique {l₁} u)

Daniel's Blog