You always hear the words “turning complete” thrown around in various contexts. Powerpoint is Turing complete, CSS is Turing complete. Turing completeness effectively describes a property of a language of system to perform any computation that can be performed on a computer (or by a human with a piece of paper). However, there’s nothing special about Turing completeness (or Turing machines) that makes them the way to represent that property. In fact, two other representations of computation were devised before the introduction of Turing machines - lambda calculus (proposed by Alonso Church) and general recursive functions (proposed by Kurt Gödel). These ways to represent computation can be shown to be equivalent to Turning machines. In this post I will talk about what lambda calculus is, as well as a way to use it to represent natural numbers.

Side note: apparently, “natural numbers” can mean either positive integers, or just nonnegative integers. In this case, we will pick the latter definition.

Abstraction and Application

The book Structure and Interpretation of Computer Programs provides this list of 3 essential mechanisms present in any powerful language:

Abstraction refers to generalizing some operation. Usually, this is done by taking concrete values on which a computation is performed and replacing them with variables. For instance, looking at 3+3, 1+2, and 1+4, we can see that in each of these, two numbers are being added. Thus, we write can write a more abstract representation of this as x+y, where x and y are both numbers. In most languages, we represent such a generalization using a function. In C:

int add(int l, int r) {
    return l + r;
}

In Haskell:

add l r = l + r

Lambda calculus is all about abstraction into functions, and the application of these functions. In lambda calculus, abstraction looks like this: $$ \lambda \ x \ . \ t $$ This reads: a function that, when given a variable x, evaluates to t. Usually, t is an expression using x. For example, suppose that there exists a function double. If we wanted to create a function that multiplies a number by 4, we can write it as follows: $$ \lambda \ x \ . \ \text{double} \ (\text{double} \ x) $$ Here, we see function application: the function double, which takes a number and returns a number, is given the parameter x. The result of evaluating double x is then given as a parameter to double again, thereby multiplying x by 4. Let’s use this new fuction to multiply some number y by 4: $$ (\lambda \ x . \ \text{double} \ (\text{double} \ x)) \ y $$ Here, we see both abstraction and application: we abstract the process of multiplying a number by 4, and then apply that process to a variable y. Application works by simply replacing the variable before the dot in the abstraction with the value that’s being passed in. This is called binding. In our case, we’re binding the variable x to the value of y. This results in the following: $$ \text{double} \ (\text{double} \ y) $$ Since we have no idea what the body (the expression after the dot) of double looks like, we cannot simplify this any further.

Currying

This is nice and all, but what about functions of more than one variable? What if we want to add numbers, like in our original example of abstraction? If lambda abstraction has only one variable on the left of the dot, how can we represent such functions? Surely there has to be a way, since we claim this little language can represent any computation that can be done on a computer. The answer is fairly simple. We use abstraction twice: $$ \lambda \ x \ . \ \lambda \ y \ . \ x + y $$ The second abstraction is inside the first. It can be equivalently written: $$ \lambda \ x \ . \ (\lambda \ y \ . \ x + y) $$

I’ve done something here that you might’ve accepted without questioning, but that I can’t ignore without feeling guilty. I’ve assumed that our lambda calculus has numbers and addition. This isn’t needed for the language to be Turing complete. However, it makes for much nicer examples, so we’ll stick with it here. Let’s move on to an example of using our function to add two numbers: $$ (\lambda \ x \ . \ \lambda \ y \ . \ x + y) \ 1 \ 2 $$ Function application is left associative. This is equivalently written as: $$ ((\lambda \ x \ . \ \lambda \ y \ . \ x + y) \ 1) \ 2 $$ First, we bind x to 1: $$ (\lambda \ y \ . \ 1 + y) \ 2 $$ Then, we bind y in the resulting expression to 2: $$ 1 + 2 $$ Well, there it is. We can extend this logic to functions of 3, 4, and more variables.

Partial Function Application

Our way of defining functions of more than one variable leads to a curious effect. To highlight this effect, let’s imagine calling the add function with less arguments than it takes in a language such as C:

add(1);

Oh no! The compiler won’t like this; add takes two parameters, and we’ve only given it one. However, when we try to do the same thing in lambda calculus: $$ (\lambda \ x \ . \ \lambda \ y \ . \ x + y) 1 $$ As before, we can bind x to 1: $$ \lambda \ y \ . \ 1 + y $$ This is not an error! All we’ve done is produce a function that takes the remaining parameter, y, and adds it to the parameter we’ve already passed in. This is an example of a partially applied function. Effectively, this new function we’ve produce is waiting for the remaining parameters we haven’t yet given it. Once we do, it’ll behave just like it would’ve if we passed all the parameters in together.

First Class Functions

So far, we’ve only been applying our functions to numbers. But it doesn’t have to be this way! Functions can also serve as parameters to other functions. For example, we can imagine a function that takes another function f, which transforms a number somehow, and then applies it to a number two times. The first parameter of such a function would be f, and the other would be some number x. In the body, we will apply f to the result of applying f to x: $$ \lambda \ f \ . \ \lambda \ x \ . \ f (f \ x) $$

This might look kind of familiar! Let’s apply this function to our previously mentioned double function: $$ (\lambda \ f . \ \lambda \ x \ . \ f (f \ x)) \ double $$

Just like in our previous examples, we simply replace f with double during application: $$ \lambda \ x \ . double \ (double \ x) $$ We’ve now created our multiply-by-four function! We can create other functions the same way. For instance, if we had a function halve, we could create a function to divide a number by 4 by applying our “apply-twice” function to it.

Church Encoded Integers

We can represent numbers using just abstraction and application. This is called Church encoding. In church encoding, a number is a function that takes another function, and applies it that many times to a value. 0 would take a function and a value, and apply it no times, returning the value it was given. 1 would take a function f and a value v, and return the result of applying that function f one time: f v. We’ve already seen 2! It’s a function that applies a function to a value twice, f (f v). Let’s write these numbers in lambda calculus: 0 becomes \(\lambda \ f \ . \ \lambda \ x \ . \ x\), 1 becomes \(\lambda \ f \ . \ \lambda \ x \ . \ f \ x\), and 2 becomes the familiar \(\lambda \ f \ . \lambda \ x \ . \ f \ (f \ x)\).

Now, let’s try represent addition. Addition of two numbers a and b would be done by taking a function f and applying it the first number of times, and then applying it the second number more times. Since addition must take in numbers a and b, which are functions of two variables, and return a number, we will end up with something like this: $$ \lambda \ a \ . \ \lambda \ b \ . \ ?? $$ What goes in the body of the second lambda? We know that a number is a function which takes a function and a variable to apply the function to: $$ \lambda \ f \ . \ \lambda \ v \ . \ ?? $$ Since the addition of a and b must return a number, we then have the following: $$ \lambda \ a \ . \ \lambda \ b \ . \ \lambda \ f \ . \ \lambda \ v \ . \ ?? $$ So how do we apply f the first number of times, and then apply it the second number more times? Well, a f is a partially applied function that, when given a variable, will apply f to that variable the first number of times. Similarly, b f is a partially applied function that will apply f the second number of times. We feed v into a f, which applies f the first number of times and returns the result. We then feed that into b f. This results in: $$ \lambda \ a \ . \ \lambda \ b \ . \ \lambda \ f \ . \ \lambda \ v \ . \ b \ f \ (a \ f \ v) $$

Phew! Now let’s try multiplication. The product of two numbers would apply the function the first number of times, the second number of times. As we already know, a f will get us a function that fulfills the first part of this requirement. Then we can apply that function the second number of times, which would give us what we want: $$ \lambda \ a \ . \ \lambda \ b \ . \ \lambda \ f \ . \ \lambda \ v \ . \ b \ (a \ f) \ v $$

Neat!

Conclusion

Today, we learned about lambda calculus, a representation of computation in terms of pure functions rather than an infinite tape and a machine dutifully churning away. This little language lies at the heart of many bigger programming languages such as Haskell. I hope you enjoyed!