We don’t want a compiler that can only generate code for a single platform. Our language should work on macOS, Windows, and Linux, on x86_64, ARM, and maybe some other architectures. We also don’t want to manually implement the compiler for each platform, dealing with the specifics of each architecture and operating system.

This is where LLVM comes in. LLVM (which stands for Low Level Virtual Machine), is a project which presents us with a kind of generic assembly language, an Intermediate Representation (IR). It also provides tooling to compile the IR into platform-specific instructions, as well as to apply a host of various optimizations. We can thus translate our G-machine instructions to LLVM, and then use LLVM to generate machine code, which gets us to our ultimate goal of compiling our language.

We start with adding LLVM to our CMake project.

From CMakeLists.txt, line 7
7
find_package(LLVM REQUIRED CONFIG)

LLVM is a huge project, and has many components. We don’t need most of them. We do need the core libraries, the x86 assembly generator, and x86 assembly parser. I’m not sure why we need the last one, but I ran into linking errors without them. We find the required link targets for these components using this CMake command:

From CMakeLists.txt, lines 19 through 20
19
20
# Find all the relevant LLVM components
llvm_map_components_to_libnames(LLVM_LIBS core x86asmparser x86codegen)

Finally, we add the new include directories, link targets, and definitions to our compiler executable:

From CMakeLists.txt, lines 39 through 41
39
40
41
target_include_directories(compiler PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
target_include_directories(compiler PUBLIC ${LLVM_INCLUDE_DIRS})
target_compile_definitions(compiler PUBLIC ${LLVM_DEFINITIONS})

Great, we have the infrastructure updated to work with LLVM. It’s now time to start using the LLVM API to compile our G-machine instructions into assembly. We start with LLVMContext. The LLVM documentation states:

This is an important class for using LLVM in a threaded context. It (opaquely) owns and manages the core “global” data of LLVM’s core infrastructure, including the type and constant uniquing tables.

We will have exactly one instance of such a class in our program.

Additionally, we want an IRBuilder, which will help us generate IR instructions, placing them into basic blocks (more on that in a bit). Also, we want a Module object, which represents some collection of code and declarations (perhaps like a C++ source file). Let’s keep these things in our own llvm_context class. Here’s what that looks like:

From llvm_context.hpp, entire file
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
#pragma once
#include <llvm/IR/DerivedTypes.h>
#include <llvm/IR/Function.h>
#include <llvm/IR/LLVMContext.h>
#include <llvm/IR/IRBuilder.h>
#include <llvm/IR/Module.h>
#include <map>

struct llvm_context {
    struct custom_function {
        llvm::Function* function;
        int32_t arity;
    };

    using custom_function_ptr = std::unique_ptr<custom_function>;

    llvm::LLVMContext ctx;
    llvm::IRBuilder<> builder;
    llvm::Module module;

    std::map<std::string, custom_function_ptr> custom_functions;
    std::map<std::string, llvm::Function*> functions;
    std::map<std::string, llvm::StructType*> struct_types;

    llvm::StructType* stack_type;
    llvm::PointerType* stack_ptr_type;
    llvm::PointerType* node_ptr_type;
    llvm::IntegerType* tag_type;
    llvm::FunctionType* function_type;

    llvm_context()
        : builder(ctx), module("bloglang", ctx) {
        create_types();
        create_functions();
    }

    void create_types();
    void create_functions();

    llvm::ConstantInt* create_i8(int8_t);
    llvm::ConstantInt* create_i32(int32_t);
    llvm::ConstantInt* create_size(size_t);

    llvm::Value* create_pop(llvm::Function*);
    llvm::Value* create_peek(llvm::Function*, llvm::Value*);
    void create_push(llvm::Function*, llvm::Value*);
    void create_popn(llvm::Function*, llvm::Value*);
    void create_update(llvm::Function*, llvm::Value*);
    void create_pack(llvm::Function*, llvm::Value*, llvm::Value*);
    void create_split(llvm::Function*, llvm::Value*);
    void create_slide(llvm::Function*, llvm::Value*);
    void create_alloc(llvm::Function*, llvm::Value*);

    llvm::Value* create_eval(llvm::Value*);

    llvm::Value* unwrap_num(llvm::Value*);
    llvm::Value* create_num(llvm::Value*);

    llvm::Value* unwrap_data_tag(llvm::Value*);

    llvm::Value* create_global(llvm::Value*, llvm::Value*);

    llvm::Value* create_app(llvm::Value*, llvm::Value*);

    llvm::Function* create_custom_function(std::string name, int32_t arity);
};

We include the LLVM context, builder, and module as members of the context struct. Since the builder and the module need the context, we initialize them in the constructor, where they can safely reference it.

Besides these fields, we added a few others, namely the functions and struct_types maps, and the various llvm::Type subclasses such as stack_type. We did this because we want to be able to call our runtime functions (and use our runtime structs) from LLVM. To generate a function call from LLVM, we need to have access to an llvm::Function object. We thus want to have an llvm::Function object for each runtime function we want to call. We could declare a member variable in our llvm_context for each runtime function, but it’s easier to leave this to be an implementation detail, and only have a dynamically created map between runtime function names and their corresponding llvm::Function objects.

We populate the maps and other type-related variables in the two methods, create_functions() and create_types(). To create an llvm::Function, we must provide an llvm::FunctionType, an llvm::LinkageType, the name of the function, and the module in which the function is declared. Since we only have one module (the one we initialized in the constructor) that’s the module we pass in. The name of the function is the same as its name in the runtime. The linkage type is a little more complicated - it tells LLVM the “visibility” of a function. “Private” or “Internal” would hide this function from the linker (like static functions in C). However, we want to do the opposite: our generated functions should be accessible from other code. Thus, our linkage type is “External”.

The only remaining parameter is the llvm::FunctionType, which is created using code like:

llvm::FunctionType::get(return_type, {param_type_1, param_type_2, ...}, is_variadic)

Declaring all the functions and types in our runtime is mostly just tedious. Here are a few lines from create_functions(), which give a very good idea of the rest of that method:

From llvm_context.cpp, lines 47 through 60
47
48
49
50
51
52
53
54
55
56
57
58
59
60
    auto void_type = Type::getVoidTy(ctx);
    auto sizet_type = IntegerType::get(ctx, sizeof(size_t) * 8);
    functions["stack_init"] = Function::Create(
            FunctionType::get(void_type, { stack_ptr_type }, false),
            Function::LinkageTypes::ExternalLinkage,
            "stack_init",
            &module
    );
    functions["stack_free"] = Function::Create(
            FunctionType::get(void_type, { stack_ptr_type }, false),
            Function::LinkageTypes::ExternalLinkage,
            "stack_free",
            &module
    );

Similarly, here are a few lines from create_types(), from which you can extrapolate the rest:

From llvm_context.cpp, lines 7 through 11
 7
 8
 9
10
11
    stack_type = StructType::create(ctx, "stack");
    stack_ptr_type = PointerType::getUnqual(stack_type);
    tag_type = IntegerType::getInt8Ty(ctx);
    struct_types["node_base"] = StructType::create(ctx, "node_base");
    struct_types["node_app"] = StructType::create(ctx, "node_app");

We also tell LLVM the contents of our structs, so that we may later reference specific fields. This is just like forward declaration - we can forward declare a struct in C/C++, but unless we also declare its contents, we can’t access what’s inside. Below is the code for specifying the body of node_base and node_app.

From llvm_context.cpp, lines 19 through 26
19
20
21
22
23
24
25
26
    struct_types.at("node_base")->setBody(
            IntegerType::getInt32Ty(ctx)
    );
    struct_types.at("node_app")->setBody(
            struct_types.at("node_base"),
            node_ptr_type,
            node_ptr_type
    );

There’s still more functionality packed into llvm_context. Let’s next take a look into custom_function, and the create_custom_function method. Why do we need these? To highlight the need for the custom class, let’s take a look at instruction_pushglobal which occurs at the G-machine level, and then at alloc_global, which will be a function call generated as part of the PushGlobal instruction. instruction_pushglobal’s only member variable is name, which stands for the name of the global function it’s referencing. However, alloc_global requires an arity argument! We can try to get this information from the llvm::Function corresponding to the global we’re trying to reference, but this doesn’t get us anywhere: as far as LLVM is concerned, any global function only takes one parameter, the stack. The rest of the parameters are given through that stack, and their number cannot be easily deduced from the function alone.

Instead, we decide to store global functions together with their arity. We thus create a class to combine these two things (custom_function), define a map from global function names to instances of custom_function, and add a convenience method (create_custom_function) that takes care of constructing an llvm::Function object, creating a custom_function, and storing it in the map.

The implementation for custom_function is straightforward:

From llvm_context.cpp, lines 234 through 252
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
llvm::Function* llvm_context::create_custom_function(std::string name, int32_t arity) {
    auto void_type = llvm::Type::getVoidTy(ctx);
    auto function_type =
        llvm::FunctionType::get(void_type, { stack_ptr_type }, false);
    auto new_function = llvm::Function::Create(
            function_type,
            llvm::Function::LinkageTypes::ExternalLinkage,
            "f_" + name,
            &module
    );
    auto start_block = llvm::BasicBlock::Create(ctx, "entry", new_function);

    auto new_custom_f = custom_function_ptr(new custom_function());
    new_custom_f->arity = arity;
    new_custom_f->function = new_function;
    custom_functions["f_" + name] = std::move(new_custom_f);

    return new_function;
}

We create a function type, then a function, and finally initialize a custom_function. There’s one thing we haven’t seen yet in this function, which is the BasicBlock class. We’ll get to what basic blocks are shortly, but for now it’s sufficient to know that the basic block gives us a place to insert code.

This isn’t the end of our llvm_context class: it also has a variety of other create_* methods! Let’s take a look at their signatures. Most return either void, llvm::ConstantInt*, or llvm::Value*. Since llvm::ConstantInt* is a subclass of llvm::Value*, let’s just treat it as simply an llvm::Value* while trying to understand these methods.

So, what is llvm::Value? To answer this question, let’s first understand how the LLVM IR works.

LLVM IR

An important property of LLVM IR is that it is in Single Static Assignment (SSA) form. This means that each variable can only be assigned to once. For instance, if we use <- to represent assignment, the following program is valid:

x <- 1
y <- 2
z <- x + y

However, the following program is not valid:

x <- 1
x <- x + 1

But what if we do want to modify a variable x? We can declare another “version” of x every time we modify it. For instance, if we wanted to increment x twice, we’d do this:

x <- 1
x1 <- x + 1
x2 <- x1 + 1

In practice, LLVM’s C++ API can take care of versioning variables on its own, by auto-incrementing numbers associated with each variable we use.

Assigned to each variable is llvm::Value. The LLVM documentation states:

It is the base class of all values computed by a program that may be used as operands to other values.

It’s important to understand that llvm::Value does not store the result of the computation. It rather represents how something may be computed. 1 is a value because it computed by just returning 1. x + 1 is a value because it is computed by adding the value inside of x to 1. Since we cannot modify a variable once we’ve declared it, we will keep assigning intermediate results to new variables, constructing new values out of values that we’ve already specified.

This somewhat elucidates what the create_* functions do: create_i8 creates an 8-bit integer value, and create_pop creates a value that is computed by calling our runtime stack_pop function.

Before we move on to look at the implementations of these functions, we need to understand another concept from the world of compiler design: basic blocks. A basic block is a sequence of instructions that are guaranteed to be executed one after another. This means that a basic block cannot have an if/else, jump, or any other type of control flow anywhere except at the end. If control flow could appear inside the basic block, there would be opporunity for execution of some, but not all, instructions in the block, violating the definition. Every time we add an IR instruction in LLVM, we add it to a basic block. Writing control flow involves creating several blocks, with each block serving as the destination of a potential jump. We will see this used to compile the Jump instruction.

Generating LLVM IR

Now that we understand what llvm::Value is, and have a vague understanding of how LLVM is structured, let’s take a look at the implementations of the create_* functions. The simplest is create_i8:

From llvm_context.cpp, lines 150 through 152
150
151
152
ConstantInt* llvm_context::create_i8(int8_t i) {
    return ConstantInt::get(ctx, APInt(8, i));
}

Not much to see here. We create an instance of the llvm::ConstantInt class, from the actual integer given to the method. As we said before, llvm::ConstantInt is a subclass of llvm::Value. Next up, let’s look at create_pop:

From llvm_context.cpp, lines 160 through 163
160
161
162
163
Value* llvm_context::create_pop(Function* f) {
    auto pop_f = functions.at("stack_pop");
    return builder.CreateCall(pop_f, { f->arg_begin() });
}

We first retrieve an llvm::Function associated with stack_pop from our map, and then use llvm::IRBuilder::CreateCall to insert a value that represents a function call into the currently selected basic block (the builder’s state is what dictates what the “selected basic block” is). CreateCall takes as parameters the function we want to call (stack_pop, which we store into the pop_f variable), as well as the arguments to the function (for which we pass f->arg_begin()).

Hold on. What the heck is arg_begin()? Why do we take a function as a paramter to this method? The answer is fairly simple: this method is used when we are generating a function with signature void f_(struct stack* s) (we discussed the signature in the previous post). The parameter that we give to create_pop is this function we’re generating, and arg_begin() gets the value that represents the first parameter to our function - s! Since stack_pop takes a stack, we need to give it the stack we’re working on, and so we use f->arg_begin() to access it.

Most of the other functions follow this exact pattern, with small deviations. However, another function uses a more complicated LLVM instruction:

From llvm_context.cpp, lines 202 through 209
202
203
204
205
206
207
208
209
Value* llvm_context::unwrap_num(Value* v) {
    auto num_ptr_type = PointerType::getUnqual(struct_types.at("node_num"));
    auto cast = builder.CreatePointerCast(v, num_ptr_type);
    auto offset_0 = create_i32(0);
    auto offset_1 = create_i32(1);
    auto int_ptr = builder.CreateGEP(cast, { offset_0, offset_1 });
    return builder.CreateLoad(int_ptr);
}

unwrap_num is used to cast a given node pointer to a pointer to a number node, and then return the integer value from that number node. It starts fairly innocently: we ask LLVM for the type of a pointer to a node_num struct, and then use CreatePointerCast to create a value that is the same node pointer we’re given, but now interpreted as a number node pointer. We now have to access the value field of our node. CreateGEP helps us with this: given a pointer to a node, and two offsets n and k, it effectively performs the following:

&(num_pointer[n]->kth_field)

The first offset, then, gives an index into the “array” represented by the pointer, while the second offset gives the index of the field we want to access. We want to dereference the pointer (num_pointer[0]), and we want the second field (1, when counting from 0). Thus, we call CreateGEP with these offsets and our pointers.

This still leaves us with a pointer to a number, rather than the number itself. To dereference the pointer, we use CreateLoad. This gives us the value of the number node, which we promptly return.

This concludes our implementation of the llvm_context - it’s time to move on to the G-machine instructions.

G-machine Instructions to LLVM IR

Let’s now envision a gen_llvm method on the instruction struct, which will turn the still-abstract G-machine instruction into tangible, close-to-metal LLVM IR. As we’ve seen in our implementation of llvm_context, to access the stack, we need access to the first argument of the function we’re generating. Thus, we need this method to accept the function whose instructions are being converted to LLVM. We also pass in the llvm_context, since it contains the LLVM builder, context, module, and a map of globally declared functions.

With these things in mind, here’s the signature for gen_llvm:

virtual void gen_llvm(llvm_context&, llvm::Function*) const;

Let’s get right to it! instruction_pushint gives us an easy start:

From instruction.cpp, lines 17 through 19
17
18
19
void instruction_pushint::gen_llvm(llvm_context& ctx, Function* f) const {
    ctx.create_push(f, ctx.create_num(ctx.create_i32(value)));
}

We create an LLVM integer constant with the value of our integer, and push it onto the stack.

instruction_push is equally terse:

From instruction.cpp, lines 37 through 39
37
38
39
void instruction_push::gen_llvm(llvm_context& ctx, Function* f) const {
    ctx.create_push(f, ctx.create_peek(f, ctx.create_size(offset)));
}

We simply peek at the value of the stack at the given offset (an integer of the same size as size_t, which we create using create_size). Once we have the result of the peek, we push it onto the stack.

instruction_pushglobal is more involved. Let’s take a look:

From instruction.cpp, lines 26 through 30
26
27
28
29
30
void instruction_pushglobal::gen_llvm(llvm_context& ctx, Function* f) const {
    auto& global_f = ctx.custom_functions.at("f_" + name);
    auto arity = ctx.create_i32(global_f->arity);
    ctx.create_push(f, ctx.create_global(global_f->function, arity));
}

First, we retrive the custom_function associated with the given global name. We then create an LLVM integer constant representing the arity of the function, and then push onto the stack the result of alloc_global, giving it the function and arity just like it expects.

instruction_pop is also short, and doesn’t require much further explanation:

From instruction.cpp, lines 46 through 48
46
47
48
void instruction_pop::gen_llvm(llvm_context& ctx, Function* f) const {
    ctx.create_popn(f, ctx.create_size(count));
}

Some other instructions, such as instruction_update, instruction_pack, instruction_split, instruction_slide, instruction_alloc and instruction_eval are equally as simple, and we omit them for the purpose of brevity.

What remains are two “meaty” functions, instruction_jump and instruction_binop. Let’s start with the former:

From instruction.cpp, lines 101 through 123
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
void instruction_jump::gen_llvm(llvm_context& ctx, Function* f) const {
    auto top_node = ctx.create_peek(f, ctx.create_size(0));
    auto tag = ctx.unwrap_data_tag(top_node);
    auto safety_block = BasicBlock::Create(ctx.ctx, "safety", f);
    auto switch_op = ctx.builder.CreateSwitch(tag, safety_block, tag_mappings.size());
    std::vector<BasicBlock*> blocks;

    for(auto& branch : branches) {
        auto branch_block = BasicBlock::Create(ctx.ctx, "branch", f);
        ctx.builder.SetInsertPoint(branch_block);
        for(auto& instruction : branch) {
            instruction->gen_llvm(ctx, f);
        }
        ctx.builder.CreateBr(safety_block);
        blocks.push_back(branch_block);
    }

    for(auto& mapping : tag_mappings) {
        switch_op->addCase(ctx.create_i8(mapping.first), blocks[mapping.second]);
    }

    ctx.builder.SetInsertPoint(safety_block);
}

This is the one and only function in which we have to take care of control flow. Conceptually, depending on the tag of the node_data at the top of the stack, we want to pick one of many branches and jump to it. As we discussed, a basic block has to be executed in its entirety; since the branches of a case expression are mutually exclusive (only one of them is executed in any given case), we have to create a separate basic block for each branch. Given these blocks, we then want to branch to the correct one using the tag of the node on top of the stack.

This is exactly what we do in this function. We first peek at the node on top of the stack, and use CreateGEP through unwrap_data_tag to get access to its tag. What we then need is LLVM’s switch instruction, created using CreateSwitch. We must provide the switch with a “default” case in case the tag value is something we don’t recognize. To do this, we create a “safety” BasicBlock. With this new safety block in hand, we’re able to call CreateSwitch, giving it the tag value to switch on, the safety block to default to, and the expected number of branches (to optimize memory allocation).

Next, we create a vector of blocks, and for each branch, we append to it a corresponding block branch_block, into which we insert the LLVM IR corresponding to the instructions of the branch. No matter the branch we take, we eventually want to come back to the same basic block, which will perform the usual function cleanup via Update and Slide. We re-use the safety block for this, and use CreateBr at the end of each branch_block to perform an unconditional jump.

After we create each of the blocks, we use the tag_mappings to add cases to the switch instruction, using addCase. Finally, we set the builder’s insertion point to the safety block, meaning that the next instructions will insert their LLVM IR into that block. Since we have all branches jump to the safety block at the end, this means that no matter which branch we take in the case expression, we will still execute the subsequent instructions as expected.

Let’s now look at instruction_binop:

From instruction.cpp, lines 139 through 150
139
140
141
142
143
144
145
146
147
148
149
150
void instruction_binop::gen_llvm(llvm_context& ctx, Function* f) const {
    auto left_int = ctx.unwrap_num(ctx.create_pop(f));
    auto right_int = ctx.unwrap_num(ctx.create_pop(f));
    llvm::Value* result;
    switch(op) {
        case PLUS: result = ctx.builder.CreateAdd(left_int, right_int); break;
        case MINUS: result = ctx.builder.CreateSub(left_int, right_int); break;
        case TIMES: result = ctx.builder.CreateMul(left_int, right_int); break;
        case DIVIDE: result = ctx.builder.CreateSDiv(left_int, right_int); break;
    }
    ctx.create_push(f, ctx.create_num(result));
}

In this instruction, we pop and unwrap two integers from the stack (assuming they are integers). Depending on the type of operation the instruction is set to, we then push the result of the corresponding LLVM instruction. PLUS calls LLVM’s CreateAdd to insert addition, MINUS calls CreateSub, and so on. No matter what the operation was, we push the result onto the stack.

That’s all for our instructions! We’re so very close now. Let’s move on to compiling definitions.

Definitions to LLVM IR

As with typechecking, to allow for mutually recursive functions, we need to be able each global function from any other function. We then take the same approah as before, going in two passes. This leads to two new methods for definition:

virtual void gen_llvm_first(llvm_context& ctx) = 0;
virtual void gen_llvm_second(llvm_context& ctx) = 0;

The first pass is intended to register all functions into the llvm_context, making them visible to other functions. The second pass is used to actually generate the code for each function, now having access to all the other global functions. Let’s see the implementation for gen_llvm_first for definition_defn:

From definition.cpp, lines 58 through 60
58
59
60
void definition_defn::gen_llvm_first(llvm_context& ctx) {
    generated_function = ctx.create_custom_function(name, params.size());
}

Since create_custom_function already creates a function and registers it with llvm_context, this is all we need. Note that we created a new member variable for definition_defn which stores this newly created function. In the second pass, we will populate this function with LLVM IR from the definition’s instructions.

We actually create functions for each of the constructors of data types, but they’re quite special: all they do is pack their arguments! Since they don’t need access to the other global functions, we might as well create their bodies then and there:

From definition.cpp, lines 101 through 112
101
102
103
104
105
106
107
108
109
110
111
112
void definition_data::gen_llvm_first(llvm_context& ctx) {
    for(auto& constructor : constructors) {
        auto new_function =
            ctx.create_custom_function(constructor->name, constructor->types.size());
        ctx.builder.SetInsertPoint(&new_function->getEntryBlock());
        ctx.create_pack(new_function,
                ctx.create_size(constructor->types.size()),
                ctx.create_i8(constructor->tag)
        );
        ctx.builder.CreateRetVoid();
    }
}

Like in definition_defn, we use create_custom_function. However, we then use SetInsertPoint to configure our builder to insert code into the newly created function (which already has a BasicBlock, thanks to that one previously unexplained line in create_custom_function!). Since we decided to only include the Pack instruction, we generate a call to it directly using create_pack. We follow this up with CreateRetVoid, which tells LLVM that this is the end of the function, and that it is now safe to return from it.

Great! We now implement the second pass of gen_llvm. In the case of definition_defn, we do almost exactly what we did in the first pass of definition_data:

From definition.cpp, lines 62 through 68
62
63
64
65
66
67
68
void definition_defn::gen_llvm_second(llvm_context& ctx) {
    ctx.builder.SetInsertPoint(&generated_function->getEntryBlock());
    for(auto& instruction : instructions) {
        instruction->gen_llvm(ctx, generated_function);
    }
    ctx.builder.CreateRetVoid();
}

As for definition_data, we have nothing to do in the second pass. We’re done!

Getting Results

We’re almost there. Two things remain. The first: our implementation of ast_binop, implement each binary operation as simply a function call: + calls f_plus, and so on. But so far, we have not implemented f_plus, or any other binary operator function. We do this in main.cpp, creating a function gen_llvm_internal_op:

From main.cpp, lines 70 through 83
70
71
72
73
74
75
76
77
78
79
80
81
82
83
void gen_llvm_internal_op(llvm_context& ctx, binop op) {
    auto new_function = ctx.create_custom_function(op_action(op), 2);
    std::vector<instruction_ptr> instructions;
    instructions.push_back(instruction_ptr(new instruction_push(1)));
    instructions.push_back(instruction_ptr(new instruction_eval()));
    instructions.push_back(instruction_ptr(new instruction_push(1)));
    instructions.push_back(instruction_ptr(new instruction_eval()));
    instructions.push_back(instruction_ptr(new instruction_binop(op)));
    ctx.builder.SetInsertPoint(&new_function->getEntryBlock());
    for(auto& instruction : instructions) {
        instruction->gen_llvm(ctx, new_function);
    }
    ctx.builder.CreateRetVoid();
}

We create a simple function body. We then append G-machine instructions that take each argument, evaluate it, and then perform the corresponding binary operation. With these instructions in the body, we insert them into a new function, just like we did in our code for definition_defn and definition_data.

Finally, we write our gen_llvm function that we will call from main:

From main.cpp, lines 125 through 141
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
void gen_llvm(const std::vector<definition_ptr>& prog) {
    llvm_context ctx;
    gen_llvm_internal_op(ctx, PLUS);
    gen_llvm_internal_op(ctx, MINUS);
    gen_llvm_internal_op(ctx, TIMES);
    gen_llvm_internal_op(ctx, DIVIDE);

    for(auto& definition : prog) {
        definition->gen_llvm_first(ctx);
    }

    for(auto& definition : prog) {
        definition->gen_llvm_second(ctx);
    }
    ctx.module.print(llvm::outs(), nullptr);
    output_llvm(ctx, "program.o");
}

It first creates the functions for +, -, *, and /. Then, it calls the first pass of gen_llvm on all definitions, followed by the second pass. Lastly, it uses LLVM’s built-in functionality to print out the generated IR in our module, and then uses a function output_llvm to create an object file ready for linking.

To be very honest, I took the output_llvm function almost entirely from instructional material for my university’s compilers course. The gist of it, though, is: we determine the target architecture and platform, specify a “generic” CPU, create a default set of options, and then generate an object file. Here it is:

From main.cpp, lines 85 through 123
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
void output_llvm(llvm_context& ctx, const std::string& filename) {
    std::string targetTriple = llvm::sys::getDefaultTargetTriple();

    llvm::InitializeNativeTarget();
    llvm::InitializeNativeTargetAsmParser();
    llvm::InitializeNativeTargetAsmPrinter();

    std::string error;
    const llvm::Target* target =
        llvm::TargetRegistry::lookupTarget(targetTriple, error);
    if (!target) {
        std::cerr << error << std::endl;
    } else {
        std::string cpu = "generic";
        std::string features = "";
        llvm::TargetOptions options;
        llvm::TargetMachine* targetMachine =
            target->createTargetMachine(targetTriple, cpu, features,
                    options, llvm::Optional<llvm::Reloc::Model>());

        ctx.module.setDataLayout(targetMachine->createDataLayout());
        ctx.module.setTargetTriple(targetTriple);

        std::error_code ec;
        llvm::raw_fd_ostream file(filename, ec, llvm::sys::fs::F_None);
        if (ec) {
            throw 0;
        } else {
            llvm::TargetMachine::CodeGenFileType type = llvm::TargetMachine::CGFT_ObjectFile;
            llvm::legacy::PassManager pm;
            if (targetMachine->addPassesToEmitFile(pm, file, NULL, type)) {
                throw 0;
            } else {
                pm.run(ctx.module);
                file.close();
            }
        }
    }
}

We now add a generate_llvm call to main.

Are we there?

Let’s try to compile our first example, works1.txt. The file:

From works1.txt, entire file
defn main = { sum 320 6 }
defn sum x y = { x + y }

We run the following commands in our build directory:

./compiler < ../examples/work1.txt
gcc -no-pie ../runtime.c program.o
./a.out

Nothing happens. How anticlimactic! Our runtime has no way of printing out the result of the evaluation. Let’s change that:

From runtime.c, lines 157 through 183
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
void print_node(struct node_base* n) {
    if(n->tag == NODE_APP) {
        struct node_app* app = (struct node_app*) n;
        print_node(app->left);
        putchar(' ');
        print_node(app->right);
    } else if(n->tag == NODE_DATA) {
        printf("(Packed)");
    } else if(n->tag == NODE_GLOBAL) {
        struct node_global* global = (struct node_global*) n;
        printf("(Global: %p)", global->function);
    } else if(n->tag == NODE_IND) {
        print_node(((struct node_ind*) n)->next);
    } else if(n->tag == NODE_NUM) {
        struct node_num* num = (struct node_num*) n;
        printf("%d", num->value);
    }
}

int main(int argc, char** argv) {
    struct node_global* first_node = alloc_global(f_main, 0);
    struct node_base* result = eval((struct node_base*) first_node);

    printf("Result: ");
    print_node(result);
    putchar('\n');
}

Rerunning our commands, we get:

Result: 326

The correct result! Let’s try it with works2.txt:

From works2.txt, entire file
defn add x y = { x + y }
defn double x = { add x x }
defn main = { double 163 }

And again, we get the right answer:

Result: 326

This is child’s play, though. Let’s try with something more complicated, like works3.txt:

From works3.txt, entire file
data List = { Nil, Cons Int List }
defn length l = {
    case l of {
        Nil -> { 0 }
        Cons x xs -> { 1 + length xs }
    }
}
defn main = { length (Cons 1 (Cons 2 (Cons 3 Nil))) }

Once again, our program does exactly what we intended:

Result: 3

Alright, this is neat, but we haven’t yet confirmed that lazy evaluation works. How about we try it with works5.txt:

From works5.txt, entire file
data List = { Nil, Cons Int List }

defn sumZip l m = {
    case l of {
        Nil -> { 0 }
        Cons x xs -> {
            case m of {
                Nil -> { 0 }
                Cons y ys -> { x + y + sumZip xs ys }
            }
        }
    }
}

defn ones = { Cons 1 ones }

defn main = { sumZip ones (Cons 1 (Cons 2 (Cons 3 Nil))) }

Yet again, the program works:

Result: 9

At last, we have a working compiler!

While this is a major victory, we are not yet finished with the compiler altogether. While we allocate nodes whenever we need them, we have not once uttered the phrase free in our runtime. Our language works, but we have no way of comparing numbers, no lambdas, no let/in. In the next several posts, we will improve our compiler to properly free unused memory usign a garbage collector, implement lambda functions using lambda lifting, and use our Alloc instruction to implement let/in expressions. We get started on the first of these tasks in Part 9 - Garbage Collection.