When I first joined the Chapel team, one pattern used in its C++-based compiler made a strong impression on me. Since then, I’ve used the pattern many more times, and have been very satisfied with how it turned out. However, it feels like the pattern is relatively unknown, so I thought I’d show it off, and some of its applications in the Chapel compiler. I’ve slightly tweaked a lot of the snippets I directly show in this article for the sake of simpler presentation; I’ve included links to the original code (available on GitHub) if you want to see the unabridged version.
Broadly speaking, the “X Macros” pattern is about generating code. If you have a lot of repetitive code to write (declaring many variables or classes, performing many very similar actions, etc.), this pattern can save a lot of time, lead to much more maintainable code, and reduce the effort required to add more code.
I will introduce the pattern in its simplest form with my first example: interning strings.
Application 1: String Interning
The Chapel compiler interns a lot of its strings. This way, it can reduce the
memory footprint of keeping identifiers in memory (every string "x"
is
actually the same string) and make for much faster equality comparisons
(you can just perform a pointer comparison!). Generally, a Context
class
is used to manage interning state. A new interned string can be constructed
using the context object in the following manner:
UniqueString::get(ctxPtr, "the string");
Effectively, this performs a search of the currently existing unique strings.
If one with the content ("the string"
in this case) doesn’t exist, it’s
created and registered with the Context
. Otherwise, the existing string is
returned. Some strings, however, occur a lot in the compiler, to the point that
it would be inefficient to perform the whole “find-or-create” operation every
time. One example is the "this"
string, which is an identifier with a lot of
special behavior in the language (much like this
in languages such as Java).
To support such frequent flier strings, the compiler initializes them once,
and creates a variable per-string that can be accessed to get that string’s value.
There’s that repetitive code. Defining a brand new variable for each string,
of which there are around 100 at the time of writing, is a lot of boilerplate.
There are also at least two places where code needs to be added:
[note:
A third use in the compiler is actually a variadic template defined over
character arrays. The template is defined and specialized in such a way that
you can refer to a variable by its string contents (i.e., you can write
USTR("the string")
instead of
theStringVariable
).
]
It would be very easy to accidentally modify the former but not the latter,
especially for developers not familiar with how these “common strings” are
implemented.
This is where the X Macros come in. If you look around the compiler source code, there’s a header file that looks something like the following:
X(align , "align")
X(atomic , "atomic")
X(bool_ , "bool")
X(borrow , "borrow")
X(borrowed , "borrowed")
X(by , "by")
X(bytes , "bytes")
// A lot more of these...
What’s this X
thing? That right there is the essence of the pattern: the macro
X
isn’t defined in the header! Effectively, all-global-strings.h
is just
a list, and we can “iterate” over this list to generate some code for each
one of its elements, in as many places as we want. What I mean by this is
that we can then write code like the following:
struct GlobalStrings {
#define X(field, str) UniqueString field;
#include "all-global-strings.h"
#undef X
};
In this case, we define the macro X
to ignore the value of the string (we’re
just declaring it here), and create a new UniqueString
variable declaration.
Since the declaration is inside the GlobalStrings
struct, this ends up
creating a field. Just like that, we’ve declared a class with over 100
fields. Initialization is equally simple:
GlobalStrings globalStrings;
Context rootContext;
static void initGlobalStrings() {
#define X(field, str) globalStrings.field = UniqueString::get(&rootContext, str);
#include "chpl/framework/all-global-strings.h"
#undef X
}
With this, we’ve completely automated the code for for both declaring and
initializing all 100 of our unique strings. Adding a new string doesn’t require
a developer to know all of the places where this is implemented: just by
modifying the all-global-strings.h
header with a new call to X
, they can
add both a new variable and code to initialize it. Pretty robust!
Application 2: AST Class Hierarchy
Altough the interned strings are an excellent first example, it wasn’t the
first usage of X Macros that I encountered in the Chapel compiler. Beyond
strings, the compiler uses X Macros to represent the whole class hierarchy
of abstract syntax tree (AST)
nodes that it uses. Here, the code is actually a bit more complicated; the
class hierarchy isn’t a list like the strings were; it is itself a tree.
To represent such a structure, we need more than a single X
macro; the
compiler went with AST_NODE
, AST_BEGIN_SUBCLASSES
, and AST_END_SUBCLASSES
.
Here’s what that looks like:
// Other AST nodes above...
AST_BEGIN_SUBCLASSES(Loop)
AST_NODE(DoWhile)
AST_NODE(While)
AST_BEGIN_SUBCLASSES(IndexableLoop)
AST_NODE(BracketLoop)
AST_NODE(Coforall)
AST_NODE(For)
AST_NODE(Forall)
AST_NODE(Foreach)
AST_END_SUBCLASSES(IndexableLoop)
AST_END_SUBCLASSES(Loop)
// Other AST nodes below...
The class hierarchy defined in this header, called uast-classes-list.h
, is
used for a lot of things, both in the compiler itself and in some libraries
that use the compiler. I’ll go through the use cases in turn.
Tags and Dynamic Casting
First, to deal with a general absence of
RTTI, the hierarchy header
is used to declare a “tag” enum. Each AST node has a tag matching its class;
this allows us inspect the AST and perform safe casts similar to dynamic_cast
.
Note that for parent classes (defined via BEGIN_SUBCLASSES
), we actually
end up creating two tags: one START_...
and one END_...
. The reason
for this will become clear in a moment.
enum AstTag {
#define AST_NODE(NAME) NAME ,
#define AST_BEGIN_SUBCLASSES(NAME) START_##NAME ,
#define AST_END_SUBCLASSES(NAME) END_##NAME ,
#include "chpl/uast/uast-classes-list.h"
#undef AST_NODE
#undef AST_BEGIN_SUBCLASSES
#undef AST_END_SUBCLASSES
NUM_AST_TAGS,
AST_TAG_UNKNOWN
};
The above snippet makes AstTag
contain elements such as DoWhile
,
While
, START_Loop
, and END_Loop
. For convenience, we also add a couple
of other elements: NUM_AST_TAGS
, which is
[note:
This is because C++ assigns integer values to enum elements sequentially, starting
at zero.
]
and a generic “unknown tag” value.
Having generated the enum elements in this way, we can write query functions.
This way, the API consumer can write isLoop(tag)
instead of manually performing
a comparison. Code generation here is actually split into two distinct forms
of “is bla” methods: those for concrete AST nodes (DoWhile,
While
) and
those for abstract base classes (Loop
). The reason for this is simple:
only a AstTag::DoWhile
represents a do-while loop, but both DoWhile
and While
are instances of Loop
. So, isLoop
should return true for both.
This is where the START_...
and END_...
enum elements come in. Reading
the header file top-to-bottom, we first end up generating START_Loop
,
then DoWhile
and While
, and then END_Loop
. Since C++ assigns integer
value to enums sequentially, to check if a tag “extends” a base class, it’s
sufficient to check if its value is greater than the START
token, and
smaller than the END
token – this means it was declared within the
matching pair of BEGIN_SUBCLASSES
and END_SUBCLASES
.
// define is___ for leaf and regular nodes
// (not yet for abstract parent classes)
#define AST_NODE(NAME) \
static inline bool is##NAME(AstTag tag) { \
return tag == NAME; \
}
#define AST_BEGIN_SUBCLASSES(NAME)
#define AST_END_SUBCLASSES(NAME)
// Apply the above macros to uast-classes-list.h
#include "chpl/uast/uast-classes-list.h"
// clear the macros
#undef AST_NODE
#undef AST_BEGIN_SUBCLASSES
#undef AST_END_SUBCLASSES
// define is___ for abstract parent classes
#define AST_NODE(NAME)
#define AST_BEGIN_SUBCLASSES(NAME) \
static inline bool is##NAME(AstTag tag) { \
return START_##NAME < tag && tag < END_##NAME; \
}
#define AST_END_SUBCLASSES(NAME)
// Apply the above macros to uast-classes-list.h
#include "chpl/uast/uast-classes-list.h"
// clear the macros
#undef AST_NODE
#undef AST_BEGIN_SUBCLASSES
#undef AST_END_SUBCLASSES
These helpers are quite convenient. Here are a few examples of what we end up with:
isFor(AstTag::For) // Returns true; a 'for' loop is indeed a 'for' loop.
isIndexableLoop(AstTag::For) // Returns true; a 'for' loop is "indexable" ('for i in ...')
isLoop(AstTag::For) // Returns true; a 'for' loop is a loop.
isFor(AstTag::While) // Returns false; a 'while' loop is not a 'for' loop.
isIndexableLoop(AstTag::While) // Returns false; a 'while' loop uses a boolean condition, not an index
isLoop(AstTag::While) // Returns true; a 'while' loop is a loop.
On the top-level AST node class, we generate isWhateverNode
and
toWhateverNode
for each AST subclass. Thus, user code is able to inspect the
AST and perform (checked) casts using plain methods. I omit isWhateverNode
here for brevity (its definition is very simple), and include only
toWhateverNode
.
#define AST_TO(NAME) \
const NAME * to##NAME() const { \
return this->is##NAME() ? (const NAME *)this : nullptr; \
} \
NAME * to##NAME() { \
return this->is##NAME() ? (NAME *)this : nullptr; \
}
#define AST_NODE(NAME) AST_TO(NAME)
#define AST_LEAF(NAME) AST_TO(NAME)
#define AST_BEGIN_SUBCLASSES(NAME) AST_TO(NAME)
#define AST_END_SUBCLASSES(NAME)
// Apply the above macros to uast-classes-list.h
#include "chpl/uast/uast-classes-list.h"
// clear the macros
#undef AST_NODE
#undef AST_LEAF
#undef AST_BEGIN_SUBCLASSES
#undef AST_END_SUBCLASSES
#undef AST_TO
These methods are used heavily in the compiler. For example, here’s a completely random snippet of code I pulled out:
if (auto var = decl->toVarLikeDecl()) {
// Figure out variable type based upon:
// * the type in the variable declaration
// * the initialization expression in the variable declaration
// * the initialization expression from split-init
auto typeExpr = var->typeExpression();
auto initExpr = var->initExpression();
if (auto var = decl->toVariable())
if (var->isField())
isField = true;
Thus, developers adding new AST nodes are not required to manually implement
the isWhatever
, toWhatever
, and other functions. This and a fair bit
of other AST functionality (which I will cover in the next subsection) is
automatically generated using X Macros.
The Visitor Pattern without Double Dispatch
The Visitor Pattern is very important in general, but it’s beyond ubiquitous for us compiler developers. It helps avoid bloating AST node classes with methods and state required for the various operations we perform on them. It also often saves us from writing AST traversal code.
Essentially, rather than adding each new operation (e.g. convert to string,
compute the type, assign IDs) as methods on each AST node class, we extract
this code into a per-operation visitor. This visitor is a class that has methods
implementing the custom behavior on the AST nodes. A visit(WhileLoop*)
method
might be used to perform the operation on ‘while’ loops, and visit(ForLoop*)
might
do the same for ‘for’ loops. The AST nodes themselves only have a traverse
method that accepts a visitor, whatever it may be, and calls the appropriate
visit methods. This way, the AST node implementations remain simple and relatively
stable.
As a very simple example, suppose you wanted to count the number of loops used
in a program for an unspecified reason. You could add a countLoops
method,
but then you’ve introduced a method to the AST node API for what might be a
one-time, throwaway operation. With the visitor pattern, you don’t need to do
that; you can just create a new class:
struct MyVisitor {
int count = 0;
void visit(const Loop*) { count += 1; }
void visit(const AstNode*) { /* do nothing for other nodes */ }
}
int countLoops(const AstNode* root) {
MyVisitor visitor;
root->traverse(visitor);
return visitor.count;
}
The traverse
method is a nice API, isn’t it? It’s very easy to add operations
that work on your syntax trees, without modifying them. There is still an important
open question, though: how does traverse
know to call the right visit
function?
If traverse
were only defined on AstNode*
, and it simply called visit(this)
,
we’d always end up calling the AstNode
version of the visit
function. This
is because C++ doesn’t dynamic dispatch
[note:
Obviously, C++ has the ability to pick the right method based on the runtime
type of the receiver: that's just virtual
functions
and vtable
s.
]
Statically, the call clearly accepts an AstNode
, and nothing more specific.
The compiler therefore picks that version of the visit
method.
The “traditional” way to solve this problem in a language like C++ or Java
is called double dispatch. Using our example as reference, this involves
making each AST node class have its own traverse
method. This way,
calls to visit(this)
have more specific type information, and are resolved
to the appropriate overload. But that’s more boilerplate code: each new AST
node will need to have a virtual traverse method that looks something like this:
void MyNode::traverse(Visitor& v) {
v.visit(this);
}
It would also require all visitors to extend from Visitor
. So now you have:
- Boilerplate code on every AST node that looks the same but needs to be duplicated
- A parent
Visitor
class that must have avisit
method for each AST node in the language (so that children can override it). - To make it easier to write code like our
MyVisitor
above, thevisit
methods in theVisitor
must be written such thatvisit(ChildNode*)
callsvisit(ParentNode*)
by default. Otherwise, theLoop
overload wouldn’t have been called by theDoWhile
overload (e.g.).
So there’s a fair bit of tedious boilerplate, and more code to manually modify
when adding an AST node: you have to go and adjust the Visitor
class with
new visit
stub.
The reason all of this is necessary is that everyone (myself included) generally agrees that code like the following is generally a bad idea:
struct AstNode {
void traverse(Visitor& visitor) {
if (auto forLoop = toForLoop()) {
visitor.visit(forLoop);
} else if (auto whileLoop = toWhileLoop()) {
visitor.visit(whileLoop);
} else {
// 100 more lines like this...
}
}
}
After all, what happens when you add a new AST node? You’d still have to modify
this list, and since everything still extends Visitor
, you’d still need to
add a new visit
stub there. But what if there were no base class? Instead,
what if traverse
were a template?
struct AstNode {
template <typename VisitorType>
void traverse(VisitorType& visitor) {
if (auto forLoop = toForLoop()) {
visitor.visit(forLoop);
} else if (auto whileLoop = toWhileLoop()) {
visitor.visit(whileLoop);
} else {
// 100 more lines like this...
}
}
}
Note that this wouldn’t be possible to write in C++ if visit
were a virtual
method; have you ever heard of a virtual template? With code like this, the
VisitorType
wouldn’t need to define every overload, as long as it had
a version for AstNode
. Furthermore, C++’s regular overload resolution rules
would take care of calling the Loop
overload if a more specific one for
DoWhile
didn’t exist.
The only problem that remains is that of having a 100-line if-else (which could
be a switch
to little aesthetic benefit). But this is exactly where the
X Macro pattern shines again! We already have a list of all AST node classes,
and the code for invoking them is nearly identical. Thus, the Chapel compiler
has a doDispatch
function (used by traverse
) that looks like this:
static void doDispatch(const AstNode* ast, Visitor& v) {
switch (ast->tag()) {
#define CONVERT(NAME) \
case chpl::uast::asttags::NAME: \
{ \
v.visit((const chpl::uast::NAME*) ast); \
return; \
}
#define IGNORE(NAME) \
case chpl::uast::asttags::NAME: \
{ \
CHPL_ASSERT(false && "this code should never be run"); \
}
#define AST_NODE(NAME) CONVERT(NAME)
#define AST_BEGIN_SUBCLASSES(NAME) IGNORE(START_##NAME)
#define AST_END_SUBCLASSES(NAME) IGNORE(END_##NAME)
#include "chpl/uast/uast-classes-list.h"
IGNORE(NUM_AST_TAGS)
IGNORE(AST_TAG_UNKNOWN)
#undef AST_NODE
#undef AST_BEGIN_SUBCLASSES
#undef AST_END_SUBCLASSES
#undef CONVERT
#undef IGNORE
}
CHPL_ASSERT(false && "this code should never be run");
}
And that’s it. We have automatically generated the traversal code, allowing
us to use the visitor pattern in what I think is a very elegant way. Assuming
a developer adding a new AST node updates the uast-classes-list.h
header,
the traversal logic will be auto-modified to properly handle the new node.
Generating a Python Class Hierarchy
This is a fun one. For a while, in my spare time, I was working on Python bindings for Chapel. These bindings are oriented towards developing language tooling: it feels much easier to write a language linter, auto-formatter, or maybe even a language server in Python rather than in C++. It’s definitely much easier to use Python to develop throwaway scripts that work with Chapel programs, which is something that developers on the Chapel team tend to do quite often.
I decided I wanted the Python AST node class hierarchy to match the C++ version.
This is convenient for many reasons, including being able to wrap methods on
parent AST nodes and have them be available through child AST nodes and having
isinstance
work properly. It’s also advantageous from the point of view
of conceptual simplicity. However, I very much did not want to write CPython
API code to define the many AST node classes that are available in the Chapel
language.
Once again, the uast-classes-list.h
header came into play here. With little
effort, I was able to auto-generate PyTypeObject
s for each AST node in the
class hierarchy:
#define DEFINE_PY_TYPE_FOR(NAME, TAG, FLAGS)\
PyTypeObject NAME##Type = { \
PyVarObject_HEAD_INIT(NULL, 0) \
.tp_name = #NAME, \
.tp_basicsize = sizeof(NAME##Object), \
.tp_itemsize = 0, \
.tp_flags = FLAGS, \
.tp_doc = PyDoc_STR("A Chapel " #NAME " AST node"), \
.tp_methods = (PyMethodDef*) PerNodeInfo<TAG>::methods, \
.tp_base = parentTypeFor(TAG), \
.tp_init = (initproc) NAME##Object_init, \
.tp_new = PyType_GenericNew, \
};
#define AST_NODE(NAME) DEFINE_PY_TYPE_FOR(NAME, chpl::uast::asttags::NAME, Py_TPFLAGS_DEFAULT)
#define AST_BEGIN_SUBCLASSES(NAME) DEFINE_PY_TYPE_FOR(NAME, chpl::uast::asttags::START_##NAME, Py_TPFLAGS_BASETYPE)
#define AST_END_SUBCLASSES(NAME)
#include "chpl/uast/uast-classes-list.h"
#undef AST_NODE
#undef AST_BEGIN_SUBCLASSES
#undef AST_END_SUBCLASSES
You may have noticed that I snuck templates into the code above. The motivation there is to avoid writing out the (usually empty) Python method table for every single AST node. In particular, I have a template that, by default, provides an empty method table, which can be specialized per node to add methods when necessary. This detail is useful for application 3 below, but not necessary to understand the use of X Macros here.
I used the same <
and >
trick to generate the parentTypeFor
each tag:
static PyTypeObject* parentTypeFor(chpl::uast::asttags::AstTag tag) {
#define AST_NODE(NAME)
#define AST_LEAF(NAME)
#define AST_BEGIN_SUBCLASSES(NAME)
#define AST_END_SUBCLASSES(NAME) \
if (tag > chpl::uast::asttags::START_##NAME && tag < chpl::uast::asttags::END_##NAME) { \
return &NAME##Type; \
}
#include "chpl/uast/uast-classes-list.h"
#include "chpl/uast/uast-classes-list.h"
#undef AST_NODE
#undef AST_LEAF
#undef AST_BEGIN_SUBCLASSES
#undef AST_END_SUBCLASSES
return &AstNodeType;
}
A few more invocations of the uast-classes-list.h
macro, and I had a working
class hierarchy. I didn’t explicitly mention any AST node at all; all was derived
from the Chapel compiler header. This also meant that as the language changed
and the AST class hierarchy developed, the Python bindings’ code would not need
to be updated. As long as it was compiled with an up-to-date version of the
header, the hierarchy would match that present within the language.
This allows for code like the following to be written in Python:
def print_decls(mod):
"""
Print all the things declared in this Chapel module.
"""
for child in mod:
if isinstance(child, NamedDecl):
print(child.name())
Application 3: CPython Method Tables and Getters
The Chapel Python bindings use the X Macro pattern another time, actually. Like I mentioned earlier, I use template specialization to reduce the amount of boilerplate code required for declaring Python objects. In particular, there’s a general method table declared as follows:
template <chpl::uast::asttags::AstTag tag>
struct PerNodeInfo {
static constexpr PyMethodDef methods[] = {
{NULL, NULL, 0, NULL} /* Sentinel */
};
};
Then, when I need to add methods, I use template specialization by writing something like the following:
template <>
struct PerNodeInfo<TheAstTag> {
static constexpr PyMethodDef methods[] = {
{"method_name", TheNode_method_name, METH_NOARGS, "Documentation string"},
// ... more like the above ...
{NULL, NULL, 0, NULL} /* Sentinel */
};
};
When reviewing a PR that adds more methods to the Python bindings (by
defining new TheNode_methodname
functions and then including them in the
method table), I noticed that in the PR, the developer added some methods
but forgot to put them into the respective table, leaving them unusable by
the Python client code. This came with the additional observation that there
was a moderate amount of duplication when declaring the C++ functions and then
listing them in the table. The name (method_name
in the code) occurred many
times.
The developer who opened the PR suggesting using X Macros to combine the information (declaration of function and its use in the corresponding method table) into a single list. This led to the following header file:
CLASS_BEGIN(FnCall)
METHOD_PROTOTYPE(FnCall, actuals, "Get the actuals of this FnCall node")
PLAIN_GETTER(FnCall, used_square_brackets, "Check if this FnCall was made using square brackets",
"b", return node->callUsedSquareBrackets())
CLASS_END(FnCall)
The PLAIN_GETTER
macro in this case is used to define trivial getters
(precluding the need for handling the Python-object-to-AST-node conversion,
and other CPython-specific things), whereas the METHOD_PROTOTYPE
is used
to refer to methods that needed explicit implementations. With
this, the method tables are generated as follows:
#define CLASS_BEGIN(TAG) \
template <> \
struct PerNodeInfo<chpl::uast::asttags::TAG> { \
static constexpr PyMethodDef methods[] = {
#define CLASS_END(TAG) \
{NULL, NULL, 0, NULL} /* Sentinel */ \
}; \
};
#define PLAIN_GETTER(NODE, NAME, DOCSTR, TYPESTR, BODY) \
{#NAME, NODE##Object_##NAME, METH_NOARGS, DOCSTR},
#define METHOD_PROTOTYPE(NODE, NAME, DOCSTR) \
{#NAME, NODE##Object_##NAME, METH_NOARGS, DOCSTR},
#include "method-tables.h"
The CLASS_BEGIN
generates the initial template <>
header and the code up
to the opening curly brace of the table definition. Then, for each method,
PLAIN_GETTER
and METHOD_PROTOTYPE
generate the relevant entries. Finally,
CLASS_END
inserts the sentinel and the closing curly brace.
Another invocation of the macros in method-tables.h
is used to generate the
implementations of “plain getters”, which is boilerplate that I won’t get into
it here, since it’s pretty CPython specific.
Discussion
I’ve presented to you a three applications of the pattern, in an order that happens to be from least to most “extreme”. It’s possible that some of these are over the line for using macros, especially for those who think of macros as unfortunate remnants of C++’s past. However, I think that what I’ve demonstrated demonstrates the versatility of the X Macro pattern – feel free to apply it to the degree that you find appropriate.
The thing I like the most about this pattern is that the header files read quite nicely:
you end up with a very declarative “scaffold” of what’s going on. The
uast-classes-list.h
makes for an excellent and fairly readable reference of
all the AST nodes in the Chapel compiler. The method-tables.h
header provides
a fairly concise summary of what methods are available on what (Python) AST
node.
Of course, this approach is not without its drawbacks. Drawback zero is the heavy use of macros: to the best of my knowledge, modern C++ tends to discourage the usage of macros in favor of C++-specific features. Of course, this “pure C++” preference is applicable to variable degrees in different use cases and code bases; because of this, I won’t count macros as (too much of) a drawback.
The more significant downside is that this approach introduces a lot of dependencies
between source files. Any time the header changes, anything that uses any part
of the code generated by the header must be recompiled. Thus, if you’re generating
classes, changing any one class will “taint” any code that uses any of the
generated classes. In the Chapel compiler, touching the AST class hierarchy
requires a recompilation of all the AST nodes, and any compiler code that uses
the AST nodes (a lot). This is because each AST node needs access to the
AstTag
enum, and that enum is generated from the hierarchy header.
That’s all I have for today! Thanks for reading. I hope you got something useful for your day-to-day programming out of this.