### Metaprogramming

Lazy evaluation can be a powerful tool for structuring your code. For instance, it can let you turn your code inside out, inverting the flow of control. Many a Haskell program take advantage of laziness to express algorithms in clear succinct terms, turning them from recipes to declarations.

The question for today’s blog post is: How can we tap the power of lazy evaluation in an inherently eager language like C++? I’ll lead you through a simple coding example and gradually introduce the building blocks of lazy programming: the suspension, the lazy stream, and a whole slew of functional algorithms that let you operate on them. In the process we’ll discover some fundamental functional patterns like functors, monads, and monoids. I have discussed them already in my post about C++ futures. It’s very edifying to see them emerge in a completely different context.

# The Problem

Let’s write a program that prints the first n Pythagorean triples. A Pythagorean triple consists of three integers, x, y, and z, that satisfy the relation x2 + y2 = z2. Let’s not be fancy and just go with the brute force approach. Here’s the program in C:

```void printNTriples(int n)
{
int i = 0;
for (int z = 1; ; ++z)
for (int x = 1; x <= z; ++x)
for (int y = x; y <= z; ++y)
if (x*x + y*y == z*z) {
printf("%d, %d, %d\n", x, y, z);
if (++i == n)
return;
}
}```

Here, a single C function serves three distinct purposes: It

1. Generates Pythagorean triples,
2. Prints them,
3. Counts them; and when the count reaches n, breaks.

This is fine, as long as you don’t have to modify or reuse this code. But what if, for instance, instead of printing, you wanted to draw the triples as triangles? Or if you wanted to stop as soon as one of the numbers reached 100? The problem with this code is that it’s structured inside out: both the test and the sink for data are embedded in the innermost loop of the algorithm. A more natural and flexible approach would be to:

1. Generate the list of Pythagorean triples,
2. Take the first ten of them, and
3. Print them.

And that’s exactly how you’d write this program in Haskell:

```main = print (take 10 triples)

triples = [(x, y, z) | z <- [1..]
, x <- [1..z]
, y <- [x..z]
, x^2 + y^2 == z^2]```

This program reads: take 10 triples and print them. It declares `triples` as a list (square brackets mean a list) of triples `(x, y, z)`, where (the vertical bar reads “where”) `z` is an element of the list of integers from 1 to infinity, `x` is from 1 to `z`, `y` is from `x` to `z`, and the sum of squares of `x` and `y` is equal to the square of `z`. This notation is called “list comprehension” and is characteristic of Haskell terseness.

You see the difference? Haskell let’s you abstract the notion of the list of Pythagorean triples so you can operate on it as one entity, whereas in C (or, for that matter, in C++) we were not able to disentangle the different, orthogonal, aspects of the program.

The key advantage of Haskell in this case is its ability to deal with infinite lists. And this ability comes from Haskell’s inherent laziness. Things are never evaluated in Haskell until they are absolutely needed. In the program above, it was the call to `print` that forced Haskell to actually do some work: take 10 elements from the list of triples. Since the triples weren’t there yet, it had to calculate them, but only as many as were requested and not a number more.

# Suspension

We’ll start with the most basic building block of laziness: a suspended function. Here’s the first naive attempt:

```template<class T>
class Susp {
public:
explicit Susp(std::function<T()> f)
: _f(f)
{}
T get() { return _f(); }
private:
std::function<T()> _f;
};```

We often create suspensions using lambda functions, as in:

```int x = 2;
int y = 3;
Susp<int> sum([x, y]() { return x + y; });
...
int z = sum.get();```

Notice that the suspended lambda may capture variables from its environment: here `x` and `y`. A lambda, and therefore a suspension, is a `closure`.

The trouble with this implementation is that the function is re-executed every time we call `get`. There are several problems with that: If the function is not pure, we may get different values each time; if the function has side effects, these may happen multiple times; and if the function is expensive, the performance will suffer. All these problems may be addressed by memoizing the value.

Here’s the idea: The first time the client calls `get` we should execute the function and store the returned value in a member variable. Subsequent calls should go directly to that variable. We could implement this by setting a Boolean flag on the first call and then checking it on every subsequent call, but there’s a better implementation that uses thunks.

A thunk is a pointer to a free function taking a suspension (the `this` pointer) and returning a value (by const reference). The `get` method simply calls this thunk, passing it the `this` pointer.

Initially, the thunk is set to `thunkForce`, which calls the method `setMemo`. This method evaluates the function, stores the result in `_memo`, switches the thunk to `thunkGet`, and returns the memoized value. On subsequent calls `get` goes through the `getMemo` thunk which simply returns the memoized value.

```template<class T>
class Susp
{
// thunk
static T const & thunkForce(Susp * susp) {
return susp->setMemo();
}
// thunk
static T const & thunkGet(Susp * susp) {
return susp->getMemo();
}
T const & getMemo() {
return _memo;
}
T const & setMemo() {
_memo = _f();
_thunk = &thunkGet;
return getMemo();
}
public:
explicit Susp(std::function<T()> f)
: _f(f), _thunk(&thunkForce), _memo(T())
{}
T const & get() {
return _thunk(this);
}
private:
T const & (*_thunk)(Susp *);
mutable T   _memo;

std::function<T()> _f;
};```

(By the way, the function pointer declaration of `_thunk` looks pretty scary in C++, doesn’t it?)

[Edit: I decided to remove the discussion of the thread safe implementation since it wasn’t ready for publication. The current implementation is not thread safe.]

You can find a lot more detail about the Haskell implementation of suspended functions in the paper by Tim Harris, Simon Marlow, and Simon Peyton Jones, Haskell on a Shared-Memory Multiprocessor.

# Lazy Stream

The loop we used to produce Pythagorean triples in C worked on the push principle — data was pushed towards the sink. If we want to deal with infinite lists, we have to use the pull principle. It should be up to the client to control the flow of data. That’s the inversion of control I was talking about in the introduction.

We’ll use a lazy list and call it a stream. In C++ a similar idea is sometimes expressed in terms of input and forward iterators, although it is understood that an iterator itself is not the source or the owner of data — just an interface to one. So we’ll stick with the idea of a stream.

We’ll implement the stream in the functional style as a persistent data structure fashioned after persistent lists (see my series of blog post on persistent data structures). It means that a stream, once constructed, is never modified. To “advance” the stream, we’ll have to create a new one by calling the `const` method `pop_front`.

Let’s start with the definition: A stream is either empty or it contains a suspended cell. This immediately suggests the implementation as a (possibly null) pointer to a cell. Since the whole stream is immutable, the cell will be immutable too, so it’s perfectly safe to share it between copies of the stream. We can therefore use a shared pointer:

```template<class T>
class Stream
{
private:
std::shared_ptr <Susp<Cell<T>>> _lazyCell;
};```

Of course, because of reference counting and memoization, the stream is only conceptually immutable and, in the current implementation, not thread safe.

So what’s in the `Cell`? Remember, we want to be able to generate infinite sequences, so `Stream` must contain the DNA for not only producing the value of type `T` but also for producing the offspring — another (lazy) `Stream` of values. The `Cell` is just that: A value and a stream.

```template<class T>
class Cell
{
public:
Cell() {} // need default constructor for memoization
Cell(T v, Stream<T> const & tail)
: _v(v), _tail(tail)
{}
explicit Cell(T v) : _v(v) {}
T val() const {
return _v;
}
Stream<T> pop_front() const {
return _tail;
}
private:
T _v;
Stream<T> _tail;
};```

This mutually recursive pair of data structures works together amazingly well.

```template<class T>
class Stream
{
private:
std::shared_ptr <Susp<Cell<T>>> _lazyCell;
public:
Stream() {}
Stream(std::function<Cell<T>()> f)
: _lazyCell(std::make_shared<Susp<Cell<T>>>(f))
{}
Stream(Stream && stm)
: _lazyCell(std::move(stm._lazyCell))
{}
Stream & operator=(Stream && stm)
{
_lazyCell = std::move(stm._lazyCell);
return *this;
}
bool isEmpty() const
{
return !_lazyCell;
}
T get() const
{
return _lazyCell->get().val();
}
Stream<T> pop_front() const
{
return _lazyCell->get().pop_front();
}
};```

There are several things worth pointing out. The two constructors follow our formal definition of the `Stream`: one constructs an empty stream, the other constructs a suspended `Cell`. A suspension is created from a function returning `Cell`.

I also added a move constructor and a move assignment operator for efficiency. We’ll see it used in the implementation of `forEach`.

The magic happens when we call `get` for the first time. That’s when the suspended `Cell` comes to life. The value and the new stream are produced and memoized for later use. Or, this may happen if the first call is to `pop_front`. Notice that `pop_front` is a `const` method — the `Stream` itself is immutable. The method returns a new stream that encapsulates the rest of the sequence.

Let’s get our feet wet by constructing a stream of integers from `n` to infinity. The constructor of a `Stream` takes a function that returns a `Cell`. We’ll use a lambda that captures the value of `n`. It creates a `Cell` with that value and a tail, which it obtains by calling `intsFrom` with `n+1`:

```Stream<int> intsFrom(int n)
{
return Stream<int>([n]()
{
return Cell<int>(n, intsFrom(n + 1));
});
}```

It’s a recursive definition, but without the usual recursive function calls that eat up the stack. The call to the inner `intsFrom` is not made from the outer `intsFrom`. Instead it’s made the first time `get` is called on the emerging `Stream`.

Of course, we can also create finite streams, like this one, which produces integers from `n` to `m`:

```Stream<int> ints(int n, int m)
{
if (n > m)
return Stream<int>();
return Stream<int>([n, m]()
{
return Cell<int>(n, ints(n + 1, m));
});
}```

The trick is to capture the limit `m` as well as the recursion variable `n`. When the limit is reached, we simply return an empty `Stream`.

We’ll also need the method `take`, which creates a `Stream` containing the first `n` elements of the original stream:

```Stream take(int n) const {
if (n == 0 || isEmpty())
return Stream();
auto cell = _lazyCell;
return Stream([cell, n]()
{
auto v = cell->get().val();
auto t = cell->get().pop_front();
return Cell<T>(v, t.take(n - 1));
});
}```

Here we are capturing the suspended cell and use it to lazily generate the elements of the new, truncated, `Stream`. Again, the key to understanding why this works is to keep in mind that `Stream`s and `Cell`s are conceptually immutable, and therefore can be shared by the implementation. This has some interesting side effects, which don’t influence the results, but change the performance. For instance, if the caller of `take` forces the evaluation of the first n elements — e.g., by passing them through the consuming `forEach` below — these elements will appear miraculously memoized in the original `Stream`.

Finally, we’ll need some way to iterate through streams. Here’s an implementation of `forEach` that consumes the stream while enumerating it and feeding its elements to a function.

```template<class T, class F>
void forEach(Stream<T> strm, F f)
{
while (!strm.isEmpty())
{
f(strm.get());
strm = strm.pop_front();
}
}```

It’s the assignment:

`strm = strm.pop_front();`

which consumes the stream by decreasing the reference count of the head of the `Stream`. In particular, if you pass an rvalue `Stream` to `forEach`, its elements will be generated and deleted in lockstep. The algorithm will use constant memory, independent of the virtual length of the `Stream`. What Haskell accomplishes with garbage collection, we approximate in C++ with reference counting and `shared_ptr`.

# Working with Streams

It’s not immediately obvious how to translate our Pythagorean triple program from nested loops to lazy streams, so we’ll have to take inspiration from the corresponding Haskell program. Let me first rewrite it using a slightly different notation:

```triples = do
z <- [1..]
x <- [1..z]
y <- [x..z]
guard (x^2 + y^2 == z^2)
return (x, y, z)```

The general idea is this: Start with the stream of integers from 1 to infinity. For every such integer — call it `z` — create a stream from 1 to `z`. For each of those — call them `x` — create a stream from `x` to `z`. Filter out those which don’t satisfy the Pythagorean constraint. Finally, output a stream of tuples `(x, y, z)`.

So far we’ve learned how to create a stream of integers — we have the function `intsFrom`. But now we’ll have to do something for each of these integers. We can’t just enumerate those integers and apply a function to each, because that would take us eternity. So we need a way to act on each element of a stream lazily.

In functional programming this is called mapping a function over a list. In general, a parameterized data structure that can be mapped over is called a functor. I’m going to show you that our `Stream` is a functor.

# Stream as a Functor

The idea is simple: we want to apply a function to each element of a stream to get a new transformed stream (it’s very similar to the `std::transform` algorithm from STL). The catch is: We want to do it generically and lazily.

To make the algorithm — we’ll call it `fmap` — generic, we have to parameterize it over types. The algorithm starts with a `Stream` of elements of type `T` and a function from `T` to some other type `U`. The result should be a stream of `U`.

We don’t want to make `U` the template argument, because then the client would have to specify it explicitly. We want the compiler to deduce this type from the type of the function. We want, therefore, the function type `F` to be the parameter of our template (this will also allow us to call it uniformly with function pointers, function objects, and lambdas):

```template<class T, class F>
auto fmap(Stream<T> stm, F f)```

Without the use of concepts, we have no way of enforcing, or even specifying, that `F` be a type of a function from `T` to `U`. The best we can do is to statically assert it inside the function:

```static_assert(std::is_convertible<F, std::function<U(T)>>::value,
"fmap requires a function type U(T)");```

But what is `U`? We can get at it using `decltype`:

`decltype(f(stm.get()));`

Notice that `decltype` takes, as an argument, an expression that can be statically typed. Here, the expression is a function call of `f`. We also need a dummy argument for this function: we use the result of `stm.get()`. The argument to `decltype` is never evaluated, but it is type-checked at compile time.

One final problem is how to specify the return type of `fmap`. It’s supposed to be `Stream<U>`, but we don’t know `U` until we apply `decltype` to the arguments of `fmap`. We have to use the new `auto` function declaration syntax of C++11. So here are all the type-related preliminaries:

```template<class T, class F>
auto fmap(Stream<T> stm, F f)->Stream<decltype(f(stm.get()))>
{
using U = decltype(f(stm.get()));
static_assert(std::is_convertible<F, std::function<U(T)>>::value,
"fmap requires a function type U(T)");
...
}```

Compared to that, the actual implementation of `fmap` seems rather straightforward:

```    if (stm.isEmpty()) return Stream<U>();
return Stream<U>([stm, f]()
{
return Cell<U>(f(stm.get()), fmap(stm.pop_front(), f));
});
```

In words: If the stream is empty, we’re done — return an empty stream. Otherwise, create a new stream by suspending a lambda function. That function captures the original stream (by value) and the function `f`, and returns a `Cell`. That cell contains the value of `f` acting on the first element of the original stream, and a tail. The tail is created with `fmap` acting on the rest of the original stream.

Equipped with `fmap`, we can now attempt to take the first step towards generating our triples: apply the function `ints(1, z)` to each element of the stream `intsFrom(1)`:

```fmap(intsFrom(1), [](int z)
{
return ints(1, z);
});```

The result is a `Stream` of `Stream`s of integers of the shape:

```1
1 2
1 2 3
1 2 3 4
1 2 3 4 5
...```

But now we are stuck. We’d like to apply `ints(x, z)` to each element of that sequence, but we don’t know how to get through two levels of `Stream`. Our `fmap` can only get through one layer. We need a way to flatten a `Stream` of `Stream`s. That functionality is part of what functional programmers call a monad. So let me show you that `Stream` is indeed a monad.

If you think of a `Stream` as a list, the flattening of a list of lists is just concatenation. Suppose for a moment that we know how to lazily concatenate two `Stream`s (we’ll get to it later) and let’s implement a function `mjoin` that concatenates a whole `Stream` of `Stream`s.

You might have noticed a pattern in the implementation of lazy functions on streams. We use some kind of recursion, which starts with “Are we done yet?” If not, we do an operation that involves one element of the stream and the result of a recursive call to the function itself.

The “Are we done yet?” question usually involves testing for an empty stream. But here we are dealing with a `Stream` of `Stream`s, so we have to test two levels deep. This way we’ll ensure that the concatenation of a `Stream` of empty `Stream`s immediately returns an empty `Stream`.

The recursive step in `mjoin` creates a `Cell` whose element is the head of the first stream, and whose tail is the concatenation of the tail of the first stream and the result of `mjoin` of the rest of the streams:

```template<class T>
Stream<T> mjoin(Stream<Stream<T>> stm)
{
while (!stm.isEmpty() && stm.get().isEmpty())
{
stm = stm.pop_front();
}
if (stm.isEmpty()) return Stream<T>();
return Stream<T>([stm]()
{
Stream<T> hd = stm.get();
return Cell<T>( hd.get()
, concat(hd.pop_front(), mjoin(stm.pop_front())));
});
}```

The combination of `fmap` and `mjoin` lets us compose function like `intsFrom` or `ints` that return `Stream`s. In fact, this combination is so common that it deserves its own function, which we’ll call `mbind`:

```template<class T, class F>
auto mbind(Stream<T> stm, F f) -> decltype(f(stm.get()))
{
return mjoin(fmap(stm, f));
}```

If we use `mbind` in place of `fmap`:

```mbind(intsFrom(1), [](int z)
{
return ints(1, z);
});```

we can produce a flattened list:

`1 1 2 1 2 3 1 2 3 4...`

But it’s not just the list: Each element of the list comes with variables that are defined in its environment — here the variable `z`. We can keep chaining calls to `mbind` and capture more variables in the process:

```mbind(intsFrom(1), [](int z)
{
return mbind(ints(1, z), [z](int x)
{
return mbind(ints(x, z), [x, z](int y)
{
...
}
}
}```

At this point we have captured the triples `x`, `y`, `z`, and are ready for the Pythagorean testing. But before we do it, let’s define two additional functions that we’ll use later.

The first one is `mthen` which is a version of `mbind` that takes a function of no arguments. The idea is that such a function will be executed for each element of the stream, but it won’t use the value of that element. The important thing is that the function will not be executed when the input stream is empty. In that case, `mthen` will return an empty stream.

We implement `mthen` using a slightly modified version of `fmap` that takes a function `f` of no arguments:

```template<class T, class F>
auto fmapv(Stream<T> stm, F f)->Stream<decltype(f())>
{
using U = decltype(f());
static_assert(std::is_convertible<F, std::function<U()>>::value,
"fmapv requires a function type U()");

if (stm.isEmpty()) return Stream<U>();
return Stream<U>([stm, f]()
{
return Cell<U>(f(), fmapv(stm.pop_front(), f));
});
}```

We plug it into the definition of `mthen` the same way `fmap` was used in `mbind`:

```template<class T, class F>
auto mthen(Stream<T> stm, F f) -> decltype(f())
{
return mjoin(fmapv(stm, f));
}```

The second useful function is `mreturn`, which simply turns a value of any type into a one-element `Stream`:

```template<class T>
Stream<T> mreturn(T v)
{
return Stream<T>([v]() {
return Cell<T>(v);
});
}```

We’ll need `mreturn` to turn our triples into `Stream`s.

It so happens that a parameterized type equipped with `mbind` and `mreturn` is called a monad (it must also satisfy some additional monadic laws, which I won’t talk about here). Our lazy `Stream` is indeed a monad.

# Stream as a Monoid and a Monad Plus

When implementing `mjoin` we used the function `concat` to lazily concatenate two `Stream`s. Its implementation follows the same recursive pattern we’ve seen so many times:

```template<class T>
Stream<T> concat( Stream<T> lft
, Stream<T> rgt)
{
if (lft.isEmpty())
return rgt;
return Stream<T>([=]()
{
return Cell<T>(lft.get(), concat<T>(lft.pop_front(), rgt));
});
}```

What’s interesting is that the concatenation of streams puts them under yet another well known functional pattern: a monoid. A monoid is equipped with a binary operation, just like `concat`, which must be associative and possess a unit element. It’s easy to convince yourself that concatenation of `Stream`s is indeed associative, and that the neutral element is an empty `Stream`. Concatenating an empty `Stream`, whether in front or in the back of any other `Stream`, doesn’t change the original `Stream`.

What’s even more interesting is that being a combination of a monoid and a monad makes `Stream` into a monad plus, and every monad plus defines a `guard` function — exactly what we need for the filtering of our triples. This function takes a Boolean argument and outputs a `Stream`. If the Boolean is false, the `Stream` is empty (the unit element of monad plus!), otherwise it’s a singleton `Stream`. We really don’t care what value sits in this `Stream` — we never use the result of `guard` for anything but the flow of control. In Haskell, there is a special “unit” value `()` — here I use a `nullptr` as its closest C++ analog.

```Stream<void*> guard(bool b)
{
if (b) return Stream<void*>(nullptr);
else return Stream<void*>();
}```

We can now pipe the result of `guard` into `mthen`, which will ignore the content of the `Stream` but won’t fire when the `Stream` is empty. When the `Stream` is not empty, we will call `mreturn` to output a singleton `Stream` with the result tuple:

```Stream<std::tuple<int, int, int>> triples()
{
return mbind(intsFrom(1), [](int z)
{
return mbind(ints(1, z), [z](int x)
{
return mbind(ints(x, z), [x, z](int y)
{
return mthen(guard(x*x + y*y == z*z), [x, y, z]()
{
return mreturn(std::make_tuple(x, y, z));
});
});
});
});
}```

These singletons will then be concatenated by the three levels of `mbind` to create one continuous lazy `Stream` of Pythagorean triples.

Compare this function with its Haskell counterpart:

```triples = do
z <- [1..]
x <- [1..z]
y <- [x..z]
guard (x^2 + y^2 == z^2)
return (x, y, z)```

Now, the client can `take` 10 of those triples from the `Stream` — and the triples still won’t be evaluated!. It’s the consuming `forEach` that finally forces the evaluation:

```void test()
{
auto strm = triples().take(10);
forEach(std::move(strm), [](std::tuple<int, int, int> const & t)
{
std::cout << std::get<0>(t) << ", "
<< std::get<1>(t) << ", "
<< std::get<2>(t) << std::endl;
});
}```

# Conclusion

The generation of Pythagorean triples is a toy example, but it shows how lazy evaluation can be used to restructure code in order to make it more reusable. You can use the same function `triples` to print the values in one part of your program and draw triangles in another. You can filter the triples or impose different termination conditions. You can use the same trick to generate an infinite set of approximation to the solution of a numerical problem, and then use different strategies to truncate it. Or you can create an infinite set of animation frames, and so on.

The building blocks of laziness are also reusable. I have used them to implement the solution to the eight-queen problem and a conference scheduling program. Once they made thread safe, the combinators that bind them are thread safe too. This is, in general, the property of persistent data structures.

You might be concerned about the performance of lazy data structures, and rightly so. They use the heap heavily, so memory allocation and deallocation is a serious performance bottleneck. There are many situation, though, where code structure, reusability, maintenance, and correctness (especially in multithreaded code) are more important than performance. And there are some problems that might be extremely hard to express without the additional flexibility gained from laziness.

I made the sources to all code in this post available on GitHub.

The videos and the slides of my Boostcon talks, “Haskell — The Pseudocode Language for C++ Metaprogramming” and now available online, courtesy my employer, Corensic. They cover some of the material from the blog series on monads.

I’m totally exhausted, so I won’t write much. I spoke to the audience practically non-stop for three hours, and then I was rewriting Proto with Joel Falcou and Hartmut Kaiser over beers till late at night. My talk about the connection between Haskell monads and Boost Proto was very well received–beyond my expectations. It looks like there is some real need to put theoretical foundations under the more advanced areas of template metaprogramming; and monads just fit the bill. There was a monad lurking in Proto all this time– the only thing missing was “bind”.

I’m looking forward to tomorrow’s keynote by Hans Boehm, and the continued hacking of Proto with Joel and Hartmut.

Aspen, Monday, May 15: These are just quick notes, and I apologize if they might not very coherent. I have little time to jot them down because the day was filled with talks and discussions. There are two main topics people are excited about: concurrency and template metaprogramming.

In the morning I went to Christopher Kohlhoff talk about Boost.Asio. Asio stands for Asynchronous IO. The problem is how to structure your program to deal with a lot of asynchronous calls. There are two main difficulties: resource management and inversion of control. An async-driven program is essentially a collection of disjoint callbacks. When control jumps from one callback to another, you can easily lose track of who is the owner of which resources, how to get hold of them, and how to reason about the whole system. Chris showed how to deal with resources, essentially by using shared, reference-counted, pointers. When you’re making an asynchronous call you usually have to make sure that the completion handler shares some resources with the caller. In Chris’s approach these resources are encapsulated in a Connection object and the completion handlers are methods of that object. So when a handler is called, it has access to the “this” pointer and can share data with the caller. A handler must take part in the reference counting of the Connection object, so that the object doesn’t suddenly disappear. Considering that a lot of event-driven code I’ve seen over the years used global shared data to store state, this is progress. The inversion of control is still a problem though.

Hartmut Kaiser talked about the Phoenix library, which essentially implements C++ inside C++. It is built on top of Proto (which is a big hit at the conference–more about it later). From what I gathered, Phoenix is mostly a better lambda. You can write anonymous functions in Phoenix using a very stylized C++: for instance, instead of braces, you use brackets, instead of semicolons, commas, etc. One advantage of Phoenix functions is that they can be statically polymorphic. Unfortunately the main example didn’t require polymorphism, and in fact would be much easier on the eyes if it were written using C++ lambdas. The other, more important advantage of Proto is that it’s highly customizable. Hartmut showed an example of how to extend C++ syntax to support parallelism. Behind the scenes, Phoenix took advantage of OpenMP, which is a system of ugly pragmas supported by many compilers to create parallel loops and other concurrent constructs.

An then there was a Proto marathon by Joel Falcou, after which I had a long discussion with him over dinner. Proto is a metaprogramming tour de force. If I described it as a library for constructing embedded domain-specific languages in C++, I wouldn’t give it justice. It’s a system that tricks the C++ compiler into parsing expressions into full-blown compile-time abstract syntax trees, which are at the same time function objects that can be executed at runtime. If this is not impressive enough, Proto provides multiple customization mechanism that allow you to plug in new constructs, give them specific semantics, and even rewrite the ASTs. Joel gave an example of an EDSL for expressing analytical functions, which could analytically calculate derivatives of functions at compile time. Joel is coming to my talk tomorrow and I hope he will be able to explain to me what I’m doing. My talk is essentially about how to get to grips with Proto by using Haskell monads. We’ll see how it goes.