May 2015



This is part 4 of the miniseries about solving a simple constraint-satisfaction problem:

  s e n d
+ m o r e
---------
m o n e y

using monads in C++. Previously: The Tale of Two Monads. To start from the beginning, go to Using Monads in C++ to Solve Constraints: 1.

In the previous post I showed some very general programming techniques based on functional data structures and monads. A lot of the code I wrote to solve this particular arithmetic puzzle is easily reusable. In fact the monadic approach is perfect for constructing libraries. I talked about it when discussing C++ futures and ranges. A monad is a pure expression of composability, and composability is the most important feature of any library.

You would be justified to think that rolling out a monad library just to solve a simple arithmetic puzzle is overkill. But I’m sure you can imagine a lot of other, more complex problems that could be solved using the same techniques. You might also fear that such functional methods — especially when implemented in C++, which is not optimized for this kind of programming — would lead to less performant code. This doesn’t matter too much if you are solving a one-shot problem, and the time you spend developing and debugging your code dwarfs the time it takes the program to complete execution. But even if performance were an issue and you were faced with a larger problem, functional code could be parallelized much easier than its imperative counterpart with no danger of concurrency bugs. So you might actually get better performance from functional code by running it in parallel.

Refactoring

In this installment I’d like to talk about something that a lot of functional programmers swear by: Functional programs are amazingly easy to refactor. Anybody who has tried to refactor C++ code can testify to how hard it is, and how long it takes to iron out all the bugs introduced by refactoring. With functional code, it’s a breeze. So let’s have another look at the code from the previous post and do some deep surgery on it. This is our starting point:

StateList<tuple> solve()
{
    StateList sel = &select;

    return mbind(sel, [=](int s) {
    return mbind(sel, [=](int e) {
    return mbind(sel, [=](int n) {
    return mbind(sel, [=](int d) {
    return mbind(sel, [=](int m) {
    return mbind(sel, [=](int o) {
    return mbind(sel, [=](int r) {
    return mbind(sel, [=](int y) {
        return mthen(guard(s != 0 && m != 0), [=]() {
            int send  = asNumber(vector{s, e, n, d});
            int more  = asNumber(vector{m, o, r, e});
            int money = asNumber(vector{m, o, n, e, y});
            return mthen(guard(send + more == money), [=]() {
                return mreturn(make_tuple(send, more, money));
            });
        });
    });});});});});});});});
}

What immediately stands out is the amount of repetition: eight lines of code look almost exactly the same. Conceptually, these lines correspond to eight nested loops that would be used in the imperative solution. The question is, how can we abstract over control structures, such as loops? But in our monadic implementation the loops are replaced with higher-order functions, and that opens up a path toward even further abstraction.

Reifying Substitutions

The first observation is that we are missing a data structure that should correspond to the central concept we use when describing the solution — the substitution. We are substituting numbers for letters, but we only see those letters as names of variables. A more natural implementation would use a map data structure, mapping characters to integers.

Notice, however, that this mapping has to be dynamically collected and torn down as we are exploring successive branches of the solution space. The imperative approach would demand backtracking. The functional approach makes use of persistent data structures. I have described such data structures previously, in particular a persistent red-black tree. It can be easily extended to a red-black map. You can find the implementation on github.

A red-black map is a generic data structure, which we’ll specialize for our use:

using Map = RBMap<char, int>;

A map lookup may fail, so it should either be implemented to return an optional, or use a default value in case of a failure. In our case we know that we never look up a value that’s not in the map (unless our code is buggy), so we can use a helper function that never fails:

int get(Map const & subst, char c)
{
    return subst.findWithDefault(-1, c);
}

Once we have objectified the substitution as a map, we can convert a whole string of characters to a number in one operation:

int toNumber(Map const & subst, string const & str)
{
    int acc = 0;
    for (char c : str)
    {
        int d = get(subst, c);
        acc = 10 * acc + d;
    }
    return acc;
}

There is one more thing that we may automate: finding the set of unique characters in the puzzle. Previously, we just figured out that they were s, e, n, d, m, o, r, y and hard-coded them into the solution. Now we can use a function, fashioned after a Haskell utility called nub:

string nub(string const & str)
{
    string result;
    for (char c : str)
    {
        if (result.find(c) == string::npos)
            result.push_back(c);
    }
    return result;
}

Don’t worry about the quadratic running time of nub — it’s only called once.

Recursion

With those preliminaries out of the way, let’s concentrate on analyzing the code. We have a series of nested mbind calls. The simplest thing is to turn these nested calls into recursion. This involves setting up the first recursive call, implementing the recursive function, and writing the code to terminate the recursion.

The main observation is that each level of nesting accomplishes one substitution of a character by a number. The recursion should end when we run out of characters to substitute.

Let’s first think about what data has to be passed down in a recursive call. One item is the string of characters that still need substituting. The second is the substitution map. While the first argument keeps shrinking, the second one keeps growing. The third argument is the number to be used for the next substitution.

Before we make the first recursive call, we have to prepare the initial arguments. We get the string of characters to be substituted by running nub over the text of our puzzle:

nub("sendmoremoney")

The second argument should start as an empty map. The third argument, the digit to be used for the next substitution, comes from binding the selection plan select<int> to a lambda that will call our recursive function. In Haskell, it’s common to call the auxiliary recursive function go, so that’s what we’ll do here as well:

StateList<tuple<int, int, int>> solve()
{
    StateList<int> sel = &select<int>;
    Map subst;
    return mbind(sel, [=](int s) {
        return go(nub("sendmoremoney"), subst, s);
    });
}

When implementing go, we have to think about terminating the recursion. But first, let’s talk about how to continue it.

We are called with the next digit to be used for substitution, so we should perform this substitution. This means inserting a key/value pair into our substitution map subst. The key is the first character from the string str. The value is the digit i that we were given.

subst.inserted(str[0], i)

Notice that, because the map is immutable, the method inserted returns a new map with one more entry. This is a persistent data structure, so the new map shares most of its data with its previous version. The creation of the new version takes logarithmic time in the number of entries (just as it does with a regular balanced tree that’s used in the Standard Library std::map). The advantage of using a persistent map is that we don’t have to do any backtracking — the unmodified version is still available after the return from the recursive call.

The recursive call to go expects three arguments: (1) the string of characters yet to be substituted — that will be the tail of the string that we were passed, (2) the new substitution map, and (3) the new digit n. We get this new digit by binding the digit selection sel to a lambda that makes the recursive call:

mbind(sel, [=](int n) {
    string tail(&str[1], &str[str.length()]);
    return go(tail, subst.inserted(str[0], i), n);
});

Now let’s consider the termination condition. Since go was called with the next digit to be substituted, we need at least one character in the string for this last substitution. So the termination is reached when the length of the string reaches one. We have to make the last substitution and then call the final function that prunes the bad substitutions. Here’s the complete implementation of go:

StateList<tuple<int, int, int>> go(string str, Map subst, int i)
{
    StateList<int> sel = &select<int>;

    assert(str.length() > 0);
    if (str.length() == 1)
        return prune(subst.inserted(str[0], i));
    else
    {
        return mbind(sel, [=](int n) {
            string tail(&str[1], &str[str.length()]);
            return go(tail, subst.inserted(str[0], i), n);
        });
    }
}

The function prune is lifted almost literally from the original implementation. The only difference is that we are now using a substitution map.

StateList<tuple<int, int, int>> prune(Map subst)
{
    return mthen(guard(get(subst, 's') != 0 && get(subst, 'm') != 0), 
      [=]() {
        int send  = toNumber(subst, "send");
        int more  = toNumber(subst, "more");
        int money = toNumber(subst, "money");
        return mthen(guard(send + more == money), [=]() {
            return mreturn(make_tuple(send, more, money));
        });
    });
}

The top-level code that starts the chain of events and displays the solution is left unchanged:

int main()
{
    List<int> lst{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
    cout << evalStateList(solve(), lst);
    return 0;
}

It’s a matter of opinion whether the refactored code is simpler or not. It’s definitely easier to generalize and it’s less error prone. But the main point is how easy it was to make the refactorization. The reason for that is that, in functional code there are no hidden dependencies, because there are no hidden side effects. What would usually be considered a side effect in imperative code is accomplished using pure functions and monadic binding. There is no way to forget that a given function works with state, because a function that deals with state has the state specified in its signature — it returns a StateList. And it doesn’t itself modify the state. In fact it doesn’t have access to the state. It just composes functions that will modify the state when executed. That’s why we were able to move these functions around so easily.

A Word About Syntax

Monadic code in C++ is artificially complex. That’s because C++ was not designed for functional programming. The same program written in Haskell is much shorter. Here it is, complete with the helper functions prune, get, and toNumber:

solve = StateL select >>= go (nub "sendmoremoney") M.empty
  where
    go [c] subst i = prune (M.insert c i subst)
    go (c:cs) subst i = StateL select >>= go cs (M.insert c i subst)
    prune subst = do
        guard (get 's' /= 0 && get 'm' /= 0)
        let send  = toNumber "send"
            more  = toNumber "more"
            money = toNumber "money"
        guard (send + more == money)
        return (send, more, money)
      where
        get c = fromJust (M.lookup c subst)
        toNumber str = asNumber (map get str)
        asNumber = foldl (\t u -> t*10 + u) 0
main = print $ evalStateL solve [0..9]

Some of the simplification comes from using the operator >>= in place of mbind, and a simpler syntax for lambdas. But there is also some built-in syntactic sugar for chains of monadic bindings in the form of the do notation. You see an example of it in the implementation of prune.

The funny thing is that C++ is on the verge of introducing something equivalent to do notation called resumable functions. There is a very strong proposal for resumable functions in connection with the future monad, dealing with asynchronous functions. There is talk of using it with generators, or lazy ranges, which are a version of the list monad. But there is still a need for a push towards generalizing resumable functions to user-defined monads, like the one I described in this series. It just doesn’t make sense to add separate language extensions for each aspect of the same general pattern. This pattern has been known in functional programming for a long time and it’s called a monad. It’s a very useful and versatile pattern whose importance has been steadily growing as the complexity of software projects increases.

Bibliography

  1. The code for this blog is available on github.
  2. Douglas M. Auclair, MonadPlus: What a Super Monad!, The Monad.Reader Issue 11, August 25, 2008. It contains the Haskell implementation of the solution to this puzzle.
  3. Justin Le, Unique sample drawing & searches with List and StateT — “Send more money”. It was the blog post that inspired this series.

This is part 3 of the miniseries about solving a simple constraint-satisfaction problem:

  s e n d
+ m o r e
---------
m o n e y

using monads in C++. Previously: The State Monad

A constraint satisfaction problem may be solved by brute force, by trying all possible substitutions. This may be done using the imperative approach with loops, or a declarative approach with the list monad. The list monad works by fanning out a list of “alternative universes,” one for each individual substitution. It lets us explore the breadth of the solution space. The universes in which the substitution passes all the constraints are then gathered together to form the solution.

We noticed also that, as we are traversing the depth of the solution space, we could avoid testing unnecessary branches by keeping track of some state. Indeed, we start with a list of 10 possibilities for the first character, but there are only 9 possibilities for the second character, 8 for the third, and so on. It makes sense to start with a list of ten digits, and keep removing digits as we make our choices. At every point, the list of remaining choices is the state we have to keep track of.

Again, there is the imperative approach and the functional approach to state. The imperative approach makes the state mutable and the composition of actions requires backtracking. The functional approach uses persistent, immutable data structures, and the composition is done at the level of functions — plans of action that can be executed once the state is available. These plans of action form the state monad.

The functional solution to our problem involves the combination of the list monad and the state monad. Mashing two monads together is not trivial — in Haskell this is done using monad transformers — but here I’ll show you how to do it manually.

State List

Let’s start with the state, which is the list of digits to choose from:

using State = List<int>;

In the state monad, our basic data structure was a function that took a state and returned a pair: a value and a new state. Now we want our functions to generate, as in quantum mechanics, a whole list of alternative universes, each described by a pair: a value and a state.

Notice that we have to vary both the value and the state within the list. For instance, the function that selects a number from a list should produce a different number and a different list in each of the alternative universes. If the choice is, say, 3, the new state will have 3 removed from the list.

ListSelections

Here’s the convenient generic definition of the list of pairs:

template<class A>
using PairList = List<pair<A, State>>;

And this is our data structure that represents a non-deterministic plan of action:

template<class A> 
using StateList = function<PairList<A>(State)>;

Once we have such a non-deterministic plan of action, we can run it. We give it a state and get back a list of results, each paired with a new state:

template<class A>
PairList<A> runStateList(StateList<A> st, State s)
{
    return st(s);
}

Composing Non-Deterministic Plans

As we’ve seen before, the essence of every monad is the composition of individual actions. The higher-order function mbind that does the composition has the same structure for every monad. It takes two arguments. One argument is the result of our previous activity. For the list monad, it was a list of values. For the state monad it was a plan of action to produce a value (paired with the leftover state). For the combination state-list monad it will be a plan of action to produce a list of values (each combined with the leftover state).

The second argument to mbind is a continuation. It takes a single value. It’s supposed to be the value that is encapsulated in the first argument. In the case of the list monad, it was one value from the list. In the case of the state monad, it was the value that would be returned by the execution of the plan. In the combined state-list monad, this will be one of the values produced by the plan.

The continuation is supposed to produce the final result, be it a list of values or a plan. In the state-list monad it will be a plan to produce a list of values, each with its own leftover state.

Finally, the result of mbind is the composite: for the list monad it was a list, for the state monad it was a plan, and for the combination, it will be a plan to produce a list of values and states.

We could write the signature of mbind simply as:

template<class A, class B>
StateList<B> mbind(StateList<A>, function<StateList<B>(A)>)

but that would prevent the C++ compiler from making automatic type deductions. So instead we parameterize it with the type A and a function type F. That requires some additional footwork to extract the return type from the type of F:

template<class A, class F>
auto mbind(StateList<A> g, F k) 
    -> decltype(k(g(State()).front().first))

The implementation of mbind is a combination of what we did for the list monad and for the state monad. To begin with, we are supposed to return a plan, so we have to create a lambda. This lambda takes a state s as an argument. Inside the lambda, given the state, we can run the first argument — the plan. We get a list of pairs. Each pair contains a value and a new state. We want to execute the continuation k for each of these values.

To do that, we run fmap over this list (think std::transform, but less messy). Inside fmap, we disassemble each pair into value and state, and execute the continuation k over the value. The result is a new plan. We run this plan, passing it the new state. The result is a list of pairs: value, state.

Since each item in the list produces a list, the result of fmap is a list of lists. Just like we did in the list monad, we concatenate all these lists into one big list using concatAll.

template<class A, class F>
auto mbind(StateList<A> g, F k) 
    -> decltype(k(g(State()).front().first))
{
    return [g, k](State s) {
        PairList<A> plst = g(s);
        // List<PairList<B>> 
        auto lst2 = fmap([k](pair<A, State> const & p) {
            A a = p.first;
            State s1 = p.second;
            auto ka = k(a);
            auto result = runStateList(ka, s1);
            return result;
        }, plst);
        return concatAll(lst2);
    };
}

To make state-list into a full-blown monad we need one more ingredient: the function mreturn that turns any value into a trivial plan to produce a singleton list with that value. The state is just piped through:

template<class A>
StateList<A> mreturn(A a)
{
    return [a](State s) {
        // singleton list
        return PairList<A>(make_pair(a, s)); 
    };
}

So that would be all for the state-list monad, except that in C++ we have to special-case mbind for the type of continuation that takes a void argument (there is no value of type void in C++). We’ll call this version mthen:

template<class A, class F>
auto mthen(StateList<A> g, F k) -> decltype(k())
{
    return [g, k](State s) {
        PairList<A> plst = g(s);
        auto lst2 = fmap([k](pair<A, State> const & p) {
            State s1 = p.second;
            auto ka = k();
            auto result = runStateList(ka, s1);
            return result;
        }, plst);
        return concatAll(lst2);
    };
}

Notice that, even though the value resulting from running the first argument to mthen is discarded, we still have to run it because we need the modified state. In a sense, we are running it only “for side effects,” although all these functions are pure and have no real side effects. Once again, we can have our purity cake and eat it too, simulating side effects with pure functions.

Guards

Every time we generate a candidate substitution, we have to check if it satisfies our constraints. That would be easy if we could access the substitution. But we are in a rather peculiar situation: We are writing code that deals with plans to produce substitutions. How do you test a plan? Obviously we need to write a plan to do the testing (haven’t we heard that before?).

Here’s the trick: Look at the implementation of mthen (or mbind for that matter). When is the continuation not executed? The continuation is not executed if the list passed to fmap is empty. Not executing the continuation is equivalent to aborting the computation. And what does mthen return in that case? An empty list!

So the way to abort a computation is to pass an argument to mthen that produces an empty list. Such a poison pill of a plan is produced by a function traditionally called mzero.

template<class A>
StateList<A> mzero()
{
    return[](State s) {
        return PairList<A>(); // empty list
    };
}

In functional programming, a monad that supports this functionality is called a “monad plus” (it also has a function called mplus). It so happens that the list monad and the state-list monad are both plus.

Now we need a function that produces a poison pill conditionally, so we can pass the result of this function to mthen. Remember, mthen ignores the individual results, but is sensitive to the size of the list. The function guard, on success, produces a discardable value — here, a nullptr — in a singleton list. This will trigger the execution of the continuation in mthen, while discarding the nullptr. On failure, guard produces mzero, which will abort the computation.

StateList<void*> guard(bool b)
{
    if (b) {
        return [](State s) {
            // singleton list with discardable value
            return List<pair<void*, State>>(make_pair(nullptr, s));
        };
    }
    else
        return mzero<void*>(); // empty list
}

The Client Side

Everything I’ve described so far is reusable code that should be put in a library. Equipped with the state-list monad library, all the client needs to do is to write a few utility functions and combine them into a solution.

Here’s the function that picks a number from a list. Actually, it picks a number from a list in all possible ways. It returns a list of all picks paired with the remainders of the list. This function is our non-deterministic selection plan.

template<class A>
PairList<A> select(List<A> lst)
{
    if (lst.isEmpty())
        return PairList<A>();

    A       x  = lst.front();
    List<A> xs = lst.popped_front();

    auto result = List<pair<A, State>>();
    forEach(select(xs), [x, &result](pair<A, List<A>> const & p)
    {
        A       y  = p.first;
        List<A> ys = p.second;
        auto y_xys = make_pair(y, ys.pushed_front(x));
        result = result.pushed_front(y_xys);
    });

    return result.pushed_front(make_pair(x, xs));
}

Here’s a little utility function that converts a list (actually, a vector) of digits to a number:

int asNumber(vector<int> const & v)
{
    int acc = 0;
    for (auto i : v)
        acc = 10 * acc + i;
    return acc;
}

And here’s the final solution to our puzzle:

StateList<tuple<int, int, int>> solve()
{
    StateList<int> sel = &select<int>;

    return mbind(sel, [=](int s) {
    return mbind(sel, [=](int e) {
    return mbind(sel, [=](int n) {
    return mbind(sel, [=](int d) {
    return mbind(sel, [=](int m) {
    return mbind(sel, [=](int o) {
    return mbind(sel, [=](int r) {
    return mbind(sel, [=](int y) {
        return mthen(guard(s != 0 && m != 0), [=]() {
            int send  = asNumber(vector<int>{s, e, n, d});
            int more  = asNumber(vector<int>{m, o, r, e});
            int money = asNumber(vector<int>{m, o, n, e, y});
            return mthen(guard(send + more == money), [=]() {
                return mreturn(make_tuple(send, more, money));
            });
        });
    });});});});});});});});
}

It starts by creating the basic plan called sel for picking numbers from a list. (The list is the state that will be provided later.) It binds this selection to a very large continuation that produces the final result — the plan to produce triples of numbers.

Here, sel is really a plan to produce a list, but the continuation expects just one number, s. If you look at the implementation of mbind, you’ll see that this continuation is called for every single pick, and the results are aggregated into one list.

The large continuation that is passed to the first mbind is itself implemented using mbind. This one, again, takes the plan sel. But we know that the state on which this sel will operate is one element shorter than the one before it. This second selection is passed to the next continuation, and so on.

Then the pruning begins: There is an mthen that takes a guard that aborts all computations where either s or m is zero. Then three numbers are formed from the selected digits. Notice that all those digits are captured by the enclosing lambdas by value (the [=] clauses). Yet another mthen takes a guard that checks the arithmetics and, if that one passes, the final plan to produce a singleton triple (send, more, money) is returned.

In theory, all these singletons would be concatenated on the way up to produce a list of solutions, but in reality this puzzle has only one solution. To get this solution, you run the plan with the list of digits:

List<int> lst{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
cout << evalStateList(solve(), lst);

The function evalStateList works the same way as runStateList except that it discards the leftover state. You can either implement it or use runStateList instead.

What’s Next?

One of the advantages of functional programming (other than thread safety) is that it makes refactoring really easy. Did you notice that 8 lines of code in our solution are almost identical? That’s outrageous! We have to do something about it. And while we’re at it, isn’t it obvious that a substitution should be represented as a map from characters to digits? I’ll show you how to do it in the next installment.

All this code and more is available on github.


This is the second part of the cycle “Using Monads in C++ to Solve Constraints” in which I’m illustrating functional techniques using the example of a simple puzzle:

  s e n d
+ m o r e
---------
m o n e y

Previously, I talked about using the list monad to search the breadth of the solution space.

What would you do if you won a lottery? Would you buy a sports car, drop your current job, and go on a trip across the United States? Maybe you would start your own company, multiply the money, and then buy a private jet?

We all like making plans, but they are often contingent on the state of our finances. Such plans can be described by functions. For instance, the car-buying plan is a function:

pair<Car, Cash> buyCar(Cash cashIn)

The input is some amount of Cash, and the output is a brand new Car and the leftover Cash (not necessarily a positive number!).

In general, a financial plan is a function that takes cash and returns the result paired with the new value of cash. It can be described generically using a template:

template<class A>
using Plan = function<pair<A, Cash>(Cash)>;

You can combine smaller plans to make bigger plans. For instance, you may use the leftover cash from your car purchase to finance your trip, or invest in a business, and so on.

There are some things that you already own, and you can trivially include them in your plans:

template<class A>
Plan<A> got_it(A a)
{
    return [a](Cash s) { return make_pair(a, s); };
}

What does all this daydreaming have to do with the solution to our puzzle? I mentioned previously that we needed to keep track of state, and this is how functional programmers deal with state. Instead of relying on side effects to silently modify the state, they write code that generates plans of action.

An imperative programmer, on the other hand, might implement the car-buying procedure by passing it a bank object, and the withdrawal of money would be a side effect of the purchase. Or, the horror!, the bank object could be global.

In functional programming, each individual plan is a function: The state comes in, and the new state goes out, paired with whatever value the function was supposed to produce in the first place. These little plans are aggregated into bigger plans. Finally, the master plan is executed — that’s when the actual state is passed in and the result comes out together with a new state. We can do the same in modern C++ using lambdas.

You might be familiar with a similar technique used with expression templates in C++. Expression templates form the basis of efficient implementation of matrix calculus, where expressions involving matrices and vectors are not evaluated on the spot but rather converted to parse trees. These trees may then be evaluated using more efficient techniques. You can think of an expression template as a plan for evaluating the final result.

The State Monad

SelectionsWithLists

To find the solution to our puzzle, we will generate substitutions by picking digits from a list of integers. This list of integers will be our state.

using State = List<int>;

We will be using a persistent list, so we won’t have to worry about backtracking. Persistent lists are never mutated — all their versions persist, and we can go back to them without fearing that they may have changed. We’ll need that when we combine our state calculations with the list monad to get the final solution. For now, we’ll just consider one substitution.

We’ll be making and combining all kinds of plans that rely on, and produce new state:

template<class A>
using Plan = function<pair<A, State>(State)>;

We can always run a plan, provided we have the state available:

template<class A>
pair<A, State> runPlan(Plan<A> pl, State s)
{
    return pl(s);
}

You may recall from the previous post that the essence of every monad is the ability to compose smaller items into bigger ones. In the case of the state monad we need the ability to compose plans of action.

For instance, suppose that you know how to generate a plan to travel across the United States on a budget, as long as you have a car. You don’t have a car though. Not to worry, you have a plan to get a car. It should be easy to generate a composite plan that involves buying the car and then continuing with your trip.

Notice the two ingredients: one is the plan to buy a car: Plan<Car>. The second ingredient is a function that takes a car and produces the plan for your trip, Plan<Trip>. This function is the continuation that leads to your final goal: it’s a function of the type Plan<Trip>(Car). And the continuation itself may in turn be a composite of many smaller plans.

So here’s the function mbind that binds a plan pl to a continuation k. The continuation uses the output of the plan pl to generate another plan. The function mbind is supposed to return a new composite plan, so it must return a lambda. Like any other plan, this lambda takes a state and returns a pair: value, state. We’ll implement this lambda for the most general case.

The logic is simple. Inside the lambda we will have the state available, so we can run the plan pl. We get back a pair: the value of type A and the new state. We pass this value to the continuation k and get back a new plan. Finally, we run that plan with the new state and we’re done.

template<class A, class F>
auto mbind(Plan<A> pl, F k) -> decltype(k(pl(State()).first))
{
    using B = decltype(k(pl(State()).first)(State()).first);
    // this should really be done with concepts
    static_assert(std::is_convertible<
        F, std::function<Plan<B>(A)>> ::value,
        "mbind requires a function type Plan<B>(A)");

    return [pl, k](State s) {
        pair<A, State> ps = runPlan(pl, s);
        Plan<B> plB = k(ps.first);
        return runPlan(plB, ps.second); // new state!
    };
}

Notice that all this running of plans inside mbind doesn’t happen immediately. It happens when the lambda is executed, which is when the larger plan is run (possibly as part of an even larger plan). So what mbind does is to create a new plan to be executed in the future.

And, as with every monad, there is a function that takes regular input and turns it into a trivial plan. I called it got_it before, but a more common name would be mreturn:

template<class A>
Plan<A> mreturn(A a)
{
    return [a](State s) { return make_pair(a, s); };
}

The state monad usually comes with two helper functions. The function getState gives direct access to the state by turning it into the return value:

Plan<State> getState()
{
    return [](State s) { return make_pair(s, s); };
}

Using getState you may examine the state when the plan is running, and dynamically select different branches of your plan. This makes monads very flexible, but it also makes the composition of different monads harder. We’ll see this in the next installment, when we compose the state monad with the list monad.

The second helper function is used to modify (completely replace) the state.

Plan<void*> putState(State newState)
{
    return [=](State s) { return make_pair(nullptr, newState); };
}

It doesn’t evaluate anything useful, so its return type is void* and the returned value is nullptr. Its only purpose is to encapsulate a side effect. What’s amazing is that you can do it and still preserve functional purity. There’s no better example of having your cake and eating it too.

Example

To demonstrate the state monad in action, let’s implement a tiny example. We start with a small plan that just picks the first number from a list (the list will be our state):

pair<int, List<int>> select(List<int> lst)
{
    int i = lst.front();
    return make_pair(i, lst.popped_front());
}

Persistent list method popped_front returns a list with its first element popped off. Typical of persistent data structures, this method doesn’t modify the original list. It doesn’t copy the list either — it just creates a pointer to its tail.

Here’s our first plan:

Plan<int> sel = &select;

Now we can create a more complex plan to produce a pair of integers:

Plan<pair<int, int>> pl = 
    mbind(sel, [=](int i) { return
    mbind(sel, [=](int j) { return
    mreturn(make_pair(i, j));
}); });

Let’s analyze this code. The first mbind takes a plan sel that selects the first element from a list (the list will be provided later, when we execute this plan). It binds it to the continuation (whose body I have grayed out) that takes the selected integer i and produces a plan to make a pair of integers. Here’s this continuation:

[=](int i) { return
    mbind(sel, [=](int j) { return
    mreturn(make_pair(i, j));
}); });

It binds the plan sel to another, smaller continuation that takes the selected element j and produces a plan to make a pair of integers. Here’s this smaller continuation:

[=](int j) { return
    mreturn(make_pair(i, j));
});

It combines the first integer i that was captured by the lambda, with the second integer j that is the argument to the lambda, and creates a trivial plan that returns a pair:

mreturn(make_pair(i, j));

Notice that we are using the same plan sel twice. But when this plan is executed inside of our final plan, it will return two different elements from the input list. When mbind is run, it first passes the state (the list of integers) to the first sel. It gets back the modified state — the list with the first element popped. It then uses this shorter list to execute the plan produced by the continuation. So the second sel gets the shortened list, and picks its first element, which is the second element of the original list. It, too, shortens the list, which is then passed to mreturn, which doesn’t modify it any more.

We can now run the final plan by giving it a list to pick from:

List<int> st{ 1, 2, 3 };
cout << runPlan(pl, st) << endl;

We are still not ready to solve the puzzle, but we are much closer. All we need is to combine the list monad with the state monad. I’m leaving this task for the next installment.

But here’s another look at the final solution:

StateL<tuple<int, int, int>> solve()
{
    StateL<int> sel = &select<int>;

    return mbind(sel, [=](int s) {
    return mbind(sel, [=](int e) {
    return mbind(sel, [=](int n) {
    return mbind(sel, [=](int d) {
    return mbind(sel, [=](int m) {
    return mbind(sel, [=](int o) {
    return mbind(sel, [=](int r) {
    return mbind(sel, [=](int y) {
        return mthen(guard(s != 0 && m != 0), [=]() {
            int send  = asNumber(vector<int>{s, e, n, d});
            int more  = asNumber(vector<int>{m, o, r, e});
            int money = asNumber(vector<int>{m, o, n, e, y});
            return mthen(guard(send + more == money), [=]() {
                return mreturn(make_tuple(send, more, money));
            });
        });
    });});});});});});});});
}

This time I didn’t rename mbind to for_each and mreturn to yield.

Now that you’re familiar with the state monad, you may appreciate the fact that the same sel passed as the argument to mbind will yield a different number every time, as long as the state list contains unique digits.

Before You Post a Comment

I know what you’re thinking: Why would I want to complicate my life with monads if there is a much simpler, imperative way of dealing with state? What’s the payoff of the functional approach? The immediate payoff is thread safety. In imperative programming mutable shared state is a never-ending source of concurrency bugs. The state monad and the use of persistent data structures eliminates the possibility of data races. And it does it without any need for synchronization (other than reference counting inside shared pointers).

I will freely admit that C++ is not the best language for functional programming, and that monadic code in C++ looks overly complicated. But, let’s face it, in C++ everything looks overly complicated. What I’m describing here are implementation details of something that should be encapsulated in an easy to use library. I’ll come back to it in the next installment of this mini-series.

Challenges

  1. Implement select from the example in the text using getState and putState.
  2. Implement evalPlan, a version of runPlan that only returns the final value, without the state.
  3. Implement mthen — a version of mbind where the continuation doesn’t take any arguments. It ignores the result of the Plan, which is the first argument to mthen (but it still runs it, and uses the modified state).
  4. Use the state monad to build a simple RPN calculator. The state is the stack (a List) of items:
    enum ItemType { Plus, Minus, Num };
    
    struct Item
    {
        ItemType _type;
        int _num;
        Item(int i) : _type(Num), _num(i) {}
        Item(ItemType t) : _type(t), _num(-1) {}
    };

    Implement the function calc() that creates a plan to evaluate an integer. Here’s the test code that should print -1:

    List<int> stack{ Item(Plus)
                   , Item(Minus)
                   , Item(4)
                   , Item(8)
                   , Item(3) };
    cout << evalPlan(calc(), stack) << endl;

    (The solution is available on github.)


I am sometimes asked by C++ programmers to give an example of a problem that can’t be solved without monads. This is the wrong kind of question — it’s like asking if there is a problem that can’t be solved without for loops. Obviously, if your language supports a goto, you can live without for loops. What monads (and for loops) can do for you is to help you structure your code. The use of loops and if statements lets you convert spaghetti code into structured code. Similarly, the use of monads lets you convert imperative code into declarative code. These are the kind of transformations that make code easier to write, understand, maintain, and generalize.

So here’s a problem that you may get as an interview question. It’s a small problem, so the advantages of various approaches might not be immediately obvious, especially if you’ve been trained all your life in imperative programming, and you are seeing monads for the first time.

You’re supposed write a program to solve this puzzle:

  s e n d
+ m o r e
---------
m o n e y

Each letter correspond to a different digit between 0 and 9. Before you continue reading this post, try to think about how you would approach this problem.

The Analysis

It never hurts to impress your interviewer with your general knowledge by correctly classifying the problem. This one belongs to the class of “constraint satisfaction problems.” The obvious constraint is that the numbers obtained by substituting letters with digits have to add up correctly. There are also some less obvious constraints, namely the numbers should not start with zero.

If you were to solve this problem using pencil and paper, you would probably come up with lots of heuristics. For instance, you would deduce that m must stand for 1 because that’s the largest possible carry from the addition of two digits (even if there is a carry from the previous column). Then you’d figure out that s must be either 8 or 9 to produce this carry, and so on. Given enough time, you could probably write an expert system with a large set of rules that could solve this and similar problems. (Mentioning an expert system could earn you extra points with the interviewer.)

However, the small size of the problem suggests that a simple brute force approach is probably best. The interviewer might ask you to estimate the number of possible substitutions, which is 10!/(10 – 8)! or roughly 2 million. That’s not a lot. So, really, the solution boils down to generating all those substitutions and testing the constraints for each.

The Straightforward Solution

The mind of an imperative programmer immediately sees the solution as a set of 8 nested loops (there are 8 unique letters in the problem: s, e, n, d, m, o, r, y). Something like this:

for (int s = 0; s < 10; ++s)
    for (int e = 0; e < 10; ++e)
        for (int n = 0; n < 10; ++n)
            for (int d = 0; d < 10; ++d)
                ...

and so on, until y. But then there is the condition that the digits have to be different, so you have to insert a bunch of tests like:

e != s
n != s && n != e
d != s && d != e && d != n

and so on, the last one involving 7 inequalities… Effectively you have replaced the uniqueness condition with 28 new constraints.

This would probably get you through the interview at Microsoft, Google, or Facebook, but really, can’t you do better than that?

The Smart Solution

Before I proceed, I should mention that what follows is almost a direct translation of a Haskell program from the blog post by Justin Le. I strongly encourage everybody to learn some Haskell, but in the meanwhile I’ll be happy to serve as your translator.

The problem with our naive solution is the 28 additional constraints. Well, I guess one could live with that — except that this is just a tiny example of a whole range of constraint satisfaction problems, and it makes sense to figure out a more general approach.

The problem can actually be formulated as a superposition of two separate concerns. One deals with the depth and the other with the breadth of the search for solutions.

Let me touch on the depth issue first. Let’s consider the problem of creating just one substitution of letters with numbers. This could be described as picking 8 digits from a list of 0, 1, …9, one at a time. Once a digit is picked, it’s no longer in the list. We don’t want to hard code the list, so we’ll make it a parameter to our algorithm. Notice that this approach works even if the list contains duplicates, or if the list elements are not easily comparable for equality (for instance, if they are futures). We’ll discuss the list-picking part of the problem in more detail later.

Now let’s talk about breadth: we have to repeat the above process for all possible picks. This is what the 8 nested loops were doing. Except that now we are in trouble because each individual pick is destructive. It removes items from the list — it mutates the list. This is a well known problem when searching through solution spaces, and the standard remedy is called backtracking. Once you have processed a particular candidate, you put the elements back in the list, and try the next one. Which means that you have to keep track of your state, either implicitly on your function’s stack, or in a separate explicit data structure.

Wait a moment! Weren’t we supposed to talk about functional programming? So what’s all this talk about mutation and state? Well, who said you can’t have state in functional programming? Functional programmers have been using the state monad since time immemorial. And mutation is not an issue if you’re using persistent data structures. So fasten your seat belts and make sure your folding trays are in the upright position.

The List Monad

We’ll start with a small refresher in quantum mechanics. As you may remember from school, quantum processes are non-deterministic. You may repeat the same experiment many times and every time get a different result. There is a very interesting view of quantum mechanics called the many-worlds interpretation, in which every experiment gives rise to multiple alternate histories. So if the spin of an electron may be measured as either up or down, there will be one universe in which it’s up, and one in which it’s down.

Selections

We’ll use the same approach to solving our puzzle. We’ll create an alternate universe for each digit substitution for a given letter. So we’ll start with 10 universes for the letter s; then we’ll split each of them into ten universes for the letter e, and so on. Of course, most of these universes won’t yield the desired result, so we’ll have to destroy them. I know, it seems kind of wasteful, but in functional programming it’s easy come, easy go. The creation of a new universe is relatively cheap. That’s because new universes are not that different from their parent universes, and they can share almost all of their data. That’s the idea behind persistent data structures. These are the immutable data structures that are “mutated” by cloning. A cloned data structure shares most of its implementation with the parent, except for a small delta. We’ll be using persistent lists described in my earlier post.

Once you internalize the many-worlds approach to programming, the implementation is pretty straightforward. First, we need functions that generate new worlds. Since we are cheap, we’ll only generate the parts that are different. So what’s the difference between all the worlds that we get when selecting the substitution for the letter s? Just the number that we assign to s. There are ten worlds corresponding to the ten possible digits (we’ll deal with the constraints like s being different from zero later). So all we need is a function that generates a list of ten digits. These are our ten universes in a nutshell. They share everything else.

Once you are in an alternate universe, you have to continue with your life. In functional programming, the rest of your life is just a function called a continuation. I know it sounds like a horrible simplification. All your actions, emotions, and hopes reduced to just one function. Well, maybe the continuation just describes one aspect of your life, the computational part, and you can still hold on to our emotions.

So what do our lives look like, and what do they produce? The input is the universe we’re in, in particular the one number that was picked for us. But since we live in a quantum universe, the outcome is a multitude of universes. So a continuation takes a number, and produces a list. It doesn’t have to be a list of numbers, just a list of whatever characterizes the differences between alternate universes. In particular, it could be a list of different solutions to our puzzle — triples of numbers corresponding to “send”, “more”, and “money”. (There is actually only one solution, but that’s beside the point.)

And what’s the very essence of this new approach? It’s the binding of the selection of the universes to the continuation. That’s where the action is. This binding, again, can be expressed as a function. It’s a function that takes a list of universes and a continuation that produces a list of universes. It returns an even bigger list of universes. We’ll call this function for_each, and we’ll make it as generic as possible. We won’t assume anything about the type of the universes that are passed in, or the type of the universes that the continuation k produces. We’ll also make the type of the continuation a template parameter and extract the return type from it using auto and decltype:

template<class A, class F>
auto for_each(List<A> lst, F k) -> decltype(k(lst.front()))
{
    using B = decltype(k(lst.front()).front());
    // This should really be expressed using concepts
    static_assert(std::is_convertible<
        F, std::function<List<B>(A)>>::value,
        "for_each requires a function type List<B>(A)");

    List<List<B>> lstLst = fmap(k, lst);
    return concatAll(lstLst);
}

The function fmap is similar to std::transform. It applies the continuation k to every element of the list lst. Because k itself produces a list, the result is a list of lists, lstLst. The function concatAll concatenates all those lists into one big list.

Congratulations! You have just seen a monad. This one is called the list monad and it’s used to model non-deterministic processes. The monad is actually defined by two functions. One of them is for_each, and here’s the other one:

template<class A>
List<A> yield(A a)
{
    return List<A> (a);
}

It’s a function that returns a singleton list. We use yield when we are done multiplying universes and we just want to return a single value. We use it to create a single-valued continuation. It represents the lonesome boring life, devoid of any choices.

I will later rename these functions to mbind and mreturn, because they are part of any monad, not just the list monad.

The names like for_each or yield have a very imperative ring to them. That’s because, in functional programming, monadic code plays a role similar to imperative code. But neither for_each nor yield are control structures — they are functions. In particular for_each, which sounds and works like a loop, is just a higher order function; and so is fmap, which is used in its implementation. Of course, at some level the code becomes imperative — fmap can either be implemented recursively or using an actual loop — but the top levels are just declarations of functions. Hence, declarative programming.

There is a slight difference between a loop and a function on lists like for_each: for_each takes a whole list as an argument, while a loop might generate individual items — in this case integers — on the fly. This is not a problem in a lazy functional language like Haskell, where a list is evaluated on demand. The same behavior may be implemented in C++ using streams or lazy ranges. I won’t use it here, since the lists we are dealing with are short, but you can read more about it in my earlier post Getting Lazy with C++.

We are not ready yet to implement the solution to our puzzle, but I’d like to give you a glimpse of what it looks like. For now, think of StateL as just a list. See if it starts making sense (I grayed out the usual C++ noise):

StateL<tuple<int, int, int>> solve()
{
    StateL<int> sel = &select<int>;

    return for_each(sel, [=](int s) {
    return for_each(sel, [=](int e) {
    return for_each(sel, [=](int n) {
    return for_each(sel, [=](int d) {
    return for_each(sel, [=](int m) {
    return for_each(sel, [=](int o) {
    return for_each(sel, [=](int r) {
    return for_each(sel, [=](int y) {
        return yield_if(s != 0 && m != 0, [=]() {
            int send  = asNumber(vector{s, e, n, d});
            int more  = asNumber(vector{m, o, r, e});
            int money = asNumber(vector{m, o, n, e, y});
            return yield_if(send + more == money, [=]() {
                return yield(make_tuple(send, more, money));
            });
        });
    });});});});});});});});
}

The first for_each takes a selection of integers, sel, (never mind how we deal with uniqueness); and a continuation, a lambda, that takes one integer, s, and produces a list of solutions — tuples of three integers. This continuation, in turn, calls for_each with a selection for the next letter, e, and another continuation that returns a list of solutions, and so on.

The innermost continuation is a conditional version of yield called yield_if. It checks a condition and produces a zero- or one-element list of solutions. Internally, it calls another yield_if, which calls the ultimate yield. If that final yield is called (and it might not be, if one of the previous conditions fails), it will produce a solution — a triple of numbers. If there is more than one solution, these singleton lists will get concatenated inside for_each while percolating to the top.

In the second part of this post I will come back to the problem of picking unique numbers and introduce the state monad. You can also have a peek at the code on github.

Challenges

  1. Implement for_each and yield for a vector instead of a List. Use the Standard Library transform instead of fmap.
  2. Using the list monad (or your vector monad), write a function that generates all positions on a chessboard as pairs of characters between 'a' and 'h' and numbers between 1 and 8.
  3. Implement a version of for_each (call it repeat) that takes a continuation k of the type function<List<B>()> (notice the void argument). The function repeat calls k for each element of the list lst, but it ignores the element itself.