This is part 4 of the miniseries about solving a simple constraint-satisfaction problem:
s e n d + m o r e --------- m o n e yusing monads in C++. Previously: The Tale of Two Monads. To start from the beginning, go to Using Monads in C++ to Solve Constraints: 1.
In the previous post I showed some very general programming techniques based on functional data structures and monads. A lot of the code I wrote to solve this particular arithmetic puzzle is easily reusable. In fact the monadic approach is perfect for constructing libraries. I talked about it when discussing C++ futures and ranges. A monad is a pure expression of composability, and composability is the most important feature of any library.
You would be justified to think that rolling out a monad library just to solve a simple arithmetic puzzle is overkill. But I’m sure you can imagine a lot of other, more complex problems that could be solved using the same techniques. You might also fear that such functional methods — especially when implemented in C++, which is not optimized for this kind of programming — would lead to less performant code. This doesn’t matter too much if you are solving a one-shot problem, and the time you spend developing and debugging your code dwarfs the time it takes the program to complete execution. But even if performance were an issue and you were faced with a larger problem, functional code could be parallelized much easier than its imperative counterpart with no danger of concurrency bugs. So you might actually get better performance from functional code by running it in parallel.
Refactoring
In this installment I’d like to talk about something that a lot of functional programmers swear by: Functional programs are amazingly easy to refactor. Anybody who has tried to refactor C++ code can testify to how hard it is, and how long it takes to iron out all the bugs introduced by refactoring. With functional code, it’s a breeze. So let’s have another look at the code from the previous post and do some deep surgery on it. This is our starting point:
StateList<tuple> solve()
{
StateList sel = &select;
return mbind(sel, [=](int s) {
return mbind(sel, [=](int e) {
return mbind(sel, [=](int n) {
return mbind(sel, [=](int d) {
return mbind(sel, [=](int m) {
return mbind(sel, [=](int o) {
return mbind(sel, [=](int r) {
return mbind(sel, [=](int y) {
return mthen(guard(s != 0 && m != 0), [=]() {
int send = asNumber(vector{s, e, n, d});
int more = asNumber(vector{m, o, r, e});
int money = asNumber(vector{m, o, n, e, y});
return mthen(guard(send + more == money), [=]() {
return mreturn(make_tuple(send, more, money));
});
});
});});});});});});});});
}
What immediately stands out is the amount of repetition: eight lines of code look almost exactly the same. Conceptually, these lines correspond to eight nested loops that would be used in the imperative solution. The question is, how can we abstract over control structures, such as loops? But in our monadic implementation the loops are replaced with higher-order functions, and that opens up a path toward even further abstraction.
Reifying Substitutions
The first observation is that we are missing a data structure that should correspond to the central concept we use when describing the solution — the substitution. We are substituting numbers for letters, but we only see those letters as names of variables. A more natural implementation would use a map data structure, mapping characters to integers.
Notice, however, that this mapping has to be dynamically collected and torn down as we are exploring successive branches of the solution space. The imperative approach would demand backtracking. The functional approach makes use of persistent data structures. I have described such data structures previously, in particular a persistent red-black tree. It can be easily extended to a red-black map. You can find the implementation on github.
A red-black map is a generic data structure, which we’ll specialize for our use:
using Map = RBMap<char, int>;
A map lookup may fail, so it should either be implemented to return an optional
, or use a default value in case of a failure. In our case we know that we never look up a value that’s not in the map (unless our code is buggy), so we can use a helper function that never fails:
int get(Map const & subst, char c) { return subst.findWithDefault(-1, c); }
Once we have objectified the substitution as a map, we can convert a whole string of characters to a number in one operation:
int toNumber(Map const & subst, string const & str) { int acc = 0; for (char c : str) { int d = get(subst, c); acc = 10 * acc + d; } return acc; }
There is one more thing that we may automate: finding the set of unique characters in the puzzle. Previously, we just figured out that they were s, e, n, d, m, o, r, y
and hard-coded them into the solution. Now we can use a function, fashioned after a Haskell utility called nub
:
string nub(string const & str) { string result; for (char c : str) { if (result.find(c) == string::npos) result.push_back(c); } return result; }
Don’t worry about the quadratic running time of nub
— it’s only called once.
Recursion
With those preliminaries out of the way, let’s concentrate on analyzing the code. We have a series of nested mbind
calls. The simplest thing is to turn these nested calls into recursion. This involves setting up the first recursive call, implementing the recursive function, and writing the code to terminate the recursion.
The main observation is that each level of nesting accomplishes one substitution of a character by a number. The recursion should end when we run out of characters to substitute.
Let’s first think about what data has to be passed down in a recursive call. One item is the string of characters that still need substituting. The second is the substitution map. While the first argument keeps shrinking, the second one keeps growing. The third argument is the number to be used for the next substitution.
Before we make the first recursive call, we have to prepare the initial arguments. We get the string of characters to be substituted by running nub
over the text of our puzzle:
nub("sendmoremoney")
The second argument should start as an empty map. The third argument, the digit to be used for the next substitution, comes from binding the selection plan select<int>
to a lambda that will call our recursive function. In Haskell, it’s common to call the auxiliary recursive function go
, so that’s what we’ll do here as well:
StateList<tuple<int, int, int>> solve()
{
StateList<int> sel = &select<int>;
Map subst;
return mbind(sel, [=](int s) {
return go(nub("sendmoremoney"), subst, s);
});
}
When implementing go
, we have to think about terminating the recursion. But first, let’s talk about how to continue it.
We are called with the next digit to be used for substitution, so we should perform this substitution. This means inserting a key/value pair into our substitution map subst
. The key is the first character from the string str
. The value is the digit i
that we were given.
subst.inserted(str[0], i)
Notice that, because the map is immutable, the method inserted
returns a new map with one more entry. This is a persistent data structure, so the new map shares most of its data with its previous version. The creation of the new version takes logarithmic time in the number of entries (just as it does with a regular balanced tree that’s used in the Standard Library std::map
). The advantage of using a persistent map is that we don’t have to do any backtracking — the unmodified version is still available after the return from the recursive call.
The recursive call to go
expects three arguments: (1) the string of characters yet to be substituted — that will be the tail of the string that we were passed, (2) the new substitution map, and (3) the new digit n
. We get this new digit by binding the digit selection sel
to a lambda that makes the recursive call:
mbind(sel, [=](int n) {
string tail(&str[1], &str[str.length()]);
return go(tail, subst.inserted(str[0], i), n);
});
Now let’s consider the termination condition. Since go
was called with the next digit to be substituted, we need at least one character in the string for this last substitution. So the termination is reached when the length of the string reaches one. We have to make the last substitution and then call the final function that prunes the bad substitutions. Here’s the complete implementation of go
:
StateList<tuple<int, int, int>> go(string str, Map subst, int i) { StateList<int> sel = &select<int>; assert(str.length() > 0); if (str.length() == 1) return prune(subst.inserted(str[0], i)); else { return mbind(sel, [=](int n) { string tail(&str[1], &str[str.length()]); return go(tail, subst.inserted(str[0], i), n); }); } }
The function prune
is lifted almost literally from the original implementation. The only difference is that we are now using a substitution map.
StateList<tuple<int, int, int>> prune(Map subst) { return mthen(guard(get(subst, 's') != 0 && get(subst, 'm') != 0), [=]() { int send = toNumber(subst, "send"); int more = toNumber(subst, "more"); int money = toNumber(subst, "money"); return mthen(guard(send + more == money), [=]() { return mreturn(make_tuple(send, more, money)); }); }); }
The top-level code that starts the chain of events and displays the solution is left unchanged:
int main() { List<int> lst{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }; cout << evalStateList(solve(), lst); return 0; }
It’s a matter of opinion whether the refactored code is simpler or not. It’s definitely easier to generalize and it’s less error prone. But the main point is how easy it was to make the refactorization. The reason for that is that, in functional code there are no hidden dependencies, because there are no hidden side effects. What would usually be considered a side effect in imperative code is accomplished using pure functions and monadic binding. There is no way to forget that a given function works with state, because a function that deals with state has the state specified in its signature — it returns a StateList
. And it doesn’t itself modify the state. In fact it doesn’t have access to the state. It just composes functions that will modify the state when executed. That’s why we were able to move these functions around so easily.
A Word About Syntax
Monadic code in C++ is artificially complex. That’s because C++ was not designed for functional programming. The same program written in Haskell is much shorter. Here it is, complete with the helper functions prune
, get
, and toNumber
:
solve = StateL select >>= go (nub "sendmoremoney") M.empty where go [c] subst i = prune (M.insert c i subst) go (c:cs) subst i = StateL select >>= go cs (M.insert c i subst) prune subst = do guard (get 's' /= 0 && get 'm' /= 0) let send = toNumber "send" more = toNumber "more" money = toNumber "money" guard (send + more == money) return (send, more, money) where get c = fromJust (M.lookup c subst) toNumber str = asNumber (map get str) asNumber = foldl (\t u -> t*10 + u) 0
main = print $ evalStateL solve [0..9]
Some of the simplification comes from using the operator >>=
in place of mbind
, and a simpler syntax for lambdas. But there is also some built-in syntactic sugar for chains of monadic bindings in the form of the do
notation. You see an example of it in the implementation of prune
.
The funny thing is that C++ is on the verge of introducing something equivalent to do notation called resumable functions. There is a very strong proposal for resumable functions in connection with the future
monad, dealing with asynchronous functions. There is talk of using it with generators, or lazy ranges, which are a version of the list monad. But there is still a need for a push towards generalizing resumable functions to user-defined monads, like the one I described in this series. It just doesn’t make sense to add separate language extensions for each aspect of the same general pattern. This pattern has been known in functional programming for a long time and it’s called a monad. It’s a very useful and versatile pattern whose importance has been steadily growing as the complexity of software projects increases.
Bibliography
- The code for this blog is available on github.
- Douglas M. Auclair, MonadPlus: What a Super Monad!, The Monad.Reader Issue 11, August 25, 2008. It contains the Haskell implementation of the solution to this puzzle.
- Justin Le, Unique sample drawing & searches with List and StateT — “Send more money”. It was the blog post that inspired this series.
May 26, 2015 at 1:06 pm
It would be great if you explained how to parallelize this example program Bartosz. I know it’s a toy example and would probably be slower when running in parallel, but it would be a good learning experience nonetheless.
May 27, 2015 at 3:56 pm
This can be written in C++ more idiomatically, efficiently, and simply by using
std::sort
thenstd::unique
.May 27, 2015 at 10:37 pm
I always enjoy reading your posts!
Here is a Java port of the solution: http://sebastian-millies.blogspot.de/2015/05/recursive-parallel-search.html
The Java solution uses a persistent collections framework in order to enable parallel recursive search. Is there something similar for C++?
A drawback of the solution is that we generate substitutions for all the letters, and then check if those substitutions constitute a solution. However, some of those substitutions could be rejected out-of-hand. Perhaps it would be possible to examine the letters as they occur from right to left in the operands, and interleave the checking of arithmetic constraints with the generation of substitutions.
Do you know (and intend to show) how that kind of thing could be done with monads?
May 29, 2015 at 4:34 pm
@Tom: Have a look at Sebastian’s Java code. He did parallelize the solution.
May 29, 2015 at 4:49 pm
@Sebastian: There is no official persistent collections library in C++. That’s why I implemented my own and used it in my solution.
The interleaving of constraint checking with the substitution generation is possible. You can see some of it in Justin’s Le blog, where he tests for the starting digits being different from zero up front, rather than later.
What you’re proposing is slightly more complicated. It would require encoding the rules of long addition. If you went by columns, from least significant to most significant digit, every third substitution would be totally determined by previous substitutions and the rules of addition. You should try implementing it 😉
May 31, 2015 at 10:33 am
@Bartosz, wouldn’t it be more generic if you checked for
str.length() == 0
and did not duplicatesubst.inserted
call?That way the
prune
function would just accept asubst
.What do you think?
June 3, 2015 at 11:29 am
I did implement it. Here’s a parallel, recursive, shallow-backtracking solution in Java, using the stream monad (flatMap) and persistent collections: http://sebastian-millies.blogspot.de/2015/06/recursive-parallel-search-with-shallow.html
The solution is general and fast (25 times faster than the parallel exhaustive search approach). Comments welcome. In particular, I’d be curious to see a Haskell equivalent.
June 5, 2015 at 7:19 pm
@Ivan Kurnosov: As the recursion is written now, I have to do something special for
str.length() == 1
. I can’t call anothersel
in that case (theelse
clause).It might be possible to rewrite the recursion, so that the result of
sel
is not passed togo
and is dealt with before calling it. But I haven’t explored that option, so I don’t know if it can be done.June 6, 2015 at 12:06 am
@Bartosz, it actually does not require anything special
I implemented it in JS in that way: https://github.com/zerkms/js-solver-list-state-monads/blob/master/index.js#L42 and from my perspective it looks clearer (since there is no duplication) for
subst.inserted(str[0], i)
June 6, 2015 at 12:48 am
@Sebastian: The shallow backtracking solution in Haskell is pretty straightforward. I didn’t try parallelizing it, but it’s already blindingly fast. Normally one would do it with monad transformers, but I kept the StateL monad from my post — just modified the definition of State to include the map:
This allowed me to combine selection with lookup (do the selection only when the lookup fails):
Here’s the straight lookup:
And this is the general solution:
BTW, your program might have a problem with “pink” + “brown” = “ivory” because “pink” is shorter than “brown”.
June 6, 2015 at 3:35 pm
@Sebastian: I posted a more complete Haskell program that solves additions of more than two strings at FP Complete.
July 12, 2015 at 1:36 am
Hey couldn’t find elsewhere to post this. There is a typo on p.249 of cppia,in assign_direct ptr should be p, I think. Thanks for a great book !
August 3, 2016 at 9:47 pm
I am late to the party here, but thank you for this. I am one of the ‘left behind’ when I tried using more academic texts. This is already very difficult for me, but workable thanks to your rigor and clarity.
What do you think about trying the exercise in Rust playing off of your C++, or should I just bite the bullet and learn Haskell? Thanks!