I am sometimes asked by C++ programmers to give an example of a problem that can’t be solved without monads. This is the wrong kind of question — it’s like asking if there is a problem that can’t be solved without for
loops. Obviously, if your language supports a goto
, you can live without for
loops. What monads (and for loops) can do for you is to help you structure your code. The use of loops and if statements lets you convert spaghetti code into structured code. Similarly, the use of monads lets you convert imperative code into declarative code. These are the kind of transformations that make code easier to write, understand, maintain, and generalize.
So here’s a problem that you may get as an interview question. It’s a small problem, so the advantages of various approaches might not be immediately obvious, especially if you’ve been trained all your life in imperative programming, and you are seeing monads for the first time.
You’re supposed write a program to solve this puzzle:
s e n d + m o r e --------- m o n e y
Each letter correspond to a different digit between 0 and 9. Before you continue reading this post, try to think about how you would approach this problem.
The Analysis
It never hurts to impress your interviewer with your general knowledge by correctly classifying the problem. This one belongs to the class of “constraint satisfaction problems.” The obvious constraint is that the numbers obtained by substituting letters with digits have to add up correctly. There are also some less obvious constraints, namely the numbers should not start with zero.
If you were to solve this problem using pencil and paper, you would probably come up with lots of heuristics. For instance, you would deduce that m
must stand for 1 because that’s the largest possible carry from the addition of two digits (even if there is a carry from the previous column). Then you’d figure out that s
must be either 8 or 9 to produce this carry, and so on. Given enough time, you could probably write an expert system with a large set of rules that could solve this and similar problems. (Mentioning an expert system could earn you extra points with the interviewer.)
However, the small size of the problem suggests that a simple brute force approach is probably best. The interviewer might ask you to estimate the number of possible substitutions, which is 10!/(10 – 8)! or roughly 2 million. That’s not a lot. So, really, the solution boils down to generating all those substitutions and testing the constraints for each.
The Straightforward Solution
The mind of an imperative programmer immediately sees the solution as a set of 8 nested loops (there are 8 unique letters in the problem: s, e, n, d, m, o, r, y). Something like this:
for (int s = 0; s < 10; ++s) for (int e = 0; e < 10; ++e) for (int n = 0; n < 10; ++n) for (int d = 0; d < 10; ++d) ...
and so on, until y
. But then there is the condition that the digits have to be different, so you have to insert a bunch of tests like:
e != s n != s && n != e d != s && d != e && d != n
and so on, the last one involving 7 inequalities… Effectively you have replaced the uniqueness condition with 28 new constraints.
This would probably get you through the interview at Microsoft, Google, or Facebook, but really, can’t you do better than that?
The Smart Solution
Before I proceed, I should mention that what follows is almost a direct translation of a Haskell program from the blog post by Justin Le. I strongly encourage everybody to learn some Haskell, but in the meanwhile I’ll be happy to serve as your translator.
The problem with our naive solution is the 28 additional constraints. Well, I guess one could live with that — except that this is just a tiny example of a whole range of constraint satisfaction problems, and it makes sense to figure out a more general approach.
The problem can actually be formulated as a superposition of two separate concerns. One deals with the depth and the other with the breadth of the search for solutions.
Let me touch on the depth issue first. Let’s consider the problem of creating just one substitution of letters with numbers. This could be described as picking 8 digits from a list of 0, 1, …9, one at a time. Once a digit is picked, it’s no longer in the list. We don’t want to hard code the list, so we’ll make it a parameter to our algorithm. Notice that this approach works even if the list contains duplicates, or if the list elements are not easily comparable for equality (for instance, if they are futures). We’ll discuss the list-picking part of the problem in more detail later.
Now let’s talk about breadth: we have to repeat the above process for all possible picks. This is what the 8 nested loops were doing. Except that now we are in trouble because each individual pick is destructive. It removes items from the list — it mutates the list. This is a well known problem when searching through solution spaces, and the standard remedy is called backtracking. Once you have processed a particular candidate, you put the elements back in the list, and try the next one. Which means that you have to keep track of your state, either implicitly on your function’s stack, or in a separate explicit data structure.
Wait a moment! Weren’t we supposed to talk about functional programming? So what’s all this talk about mutation and state? Well, who said you can’t have state in functional programming? Functional programmers have been using the state monad since time immemorial. And mutation is not an issue if you’re using persistent data structures. So fasten your seat belts and make sure your folding trays are in the upright position.
The List Monad
We’ll start with a small refresher in quantum mechanics. As you may remember from school, quantum processes are non-deterministic. You may repeat the same experiment many times and every time get a different result. There is a very interesting view of quantum mechanics called the many-worlds interpretation, in which every experiment gives rise to multiple alternate histories. So if the spin of an electron may be measured as either up or down, there will be one universe in which it’s up, and one in which it’s down.
We’ll use the same approach to solving our puzzle. We’ll create an alternate universe for each digit substitution for a given letter. So we’ll start with 10 universes for the letter s
; then we’ll split each of them into ten universes for the letter e
, and so on. Of course, most of these universes won’t yield the desired result, so we’ll have to destroy them. I know, it seems kind of wasteful, but in functional programming it’s easy come, easy go. The creation of a new universe is relatively cheap. That’s because new universes are not that different from their parent universes, and they can share almost all of their data. That’s the idea behind persistent data structures. These are the immutable data structures that are “mutated” by cloning. A cloned data structure shares most of its implementation with the parent, except for a small delta. We’ll be using persistent lists described in my earlier post.
Once you internalize the many-worlds approach to programming, the implementation is pretty straightforward. First, we need functions that generate new worlds. Since we are cheap, we’ll only generate the parts that are different. So what’s the difference between all the worlds that we get when selecting the substitution for the letter s
? Just the number that we assign to s
. There are ten worlds corresponding to the ten possible digits (we’ll deal with the constraints like s
being different from zero later). So all we need is a function that generates a list of ten digits. These are our ten universes in a nutshell. They share everything else.
Once you are in an alternate universe, you have to continue with your life. In functional programming, the rest of your life is just a function called a continuation. I know it sounds like a horrible simplification. All your actions, emotions, and hopes reduced to just one function. Well, maybe the continuation just describes one aspect of your life, the computational part, and you can still hold on to our emotions.
So what do our lives look like, and what do they produce? The input is the universe we’re in, in particular the one number that was picked for us. But since we live in a quantum universe, the outcome is a multitude of universes. So a continuation takes a number, and produces a list. It doesn’t have to be a list of numbers, just a list of whatever characterizes the differences between alternate universes. In particular, it could be a list of different solutions to our puzzle — triples of numbers corresponding to “send”, “more”, and “money”. (There is actually only one solution, but that’s beside the point.)
And what’s the very essence of this new approach? It’s the binding of the selection of the universes to the continuation. That’s where the action is. This binding, again, can be expressed as a function. It’s a function that takes a list of universes and a continuation that produces a list of universes. It returns an even bigger list of universes. We’ll call this function for_each
, and we’ll make it as generic as possible. We won’t assume anything about the type of the universes that are passed in, or the type of the universes that the continuation k
produces. We’ll also make the type of the continuation a template parameter and extract the return type from it using auto
and decltype
:
template<class A, class F> auto for_each(List<A> lst, F k) -> decltype(k(lst.front())) { using B = decltype(k(lst.front()).front()); // This should really be expressed using concepts static_assert(std::is_convertible< F, std::function<List<B>(A)>>::value, "for_each requires a function type List<B>(A)"); List<List<B>> lstLst = fmap(k, lst); return concatAll(lstLst); }
The function fmap
is similar to std::transform
. It applies the continuation k
to every element of the list lst
. Because k
itself produces a list, the result is a list of lists, lstLst
. The function concatAll
concatenates all those lists into one big list.
Congratulations! You have just seen a monad. This one is called the list monad and it’s used to model non-deterministic processes. The monad is actually defined by two functions. One of them is for_each
, and here’s the other one:
template<class A> List<A> yield(A a) { return List<A> (a); }
It’s a function that returns a singleton list. We use yield
when we are done multiplying universes and we just want to return a single value. We use it to create a single-valued continuation. It represents the lonesome boring life, devoid of any choices.
I will later rename these functions to mbind
and mreturn
, because they are part of any monad, not just the list monad.
The names like for_each
or yield
have a very imperative ring to them. That’s because, in functional programming, monadic code plays a role similar to imperative code. But neither for_each
nor yield
are control structures — they are functions. In particular for_each
, which sounds and works like a loop, is just a higher order function; and so is fmap
, which is used in its implementation. Of course, at some level the code becomes imperative — fmap
can either be implemented recursively or using an actual loop — but the top levels are just declarations of functions. Hence, declarative programming.
There is a slight difference between a loop and a function on lists like for_each
: for_each
takes a whole list as an argument, while a loop might generate individual items — in this case integers — on the fly. This is not a problem in a lazy functional language like Haskell, where a list is evaluated on demand. The same behavior may be implemented in C++ using streams or lazy ranges. I won’t use it here, since the lists we are dealing with are short, but you can read more about it in my earlier post Getting Lazy with C++.
We are not ready yet to implement the solution to our puzzle, but I’d like to give you a glimpse of what it looks like. For now, think of StateL
as just a list. See if it starts making sense (I grayed out the usual C++ noise):
StateL<tuple<int, int, int>> solve() { StateL<int> sel = &select<int>; return for_each(sel, [=](int s) { return for_each(sel, [=](int e) { return for_each(sel, [=](int n) { return for_each(sel, [=](int d) { return for_each(sel, [=](int m) { return for_each(sel, [=](int o) { return for_each(sel, [=](int r) { return for_each(sel, [=](int y) { return yield_if(s != 0 && m != 0, [=]() { int send = asNumber(vector{s, e, n, d}); int more = asNumber(vector{m, o, r, e}); int money = asNumber(vector{m, o, n, e, y}); return yield_if(send + more == money, [=]() { return yield(make_tuple(send, more, money)); }); }); });});});});});});});}); }
The first for_each
takes a selection of integers, sel
, (never mind how we deal with uniqueness); and a continuation, a lambda, that takes one integer, s
, and produces a list of solutions — tuples of three integers. This continuation, in turn, calls for_each
with a selection for the next letter, e
, and another continuation that returns a list of solutions, and so on.
The innermost continuation is a conditional version of yield
called yield_if
. It checks a condition and produces a zero- or one-element list of solutions. Internally, it calls another yield_if
, which calls the ultimate yield
. If that final yield
is called (and it might not be, if one of the previous conditions fails), it will produce a solution — a triple of numbers. If there is more than one solution, these singleton lists will get concatenated inside for_each
while percolating to the top.
In the second part of this post I will come back to the problem of picking unique numbers and introduce the state monad. You can also have a peek at the code on github.
Challenges
- Implement
for_each
andyield
for avector
instead of aList
. Use the Standard Librarytransform
instead offmap
. - Using the list monad (or your vector monad), write a function that generates all positions on a chessboard as pairs of characters between
'a'
and'h'
and numbers between 1 and 8. - Implement a version of
for_each
(call itrepeat
) that takes a continuationk
of the typefunction<List<B>()>
(notice the void argument). The functionrepeat
callsk
for each element of the listlst
, but it ignores the element itself.
May 12, 2015 at 2:43 am
“This would probably get you through the interview at Microsoft, Google, or Facebook”
maybe for internship…
Other than that code is cute, but IMO this is not much worse
:
May 12, 2015 at 1:32 pm
This code looks very hard to change in case of changing expression and looks ugly to me. Isn’t it easies to create all possible permutations, assign them to letters (maybe in alphabetical order) and check expression to be satisfied?
May 12, 2015 at 2:00 pm
Here is a readable version of the C implementation:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
gistfile1.c
hosted with ❤ by GitHub
May 12, 2015 at 2:34 pm
Here’s a corresponding solution in Java:
http://sebastian-millies.blogspot.de/2015/04/easy-exhaustive-search-with-flatmap.html
May 12, 2015 at 2:58 pm
Have you compared the runtime with the “dumb” version. All these “smart” versions are really, really slow. The following “dumb” version beats them pretty easily. What’s the point of “smart”, “cute” code if it’s slow as molasses?
May 12, 2015 at 3:57 pm
I like @jeremie__roy ‘s version. a partially c++-ified version without globals (except for constants) is interestingly enough even faster
(I left the printf in as even with sync_with_stdio using cout inside test_proposal slows everything down by 50%, I also removed the early out to make it more comparable with the other versions)
May 12, 2015 at 5:52 pm
Sunch puzzles can be solved much faster doing arithmetic by digits and rejecting partial solutions (“shallow backtracking”). Here’s my take on it: http://codepad.org/09iEupRX
More generally, this can be viewed as an instance of monadic constraint programming: http://users.ugent.be/~tschrijv/MCP/
May 13, 2015 at 2:47 am
Or, just use an SMT solver, scripted from Haskell. About 10 lines of code: https://gist.github.com/LeventErkok/49347cc0b96d72efe17c
May 13, 2015 at 4:24 am
Great article. I saw the post on my phone and stopped reading after the problem. I decided to do it two ways in c#. I will probably do it again in
Typescript
This was the pretty version, slow and unoptimized (because I wanted it pretty!)
May 13, 2015 at 4:30 am
I wrote a solution in C# just because and before reading past the problem in the the article. The first being performant the second just being pretty.
Here is the slow and pretty version:
https://gist.github.com/c97ff4f5fc98a81ee2b2.git
May 13, 2015 at 4:35 am
Sorry, wrong link posted
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
gistfile1.cs
hosted with ❤ by GitHub
May 13, 2015 at 6:37 pm
I spent 5 minutes to find a solution with a pen, 9567 + 1085 = 10652. How long to solve this problem with a computer?
May 14, 2015 at 11:45 am
LMFAO @joe’s solution….. Using iostreams in that one function cause a 50% slowdown; can you elaborate? Profile?
May 14, 2015 at 9:02 pm
A simple brute force solution aiming at simplicity and readability over performance.
* Iterates the permutations using a wrapper for std::next_permutation.
* Integer conversions via stringstream for convenience.
May 15, 2015 at 4:18 am
“(I grayed out the usual C++ noise)”
This noise can be removed with help of generic lambdas, auto, initializer_list and DO-macro (https://github.com/panaseleus/monad_do):
StateLI sel = &select;
return DO(StateLI,
(s, sel)
(e, sel)
(n, sel)
(d, sel)
(m, sel)
(o, sel)
(r, sel)
(y, sel)
(_, guard(s != 0 && m != 0))
(_, [=]{
auto send = asNumber({s, e, n, d});
auto more = asNumber({m, o, r, e});
auto money = asNumber({m, o, n, e, y});
return DO(StateLI,
(_, guard(send + more == money))
(_, unit(make_tuple(send, more, money)))
);
}())
);
May 15, 2015 at 11:20 am
@Alexander Riccio sorry, no, didn’t profile it. Just the difference in runtimes (as printed in the program itself) 6 ms vs 9 ms IIRC. No endl, and sync_with_stdio false.
May 15, 2015 at 1:38 pm
@evgeny: That’s pretty good. Almost as terse as the Haskell version:
May 15, 2015 at 7:00 pm
@Bartosz Yes, syntactical structure is almost same as in Haskell. Here is DRAFT live demo: http://coliru.stacked-crooked.com/a/6b969c31b60989ec
However the real challenge is not a syntax, but to make such kind/style of code run fast.
By the way, version which uses ready library permutation is higher level than trying to generate permutations by hands with monads.
May 20, 2015 at 7:01 am
To everyone else who is balking at the over-complicated nature of the solution, do remember that the only reason why anyone would bother “solving” such a trivial problem is to learn something.
The day may come when you have to do Prolog in C++. This may help if that day ever comes.
August 11, 2015 at 4:26 am
Rust here (the naive version without removal): https://github.com/FranklinChen/send-more-money-rust
For entertainment purposes only! I would never code for real in Rust this way. It was just as a language learning experience.
September 30, 2015 at 2:38 pm
Scala bruteforce version which takes “send”, “more”, “money” as arguments, it can also take a different number of words to sum. Of course, the performance can be far better with constraint satisfaction or any other type of search space pruning.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sendmoremoney.scala
hosted with ❤ by GitHub
October 2, 2015 at 10:18 am
Looks like the link in my previous comment didn’t appear: https://gist.github.com/VelizarHristov/5499f6cb02adfcf11320
It actually finds 25 solutions rather than 1 because I didn’t disallow 0 for the first digit. That can be fixed in the checkSolution function, but I like it more this way.
August 22, 2017 at 2:40 pm
The WordPress pingback didn’t seem to work automatically… Here is my reaction to this: http://www.dlugosz.com/zeta/?p=452 I have a very different lesson, “know your libraries”.
August 22, 2017 at 2:44 pm
Velizar, you found 25 solutions by not disallowing leading zero? That’s odd, since I found 50. Also, what was the speed of your brute-force solution in Scala?
October 25, 2017 at 10:10 am
Unless we know why this monadic solution is preferred over, for example, the solution that uses std::next_permutation above, it remains just a novelty to me in the world of C++.
In C++, does a monad offer any inherent advantage for any class of problems over its equivalent non-monadic solution? If so, can we see an example?
May 18, 2018 at 4:00 am
[…] saw this puzzle on Bartosz Milewski’s blog, with an entry on using monads in C++. I’d like to hold it up as an example of a completely different lesson to C++ […]