In the previous installment I introduced monads using two examples: the Maybe monad and the List monad. Admittedly, those weren’t the most exciting uses of monads, although they served the purpose of explaining the underlying theory. I also mentioned that monads were introduced into functional programming as a general solution for a variety of problems. One of those problems was representing stateful computations as functions.

## Functions, State, and Side Effects

Here’s the problem: In functional languages, every time you call a function with the same arguments, it must return the same result. In fact, if the compiler detects such a situation, it is free to memoize the result of the first call and skip subsequent calls altogether. A stateful computation, on the other hand, might return different results every time it’s performed. It may, for instance, access some global or static variables. It may also modify those variables– in other words have side effects. In extreme cases a computation might be performed purely for side effects and not even bother to return any results.

This kind of behavior is often troublesome even in imperative programming. The use of global variables in particular is being discouraged. A better solution is to encapsulate the state and pass it explicitly to functions that use it. As a syntactic shortcut, in object-oriented languages, some of the state is regularly passed to functions (methods) as a hidden “this” or “self” argument. There’s even a syntax for composing such functions, as in this JavaScript snippet:

with(document) { var t = title; write(t + " and more"); }

Here, `title`

is a property and `write`

a method of the object `document`

. (If you put on your monadic glasses, it almost looks like `do`

notation.)

In functional languages we have one more limitation: we cannot mutate any data. There’s a standard way to overcome this limitation: Instead of modifying the data, you create a modified copy. This doesn’t even have to be expensive, if the language supports smart data structures that silently substitute references for copies whenever possible. Most operations on lists in Haskell are optimized this way, and there’s even a language, Clojure, at whose core are “persistent” data structures that behave as if they were immutable, but do a lot of sharing behind the scenes. Immutability is a very attractive feature when you are doing concurrent programming: access to immutable data requires no synchronization.

Taking all this into account, the way to translate stateful computations into functional language is to use functions that explicitly take state (encapsulated in some data structure) and return the, possibly modified, state together with the usual return value. For instance, a C++ “function”:

int pop() { auto v = glob.top(); glob.pop(); return v; }

that operates on a global vector, `glob`

of the type `std::vector<int>`

, would be turned into a Haskell function:

pop (ST lst) = (ST (tail lst), head lst)

operating on the state of type `Stack`

:

newtype Stack = ST [Int]

The constructor `ST`

creates a `Stack`

from a list of integers. This constructor is also used for pattern matching, as in the argument of `pop`

. The function `head`

returns the first element of a list, `tail`

returns the rest.

The signature of `pop`

is characteristic of functions that operate on state:

top:: Stack-> (Stack, Int)

Such functions are often called “actions.”

There are two problems with this scheme: (1) It’s awkward, and (2) It doesn’t fit our monadic approach. In this example, the original computation (as expressed in C++) takes no arguments and returns an integer. Its functional counterpart takes a state and returns an integer combined with the state. It’s not clear how one would `bind`

such functions together and use the convenient `do`

notation.

We are on the right track though. We just need to get even more general: We need to separate the construction of an action from its execution. Our basic blocks will be functions that return actions– we’ll call them “monadic functions.” Since action is a function, we’ll be dealing with functions returning functions; or higher order functions.

Our goal is to find a way to compose monadic functions into larger monadic functions. A composite monadic function will return a composite action. We will then execute such action on a state, and get our result.

This new description fits the monadic pattern much better. We start with a generic stateful computation that takes an argument of type `a`

and returns a value of type `b`

, and we turn it into a (monadic) function that takes type `a`

and returns an enriched type based on `b`

. This time, though, the enriched type is a function type– an action. In general, an action is a function that takes a state (of some type `S`

) and returns a tuple consisting of the (possibly modified) state and the value of type `b`

.

S -> (S, b)

Here’s the first element of the monad– a type constructor: For any type `t`

it defines a new type: an action to calculate a value of type `t`

given a state. This type constructor is part of the *state monad*. Before we get to a more formal definition, let’s do some practice exercises.

# The Monadic Calculator

There’s a perfect example of a stateful computation: a stack-based calculator. The state in this case is represented by the type `Calc`

:

newtype Calc = Calc [Int]

that hides a list of integers– our calculator’s stack.

First, lets define a monadic function (a function that returns an action) that pops an element off the calculator’s stack. It will be a function returning a function, so we need to use lambdas.

popCalc = \(Calc lst) -> (Calc (tail lst), head lst)

The body of the lambda is almost identical to the implementation of `pop`

above. Notice that `popCalc`

takes no arguments. Rather, the function that it produces takes a calculator as an argument and returns the calculator back, paired with the result–the value at the top of the stack. In other words, `popCalc`

returns a promise to calculate the top of the calculator’s stack when the stack is available.

Here’s how you can use `popCalc`

. First, you call it with no arguments and record the returned action. Next, you create a calculator (with a non-empty stack, otherwise the next line of code would bomb). You apply the action to that calculator and record the result– you pattern-match it to a tuple consisting of a changed calculator and a number. Finally you display that number. This is the actual output of a Haskell interpreter session:

> let f = popCalc > let calc = Calc [3, 2, 1] > let (calc', x) = f calc > x 3

While we’re at it, we can similarly implement a `pushCalc`

function:

pushCalc n = \(Calc lst) -> (Calc (n:lst), ())

Notice that the lambda produced by `pushCalc`

returns a modified calculator (argument `n`

is prepended to the list) paired with a special value `()`

of the type `unit`

– a Haskell equivalent of `void`

. The imperative equivalent of this function would return void and work only through side effects. Notice also that the lambda is actually a closure: it captures the outer variable `n`

for later use.

Finally, we need a function that performs some calculation; after all we are implementing a calculator:

addCalc = \(Calc lst) -> let (a:b:rest) = lst in (Calc ((a + b):rest), ())

Here I’m matching the calculator’s list with the pattern `(a:b:rest)`

to retrieve the top two elements. The modified calculator has the sum of those two elements on the top of its stack.

We can use all these functions in combination to perform more complex operations, like adding two numbers. Here’s a piece code that might rival some of the Rube Goldberg creations:

add x y = let pux = pushCalc x -- promise to push x puy = pushCalc y -- promise to push y axy = addCalc -- promise to add top numbers pp = popCalc -- promise to pop the result calc = Calc [] -- we need a calculator (calc1, _) = pux calc -- actually push x (calc2, _) = puy calc1 -- actually push y (calc3, _) = axy calc2 -- actually add top numbers (_, z) = pp calc3 -- actually pop the result in z -- return the result

But what we really want is to be able to combine smaller actions into larger actions. For that we have to define `bind`

. The signature of bind, in this case, should be:

bind :: (Calc -> (Calc, a)) -> -- action (a -> (Calc -> (Calc, b)) -> -- continuation (Calc -> (Calc, b)) -- new action

I have highlighted our enriched types–the action types. This signature looks much more complex than the signature of the `Maybe`

bind, but that’s only because the enriched type is itself a function type. Other than that, the structure is the same: `bind`

accepts an action and a continuation and returns a new action. The continuation in this case takes an argument of type `a`

(the value to be calculated by the first action) and returns the composite action.

In fact, if we define `Action`

as a type alias:

type Action a = Calc -> (Calc, a)

the signature of bind can be abbreviated to:

bind :: (Action a) -> (a -> (Action b)) -> (Action b)

Now for the implementation. Since the result of `bind`

is an action, it has to return a lambda of the appropriate signature.

bind act cont = \calc -> ... produce (Calc, b) tuple ...

Bind is supposed to compose the action, `act`

, with the continuation, `cont`

. So it should first apply `act`

to `calc`

.

let (calc', v) = act calc

The result is a tuple `(calc', v)`

with a new calculator and a value `v`

of type `a`

.

This is the `v`

that the continuation expects, so the next step is to apply the continuation:

act' = cont v

The result of the continuation is a new action. This new action can then be executed, that is applied to the new calculator:

act' calc'

to produce the desired result– a tuple of the type `(Calc, b)`

.

Here’s the final code:

bind act cont = \calc -> let (calc', v) = act calc act' = cont v in act' calc'

To complete our construction of the monad, we need to define `return`

. The signature of `return`

is:

return :: a -> Action a

and the implementation is pretty straightforward. It takes a value `v`

and returns a promise to return this value.

return v = \calc -> (calc, v)

An astute reader might notice that nothing in this construction depends on the peculiarities of the type `Calc`

. It will work for any type that is used to represent state. So we have in fact just constructed a generic state monad. The stack-based calculator is just a silly example of that monad.

It’s not difficult to implement `bind`

as an infix operator, `>>=`

, and turn the calculator into a monad that’s recognizable by the Haskell compiler (see Appendix 1). Then the relevant part of the `add`

function may be rewritten in the `do`

notation:

add x y = do pushCalc x pushCalc y addCalc r <- popCalc return r

Let me present the same code without any syntactic sugar, using the cascading lambda-within-lambda notation:

add x y = bind (pushCalc x) (\() -> bind (pushCalc y) (\() -> bind addCalc (\() -> bind popCalc (\z -> return z))))

This is not something you will see often in Haskell programs, but I will eventually want to go beyond Haskell. My goal is to connect back with C++, and this is the form that’s most convenient form making such a transition.

So let’s painstakingly analyze this code. We are binding the first action, `(pushCalc x)`

, to the rest of the code. The rest of the code is expressed as one huge lambda. To make these two parts fit together, their types have to match. The value produced by the action `pushCalc`

is void (a.k.a., “unit”)– so it’s type is `Action ()`

. Therefore the lambda to which it binds must also take void, hence the notation:

\() -> ...

The body of that lambda is another `bind`

, and so on, until we get to the interesting part, which is `popCalc`

.

`popCalc`

is an action that calculates a value: its signature is `Action Int`

. This value is passed to the lambda to which `popCalc`

is bound. Therefore this last lambda takes an `Int`

argument, `z`

. Finally, this value is enclosed in an action, and that’s done by the function `return`

.

This unsugared notation elucidates one more aspect of monadic approach that’s very relevant in the context of Haskell. Haskell is a lazy language: it doesn’t evaluate anything unless it is strictly necessary for achieving some final goal. Also, when it needs to evaluate several independent things, it will do that in some arbitrary, unpredictable order. So if it were somehow possible to implement the imperative versions of `push`

and `pop`

in Haskell, we would have two problems: `push`

would never be evaluated because it produces no result, and even if it were, its evaluation could be swapped with the subsequent `pop`

. Monadic `bind`

forces the ordered evaluation of actions by introducing explicit data dependency. If `pop`

follows `push`

in the chain of `bind`

s, `pop`

cannot be evaluated before `push`

because its argument is the calculator that is returned by `push`

. The two are linked by data dependency which, by the way, is not so obvious in the `do`

notation.

# Conclusion

The state monad is a very interesting pattern from the programming point of view. Instead of doing something, you create an action that is executed (maybe even multiple times) later. The monadic scaffolding provides the standard amenities like the ability to compose actions, and the `do`

notation makes writing functions that produce functions much more natural. There is an interesting variation of the state monad called the IO monad, which is used for input and output in Haskell. I describe it in Appendix 2.

There are many patterns in imperative languages that have elements, or sometimes just hints, of a state monad. For instance, in the OO world you might encounter a very useful Command Pattern. You can “bind” command ubjects using the Composite Pattern. In languages that support anonymous functions and closures, like JavaScript, C# and, recently C++, you can return functions from functions directly. This might help, for instance, in dealing with inversion of control, where you return a closure as an event handler (that would be material for another series of blog posts).

But I have in mind a very specific example that I’ve been working on in C++ that fits the monadic pattern perfectly, and I’m going to write about it in the next installment of this series.

I’d like to thank Eric Niebler for valuable comments on the draft of this post.

# Appendix 1

The full implementation of a stack-based calculator requires a few more Haskell tricks. First, we have to explicitly define our type constructor. I’ll call the new type `Calculation`

, with the type constructor `CL`

that encapsulates an action:

newtype Calculation a = CL (Calc -> (Calc, a))

Monadic functions have to return this new type, so they all wrap their actions into a `Calculation`

.

pushCalc n = CL (\(Calc lst) -> (Calc (n:lst), ())) topCalc = CL (\(Calc lst) -> (Calc lst, head lst)) popCalc = CL (\(Calc lst) -> (Calc (tail lst), head lst)) addCalc = CL (\(Calc lst) -> let (a:b:rest) = lst in (Calc ((a + b):rest), ()))

Haskell has a built-in *type class* for `Monad`

s (think of a type class as a C++ concept). We have to tell Haskell that our `Calculation`

is an instance of `Monad`

and provide the definition of the two associated functions: `bind`

, using infix notation, and `return`

.

instance Monad Calculation where return x = CL (\calc -> (calc, x)) CL(c) >>= cont = CL (\calc -> let (calc', v) = c calc CL c' = cont v in c' calc')

With those definitions, our `add`

function can be written using the `do`

notation.

add x y = do pushCalc x pushCalc y addCalc r <- popCalc return r

Since we are not expecting any values to be calculated by `pushCalc`

or `addCalc`

, there are no left arrows accompanying them in the `do`

notation.

# Appendix 2: The IO Monad

Strictly speaking a lazy purely functional language like Haskell cannot do any input or output. That’s because the compiler is free to memoize the result of the first call to, say, `getChar`

and elide all subsequent calls. Calls to `putChar`

, which don’t return anything, may be ignored altogether. This is why most functional languages cheat when it comes to I/O. But not Haskell. Monads to the rescue!

Let’s think for a moment why `getChar`

may return different characters every time it’s called. It’s because there is a keyboard somewhere out there that changes its state. Why is it changing its state? Because there is a human who pushes the keys. Why is the human pushing the keys? Because he or she got a phone call from China that the price of rice is about to go up, so it makes sense to buy some futures. And so on… In other words there is this external world that is full of state.

What if we encapsulate that whole world in a hidden data structure and write our program as an action that operates on it? This is very similar to the state monad pattern except that here the programmer has no access to the state and cannot execute the action. The action is produced by `main`

and passed to the system for execution. It’s the system, wink wink, that has access to “the world” and may pass it as a state argument to that action. Of course it’s all smoke and mirrors, but it successfully staves off the insanity of the external world from impinging on the purity of Haskell.

How does it work in practice? There is a monad called `IO`

. It’s almost like a state monad, except that its type can’t be expressed in Haskell, because it would have to look something like this:

type IO a = World -> (World, a)

and we don’t know what `World`

is. The `main`

function in a Haskell program is a monadic IO action, usually:

main :: IO ()

with the type parameter `a`

replaced by unit, `()`

, (although any other type will work too).

The simplest `main`

is just a single `IO`

action:

main = putStrLn "Hello World!"

but in general `main`

is a `do`

block.

You might ask: If a Haskell program is one monadic IO action, then where does the traditional functional code go? The answer is that you can call traditional functions from anywhere in the `do`

block. Even in my highly biased example there were several non-monadic function calls (`head`

, `tail`

, operator `(+)`

, etc.). Imagine a Haskell program as a tree: it’s trunk is monadic IO, and so are all the thick branches that have anything to do with I/O. But the thinner branches and leaves are your run of the mill functions that get (lazily) evaluated only when the main `IO`

action is executed by the system.

Another interesting observation is that all functions that perform I/O have this information encoded in their type signature. Not in the type of its arguments, mind you, but in the return type. This is almost the opposite of what you see in imperative languages, where you’d have to pass some kind of I/O object (file or stream) to a procedure that performs I/O (except when that object is global, like standard input/output in C++). In Haskell, if you want your function to perform I/O, two things must happen: it must return an `IO`

action that it internally constructs; and the action must find its way to the very top, to the `main`

function. On the way up, it might be bound with other such actions either using `bind`

or the `do`

notation.

You might ask: How does Haskell make sure that all IO actions get to the top, so that the system may execute them? It doesn’t! But consider what you would have to do in order *not* to pass an action to the top. You’d have to *explicitly* ignore it, as in:

let _ = putStrLn "Don't print me!"

Ignoring things in Haskell is not a default thing.