In the previous installment I introduced monads using two examples: the Maybe monad and the List monad. Admittedly, those weren’t the most exciting uses of monads, although they served the purpose of explaining the underlying theory. I also mentioned that monads were introduced into functional programming as a general solution for a variety of problems. One of those problems was representing stateful computations as functions.

Functions, State, and Side Effects

Here’s the problem: In functional languages, every time you call a function with the same arguments, it must return the same result. In fact, if the compiler detects such a situation, it is free to memoize the result of the first call and skip subsequent calls altogether. A stateful computation, on the other hand, might return different results every time it’s performed. It may, for instance, access some global or static variables. It may also modify those variables– in other words have side effects. In extreme cases a computation might be performed purely for side effects and not even bother to return any results.

This kind of behavior is often troublesome even in imperative programming. The use of global variables in particular is being discouraged. A better solution is to encapsulate the state and pass it explicitly to functions that use it. As a syntactic shortcut, in object-oriented languages, some of the state is regularly passed to functions (methods) as a hidden “this” or “self” argument. There’s even a syntax for composing such functions, as in this JavaScript snippet:

with(document) {
    var t = title;
    write(t + " and more");
}

Here, title is a property and write a method of the object document. (If you put on your monadic glasses, it almost looks like do notation.)

In functional languages we have one more limitation: we cannot mutate any data. There’s a standard way to overcome this limitation: Instead of modifying the data, you create a modified copy. This doesn’t even have to be expensive, if the language supports smart data structures that silently substitute references for copies whenever possible. Most operations on lists in Haskell are optimized this way, and there’s even a language, Clojure, at whose core are “persistent” data structures that behave as if they were immutable, but do a lot of sharing behind the scenes. Immutability is a very attractive feature when you are doing concurrent programming: access to immutable data requires no synchronization.

Taking all this into account, the way to translate stateful computations into functional language is to use functions that explicitly take state (encapsulated in some data structure) and return the, possibly modified, state together with the usual return value. For instance, a C++ “function”:

int pop() {
    auto v = glob.top();
    glob.pop();
    return v;
}

that operates on a global vector, glob of the type std::vector<int>, would be turned into a Haskell function:

pop (ST lst) = (ST (tail lst), head lst)

operating on the state of type Stack:

newtype Stack = ST [Int]

The constructor ST creates a Stack from a list of integers. This constructor is also used for pattern matching, as in the argument of pop. The function head returns the first element of a list, tail returns the rest.

The signature of pop is characteristic of functions that operate on state:

top:: Stack-> (Stack, Int)

Such functions are often called “actions.”

There are two problems with this scheme: (1) It’s awkward, and (2) It doesn’t fit our monadic approach. In this example, the original computation (as expressed in C++) takes no arguments and returns an integer. Its functional counterpart takes a state and returns an integer combined with the state. It’s not clear how one would bind such functions together and use the convenient do notation.

We are on the right track though. We just need to get even more general: We need to separate the construction of an action from its execution. Our basic blocks will be functions that return actions– we’ll call them “monadic functions.” Since action is a function, we’ll be dealing with functions returning functions; or higher order functions.

Our goal is to find a way to compose monadic functions into larger monadic functions. A composite monadic function will return a composite action. We will then execute such action on a state, and get our result.

This new description fits the monadic pattern much better. We start with a generic stateful computation that takes an argument of type a and returns a value of type b, and we turn it into a (monadic) function that takes type a and returns an enriched type based on b. This time, though, the enriched type is a function type– an action. In general, an action is a function that takes a state (of some type S) and returns a tuple consisting of the (possibly modified) state and the value of type b.

S -> (S, b)

Here’s the first element of the monad– a type constructor: For any type t it defines a new type: an action to calculate a value of type t given a state. This type constructor is part of the state monad. Before we get to a more formal definition, let’s do some practice exercises.

The Monadic Calculator

There’s a perfect example of a stateful computation: a stack-based calculator. The state in this case is represented by the type Calc:

newtype Calc = Calc [Int]

that hides a list of integers– our calculator’s stack.

First, lets define a monadic function (a function that returns an action) that pops an element off the calculator’s stack. It will be a function returning a function, so we need to use lambdas.

popCalc = 
    \(Calc lst) -> (Calc (tail lst), head lst)

The body of the lambda is almost identical to the implementation of pop above. Notice that popCalc takes no arguments. Rather, the function that it produces takes a calculator as an argument and returns the calculator back, paired with the result–the value at the top of the stack. In other words, popCalc returns a promise to calculate the top of the calculator’s stack when the stack is available.

Here’s how you can use popCalc. First, you call it with no arguments and record the returned action. Next, you create a calculator (with a non-empty stack, otherwise the next line of code would bomb). You apply the action to that calculator and record the result– you pattern-match it to a tuple consisting of a changed calculator and a number. Finally you display that number. This is the actual output of a Haskell interpreter session:

> let f = popCalc
> let calc = Calc [3, 2, 1]
> let (calc', x) = f calc
> x
3

While we’re at it, we can similarly implement a pushCalc function:

pushCalc n =
    \(Calc lst) -> (Calc (n:lst), ())

Notice that the lambda produced by pushCalc returns a modified calculator (argument n is prepended to the list) paired with a special value () of the type unit— a Haskell equivalent of void. The imperative equivalent of this function would return void and work only through side effects. Notice also that the lambda is actually a closure: it captures the outer variable n for later use.

Finally, we need a function that performs some calculation; after all we are implementing a calculator:

addCalc =
    \(Calc lst) -> let (a:b:rest) = lst
                   in
                       (Calc ((a + b):rest), ())

Here I’m matching the calculator’s list with the pattern (a:b:rest) to retrieve the top two elements. The modified calculator has the sum of those two elements on the top of its stack.

We can use all these functions in combination to perform more complex operations, like adding two numbers. Here’s a piece code that might rival some of the Rube Goldberg creations:

add x y =
    let pux = pushCalc x -- promise to push x
        puy = pushCalc y -- promise to push y
        axy = addCalc    -- promise to add top numbers
        pp = popCalc     -- promise to pop the result
        calc = Calc []   -- we need a calculator
        (calc1, _) = pux calc  -- actually push x
        (calc2, _) = puy calc1 -- actually push y
        (calc3, _) = axy calc2 -- actually add top numbers
        (_, z) = pp calc3      -- actually pop the result
    in
        z  -- return the result

But what we really want is to be able to combine smaller actions into larger actions. For that we have to define bind. The signature of bind, in this case, should be:

bind :: (Calc -> (Calc, a)) ->        -- action
        (a -> (Calc -> (Calc, b)) ->  -- continuation
        (Calc -> (Calc, b))           -- new action

I have highlighted our enriched types–the action types. This signature looks much more complex than the signature of the Maybe bind, but that’s only because the enriched type is itself a function type. Other than that, the structure is the same: bind accepts an action and a continuation and returns a new action. The continuation in this case takes an argument of type a (the value to be calculated by the first action) and returns the composite action.

In fact, if we define Action as a type alias:

type Action a = Calc -> (Calc, a)

the signature of bind can be abbreviated to:

bind :: (Action a) -> (a -> (Action b)) ->  (Action b)

Now for the implementation. Since the result of bind is an action, it has to return a lambda of the appropriate signature.

bind act cont =
    \calc -> ... produce (Calc, b) tuple ...

Bind is supposed to compose the action, act, with the continuation, cont. So it should first apply act to calc.

let (calc', v) = act calc

The result is a tuple (calc', v) with a new calculator and a value v of type a.

This is the v that the continuation expects, so the next step is to apply the continuation:

act' = cont v

The result of the continuation is a new action. This new action can then be executed, that is applied to the new calculator:

act' calc'

to produce the desired result– a tuple of the type (Calc, b).

Here’s the final code:

bind act cont =
    \calc ->
        let (calc', v) = act calc
            act' = cont v
        in
            act' calc'

To complete our construction of the monad, we need to define return. The signature of return is:

return :: a -> Action a

and the implementation is pretty straightforward. It takes a value v and returns a promise to return this value.

return v = \calc -> (calc, v)

An astute reader might notice that nothing in this construction depends on the peculiarities of the type Calc. It will work for any type that is used to represent state. So we have in fact just constructed a generic state monad. The stack-based calculator is just a silly example of that monad.

It’s not difficult to implement bind as an infix operator, >>=, and turn the calculator into a monad that’s recognizable by the Haskell compiler (see Appendix 1). Then the relevant part of the add function may be rewritten in the do notation:

add x y = do
    pushCalc x
    pushCalc y
    addCalc
    r <- popCalc
    return r

Let me present the same code without any syntactic sugar, using the cascading lambda-within-lambda notation:

add x y =
  bind (pushCalc x) 
       (\() -> bind (pushCalc y)
                    (\() -> bind addCalc
                                 (\() -> bind popCalc
                                              (\z -> return z))))

This is not something you will see often in Haskell programs, but I will eventually want to go beyond Haskell. My goal is to connect back with C++, and this is the form that’s most convenient form making such a transition.

So let’s painstakingly analyze this code. We are binding the first action, (pushCalc x), to the rest of the code. The rest of the code is expressed as one huge lambda. To make these two parts fit together, their types have to match. The value produced by the action pushCalc is void (a.k.a., “unit”)– so it’s type is Action (). Therefore the lambda to which it binds must also take void, hence the notation:

\() -> ...

The body of that lambda is another bind, and so on, until we get to the interesting part, which is popCalc.

popCalc is an action that calculates a value: its signature is Action Int. This value is passed to the lambda to which popCalc is bound. Therefore this last lambda takes an Int argument, z. Finally, this value is enclosed in an action, and that’s done by the function return.

This unsugared notation elucidates one more aspect of the monadic approach that’s very relevant in the context of Haskell. Haskell is a lazy language: it doesn’t evaluate anything unless it is strictly necessary for achieving some final goal. Also, when it needs to evaluate several independent things, it will do that in some arbitrary, unpredictable order. So if it were somehow possible to implement the imperative versions of push and pop in Haskell, we would have two problems: push would never be evaluated because it produces no result, and even if it were, its evaluation could be swapped with the subsequent pop. Monadic bind forces the ordered evaluation of actions by introducing explicit data dependency. If pop follows push in the chain of binds, pop cannot be evaluated before push because its argument is the calculator that is returned by push. The two are linked by data dependency which, by the way, is not so obvious in the do notation.

Conclusion

The state monad is a very interesting pattern from the programming point of view. Instead of doing something, you create an action that is executed (maybe even multiple times) later. The monadic scaffolding provides the standard amenities like the ability to compose actions, and the do notation makes writing functions that produce functions much more natural. There is an interesting variation of the state monad called the IO monad, which is used for input and output in Haskell. I describe it in Appendix 2.

There are many patterns in imperative languages that have elements, or sometimes just hints, of a state monad. For instance, in the OO world you might encounter a very useful Command Pattern. You can “bind” command objects using the Composite Pattern. In languages that support anonymous functions and closures, like JavaScript, C# and, recently C++, you can return functions from functions directly. This might help, for instance, in dealing with inversion of control, where you return a closure as an event handler (that would be material for another series of blog posts).

But I have in mind a very specific example that I’ve been working on in C++ that fits the monadic pattern perfectly, and I’m going to write about it in the next installment of this series.

I’d like to thank Eric Niebler for valuable comments on the draft of this post.

Appendix 1

The full implementation of a stack-based calculator requires a few more Haskell tricks. First, we have to explicitly define our type constructor. I’ll call the new type Calculation, with the type constructor CL that encapsulates an action:

newtype Calculation a = CL (Calc -> (Calc, a))

Monadic functions have to return this new type, so they all wrap their actions into a Calculation.

pushCalc n =
    CL (\(Calc lst) -> (Calc (n:lst), ()))

topCalc = 
    CL (\(Calc lst) -> (Calc lst, head lst))

popCalc =
    CL (\(Calc lst) -> (Calc (tail lst), head lst))

addCalc =
    CL (\(Calc lst) -> let (a:b:rest) = lst
                       in
                           (Calc ((a + b):rest), ()))

Haskell has a built-in type class for Monads (think of a type class as a C++ concept). We have to tell Haskell that our Calculation is an instance of Monad and provide the definition of the two associated functions: bind, using infix notation, and return.

instance Monad Calculation where
    return x = 
        CL (\calc -> (calc, x))
    CL(c) >>= cont =
        CL (\calc ->
            let (calc', v) = c calc
                CL c' = cont v
            in
                c' calc')

With those definitions, our add function can be written using the do notation.

add x y = do
    pushCalc x
    pushCalc y
    addCalc
    r <- popCalc
    return r

Since we are not expecting any values to be calculated by pushCalc or addCalc, there are no left arrows accompanying them in the do notation.

Appendix 1a: The Applicative

Haskell keeps evolving and, on occasion, the changes in the language break old code. In particular the Monad class has a superclass now, called Applicative. That’s why the Monad instance for Calculation won’t compile any more, unless you explicitly add the instance for Applicative. Fortunately, the Applicative functionality can be fully implemented using the Monad interface, as in:

instance Applicative Calculation where
  pure = return
  mf <*> mx = do f <- mf
                 x <- mx
                 return (f x)

This won’t compile either, because Applicative requires Functor as its superclass. So we have to make Applicative a Functor. The simplest is to let the compiler work it out, but you have to include this line at the very top of the file:

{-# LANGUAGE DeriveFunctor #-}

and modify the definition of Calculation:

newtype Calculation a = CL (Calc -> (Calc, a))
  deriving Functor

In fact, it’s possible to implement the calculator as an Applicative, without the need for a Monad instance. But that’s a different story.

Appendix 2: The IO Monad

Strictly speaking a lazy purely functional language like Haskell cannot do any input or output. That’s because the compiler is free to memoize the result of the first call to, say, getChar and elide all subsequent calls. Calls to putChar, which don’t return anything, may be ignored altogether. This is why most functional languages cheat when it comes to I/O. But not Haskell. Monads to the rescue!

Let’s think for a moment why getChar may return different characters every time it’s called. It’s because there is a keyboard somewhere out there that changes its state. Why is it changing its state? Because there is a human who pushes the keys. Why is the human pushing the keys? Because he or she got a phone call from China that the price of rice is about to go up, so it makes sense to buy some futures. And so on… In other words there is this external world that is full of state.

What if we encapsulate that whole world in a hidden data structure and write our program as an action that operates on it? This is very similar to the state monad pattern except that here the programmer has no access to the state and cannot execute the action. The action is produced by main and passed to the system for execution. It’s the system, wink wink, that has access to “the world” and may pass it as a state argument to that action. Of course it’s all smoke and mirrors, but it successfully staves off the insanity of the external world from impinging on the purity of Haskell.

How does it work in practice? There is a monad called IO. It’s almost like a state monad, except that its type can’t be expressed in Haskell, because it would have to look something like this:

type IO a = World -> (World, a)

and we don’t know what World is. The main function in a Haskell program is a monadic IO action, usually:

main :: IO ()

with the type parameter a replaced by unit, (), (although any other type will work too).

The simplest main is just a single IO action:

main = putStrLn "Hello World!"

but in general main is a do block.

You might ask: If a Haskell program is one monadic IO action, then where does the traditional functional code go? The answer is that you can call traditional functions from anywhere in the do block. Even in my highly biased example there were several non-monadic function calls (head, tail, operator (+), etc.). Imagine a Haskell program as a tree: it’s trunk is monadic IO, and so are all the thick branches that have anything to do with I/O. But the thinner branches and leaves are your run of the mill functions that get (lazily) evaluated only when the main IO action is executed by the system.

Another interesting observation is that all functions that perform I/O have this information encoded in their type signature. Not in the type of its arguments, mind you, but in the return type. This is almost the opposite of what you see in imperative languages, where you’d have to pass some kind of I/O object (file or stream) to a procedure that performs I/O (except when that object is global, like standard input/output in C++). In Haskell, if you want your function to perform I/O, two things must happen: it must return an IO action that it internally constructs; and the action must find its way to the very top, to the main function. On the way up, it might be bound with other such actions either using bind or the do notation.

You might ask: How does Haskell make sure that all IO actions get to the top, so that the system may execute them? It doesn’t! But consider what you would have to do in order not to pass an action to the top. You’d have to explicitly ignore it, as in:

let _ = putStrLn "Don't print me!"

Ignoring things in Haskell is not a default thing.