In the previous installment I introduced monads using two examples: the Maybe monad and the List monad. Admittedly, those weren’t the most exciting uses of monads, although they served the purpose of explaining the underlying theory. I also mentioned that monads were introduced into functional programming as a general solution for a variety of problems. One of those problems was representing stateful computations as functions.
Functions, State, and Side Effects
Here’s the problem: In functional languages, every time you call a function with the same arguments, it must return the same result. In fact, if the compiler detects such a situation, it is free to memoize the result of the first call and skip subsequent calls altogether. A stateful computation, on the other hand, might return different results every time it’s performed. It may, for instance, access some global or static variables. It may also modify those variables– in other words have side effects. In extreme cases a computation might be performed purely for side effects and not even bother to return any results.
This kind of behavior is often troublesome even in imperative programming. The use of global variables in particular is being discouraged. A better solution is to encapsulate the state and pass it explicitly to functions that use it. As a syntactic shortcut, in object-oriented languages, some of the state is regularly passed to functions (methods) as a hidden “this” or “self” argument. There’s even a syntax for composing such functions, as in this JavaScript snippet:
with(document) { var t = title; write(t + " and more"); }
Here, title
is a property and write
a method of the object document
. (If you put on your monadic glasses, it almost looks like do
notation.)
In functional languages we have one more limitation: we cannot mutate any data. There’s a standard way to overcome this limitation: Instead of modifying the data, you create a modified copy. This doesn’t even have to be expensive, if the language supports smart data structures that silently substitute references for copies whenever possible. Most operations on lists in Haskell are optimized this way, and there’s even a language, Clojure, at whose core are “persistent” data structures that behave as if they were immutable, but do a lot of sharing behind the scenes. Immutability is a very attractive feature when you are doing concurrent programming: access to immutable data requires no synchronization.
Taking all this into account, the way to translate stateful computations into functional language is to use functions that explicitly take state (encapsulated in some data structure) and return the, possibly modified, state together with the usual return value. For instance, a C++ “function”:
int pop() { auto v = glob.top(); glob.pop(); return v; }
that operates on a global vector, glob
of the type std::vector<int>
, would be turned into a Haskell function:
pop (ST lst) = (ST (tail lst), head lst)
operating on the state of type Stack
:
newtype Stack = ST [Int]
The constructor ST
creates a Stack
from a list of integers. This constructor is also used for pattern matching, as in the argument of pop
. The function head
returns the first element of a list, tail
returns the rest.
The signature of pop
is characteristic of functions that operate on state:
top:: Stack-> (Stack, Int)
Such functions are often called “actions.”
There are two problems with this scheme: (1) It’s awkward, and (2) It doesn’t fit our monadic approach. In this example, the original computation (as expressed in C++) takes no arguments and returns an integer. Its functional counterpart takes a state and returns an integer combined with the state. It’s not clear how one would bind
such functions together and use the convenient do
notation.
We are on the right track though. We just need to get even more general: We need to separate the construction of an action from its execution. Our basic blocks will be functions that return actions– we’ll call them “monadic functions.” Since action is a function, we’ll be dealing with functions returning functions; or higher order functions.
Our goal is to find a way to compose monadic functions into larger monadic functions. A composite monadic function will return a composite action. We will then execute such action on a state, and get our result.
This new description fits the monadic pattern much better. We start with a generic stateful computation that takes an argument of type a
and returns a value of type b
, and we turn it into a (monadic) function that takes type a
and returns an enriched type based on b
. This time, though, the enriched type is a function type– an action. In general, an action is a function that takes a state (of some type S
) and returns a tuple consisting of the (possibly modified) state and the value of type b
.
S -> (S, b)
Here’s the first element of the monad– a type constructor: For any type t
it defines a new type: an action to calculate a value of type t
given a state. This type constructor is part of the state monad. Before we get to a more formal definition, let’s do some practice exercises.
The Monadic Calculator
There’s a perfect example of a stateful computation: a stack-based calculator. The state in this case is represented by the type Calc
:
newtype Calc = Calc [Int]
that hides a list of integers– our calculator’s stack.
First, lets define a monadic function (a function that returns an action) that pops an element off the calculator’s stack. It will be a function returning a function, so we need to use lambdas.
popCalc = \(Calc lst) -> (Calc (tail lst), head lst)
The body of the lambda is almost identical to the implementation of pop
above. Notice that popCalc
takes no arguments. Rather, the function that it produces takes a calculator as an argument and returns the calculator back, paired with the result–the value at the top of the stack. In other words, popCalc
returns a promise to calculate the top of the calculator’s stack when the stack is available.
Here’s how you can use popCalc
. First, you call it with no arguments and record the returned action. Next, you create a calculator (with a non-empty stack, otherwise the next line of code would bomb). You apply the action to that calculator and record the result– you pattern-match it to a tuple consisting of a changed calculator and a number. Finally you display that number. This is the actual output of a Haskell interpreter session:
> let f = popCalc > let calc = Calc [3, 2, 1] > let (calc', x) = f calc > x 3
While we’re at it, we can similarly implement a pushCalc
function:
pushCalc n = \(Calc lst) -> (Calc (n:lst), ())
Notice that the lambda produced by pushCalc
returns a modified calculator (argument n
is prepended to the list) paired with a special value ()
of the type unit
— a Haskell equivalent of void
. The imperative equivalent of this function would return void and work only through side effects. Notice also that the lambda is actually a closure: it captures the outer variable n
for later use.
Finally, we need a function that performs some calculation; after all we are implementing a calculator:
addCalc = \(Calc lst) -> let (a:b:rest) = lst in (Calc ((a + b):rest), ())
Here I’m matching the calculator’s list with the pattern (a:b:rest)
to retrieve the top two elements. The modified calculator has the sum of those two elements on the top of its stack.
We can use all these functions in combination to perform more complex operations, like adding two numbers. Here’s a piece code that might rival some of the Rube Goldberg creations:
add x y = let pux = pushCalc x -- promise to push x puy = pushCalc y -- promise to push y axy = addCalc -- promise to add top numbers pp = popCalc -- promise to pop the result calc = Calc [] -- we need a calculator (calc1, _) = pux calc -- actually push x (calc2, _) = puy calc1 -- actually push y (calc3, _) = axy calc2 -- actually add top numbers (_, z) = pp calc3 -- actually pop the result in z -- return the result
But what we really want is to be able to combine smaller actions into larger actions. For that we have to define bind
. The signature of bind, in this case, should be:
bind :: (Calc -> (Calc, a)) -> -- action (a -> (Calc -> (Calc, b)) -> -- continuation (Calc -> (Calc, b)) -- new action
I have highlighted our enriched types–the action types. This signature looks much more complex than the signature of the Maybe
bind, but that’s only because the enriched type is itself a function type. Other than that, the structure is the same: bind
accepts an action and a continuation and returns a new action. The continuation in this case takes an argument of type a
(the value to be calculated by the first action) and returns the composite action.
In fact, if we define Action
as a type alias:
type Action a = Calc -> (Calc, a)
the signature of bind can be abbreviated to:
bind :: (Action a) -> (a -> (Action b)) -> (Action b)
Now for the implementation. Since the result of bind
is an action, it has to return a lambda of the appropriate signature.
bind act cont = \calc -> ... produce (Calc, b) tuple ...
Bind is supposed to compose the action, act
, with the continuation, cont
. So it should first apply act
to calc
.
let (calc', v) = act calc
The result is a tuple (calc', v)
with a new calculator and a value v
of type a
.
This is the v
that the continuation expects, so the next step is to apply the continuation:
act' = cont v
The result of the continuation is a new action. This new action can then be executed, that is applied to the new calculator:
act' calc'
to produce the desired result– a tuple of the type (Calc, b)
.
Here’s the final code:
bind act cont = \calc -> let (calc', v) = act calc act' = cont v in act' calc'
To complete our construction of the monad, we need to define return
. The signature of return
is:
return :: a -> Action a
and the implementation is pretty straightforward. It takes a value v
and returns a promise to return this value.
return v = \calc -> (calc, v)
An astute reader might notice that nothing in this construction depends on the peculiarities of the type Calc
. It will work for any type that is used to represent state. So we have in fact just constructed a generic state monad. The stack-based calculator is just a silly example of that monad.
It’s not difficult to implement bind
as an infix operator, >>=
, and turn the calculator into a monad that’s recognizable by the Haskell compiler (see Appendix 1). Then the relevant part of the add
function may be rewritten in the do
notation:
add x y = do pushCalc x pushCalc y addCalc r <- popCalc return r
Let me present the same code without any syntactic sugar, using the cascading lambda-within-lambda notation:
add x y = bind (pushCalc x) (\() -> bind (pushCalc y) (\() -> bind addCalc (\() -> bind popCalc (\z -> return z))))
This is not something you will see often in Haskell programs, but I will eventually want to go beyond Haskell. My goal is to connect back with C++, and this is the form that’s most convenient form making such a transition.
So let’s painstakingly analyze this code. We are binding the first action, (pushCalc x)
, to the rest of the code. The rest of the code is expressed as one huge lambda. To make these two parts fit together, their types have to match. The value produced by the action pushCalc
is void (a.k.a., “unit”)– so it’s type is Action ()
. Therefore the lambda to which it binds must also take void, hence the notation:
\() -> ...
The body of that lambda is another bind
, and so on, until we get to the interesting part, which is popCalc
.
popCalc
is an action that calculates a value: its signature is Action Int
. This value is passed to the lambda to which popCalc
is bound. Therefore this last lambda takes an Int
argument, z
. Finally, this value is enclosed in an action, and that’s done by the function return
.
This unsugared notation elucidates one more aspect of the monadic approach that’s very relevant in the context of Haskell. Haskell is a lazy language: it doesn’t evaluate anything unless it is strictly necessary for achieving some final goal. Also, when it needs to evaluate several independent things, it will do that in some arbitrary, unpredictable order. So if it were somehow possible to implement the imperative versions of push
and pop
in Haskell, we would have two problems: push
would never be evaluated because it produces no result, and even if it were, its evaluation could be swapped with the subsequent pop
. Monadic bind
forces the ordered evaluation of actions by introducing explicit data dependency. If pop
follows push
in the chain of bind
s, pop
cannot be evaluated before push
because its argument is the calculator that is returned by push
. The two are linked by data dependency which, by the way, is not so obvious in the do
notation.
Conclusion
The state monad is a very interesting pattern from the programming point of view. Instead of doing something, you create an action that is executed (maybe even multiple times) later. The monadic scaffolding provides the standard amenities like the ability to compose actions, and the do
notation makes writing functions that produce functions much more natural. There is an interesting variation of the state monad called the IO monad, which is used for input and output in Haskell. I describe it in Appendix 2.
There are many patterns in imperative languages that have elements, or sometimes just hints, of a state monad. For instance, in the OO world you might encounter a very useful Command Pattern. You can “bind” command objects using the Composite Pattern. In languages that support anonymous functions and closures, like JavaScript, C# and, recently C++, you can return functions from functions directly. This might help, for instance, in dealing with inversion of control, where you return a closure as an event handler (that would be material for another series of blog posts).
But I have in mind a very specific example that I’ve been working on in C++ that fits the monadic pattern perfectly, and I’m going to write about it in the next installment of this series.
I’d like to thank Eric Niebler for valuable comments on the draft of this post.
Appendix 1
The full implementation of a stack-based calculator requires a few more Haskell tricks. First, we have to explicitly define our type constructor. I’ll call the new type Calculation
, with the type constructor CL
that encapsulates an action:
newtype Calculation a = CL (Calc -> (Calc, a))
Monadic functions have to return this new type, so they all wrap their actions into a Calculation
.
pushCalc n = CL (\(Calc lst) -> (Calc (n:lst), ())) topCalc = CL (\(Calc lst) -> (Calc lst, head lst)) popCalc = CL (\(Calc lst) -> (Calc (tail lst), head lst)) addCalc = CL (\(Calc lst) -> let (a:b:rest) = lst in (Calc ((a + b):rest), ()))
Haskell has a built-in type class for Monad
s (think of a type class as a C++ concept). We have to tell Haskell that our Calculation
is an instance of Monad
and provide the definition of the two associated functions: bind
, using infix notation, and return
.
instance Monad Calculation where return x = CL (\calc -> (calc, x)) CL(c) >>= cont = CL (\calc -> let (calc', v) = c calc CL c' = cont v in c' calc')
With those definitions, our add
function can be written using the do
notation.
add x y = do pushCalc x pushCalc y addCalc r <- popCalc return r
Since we are not expecting any values to be calculated by pushCalc
or addCalc
, there are no left arrows accompanying them in the do
notation.
Appendix 1a: The Applicative
Haskell keeps evolving and, on occasion, the changes in the language break old code. In particular the Monad
class has a superclass now, called Applicative
. That’s why the Monad
instance for Calculation
won’t compile any more, unless you explicitly add the instance for Applicative
. Fortunately, the Applicative
functionality can be fully implemented using the Monad
interface, as in:
instance Applicative Calculation where pure = return mf <*> mx = do f <- mf x <- mx return (f x)
This won’t compile either, because Applicative
requires Functor
as its superclass. So we have to make Applicative
a Functor
. The simplest is to let the compiler work it out, but you have to include this line at the very top of the file:
{-# LANGUAGE DeriveFunctor #-}
and modify the definition of Calculation
:
newtype Calculation a = CL (Calc -> (Calc, a)) deriving Functor
In fact, it’s possible to implement the calculator as an Applicative
, without the need for a Monad
instance. But that’s a different story.
Appendix 2: The IO Monad
Strictly speaking a lazy purely functional language like Haskell cannot do any input or output. That’s because the compiler is free to memoize the result of the first call to, say, getChar
and elide all subsequent calls. Calls to putChar
, which don’t return anything, may be ignored altogether. This is why most functional languages cheat when it comes to I/O. But not Haskell. Monads to the rescue!
Let’s think for a moment why getChar
may return different characters every time it’s called. It’s because there is a keyboard somewhere out there that changes its state. Why is it changing its state? Because there is a human who pushes the keys. Why is the human pushing the keys? Because he or she got a phone call from China that the price of rice is about to go up, so it makes sense to buy some futures. And so on… In other words there is this external world that is full of state.
What if we encapsulate that whole world in a hidden data structure and write our program as an action that operates on it? This is very similar to the state monad pattern except that here the programmer has no access to the state and cannot execute the action. The action is produced by main
and passed to the system for execution. It’s the system, wink wink, that has access to “the world” and may pass it as a state argument to that action. Of course it’s all smoke and mirrors, but it successfully staves off the insanity of the external world from impinging on the purity of Haskell.
How does it work in practice? There is a monad called IO
. It’s almost like a state monad, except that its type can’t be expressed in Haskell, because it would have to look something like this:
type IO a = World -> (World, a)
and we don’t know what World
is. The main
function in a Haskell program is a monadic IO action, usually:
main :: IO ()
with the type parameter a
replaced by unit, ()
, (although any other type will work too).
The simplest main
is just a single IO
action:
main = putStrLn "Hello World!"
but in general main
is a do
block.
You might ask: If a Haskell program is one monadic IO action, then where does the traditional functional code go? The answer is that you can call traditional functions from anywhere in the do
block. Even in my highly biased example there were several non-monadic function calls (head
, tail
, operator (+)
, etc.). Imagine a Haskell program as a tree: it’s trunk is monadic IO, and so are all the thick branches that have anything to do with I/O. But the thinner branches and leaves are your run of the mill functions that get (lazily) evaluated only when the main IO
action is executed by the system.
Another interesting observation is that all functions that perform I/O have this information encoded in their type signature. Not in the type of its arguments, mind you, but in the return type. This is almost the opposite of what you see in imperative languages, where you’d have to pass some kind of I/O object (file or stream) to a procedure that performs I/O (except when that object is global, like standard input/output in C++). In Haskell, if you want your function to perform I/O, two things must happen: it must return an IO
action that it internally constructs; and the action must find its way to the very top, to the main
function. On the way up, it might be bound with other such actions either using bind
or the do
notation.
You might ask: How does Haskell make sure that all IO actions get to the top, so that the system may execute them? It doesn’t! But consider what you would have to do in order not to pass an action to the top. You’d have to explicitly ignore it, as in:
let _ = putStrLn "Don't print me!"
Ignoring things in Haskell is not a default thing.
April 19, 2011 at 4:26 am
This is a great series of post!
I couldn’t understand though how the world monad is then simply abstracted out? Does GHC say something like ‘assume it exists, and it can have whatever attributes it requires’?
One thing I’d like to point out is that when I read the website after zooming in, the code that overflows is completely hidden, no horizontal scrollbar either. So it would be super awesome if that could be fixed.
May 4, 2011 at 7:34 am
The article appears to be missing. I only see the first comment. I checked the HTML source and article is not there either.
July 11, 2011 at 1:12 pm
[…] Part 3 […]
December 7, 2012 at 5:20 am
This is the first state monad explanation that clicked for me. Thanks!
June 27, 2014 at 10:58 am
Probably the best monads explanation i have ever seen. But it could be just invaluable if you show relation between testing and state monad.
March 30, 2015 at 4:48 pm
Awesome series. Please, continue writing about this! Looking forward to the next installment!
June 23, 2015 at 8:19 pm
Thanks for the tutorial, it’s rather useful. I’ve got lost in one place – your first definition of “add x y” actually returns the sum of two values, while the final does not – it returns a Computation, and without knowing anything about Haskell, I could only use it like so:
Surely there must be a more concise way?
June 24, 2015 at 1:45 pm
@Vladimir: I would use a helper function
eval
:May 24, 2016 at 9:49 pm
First off, really awesome, interesting article. As a math nerd, it is up my alley.
However, as a Java/Javascript programmer (i.e., as an imperative programmer), I’m struggling to see how the state monad is useful in an imperative language like Javascript.
Please note that I do see Monads as being useful in imperative languages. For example, Promises are one of my favorite ways of handling asynchronicity (and a Promise is a monad). The Maybe monad is also useful (because it makes code less cluttered and more readable).
From what I can tell though, the State monad is a necessity in Haskell because…. well you can’t do imperative code (like assignments). But in an imperative language do you really need it?
September 3, 2021 at 8:36 pm
[…] [12] Monads for the Curious Programmer:Part 3.https://bartoszmilewski.com/2011/03/17/monads-for-the-curious-programmer-part-3/.2011. […]