Functional Programming



In the previous installment I introduced monads using two examples: the Maybe monad and the List monad. Admittedly, those weren’t the most exciting uses of monads, although they served the purpose of explaining the underlying theory. I also mentioned that monads were introduced into functional programming as a general solution for a variety of problems. One of those problems was representing stateful computations as functions.

Functions, State, and Side Effects

Here’s the problem: In functional languages, every time you call a function with the same arguments, it must return the same result. In fact, if the compiler detects such a situation, it is free to memoize the result of the first call and skip subsequent calls altogether. A stateful computation, on the other hand, might return different results every time it’s performed. It may, for instance, access some global or static variables. It may also modify those variables– in other words have side effects. In extreme cases a computation might be performed purely for side effects and not even bother to return any results.

This kind of behavior is often troublesome even in imperative programming. The use of global variables in particular is being discouraged. A better solution is to encapsulate the state and pass it explicitly to functions that use it. As a syntactic shortcut, in object-oriented languages, some of the state is regularly passed to functions (methods) as a hidden “this” or “self” argument. There’s even a syntax for composing such functions, as in this JavaScript snippet:

with(document) {
    var t = title;
    write(t + " and more");
}

Here, title is a property and write a method of the object document. (If you put on your monadic glasses, it almost looks like do notation.)

In functional languages we have one more limitation: we cannot mutate any data. There’s a standard way to overcome this limitation: Instead of modifying the data, you create a modified copy. This doesn’t even have to be expensive, if the language supports smart data structures that silently substitute references for copies whenever possible. Most operations on lists in Haskell are optimized this way, and there’s even a language, Clojure, at whose core are “persistent” data structures that behave as if they were immutable, but do a lot of sharing behind the scenes. Immutability is a very attractive feature when you are doing concurrent programming: access to immutable data requires no synchronization.

Taking all this into account, the way to translate stateful computations into functional language is to use functions that explicitly take state (encapsulated in some data structure) and return the, possibly modified, state together with the usual return value. For instance, a C++ “function”:

int pop() {
    auto v = glob.top();
    glob.pop();
    return v;
}

that operates on a global vector, glob of the type std::vector<int>, would be turned into a Haskell function:

pop (ST lst) = (ST (tail lst), head lst)

operating on the state of type Stack:

newtype Stack = ST [Int]

The constructor ST creates a Stack from a list of integers. This constructor is also used for pattern matching, as in the argument of pop. The function head returns the first element of a list, tail returns the rest.

The signature of pop is characteristic of functions that operate on state:

top:: Stack-> (Stack, Int)

Such functions are often called “actions.”

There are two problems with this scheme: (1) It’s awkward, and (2) It doesn’t fit our monadic approach. In this example, the original computation (as expressed in C++) takes no arguments and returns an integer. Its functional counterpart takes a state and returns an integer combined with the state. It’s not clear how one would bind such functions together and use the convenient do notation.

We are on the right track though. We just need to get even more general: We need to separate the construction of an action from its execution. Our basic blocks will be functions that return actions– we’ll call them “monadic functions.” Since action is a function, we’ll be dealing with functions returning functions; or higher order functions.

Our goal is to find a way to compose monadic functions into larger monadic functions. A composite monadic function will return a composite action. We will then execute such action on a state, and get our result.

This new description fits the monadic pattern much better. We start with a generic stateful computation that takes an argument of type a and returns a value of type b, and we turn it into a (monadic) function that takes type a and returns an enriched type based on b. This time, though, the enriched type is a function type– an action. In general, an action is a function that takes a state (of some type S) and returns a tuple consisting of the (possibly modified) state and the value of type b.

S -> (S, b)

Here’s the first element of the monad– a type constructor: For any type t it defines a new type: an action to calculate a value of type t given a state. This type constructor is part of the state monad. Before we get to a more formal definition, let’s do some practice exercises.

The Monadic Calculator

There’s a perfect example of a stateful computation: a stack-based calculator. The state in this case is represented by the type Calc:

newtype Calc = Calc [Int]

that hides a list of integers– our calculator’s stack.

First, lets define a monadic function (a function that returns an action) that pops an element off the calculator’s stack. It will be a function returning a function, so we need to use lambdas.

popCalc = 
    \(Calc lst) -> (Calc (tail lst), head lst)

The body of the lambda is almost identical to the implementation of pop above. Notice that popCalc takes no arguments. Rather, the function that it produces takes a calculator as an argument and returns the calculator back, paired with the result–the value at the top of the stack. In other words, popCalc returns a promise to calculate the top of the calculator’s stack when the stack is available.

Here’s how you can use popCalc. First, you call it with no arguments and record the returned action. Next, you create a calculator (with a non-empty stack, otherwise the next line of code would bomb). You apply the action to that calculator and record the result– you pattern-match it to a tuple consisting of a changed calculator and a number. Finally you display that number. This is the actual output of a Haskell interpreter session:

> let f = popCalc
> let calc = Calc [3, 2, 1]
> let (calc', x) = f calc
> x
3

While we’re at it, we can similarly implement a pushCalc function:

pushCalc n =
    \(Calc lst) -> (Calc (n:lst), ())

Notice that the lambda produced by pushCalc returns a modified calculator (argument n is prepended to the list) paired with a special value () of the type unit— a Haskell equivalent of void. The imperative equivalent of this function would return void and work only through side effects. Notice also that the lambda is actually a closure: it captures the outer variable n for later use.

Finally, we need a function that performs some calculation; after all we are implementing a calculator:

addCalc =
    \(Calc lst) -> let (a:b:rest) = lst
                   in
                       (Calc ((a + b):rest), ())

Here I’m matching the calculator’s list with the pattern (a:b:rest) to retrieve the top two elements. The modified calculator has the sum of those two elements on the top of its stack.

We can use all these functions in combination to perform more complex operations, like adding two numbers. Here’s a piece code that might rival some of the Rube Goldberg creations:

add x y =
    let pux = pushCalc x -- promise to push x
        puy = pushCalc y -- promise to push y
        axy = addCalc    -- promise to add top numbers
        pp = popCalc     -- promise to pop the result
        calc = Calc []   -- we need a calculator
        (calc1, _) = pux calc  -- actually push x
        (calc2, _) = puy calc1 -- actually push y
        (calc3, _) = axy calc2 -- actually add top numbers
        (_, z) = pp calc3      -- actually pop the result
    in
        z  -- return the result

But what we really want is to be able to combine smaller actions into larger actions. For that we have to define bind. The signature of bind, in this case, should be:

bind :: (Calc -> (Calc, a)) ->        -- action
        (a -> (Calc -> (Calc, b)) ->  -- continuation
        (Calc -> (Calc, b))           -- new action

I have highlighted our enriched types–the action types. This signature looks much more complex than the signature of the Maybe bind, but that’s only because the enriched type is itself a function type. Other than that, the structure is the same: bind accepts an action and a continuation and returns a new action. The continuation in this case takes an argument of type a (the value to be calculated by the first action) and returns the composite action.

In fact, if we define Action as a type alias:

type Action a = Calc -> (Calc, a)

the signature of bind can be abbreviated to:

bind :: (Action a) -> (a -> (Action b)) ->  (Action b)

Now for the implementation. Since the result of bind is an action, it has to return a lambda of the appropriate signature.

bind act cont =
    \calc -> ... produce (Calc, b) tuple ...

Bind is supposed to compose the action, act, with the continuation, cont. So it should first apply act to calc.

let (calc', v) = act calc

The result is a tuple (calc', v) with a new calculator and a value v of type a.

This is the v that the continuation expects, so the next step is to apply the continuation:

act' = cont v

The result of the continuation is a new action. This new action can then be executed, that is applied to the new calculator:

act' calc'

to produce the desired result– a tuple of the type (Calc, b).

Here’s the final code:

bind act cont =
    \calc ->
        let (calc', v) = act calc
            act' = cont v
        in
            act' calc'

To complete our construction of the monad, we need to define return. The signature of return is:

return :: a -> Action a

and the implementation is pretty straightforward. It takes a value v and returns a promise to return this value.

return v = \calc -> (calc, v)

An astute reader might notice that nothing in this construction depends on the peculiarities of the type Calc. It will work for any type that is used to represent state. So we have in fact just constructed a generic state monad. The stack-based calculator is just a silly example of that monad.

It’s not difficult to implement bind as an infix operator, >>=, and turn the calculator into a monad that’s recognizable by the Haskell compiler (see Appendix 1). Then the relevant part of the add function may be rewritten in the do notation:

add x y = do
    pushCalc x
    pushCalc y
    addCalc
    r <- popCalc
    return r

Let me present the same code without any syntactic sugar, using the cascading lambda-within-lambda notation:

add x y =
  bind (pushCalc x) 
       (\() -> bind (pushCalc y)
                    (\() -> bind addCalc
                                 (\() -> bind popCalc
                                              (\z -> return z))))

This is not something you will see often in Haskell programs, but I will eventually want to go beyond Haskell. My goal is to connect back with C++, and this is the form that’s most convenient form making such a transition.

So let’s painstakingly analyze this code. We are binding the first action, (pushCalc x), to the rest of the code. The rest of the code is expressed as one huge lambda. To make these two parts fit together, their types have to match. The value produced by the action pushCalc is void (a.k.a., “unit”)– so it’s type is Action (). Therefore the lambda to which it binds must also take void, hence the notation:

\() -> ...

The body of that lambda is another bind, and so on, until we get to the interesting part, which is popCalc.

popCalc is an action that calculates a value: its signature is Action Int. This value is passed to the lambda to which popCalc is bound. Therefore this last lambda takes an Int argument, z. Finally, this value is enclosed in an action, and that’s done by the function return.

This unsugared notation elucidates one more aspect of the monadic approach that’s very relevant in the context of Haskell. Haskell is a lazy language: it doesn’t evaluate anything unless it is strictly necessary for achieving some final goal. Also, when it needs to evaluate several independent things, it will do that in some arbitrary, unpredictable order. So if it were somehow possible to implement the imperative versions of push and pop in Haskell, we would have two problems: push would never be evaluated because it produces no result, and even if it were, its evaluation could be swapped with the subsequent pop. Monadic bind forces the ordered evaluation of actions by introducing explicit data dependency. If pop follows push in the chain of binds, pop cannot be evaluated before push because its argument is the calculator that is returned by push. The two are linked by data dependency which, by the way, is not so obvious in the do notation.

Conclusion

The state monad is a very interesting pattern from the programming point of view. Instead of doing something, you create an action that is executed (maybe even multiple times) later. The monadic scaffolding provides the standard amenities like the ability to compose actions, and the do notation makes writing functions that produce functions much more natural. There is an interesting variation of the state monad called the IO monad, which is used for input and output in Haskell. I describe it in Appendix 2.

There are many patterns in imperative languages that have elements, or sometimes just hints, of a state monad. For instance, in the OO world you might encounter a very useful Command Pattern. You can “bind” command objects using the Composite Pattern. In languages that support anonymous functions and closures, like JavaScript, C# and, recently C++, you can return functions from functions directly. This might help, for instance, in dealing with inversion of control, where you return a closure as an event handler (that would be material for another series of blog posts).

But I have in mind a very specific example that I’ve been working on in C++ that fits the monadic pattern perfectly, and I’m going to write about it in the next installment of this series.

I’d like to thank Eric Niebler for valuable comments on the draft of this post.

Appendix 1

The full implementation of a stack-based calculator requires a few more Haskell tricks. First, we have to explicitly define our type constructor. I’ll call the new type Calculation, with the type constructor CL that encapsulates an action:

newtype Calculation a = CL (Calc -> (Calc, a))

Monadic functions have to return this new type, so they all wrap their actions into a Calculation.

pushCalc n =
    CL (\(Calc lst) -> (Calc (n:lst), ()))

topCalc = 
    CL (\(Calc lst) -> (Calc lst, head lst))

popCalc =
    CL (\(Calc lst) -> (Calc (tail lst), head lst))

addCalc =
    CL (\(Calc lst) -> let (a:b:rest) = lst
                       in
                           (Calc ((a + b):rest), ()))

Haskell has a built-in type class for Monads (think of a type class as a C++ concept). We have to tell Haskell that our Calculation is an instance of Monad and provide the definition of the two associated functions: bind, using infix notation, and return.

instance Monad Calculation where
    return x = 
        CL (\calc -> (calc, x))
    CL(c) >>= cont =
        CL (\calc ->
            let (calc', v) = c calc
                CL c' = cont v
            in
                c' calc')

With those definitions, our add function can be written using the do notation.

add x y = do
    pushCalc x
    pushCalc y
    addCalc
    r <- popCalc
    return r

Since we are not expecting any values to be calculated by pushCalc or addCalc, there are no left arrows accompanying them in the do notation.

Appendix 1a: The Applicative

Haskell keeps evolving and, on occasion, the changes in the language break old code. In particular the Monad class has a superclass now, called Applicative. That’s why the Monad instance for Calculation won’t compile any more, unless you explicitly add the instance for Applicative. Fortunately, the Applicative functionality can be fully implemented using the Monad interface, as in:

instance Applicative Calculation where
  pure = return
  mf <*> mx = do f <- mf
                 x <- mx
                 return (f x)

This won’t compile either, because Applicative requires Functor as its superclass. So we have to make Applicative a Functor. The simplest is to let the compiler work it out, but you have to include this line at the very top of the file:

{-# LANGUAGE DeriveFunctor #-}

and modify the definition of Calculation:

newtype Calculation a = CL (Calc -> (Calc, a))
  deriving Functor

In fact, it’s possible to implement the calculator as an Applicative, without the need for a Monad instance. But that’s a different story.

Appendix 2: The IO Monad

Strictly speaking a lazy purely functional language like Haskell cannot do any input or output. That’s because the compiler is free to memoize the result of the first call to, say, getChar and elide all subsequent calls. Calls to putChar, which don’t return anything, may be ignored altogether. This is why most functional languages cheat when it comes to I/O. But not Haskell. Monads to the rescue!

Let’s think for a moment why getChar may return different characters every time it’s called. It’s because there is a keyboard somewhere out there that changes its state. Why is it changing its state? Because there is a human who pushes the keys. Why is the human pushing the keys? Because he or she got a phone call from China that the price of rice is about to go up, so it makes sense to buy some futures. And so on… In other words there is this external world that is full of state.

What if we encapsulate that whole world in a hidden data structure and write our program as an action that operates on it? This is very similar to the state monad pattern except that here the programmer has no access to the state and cannot execute the action. The action is produced by main and passed to the system for execution. It’s the system, wink wink, that has access to “the world” and may pass it as a state argument to that action. Of course it’s all smoke and mirrors, but it successfully staves off the insanity of the external world from impinging on the purity of Haskell.

How does it work in practice? There is a monad called IO. It’s almost like a state monad, except that its type can’t be expressed in Haskell, because it would have to look something like this:

type IO a = World -> (World, a)

and we don’t know what World is. The main function in a Haskell program is a monadic IO action, usually:

main :: IO ()

with the type parameter a replaced by unit, (), (although any other type will work too).

The simplest main is just a single IO action:

main = putStrLn "Hello World!"

but in general main is a do block.

You might ask: If a Haskell program is one monadic IO action, then where does the traditional functional code go? The answer is that you can call traditional functions from anywhere in the do block. Even in my highly biased example there were several non-monadic function calls (head, tail, operator (+), etc.). Imagine a Haskell program as a tree: it’s trunk is monadic IO, and so are all the thick branches that have anything to do with I/O. But the thinner branches and leaves are your run of the mill functions that get (lazily) evaluated only when the main IO action is executed by the system.

Another interesting observation is that all functions that perform I/O have this information encoded in their type signature. Not in the type of its arguments, mind you, but in the return type. This is almost the opposite of what you see in imperative languages, where you’d have to pass some kind of I/O object (file or stream) to a procedure that performs I/O (except when that object is global, like standard input/output in C++). In Haskell, if you want your function to perform I/O, two things must happen: it must return an IO action that it internally constructs; and the action must find its way to the very top, to the main function. On the way up, it might be bound with other such actions either using bind or the do notation.

You might ask: How does Haskell make sure that all IO actions get to the top, so that the system may execute them? It doesn’t! But consider what you would have to do in order not to pass an action to the top. You’d have to explicitly ignore it, as in:

let _ = putStrLn "Don't print me!"

Ignoring things in Haskell is not a default thing.


In my previous post I talked about categories, functors, endofunctors, and a little about monads. Why didn’t I just start with a Haskell definition of a monad and skip the math? Because I believe that monads are bigger than that. I don’t want to pigeonhole monads based on some specific language or application. If you know what to look for you’ll find monads everywhere. Some languages support them better, as a single general abstraction, others require you to re-implement them from scratch for every use case. I want you to be able to look at a piece of C++ code and say, “Hey, that looks like a monad.” That’s why I started from category theory, which sits above all programming languages and provides a natural platform to talk about monads.

In this installment I will talk about some specific programming challenges that, on the surface, seem to have very little in common. I will show how they can be solved and how from these disparate solutions a pattern emerges that can be recognized as a monad. I’ll talk in some detail about the Maybe monad and the List monad, since they are simple and illustrate the basic ideas nicely. I will leave the really interesting examples, like the state monad and its variations, to the next installment.

Challenges of Functional Programming

You know what a function is: called with the same argument it always returns the same result. A lot of useful computations map directly to functions. Still, a lot don’t. We often have to deal with some notions of computations that are not functions. We can describe them in words or implement as procedures with side effects. In non-functional languages such computations are often called “functions” anyway– I’ll enclose such usage with quotation marks. Eugenio Moggi, the guy who introduced monads to computing, listed a few examples:

  • Partial “functions”: For some arguments they never terminate (e.g., go into an infinite loop). These are not really functions in the mathematical sense.
  • Nondeterministic “functions”: They don’t return a single result but a choice of results. Non-deterministic parsers are like this. When given an input, they return a set of possible parses. Which of them is right depends on the context. These are not really functions because a function must return a single value.
  • “Functions” with side effects: They access and modify some external state and, when called repeatedly, return different results depending on that state.
  • “Functions” that throw exceptions: They are not defined for certain arguments (or states) and they throw exceptions instead of returning a result.
  • Continuations.
  • Interactive input: getchar is a good example. It’s result depends on what hits the keyboard.
  • Interactive output: putchar is a good example. It’s not a function because it doesn’t return anything (if it returns an error code, it doesn’t depend on the argument). Still, it can’t be just optimized away if we want any output to appear on the screen.

The amazing thing is that, using some creative thinking, all these problems can be solved using pure functional methods. I won’t discuss all of them, but I’ll show you a series of progressively more interesting examples.

Error Propagation and Exceptions

Who doesn’t deal with error propagation? Let’s dissect the problem: You are defining a computation that can return a valid result only for a subset of possible arguments. So what happens when the argument is “wrong”? Well, you can return an error.

In the simplest case you might add one extra bit, success or failure, to your result. Sometimes you can spare a bit in the result type: If the correct result is always a non-negative number, you may return negative one as an error. Or, if the result is a pointer, you might return a null pointer instead. But these are just small hacks introduced for the sake of performance. And, like most hacks, they can lead to dangerous code. Consider this (likely incorrect) code:

size_t off = fileName.find('.');
string ext = fileName.substr(off, fileName.length() - off);

If fileName contains no dot, the result is a special value npos signifying an error. The problem is that npos is of the same type as a non-error result, so it can be passed quietly to string::substr causing undefined behavior.

A more general and safer solution is to change– extend– the type of the result to include an error bit. You can do it for any type of result. In Haskell, you just define a type constructor called Maybe:

data Maybe a = Nothing | Just a

It takes an arbitrary type a and defines a new type that adds the special value Nothing to the set of possible values of a. In C++ that would be equivalent to a template:

template<class T>
struct Maybe {
    T just; // valid if 'nothing' is false
    bool nothing;
};

(This is just an example. I’m not arguing that Maybe would be very useful in C++ which has other mechanisms for error propagation.)

So here we have a way to turn a computation that’s not defined for all values into a function that’s defined over the whole domain, but which returns a new richer type.

The next question is, do these new things– functions returning Maybe— compose? What should the caller do with the result of such a function when it’s part of a larger computation? One thing is for sure, this result cannot be passed directly to an unsuspecting function–the error cannot be easily ignored– which is good. If we replace find by a new function safe_find that returns a Maybe<size_t>, the client won’t call substr with it. The types wouldn’t match. Instead, the result of safe_find must be unpacked and (much more likely than before) tested.

Maybe<size_t> off = safe_find(fileName, '.');
string ext;
if (!off.nothing)
    ext = fileName.substr(off.just, fileName.length() - off.just);

Notice what happened here: By changing the return type of a function we broke the natural way functions compose– the result of one becoming the input of the next. On the other hand, we have come up with a new way of composing such functions. Let me chain a few of such compositions for better effect:

Maybe<Foo> foo = f(x);
if (!foo.nothing) {
    Maybe<Bar> bar = g(foo.just);
    if (!bar.nothing) {
        Maybe<Baz> baz = h(bar.just);
        if (!baz.nothing) {
            ...
        }
    }
}

I have highlighted the elements of the “glue,” which is used to compose our new, more error-conscious functions. Notice how this boilerplate glue gets in the way of code clarity. Ideally, we would like to write something like this:

DO
{
    auto y = f(x);
    auto v = g(y);
    auto z = h(v);
    return z;
}

where DO would magically provide the glue that’s implicit in the chaining of f, g, and h.

It’s not easy to do this– abstract the glue away– in C++. I’m not even going to try. But in Haskell it’s a different story. Let’s start with the almost direct translation of the C++ code into Haskell:

compose n =
    let m = f n in
    case m of
    Nothing -> Nothing
    Just n1 ->
        let m1 = g n1 in
        case m1 of
        Nothing -> Nothing
        Just n2 -> 
            let n3 = h n2 in
            n3

The if statements are replaced by Haskell’s pattern matching (case x of). The ms are the Maybes and the ns art their contents (if they aren’t Nothings).

Notice the characteristic cascading style of this code– the nested conditionals or pattern matches. Let’s analyze one level of such a cascade. We start with a Maybe value (one returned by f n). We unpack it and examine the contents. If the result is not an error (Just n1), we continue with the rest of the body of compose. I have highlighted the “rest of the body” in blue (the “glue” is still in red).

What’s also important is that, if the result is an error (the Nothing branch of the pattern match) the whole “rest of the body” is skipped. In order to abstract the glue, we have to abstract the “rest of the body” as well. Such an abstraction is called a continuation. Let’s write this continuation as a lambda (lambdas in Haskell are written using the backslash, which is nothing but the Greek letter λ missing one leg):

\ n1 -> 
    let m1 = g n1 in
    case m1 of
    Nothing -> Nothing
    Just n2 -> 
        let n3 = h n2 in
        n3

And here’s the trick: We can abstract the glue as a (higher-order) function that takes a Maybe value and a continuation. Let’s call this new function bind and rewrite compose with its help:

compose n =
    let m = f n in
    -- the first argument is m, the second is the whole continuation
    bind m (\n1 ->
                let m1 = g n1 in
                case m1 of
                Nothing -> Nothing
                Just n2 -> 
                    let n3 = h n2 in
                    n3)

Here’s how bind is implemented:

bind m f = 
    case m of 
    Nothing -> Nothing  -- bypass the continuation
    Just n -> f n       -- pass n to the continuation

or, more compactly,

bind Nothing cont  = Nothing
bind (Just n) cont = cont n

Figure 1 illustrates the complete code transformation. The result of f n, which is a Maybe, is passed to bind, represented by a blue box. Inside bind the Maybe is unpacked. If its value is Nothing, nothing happens. If its value is Just n1, the rest of the code, the continuation, is called with the argument n1. The continuation itself is rewritten using bind, and so on. The final continuation calls return, which I will explain shortly.

Fig 1. Composition of functions that return Maybe results

The position of bind, the blue box in Figure 1, between its Maybe argument and the continuation suggests that infix notation might be more appropriate. Indeed, in Haskell bind is represented by the infix operator, >>=:

Nothing  >>= cont = Nothing
(Just x) >>= cont = cont x

(The left-hand side of the equal sign is the operator between its two arguments [actually, patterns], and the right-hand side is the body of the function.) We can express the type signature of the bind function as:

(>>=) :: Maybe a -> (a -> Maybe b) -> Maybe b

It takes a Maybe a and a function from a to Maybe b (that’s our continuation). It returns a Maybe b. When dealing with higher order functions I find type signatures extremely useful; so much so that I often use them as comments in my C++ meta-code.

The complete rewrite of compose using >>= looks like this:

compose1 n =
    f n >>= \n1 ->
        g n1 >>= \n2 ->
            h n2 >>= \n3 ->
                return n3

Now is the time to explain the mysterious return at the end of the function. No, it’s not the keyword for returning a result from a function. It’s a function that takes an argument of type a and turns it into a Maybe a:

return n = Just n

We need return because the result of compose is a Maybe. If any of the functions returns Nothing, compose returns Nothing (it’s part of the definition of >>=). Only when all functions succeed do we call return with the correct result, n3. It turns n3 into Just n3, thus announcing the success of the computation and encapsulating the final result.

You might say that using return instead of Just n3 is an overkill, but there are good reasons to do that. One is to encapsulate and hide direct access to the implementation of Maybe. The other has to do with the ways this pattern is generalized beyond Maybe.

The Maybe Monad

What does Maybe have to do with monads? Let’s see what we have done so far from a more general perspective.

We started from a computation which didn’t behave like a function– here, it was not defined for all arguments. We had found a clever way to turn it into a function by enriching its return type. Such general “enriching” of types can be expressed as a type constructor.

From a category-theoretical point of view, defining a type constructor, which is a mapping from types to types, gets us half way towards defining an endofunctor. The other part is the mapping of functions (morphisms). For any function that takes values of type a and returns values of type b we need a corresponding function that acts between the enriched types. In Haskell, the mapping of functions to functions can be expressed as a higher-order function. When this mapping is part of an endofunctor, the corresponding higher-order function is called fmap.

Let’s call the type constructor M. It constructs the enriched type M a from any type a. The function fmap has the following signature:

fmap :: (a -> b) -> (M a -> M b)

It maps a function from a to b to a function from M a to M b. In the case of Maybe we already know the type constructor part, and fmap will follow soon.

Let’s go back to our computation turned function returning an enriched type. In general, its signature is:

f :: a -> M b

Since we care about composability of computations, we need a way to glue the enriched functions together. In the process of gluing we usually have to inspect the enriched type and make decisions based on its value. One of such decisions might be whether to continue with the rest of the computation or not. So the general bind operation must take two arguments: the result of the previous enriched function and the rest of the computation– a continuation:

bind :: M a -> (a -> M b) -> M b

If you squint enough, you can see a similarity between bind and fmap.

Here’s how you do it: Imagine that you supply bind with its second argument, the continuation. There is still one free parameter of the type M a. So what’s left is a function that takes M a and returns M b. (This is not exactly currying, since we are binding the second argument rather than the first, but the idea is the same.) That’s exactly the type of function that occurs on the right hand side of the signature of fmap. Seen from that point of view, bind maps functions into functions:

(a -> M b) -> (M a -> M b)

The function of the left hand side is our continuation, the one on the right hand side is bind with the second argument plugged by the continuation.

This signature is really close to that of fmap, but not quite. Luckily, we still have a second family of functions at our disposal, the polymorphic return. The signature of return is:

return :: a -> M a

Using both bind and return we can lift any function f:

f :: a -> b

to a function g:

g :: M a -> M b

Here’s the magic formula that defines g in terms of f, bind, and return (the dot denotes regular function composition):

g ma = bind ma (return . f)

Or, using infix notation:

g ma = ma >>= (return . f)

This mapping of f into g becomes our definition of fmap. Of course, fmap must obey some axioms, in particular it must preserve function composition and interact correctly with unit (which I will discuss shortly). These axioms translate back into corresponding conditions on bind and return, which I’m not going to discuss here.

Now that we have recovered the functor, we need the two other parts of a monad: unit and join. Guess what, return is just another name for unit. For any type, it lifts an element of that type to an element of the enriched type in the most natural manner (incidentally, the word natural has a very precise meaning in category theory; both unit and join are natural transformations of the functor (M, fmap)).

To recover join, let’s look at its type signature in Haskell:

join :: M (M a) -> M a

It collapses one layer of the type constructor. In the case of Maybe it should map a double Maybe to a single Maybe. It’s easy to figure out that it should map Just (Just x) into Just x and anything else to Nothing. But there is a more general way of defining join using bind combined with the identity function:

join :: M (M a) -> M a
join mmx = mmx >>= id

We are binding a value of type M (M a) to a function id. This function, when acting on M a returns M a. You can convince yourself that these signatures match the general signature of bind (see Appendix); and that this definition, when applied to Maybe, produces the correct result.

To summarize: What we have done here is to show that Maybe together with bind and return is a monad in the category-theory sense. Actually, we have shown something far more general: Any triple that consists of a type constructor and two functions bind and return that obey certain identities (axioms) define a monad. In category theory this triple is called a Kleisli triple and can be used as an alternative definition of a monad.

Syntactic Sugar

So far you’ve seen only the Maybe example, so it might seem like we are pulling out a monadic cannon to kill a mosquito. But, as you’ll see later, this same pattern pops up in a lot of places in various guises. In fact it’s so common in functional languages that it acquired its own syntactic sugar. In Haskell, this sugar is called the do notation. Let’s go back to the implementation of our function compose:

compose n =
    f n >>= \n1 ->
        g n1 >>= \n2 ->
            h n2 >>= \n3 ->
                return n3

Here’s exactly the same code in do notation:

compose n = do
    n1 <- f n
    n2 <- g n1
    n3 <- h n2
    return n3

This looks deceptively similar to an imperative program. In fact here’s a C++ version of it (for the sake of the argument let’s assume that f takes a pointer to Foo and h returns an integer) :

int compose(Foo * n)
{
    auto n1 = f(n);
    auto n2 = g(n1);
    auto n3 = h(n2);
    return n3;
}

Uh, one more thing. This is how you would call it:

try {
    compose(pFoo);
}
catch(...) {
    // error handling
}

In C++ you get virtually the same functionality not by modifying the return type, but by throwing an exception. (Now you may utter your first cry, “Hey, that looks like a monad!”, when looking at C++ code.)

Just like in our Haskell example, once any of the functions reports an error (throws an exception), the rest of the body of compose is skipped. You might say that C++ exceptions offer more power than the Maybe monad and you’ll be right. But that’s because I haven’t shown you the Error monad and the Exception monad.

Where Haskell’s Exception monad beats C++ exceptions is in type checking. Remember the unfortunate attempt at adding exceptions specification to the C++ type system? (Java went the same way.) Here’s what the C++ standard says:

An implementation shall not reject an expression merely because when executed it throws or might throw an exception that the containing function does not allow.

In other words, exceptions specifications are bona fide comments. Not so in Haskell! If a function returns a Maybe or Exception, that becomes part of its type signature which is checked both at the call site and at the function definition site. No cheating is allowed, period.

But the major difference between Haskell’s and C++’s approach is that the do notation generalizes to all monads, whereas C++ neat try/catch syntax applies only to exceptions.

A word of caution when reading Haskell monadic code. Despite similarities in usage, Haskell’s left arrow is not an assignment. The left hand side identifier corresponds to the argument of the continuation (which is the rest of the do block). It is the result of unpacking the outcome of the right hand side expression. This unpacking always happens inside bind.

Non-deterministic Computations

In the previous post I introduced the list monad by defining a functor and two families of functions. The type constructor part of the functor mapped any type a into the list of a, [a]. The part that acted on functions (now we know that, in general, it’s called fmap; but for lists it’s just map) worked by applying the function to each element of the list. The unit and the join were polymorphic functions. unit was defined as:

 unit x = [x]

and join as concat.

Now I can tell you that the list monad provides the solution to the problem of implementing non-deterministic computations. Imagine parsing a non-deterministic grammar. A production might offer multiple alternative parse trees for the same input.

These types of computations may be simulated with functions returning lists of possible results. Here’s the same trick again: whatever type is returned by a deterministic parser (e.g., a parse-tree type), it’s now turned into an enriched type (a parse tree list).

We’ve already seen the category-theory construction of the list monad but here we are facing a slightly different problem: how to string together functions that return lists. We know that we have to define bind. It’s signature, in this case, would be:

bind :: [a] -> (a -> [b]) -> [b]

It takes a list (which is the result of a previous function) and a continuation, (a -> [b]), and must produce another list. The obvious thing is to apply the continuation to each element of the input list. But since the continuation also produces a list, we’ll end up with a list of lists. To get a single list as the result, we’ll simply concatenate the sublists. Here’s the final implementation:

xs >>= cont = concat (map cont xs)

The application of the continuation to each element of the input list is done using map.

Is this a coincidence that we were able to define bind in therms of join and fmap? Not at all. The general formula for converting the functor definition of a monad to a Kleisli triple is:

  • Take the object-mapping part of the functor (the type constructor)
  • Define bind as
    bind x f = join ((fmap f) x))

    where fmap is the part of the functor that maps morphisms.

  • Define return as unit

Now we know how to move between those two definitions back and forth.

Just as in the case of Maybe, we can apply the do notation to functions returning lists:

toss2Dice = do
    n <- tossDie
    m <- tossDie
    return (n + m)

Here, tossDie returns the list of all possible outcomes of a die toss, and toss2Dice returns the list of all possible sums of the outcomes of a two-die toss.

An interesting observation is that the list monad is closely related to list comprehensions, down to the use of left arrows (in Haskell). For instance, the above example is equivalent to:

toss2Dice = [n + m | n <- tossDie, m <- tossDie]

Conclusions

There is a large class of computations that convert input to output in a non-functional way. Many of those can be expressed as functions that return “enriched” output. This enrichment of the output can be expressed in terms of a type constructor. This type constructor defines the first component of a monad.

Computations have to be composable, so we need a way of composing the enriched functions. This introduces the second component of the monad, the bind family of higher-order functions.

Finally, we should be able to construct functions that return enriched types, and for that we need the third component of the monad, the return family of functions. They convert regular values into their enriched counterparts.

I’ve shown you two examples of Haskell monads in action, but the real treat will come in the third installment of my mini-series. I’ll show you how to define and compose functions that, instead of returning regular values, return functions that calculate those values.

Appendix: Join From Bind

By popular demand (see comments), I decided to explain in more detail the typing of the formula:

join :: M (M a) -> M a
join mmx = mmx >>= id

At first look it seems like id doesn’t have the right signature for the second parameter to bind:

bind :: M a' -> (a' -> M b') -> M b'

(I changed the names of type arguments to a' and b' to avoid confusion.) However, notice that here we are calling bind with the first argument, mmx, of type M (M a). It means that a' in this particular instantiation of bind is (M a). The second argument to bind is id with the signature:
id: c -> c
Here, id will be called with c equal to (M a), so its return type will also be (M a). With those substitutions, bind will also return type (M a):

M (M a) -> (M a -> M a) -> M a

Thus the whole combination (mmx >>= id) ends up with the right type signature for join.

Bibliography

  1. Eugenio Moggi, Notions of Computation and Monads. This is a hard core research paper that started the whole monad movement in functional languages.
  2. Philip Wadler, Monads for Functional Programming. The classic paper introducing monads into Haskell.
  3. Tony Morris, What Does Monad Mean?
  4. Brian Beckman, Don’t fear the Monad
  5. Graham Hutton, Erik Meijer, Monadic Parsing in Haskell.
  6. Brian McNamara, Yannis Smaragdakis, Syntax sugar for FC++: lambda, infix, monads, and more. A C++ template library for functional programming that imitates Haskell pretty closely.

The Monad is like a bellows:
it is empty yet infinitely capable.
The more you use it, the more it produces;
the more you talk about it, the less you understand.

–Monad Te Ching

I don’t know if I’m exaggerating but it seems like every programmer who gets monads posts a tutorial about them. (And each post begins with: There’s already a lot of monad tutorials on the Internet, but…) The reason is that getting monads it’s like a spiritual experience that you want to share with others.

When facing a monad, people often behave like the three blind men describing an elephant. You’ll see monads described as containers and monads described as actions. Some people see them as a cover-up for side effects, others as examples of endofunctors in Category Theory.

Monads are hard to describe because they don’t correspond to anything in our everyday experience. Compare this with Objects in Object-Oriented programming. Even an infant knows what an object is (something you can put in your mouth). What do you do with a monad?

But first, let me answer the pertinent question:

Why Bother?

Monads enable pure functional programmers to implement mutation, state, I/O, and a plethora of other things that are not functions. Well, you might say, they brought it on themselves. They tied their hands behind their backs and now they’re bragging that they can type with their toes. Why should we pay attention?

The thing is, all those non-functional things that we are so used to doing in imperative programming are also sources of a lot of troubles. Take side effects for instance. Smart programmers (read: the ones who burnt their fingers too many times) try to minimize the use of global and static variables for fear of side effects. That’s doable if you know what you’re doing. But the real game changer is multithreading. Controlling the sharing of state between threads is not just good programming practice– it’s a survival skill. Extreme programming models are in use that eliminate sharing altogether, like Erlang’s full isolation of processes and its restriction of message passing to values.

Monads stake the ground between total anarchy of imperative languages and the rigid dictatorship of Erlang-like isolationism. They don’t prohibit sharing or side effects but let you control them. And, since the control is exercised through the type system, a program that uses monads can be checked for correctness by the compiler. Considering how hard it it to test for data races in imperative programs, I think it’s worth investing some time to learn monads.

There is also a completely different motivation: metaprogramming. The template language used for metaprogramming in C++ is a pure functional language (see my blog post, What does Haskell have to do with C++?). If monads are so important in functional programming, they must also pop up in C++ metaprogramming. And indeed they do. I hope to discuss this topic in a future post.

So what’s a monad?

A Categorical Answer

If you don’t know anything about category theory, don’t get intimidated. This is really simple stuff and it will clarify a lot of things, not to mention earning you some bragging rights. My main goal is to share some intuitions from mathematics that will build foundations for a deeper understanding of monads in programming. In this installment I will explain categories, functors, and endofunctors, leading up to monads. I will give examples taken both from everyday life and from programming. I will really get into monads and their practical applications in the next installment, so be patient.

Categories

A category is a natural extension of our notion of sets and functions. The generalization of a set in a category is called an object (a pretty neutral term with little semantic ballast), and the generalization of a function is called a morphism. In fact, the standard example of a category is the category of sets and functions called (capital letter) Set.

A morphism (read “function”) goes from one object (read “set”) to another. Mathematical functions like sin or exp usually go from the set of real numbers to the set of real numbers. But you may also define functions like isPrime that go from natural numbers to Booleans, or a function price that goes from a set of goods to the set of numbers.

The only thing a mathematician needs to know about morphisms is that they can be composed. If you have a morphism from A to B, A->B, and another going from B to C, B->C, then they can be composed to a morphism from A to C, A->C. And just like the standard composition of functions, morphism composition must be associative, so we don’t need parentheses when composing more than two of them.

Actually, two things. There must be, for every object, a special morphism called identity that essentially does nothing and when composed with any other morphism reproduces the same morphism.

Just to throw you off the track, a category doesn’t have to be built on sets and functions. You can easily construct simple categories from blobs and arrows. Fig 1 shows such a category that contains two objects and four morphisms: arrows between them (formally, those arrows are ordered pairs of objects so, for instance, f is a pair (A, B)). You can easily check that any two morphisms can be composed and that the two moprphisms iA and iB serve as identities.

Fig 1. A simple category with two objects and four morphisms.

That’s it! Hopefully I have just convinced you that a category is not a big deal. But let’s get down to Earth. The one category that’s really important in programming languages is the category of types and functions, in particular its Haskell version called Hask. There usually is a finite set of basic types like integers or Booleans, and an infinite set of derived types, like lists of integers, functions from integers to Booleans, etc. In Hask, a type is just a set of values. For instance, the type Char is a set {‘a’, ‘b’, ‘c’, … }.

So, in the category Hask, types are objects and functions are morphisms. Indeed, a function maps one type into another (forget for a moment functions of multiple arguments– they can be modeled with currying– and polymorphic functions– they are families of functions). And these are functions in the functional-programming sense: called with the same values they return the same values–no side effects allowed.

Function composition is just passing the result of one function as an argument to another. The identity function takes x and immediately returns it back.

This is all fine, but what’s in it for me, you might ask. So here’s the first insight and a moment of Zen. If there is one thing that you can call the essence of programming, it’s composability. In any style of programming you always compose your program from smaller pieces, and those pieces from even smaller pieces, and so on. That’s why categories with their composable morphisms are so important. The essence of Lego blocks is the way they fit together, their composability, not the color or size. The essence of functional programming is how functions work together: how you can build larger functions from smaller ones.

Every category is defined by its choice of objects and morphisms. But is there something that can characterize a given category that’s independent of its choice of particular objects and morphisms? How do you expose the inner structure of a particular category? Mathematicians know exactly how to do that. You have to be able to map categories into other categories while preserving some constraints imposed by the way morphisms are attached to objects and the way they compose. Such maps let you find similarities between categories and catalog different kinds of categories. That’s when things get really interesting.

Functors

A functor, F, is a map from one category to another: it maps objects into objects and morphisms into morphisms. But it can’t do it in a haphazard way because that would destroy the very structures that we are after. So we must impose some “obvious” (mathematicians love that word) constraints.

First of all, if you have a morphism between two objects in the first category then it better be mapped into a morphism between the corresponding objects in the second category. Fig 2 explains this diagrammatically. Object A is mapped into F(A), object B into F(B). A morphism f from A to B is mapped into a morphism F(f) from F(A) to F(B). Mathematicians say that such a diagram must commute, that is the result must be the same whether you go from A to F(A) and then apply F(f), or first apply f and then go from B to F(B).

Functor diagram

Fig 2. Diagram showing the action of a functor F on objects A and B and a morphism f. The bottom part lives in F's domain (source) category, the top part in its codomain (the target).

Moreover, such mapping should preserve the composition property of morphisms. So if morphism h is a composition of f and g, then F(h) must be a composition of F(f) and F(g). And, of course, the functor must map identity morphisms into identity morphisms.

To get a feel for how constrained functors are by these conditions, consider how you could map the category in Fig 1 into itself (such a functor just rearranges things inside one category). There are two trivial mappings that collapse both objects into one (either A or B), and turn all morphisms into identity. Then there is the identity functor that maps both objects into themselves and all morphisms into themselves. Finally, there is just one “interesting” functor that maps A into B and B into A with f and g switching roles. Now imagine a similar category but with the g arrow removed (yes, it’s still a category). Suddenly there is no functor other than the collapsing ones between Fig 1 and that new category. That’s because the two categories have completely different structure.

Let me now jump into more familiar territory. Since we are mostly interested in one category, Hask, let me define a functor that maps that category into itself (such functors are called endofunctors). An object in Hask is a type, so our functor must map types into types. The way to look at it is that a functor in Hask constructs one type from another– it’s a type constructor. Don’t get confused by the name: a type constructor creates a new type in your program, but that type has already existed in Hask.

A classical example is the list type constructor. Given any type it constructs a list of that type. Type Integer is mapped into list of integers or, in Haskell notation, [Integer]. Notice that this is not a map defined on integer values, like 1, 2, or 3. It also doesn’t add a new type to Hask— the type [Integer] is already there. It just maps one type into another. For C++ programmers: think of mapping type T into a container of T; for instance, std::vector<T>.

Mapping the types is the easy part, what about functions? We have to find a way to take a particular function and map it into a function on lists. That’s also easy: apply the function to each element of the list in turn. There is a (higher level) function in Haskel that does it. It’s called map and it takes a function and a list and returns a new list (or, because of currying, you may say that it takes a function and returns a function acting on lists). In C++ there is a corresponding template function called std::transform (well, it takes two iterators and a function object, but the idea is the same).

Mathematicians often use diagrams to illustrate the properties of morphisms and functors (see Fig 2). The arrows for morphisms are usually horizontal, while the arrows for functors are vertical (going up). That’s why the mapping of morphisms under a functor is often called lifting. You can take a function operating on integers and “lift it” (using a functor) to a function operating on lists of integers, and so on.

The list functor obviously preserves function composition and identity (I’ll leave it as an easy but instructive exercise for the reader).

And now for another moment of Zen. What’s the second most important property of programming? Reusability! Look what we have just done: We took all the functions we’ve implemented so far and lifted them to the level of lists. We’ve got functions operating on lists essentially for free (well, we’ve got a small but important subset of those functions). And the same trick may be applied to all kinds of containers, arrays, trees, queues, unique_ptrs and more.

It’s all beautiful, but you don’t really need category theory to apply functions to lists. Still it’s always good to see patterns in programming, and this one is definitely a keeper. The real revolution starts with monads. And, guess what, the list functor is actually a monad. You just need a few more ingredients.

What’s the intuition behind the statement that mappings expose the structure of the system? Consider the schematic of the London underground in Fig 3. It’s just a bunch of circles and lines. It’s only relevant because there is a mapping between the city of London and this schematic. The circles correspond to tube stations and the lines to train connections. Most importantly, if trains run between two stations, the corresponding circles in the diagram are connected by lines and vice versa: these are the constraints that the mapping preserves. The schematic shows a certain structure that exists in London (mostly hidden underground) which is made apparent by the mapping.

Fig 3. The schematic map of London underground system.

Interestingly, what I’m doing here is also mapping: London and the underground map correspond to two categories. Trains stations/circles are objects and train connections/lines are morphism. How’s that for an example?

Endofunctors

Mathematicians love mappings that preserve “obvious” constraints. As I explained, such mappings abstract inner structures away from the details of implementation. But you can also learn a lot about structure by studying non-trivial mappings into itself. Functors that map a category into itself are called endofunctors (like endo-scopes they let you look inside things). If functors expose similarities, endofunctors expose self-similarities. Take one look at the fractal fern, Fig 4, and you’ll understand how powerful self-similarity can be.

Fractal Fern

Fig 4. This fractal fern was generated using just four endomorphisms.

With a little bit of imagination you can see the list functor exposing fern-like structures inside Hask (Fig 5). Chars fan out into lists of Chars, which then fan out into lists of lists of Chars, and so on, ad infinitum. Horizontal structures described by functions from Char to Bool are reflected at higher and higher levels as functions on lists, lists of lists, etc.

Fig 5. The action of the list type constructor reveals fractal-like structure inside Hask. The functor lifts things up, the functions act horizontally.

A C++ template that takes a type parameter could be considered a type constructor. How likely is it that it also defines a functor (loosely speaking– C++ is not as mathematized as Haskell)? You have to ask yourself: Is the type parameter constrained in any way? It’s often hard to say, because type constraints are implicit in the body of a template and are tested only during instantiation. For instance, the type parameter for a std::vector must be copyable. That eliminates, for instance, classes that have private or deleted (in C++0x) copy constructors. This is not a problem though, because copyable types form a subcategory (I’m speaking really loosely now). The important thing is that a vector of copyable is itself copyable, so the “endo-” part of the endomorphism holds. In general you want to be able to feed the type created by the type constructor back to the type constructor, as in std::vector<std::vector<Foo>>. And, of course, you have to be able to lift functions in a generic way too, as in std::transform.

Monads

Ooh, Monads!
–Haskell Simpson

It’s time to finally lift the veil. I’ll start with the definition of a monad that builds on the previous sections and is mostly used by mathematicians. There is another one that’s less intuitive but easier to use in programming. I’ll leave that one for later.

A monad is an endofunctor together with two special families of morphisms, both going vertically, one up and one down (for “directions” see Fig 5). The one going up is called unit and the one going down is called join.

Now we are juggling a lot of mappings so let’s slow down to build some intuition. Remember, a functor maps objects: in our case, types, which are sets of values. The functor doesn’t see what’s inside the objects; morphisms, in general, do. In our case, a morphism is a function that maps values of one type into values of another type. Our functors, which are defined by type constructors, usually map poorer types into richer types; in the sense that type Bool is a set that contains just two elements, True and False, but type [Bool] contains infinitely many lists of True and False.

Unit takes a value from the poorer type, then picks one value from the richer type, and pronounces the two roughly equivalent. Such a rough equivalent of True from the Bool object is the one-element list [True] from the [Bool] object. Similarly, unit would map False into [False]. It would also map integer 5 into [5] and so on.

Unit can be thought of as immersing values from a lower level into the higher level in the most natural way possible. By the way, in programming we call a family of functions defined for any type a polymorphic function. In C++, we would express unit as a template, like this:

template<class T>
std::vector<T> unit(T value) {
    std::vector<T> vec;
    vec.push_back(value);
    return vec;
}

To explain join, imagine the functor acting twice. For instance, from a given type T the list functor will first construct the type [T] (list of T), and then [[T]] (list of list of T). Join removes one layer of “listiness” by joining the sub-lists. Plainly speaking, it just concatenates the inner lists. Given, for instance, [[a, b], [c], [d, e]], it produces [a, b, c, d, e]. It’s a many-to-one mapping from the richer type to the poorer type and the type-parameterized family of joins also forms a polymorphic function (a template, in C++).

There are a few monadic axioms that define the properties of unit and join (for instance that unit and join cancel each other), but I’m not going to elaborate on them. The important part is that the existence of unit and join imposes new constraints on the endofunctor and thus exposes even more structure.

Mathematicians look at join as the grandfather of all multiplication with unit being its neutral element. It’s heaven for mathematicians because multiplication leads to algebraic structures and indeed monads are great for constructing algebras and finding their hidden properties.

Unlike mathematicians, we programmers are not that interested in algebraic structures. So there must be something else that makes monads such a hit. As I mentioned in the beginning, in programming we often face problems that don’t naturally translate into the functional paradigm. There are some types of computations that are best expressed in imperative style. It doesn’t mean they can’t be translated into functions, it’s just that the translation is somewhat awkward and tedious. Monads provide an elegant tool to do this translation. Monads made possible the absorption and assimilation of imperative programming into functional programming, so much so that some people claim (tongue in cheek?) that Haskell is the best imperative language. And like all things functional monads are bound to turn around and find their place in imperative programming. But that’s material for my next blog post.

Bibliography


Learning a new programming paradigm is like learning a foreign language. You learn the new vocabulary, the grammar, a few idioms, but you still formulate your thoughts in your native tongue. It takes years of immersion before you start thinking in a foreign language. In programming, the litmus test comes when you’re approaching a new problem. Will you formulate your solution in terms of the old or the new paradigm? Will you see procedures, objects, or functions?

I remember proposing an object oriented approach to the design of a content index back in my Microsoft years. The reaction was: It’s a great paradigm, but it’s not applicable to this particular problem. There are no “objects” in the content index. Indeed, you can’t find Employees and Payrolls, Students and Courses, DisplayableObjects and LightRays in the content index. But after a short brain storm we discovered such exotic objects as a Resource Manager or a Master Merge. We ended up with a fine piece of OO engineering that is still part of the Windows shell after all those years.

There’s a similar, if not larger, shift in thinking when you learn functional programming, especially if you come from an OO background. Initially you can’t help but see everything through the perspective of mutable data structures and loops. There are no obvious “functions” in your favorite problem. Functions are these weird stateless things–they always produce the same results when called with the same arguments. They are good in mathematics, but not in real-life programming.

And yet, there are people who write complete applications using functional (or at least hybrid) languages. One such example is the game The Path of Go created by Microsoft Research for the Xbox. It’s not a spectacular game as far as UI goes, but it plays some mean Go and it’s written in F#.

F# is a mostly functional language (based on ML) with support for object-oriented programming and access to the rich .NET libraries. In this respect it’s similar to Scala. Roughly: F# is to .NET what Scala is to JVM. Both languages were designed by excellent teams. F# was designed by Microsoft Research in Cambridge, England. The chief architect of F# is Don Syme. Scala is the brainchild of Martin Odersky.

I decided to learn F# and immerse myself in the new paradigm. So I had to pick a random problem, not one with obvious functional implementation, and start from scratch, design and all. In practice, I had to do a lot of experimenting in order to familiarize myself with the language and the library. Experimenting, by the way, was made relatively easy by the inclusion of an F# interpreter in Visual Studio 2010.

To learn F#, I used only online documentation which, as I found out, is not that good. There are also fewer online discussions about F# than, say, about Haskell. The two websites I used most are:

The Problem

Without further ado, let me describe the challenge:

Write a program that finds duplicate files on disk.

In particular, I was interested in finding duplicate image files, but for testing I used text files. The program should therefore concentrate on files with particular extensions. It should also be able to skip directories that I’m not interested in. The duplicates (or triplicates, etc.) don’t have to have the same names but have to have identical extensions and contents.

The Design

Not surprisingly, functional programming requires a major adjustment. It’s one thing to read somebody else’s code and admire the tricks, but a completely different thing to be designing and writing functional code from scratch. But once you get the hang of it, it actually becomes easy and quite natural.

The most important thing you notice when using functional languages is that types are mostly unobtrusive due to type inference, but type checking is very strong. Essentially, if you manage to compile your program, it usually runs correctly. You spend much less time debugging (which I found rather difficult in F#) and much more time figuring out why the types don’t match. Of course, you have to learn a whole new language of error messages.

So it’s definitely a steep learning curve but once you’re over the hump you start reaping the benefits.

The naive approach to solving my problem would be to list all files on disk (recursively, starting with the root directory) and compare each with each. That would scale very poorly, O(N2), so we need to do some pruning.

Let’s first group files by extension and size. There will be a lot of singleton groups– containing only one file with a particular combination of extension and size. Let’s eliminate them from consideration. After that we’ll be dealing with much smaller groups of files so, within those groups, we can do full-blown byte-by-byte comparisons. Strict comparisons will potentially split those groups into even smaller groups. Again, we should eliminate the resulting singletons. Finally, we should be able to print the resulting lists of lists of identical files.

For an imperative programmer the first impulse would be to use a lot of looping; e.g., for each file retrieve its extension and size, etc. An object-oriented programmer would use vectors, hash tables, and looping over iterators.

How would a functional programmer approach the subject? Iteration is out of the question. Recursion and immutable lists are in. Quite often functions operating on lists can be expressed as list comprehensions. But there’s an even better tool called sequences in F#. They’re sort of like iterators, but with some very nice compositional properties. Sequences can be composed using pipelining. So let me express the above design as a pipeline.

The Pipeline

This is the refinement of the original design that takes into account data structures: sequences, lists, and tuples in various combinations.

  1. The source for the pipeline is a sequence of file paths coming from a recursive enumerator.
  2. The first operation is to group the files that share the same key: in our case the key will be a tuple (file extension, file size).
  3. The next operation is to filter out groups of length one, the singletons.
  4. Since the grouping injected keys into our stream, we need to strip them now and convert groups to lists.
  5. Now we group byte-wise equal files within each group.
  6. Then we remove singletons within those subgroups,
  7. Flatten lists of lists, and
  8. Print the results.

There are only two stages that deal with technical details of data structures: the stripping of the keys and the flattening of the lists. Everything else follows from high-level design.

Here’s the pipeline in its full functional glory. The |> symbol is used to forward the results of one stage to the next.

enumFilesRec 
  (filterOutPaths ["c:\\Windows";"c:\\ProgramData";"c:\\Program Files"])
  (filterExt [".jpg"; ".gif"])
  "c:\\Multimedia" 
|> Seq.groupBy (fun pth->(Path.GetExtension pth, (FileInfo pth).Length))
|> Seq.filter (fun (_, s) -> (Seq.length s) > 1)
|> Seq.map (fun (_, sq) -> [for path in sq -> path]) 
|> Seq.map groupEqualFiles
|> Seq.map filterOutSingletons
|> Seq.collect Seq.ofList
|> Seq.iter (fun lst -> printfn "%A" lst)

I will go through it line by line shortly.

I realize that this is a handful and if you have no previous experience with functional programming you are likely to feel overwhelmed at some point. The important thing is to observe how the original design translates almost one-to-one into implementation. Notice also the points of customization–they are almost universally plugs for user-defined functions. For instance, you customize Seq.map, Seq.filter, or Seq.collect by passing functions, often lambdas, as their arguments. Also, look how the function enumFilesRec is used. I decided to make its first two arguments functions even though my first impulse was to directly pass lists of directories to be skipped and extensions to be accepted. This way my design will work even if I later decide to filter files by, say, time of creation or size.

The Stages

Here’s the line by line reading of the pipeline code. My suggestion is to read as far as your patience permits and then skip to conclusions.

  1. I’m calling my function enumFilesRec with three arguments:
    enumFilesRec 
      (filterOutPaths ["c:\\Windows";"c:\\ProgramData";"c:\\Program Files"])
      (filterExt [".jpg"; ".gif"])
      "c:\\Multimedia"
    1. A directory filter: a function (predicate) that returns true for all directories except the ones listed as arguments to filterOutPaths. It’s worth mentioning that filterOutPaths is a function that returns another function — the predicate expected by enumFilesRec.
    2. A file filter: a function that returns true only for listed extensions. Again, filterExt is a function that takes a list and returns a predicate.
    3. The top directory: the root of the listing.

    enumFilesRec returns a sequence of paths. Since the sequence is only evaluated on demand, the call to this function returns almost immediately.

  2. The next stage of the pipeline:
    |> Seq.groupBy (fun p->(Path.GetExtension p, (FileInfo p).Length))

    applies Seq.gropuBy to the incoming sequence of paths. Seq.groupBy takes one argument– a function that takes a path and generates a key:

    fun path -> (Path.GetExtension path, (FileInfo path).Length)

    The key is the tuple consisting of file extension and file length:

    (Path.GetExtension path, (FileInfo path).Length)

    F# notation for anonymous functions (lambdas) is of the form:

    fun x -> expr

    The function Seq.gropuBy groups all elements of the sequence into subgroups that share the same key. The result is a sequence of sequences (the groups). Of course, to perform this step the whole input sequence must be scanned. That forces the actual listing of directories on disk, which takes the bulk of the run time.

  3. The next stage performs Seq.filter on the sequence:
    |> Seq.filter (fun (_, s) -> (Seq.length s) > 1)

    Seq.filter takes a predicate– here defined by a lambda– and applies it to all elements of the sequence; passing through only those that satisfy the predicate. This is the predicate:

    fun (_, s) -> (Seq.length s) > 1

    Notice that the previous step produced a sequence whose elements were tuples of (key, subsequence) with the subsequences sharing the same key. The lambda pattern-matches these tuples, (_, s), ignoring the key and testing the length of the subsequence against one. That eliminates singleton groups.

  4. We can now get rid of the keys and convert the subsequences into plain lists that will be needed for further processing.
    |> Seq.map (fun (_, sq) -> [for path in sq -> path])

    I use the workhorse of sequences, Seq.map, that applies a function to every element of the sequence. Remember that the element is still a tuple (key, subsequence). The lambda ignores the key and returns a list:

    fun (_, sq) -> [for path in sq -> path]

    The expression:

    [for path in sq -> path]

    enumerates the paths in the sequence sq and uses them to initialize a list (the brackets denote a list in F#). In functional programming such constructs are known as list comprehensions. The expression for path in sq -> path is called a generator.

  5. The next stage looks deceptively simple:
    |> Seq.map groupEqualFiles

    It applies a function, groupEqualsFiles to each list in the sequence. The interesting work happens in that function, which I will analyze shortly. Suffice it to say that it produces a list of sublists of identical files. Some of the sublists may be singletons.

    It might be a little hard to keep track of all those sequences, subsequences, and sublists. A good development environment will show you all the types while you’re developing the program. You may also sketch simple examples:

    seq[ [[a; a]; [b]]; [[c; c; c]; [d; d]; [e]] ]

    This one shows a sequence of lists of lists of identical elements analogous to the output of the last stage.

  6. Next I apply another function, filterOutSingletons to each list of sublists.
    |> Seq.map filterOutSingletons

    I end up with a sequence of lists of sublists of length greater than one containing identical files. The sequence above would be transformed to:

    seq[ [[a; a]]; [[c; c; c]; [d; d]] ]
  7. In the next step I flatten this hierarchy using Seq.collect.
    |> Seq.collect Seq.ofList

    Seq.collect takes a function that turns each element of the original sequence into a sequence and concatenates all those sequences into one. Like this:

    seq[ [a; a]; [c; c; c]; [d; d] ]

    Remember that the element of our sequence is a list of sublists. We can easily convert such a list to a sequence by applying Seq.ofList to it. It creates a sequence of sublists, and Seq.collect will concatenate all such sequences. I end up with a big sequence of lists. Those lists contain identical files. Voila!

  8. The final step is to print those lists.
    |> Seq.iter (fun lst -> printfn "%A" lst)

    I apply Seq.iter, which takes a void function (a function returning unit, in the F# parlance):

    fun lst -> printfn "%A" lst

    (which is really not a function because it has a side effect of printing its argument–a list). Seq.iter is just like Seq.map, but it consumes its input sequence producing nothing (except for side effects). Unlike Haskell, F# doesn’t track I/O side effects in the type system.

Details

For those who are really curious, I can go on filling in the details–the implementations of various functions used in the pipeline. Those functions use a variety of functional features of F# such as lists, recursion, pattern matching, etc. This is the bread and butter of functional programming.

Let me start with the function that enumerates files in a directory tree. The idea is to first list the files in the current directory and pass them through the file filter; then list the subdirectories, filter them through the directory filter, and recurse into each subdirectory. Since this function is the first stage of the pipeline, it should produce a sequence.

let rec enumFilesRec dirFilter fileFilter dir =
   seq {
      yield! 
         enumFilesSafe dir
         |> Seq.filter fileFilter
      yield!
         enumDirsSafe dir
         |> Seq.filter dirFilter
         |> Seq.map (fun sub -> enumFilesRec dirFilter fileFilter sub)
         |> Seq.collect id
   }

Monads Anyone?

I don’t want to scare anyone but F# sequences are monads. People usually have strong feelings about monads, some love them, some hate them. But don’t get intimidated by monads. The theory behind them is hard, but the usage is pretty simple.

To create a sequence you use the seq { ... } block sprinkled with yield and yield! statements. When such a sequence is enumerated (you can do it with a loop for instance: for elem in sqnc), each yield returns an element and suspends the execution of the seq block until the next call. The next iteration resumes right after the last yield. In our case we are building a new sequence from existing ones. To dive into another sequence inside the seq block we use yield! (yield bang). This is what the above code does: It first dives into file enumeration (a sequence returned by enumFilesSafe) and then into the enumeration of files in subdirectories.

enumFilesSafe is a function that calls the system’s Directory.EnumerateFiles API (part of the .NET library). I had to encapsulate it into my own function in order to catch (and ignore) the UnauthorizedAccessExceptions. Notice the use of pipelining to filter the paths.

After the sequence of files paths is exhausted, we enter the second yield!. This one starts by enumerating subdirectories. Subdirectory paths are pipelined through the directory filter. Now we have to do something for each subdirectory– that’s the clue to use Seq.map. The mapping function:

fun sub -> enumFilesRec dirFilter fileFilter sub

simply calls enumFilesRec recursively, passing it the filters and the name of the subdirectory. Notice that enumFilseRec returns another sequence, so we end up with a sequence of sequences corresponding to individual subdirectories. To flatten this hierarchy I use Seq.collect. Notice that I pass it the identity function, id, which just returns its argument: The elements of my sequence are sequences and I don’t have to do anything to them.

The second function I’d like to discuss is groupEqualFiles. It gets a list of file paths and splits it into sublists containing byte-wise identical files. This problem can be decomposed in the following way: Let’s pick the first file and split the rest into two groups: the ones equal to that file and the ones not equal. Then do the same with the non-equal group. The “do the same” part hints at recursion. Here’s the code:

let groupEqualFiles paths =
    let rec groupEqualFilesRec soFar lst =
        match lst with 
        | [] -> soFar
        | (file::tail) ->
            let (eq, rest) = groupFilesEqualTo file tail
            if rest.IsEmpty then eq::soFar
            else groupEqualFilesRec (eq::soFar) rest
    groupEqualFilesRec [] paths

A recursive solution often involves defining a recursive function and then calling it with some initial arguments. That’s the case here as well. The recursive function groupEqualFilesRec takes the accumulator, the soFar list, and a list of files to group.

let rec groupEqualFilesRec soFar lst =
    match lst with 
    | [] -> soFar
    | (file::tail) ->
        let (eq, rest) = groupFilesEqualTo file tail
        if rest.IsEmpty then eq::soFar
        else groupEqualFilesRec (eq::soFar) rest

The new trick here is pattern matching. A list can be empty and match the pattern [], or it can be split into the head and tail using the pattern (file::tail). In the first case I return the soFar list and terminate recursion. Otherwise I call another function groupFilesEqualTo with the head and the tail of the list. This auxiliary function returns a tuple of lists: the equal group and the rest. Symbolically, when called with a and [b; a; c; b; d; a] it produces:

([a; a; a], [b; c; b; d])

The tuple is immediately pattern matched to (eq, rest) in:

let (eq, rest) = groupFilesEqualTo file tail

if the rest is empty, I prepend the eq list to the accumulator, soFar. Otherwise I recursively call groupEqualFilesRec with the augmented accumulator and the rest.

The function groupEqualFiles simply calls the recursive groupEqualFilesRec with an empty accumulator and the initial list. The result is a list of lists.

For completeness, here’s the implementation of the recursive function groupFilesEqualTo

let rec groupFilesEqualTo file files =
   match files with 
   | [] -> ([file], [])
   | (hd::tail) -> 
       let (eqs, rest) = (groupFilesEqualTo file tail)
       if eqFiles file hd then (hd::eqs, rest) else (eqs, hd::rest)

Again, this function pattern-matches the list. If it’s empty, it returns a tuple consisting of the singleton list containing the file in question and the empty rest. Otherwise it calls itself recursively to group the tail. The result is pattern-matched into (eqs, rest). Now the comparison is made between the original file and the head of the original list (we could have done it before making the recursive call, but this way the code is more compact). If they match then the head is prepended to the list of equal files, otherwise it lands in the rest.

Did I mention there would be a test at the end? By now you should be able to analyze the implementation of filterOutSingletons:

let rec filterOutSingletons lstOfLst =
    match lstOfLst with
    | [] -> []
    | (h::t) -> 
        let t1 = filterOutSingletons t
        if (List.length h) > 1 then h::t1 else t1

Conclusions

I am not arguing that we should all switch to functional programming. For one thing, despite great progress, performance is still a problem in many areas. Immutable data structures are great, especially in concurrent programming, but can at times be awkward and inefficient. However, I strongly believe that any programmer worth his or her salt should be fluent with the functional paradigm.

The heart of programming is composition and reuse. In object-oriented programming you compose and reuse objects. In functional programming you do the same with functions. There are myriads of ways you can compose functions, the simplest being pipelining, passing functions as arguments, and returning functions from functions. There are lambdas, closures, continuations, comprehensions and, yes, monads. These are powerful tools in the hands of a skilled programmer. Not every problem fits neatly within the functional paradigm, but neither do all problems fit the OO paradigm. What’s important is having choices.

The code is available on GitHub.


I gave this presentation on Nov 17 at the Northwest C++ Users Group. Here are the links to the two-part video:

  1. Part I
  2. Part II

There was an unresolved discussion about one of the examples–the one about template specialization conflicting with modular type checking. It turns out the example was correct (that is an error would occur).

template<class T> struct vector {
    vector(int); // has constructor
};

//-- This one typechecks
template<LessThanComparable T>
void f(T x) { vector<T> v(100); ... }

//-- Specialization of vector for int
template<> struct vector<int> {
    // missing constructor!
};

int main() { f(4); } // error

The issue was whether the instantiation of f in main would see the specialization of the vector template for int even though the specialization occurs after the definition of f. Yes, it would, because vector<T> is a dependent name inside f. Dependent names are resolved in the context of template instantiation–here in the context of main, where the specialization is visible. Only non-dependent names are resolved in the context of template definition.


I recently gave a presentation at the Northwest C++ Users Group on message passing. The video and the slides are available here:

Here’s my premise: I needed a flexible system of primitives for passing messages between threads in C++. After studying existing solutions in C++ (Boost, MPI) and in other languages (Haskell MVars and Caml channels and events) I created my own C++ mini-library. It’s general (no restrictions on message types or message storage), composable (allows waiting on multiple polymorphic channels), type-safe, and first-class (you may pass channels inside messages). Considering all this, the library turned out to be surprisingly simple. One thing I gave up on, and I think it was the right decision, is location transparency. Message passing between threads is fundamentally different from message passing between processes or network nodes. Performance and first-class-ness require substantially different interfaces, not to mention different implementations.


The video of my talk, Haskell and C++ Template Metaprogramming, is now available; and so are the slides. I pretty much covered the material from my last blog post, but many people (including me) find a video presentation easier to follow.

This is also a plug for the Northwest C++ Users Group that meets in Redmond every third Wednesday of the month. If you live in Seattle or on the east side of Lake Washington, check it out. You won’t be disappointed.


If you want to understand C++ template metaprogramming (TMP) you have to know functional programming. Seriously. I want you to think of TMP as maximally obfuscated (subset of) Haskell, and I’ll illustrate this point by point. If you don’t know Haskell, don’t worry, I’ll explain the syntax as I go.

The nice thing about single-paradigm languages like Haskell is that they have very simple syntax (think of Lisp that does everything with just a bunch of parentheses). I will start the Haskell-C++TMP mapping with basics, like functions and recursion, but I’ll try to cover a lot more, including higher-order functions, pattern matching, list comprehension (did you know it was expressible in C++?), and more.

Keep in mind that my Haskell examples are runtime functions operating on runtime data whereas their C++ TMP equivalents are compile-time templates operating mostly on types. Operation on types are essential in providing correct and efficient implementations of parametrized classes and functions.

By necessity the examples are simple, but the same mapping may be applied to much more complex templates from the C++ Standard Library and Boost. As a bonus, I’ll also explain the hot new thing, variadic templates.

Functional Approach to Functions

How do you implement useful functions if you don’t have mutable variables, if statements, or loops? To a C++ programmer that might seem like an impossible task. But that’s the reality of C++ compile-time language that forms the basis of TMP. Functional programming to the rescue!

As a warm-up, let’s see how Haskell implements a simple function, the factorial:

fact 0 = 1
fact n = n * fact (n - 1)

The first line states that the factorial of zero is one. The second line defines factorial for a non-zero argument, n (strictly speaking it only works for positive non-zero arguments). It does it using recursion: factorial of n is equal to n times the factorial of n-1. The recursion stops when n is equal to zero, in which case the first definition kicks in. Notice that the definition of the function fact is split into two sub-definitions. So when you call:

fact 4

the first definition is looked up first and, if it doesn’t match the argument (which it doesn’t), the second one comes into play. This is the simplest case of pattern matching: 4 doesn’t match the “pattern” 0, but it matches the pattern n.

Here’s almost exactly the same code expressed in C++ TMP:

template<int n> struct
fact {
    static const int value = n * fact<n - 1>::value;
};

template<> struct
fact<0> { // specialization for n = 0
    static const int value = 1;
};

You might notice how the horrible syntax of C++ TMP obscures the simplicity and elegance of this code. But once you are equipped with the C++/Haskell decoder ring, things become a lot clearer.

Let’s analyze this code. Just like in Haskell, there are two definitions of fact, except that their order is inverted. This is because C++ requires template specialization to follow the template’s general definition (or declaration, as we’ll see later). The pattern matching of arguments in C++ does not follow the order of declarations but rather is based on “best match”. If you instantiate the template with argument zero:

cout << "Factorial of 0 = " << fact<0>::value << endl;

the second pattern, fact<0>, is a better fit. Otherwise the first one, <int n>, is used.

Notice also the weird syntax for “function call”

fact<n>::value

and for the “return statement”

static const int value = n * fact<n - 1>::value;

This all makes sense if you look at templates as definitions of parameterized types, which was their initial purpose in C++. In that interpretation, we are defining a struct called fact, parameterized by an integer n, whose sole member is a static const integer called value. Moreover, this template is specialized for the case of n equal zero.

Now I want you to forget about what I just said and put on the glasses which make the C++ code look like the corresponding Haskell code.

Here’s another example–this time of a predicate (a function returning a Boolean):

is_zero 0 = True
is_zero x = False

Let’s spice it up a little for C++ and define a predicate on types rather than integers. The following compile-time function returns true only when the type T is a pointer:

template<class T> struct
isPtr {
    static const bool value = false;
};

template<class U> struct
isPtr<U*> {
    static const bool value = true;
};

This time the actual argument to isPtr is first matched to the more specialized pattern, U* and, if it fails, the general pattern is used.

We can add yet another specialization, which will pattern-match a const pointer:

template<class U> struct
isPtr<U * const> {
    static const bool value = true;
};

These types of type predicates may be, for instance, used to select more flexible and efficient implementations of parameterized containers. Think of the differences between a vector of values vs. a vector of pointers.

Lists

The basic data structure in functional languages is the list. Haskell’s lists are introduced using square brackets. For instance, a list of three numbers, 1, 2, 3, looks like:

[1, 2, 3]

List processing in functional languages follows the standard pattern: a list is split into head and tail, an operation is performed on the head, and the tail is processed using recursion. The splitting is done by pattern matching: the Haskell pattern being (head:tail) (to be precise, the colon in parentheses represents the cons operation–the creation of a list by prepending an element to an existing list).

Here’s a simple function, count, that calculates the length of a list:

count [] = 0
count (head:tail) = 1 + count tail

The first pattern, [], matches an empty list; the second a non-empty one. Notice that a function call in Haskell doesn’t use parentheses around arguments, so count tail is interpreted as a call to count with the argument tail.

Before C++0x, TMP was severely crippled by the lack of a list primitive. People used separate definitions for a list of zero, one, two, etc., elements, and even used special macros to define them. This is no longer true in C++0x, thanks to variadic templates and template parameter packs. Here’s our Haskel count translated into C++ TMP:

// Just a declaration
template<class... list> struct
count;

template<> struct
count<> {
    static const int value = 0;
};

template<class head, class... tail> struct
count<head, tail...> {
    static const int value = 1 + count<tail...>::value;
};

First we have a declaration (not a definition) of the template count that takes a variable number of type parameters (the keyword class or typename introduces a type parameter). They are packed into a template parameter pack, list.

Once this general declaration is visible, specializations may follow in any order. I arranged them to follow the Haskell example. The first one matches the empty list and returns zero. The second one uses the pattern, <head, tail…>. This pattern will match any non-empty list and split it into the head and the (possibly empty) tail.

To “call” a variadic template, you initiate it with an arbitrary number of arguments and retrieve its member, value, e.g.,

int n = count<int, char, long>::value; // returns 3

A few words about variadic templates: A variadic template introduces a template parameter pack using the notation class… pack (or int… ipack, etc…). The only thing you may do with a pack is to expand it and pass to another variadic template. The expansion is done by following the name of the pack with three dots, as in tail…. You’ll see more examples later.

Variadic templates have many applications such as type-safe printf, tuples (objects that store an arbitrary number of differently typed arguments), variants, and many more.

Higher-Order Functions and Closures

The real power of functional programming comes from treating functions as first class citizens. It means that you may pass functions to other functions and return functions from functions. Functions operating on functions are called higher-order functions. Surprisingly, it seems like compile-time C++ has better support for higher-order functions than run-time C++.

Let’s start with a Haskell example. I want to define a function that takes two predicate functions and returns another predicate function that combines the two using logical OR. Here it is in Haskell:

or_combinator f1 f2 = 
    λ x -> (f1 x) || (f2 x)

The or_combinator returns an anonymous function (the famous “lambda”) that takes one argument, x, calls both f1 and f2 with it, and returns the logical OR of the two results. The return value of or_combinator is this freshly constructed function. I can then call this function with an arbitrary argument. For instance, here I’m checking if 2 is either zero or one (guess what, it isn’t!):

(or_combinator is_zero is_one) 2

I put the parentheses around the function and its arguments for readability, although they are not strictly necessary. 2 is the argument to the function returned by or_combinator.

The lambda that’s returned from or_combinator is actually a closure. It “captures” the two arguments, f1 and f2 passed to or_combinator. They may be used long after the call to or_combinator has returned.

It might take some getting used to it before you are comfortable with functions taking functions and returning functions, but it’s much easier to learn this stuff in Haskell than in the obfuscated C++. Indeed, here’s an almost direct translation of this example:

template<template<class> class f1, template<class> class f2> struct
or_combinator {
    template<class T> struct
    lambda {
        static const bool value = f1<T>::value || f2<T>::value;
    };
};

Since in the metalanguage a function is represented by a template, the template or_combinator takes two such templates as arguments. It “calls” these templates using the standard syntax f<T>::value. Actually, the or_combinator doesn’t call these functions. Instead it defines a new template, which I call lambda, that takes the argument T and calls those functions. This template acts like a closure–it captures the two templates that are the arguments to or_combinator.

Here’s how you may use the or_combinator to combine two tests, isPtr and isConst and apply the result to the type const int:

std::cout 
   << "or_combinator<isPtr, isConst>::lambda<const int>::value = "
   << or_combinator<isPtr, isConst>::lambda<const int>::value 
   << std::endl;

Such logical combinators are essential for predicate composability.

Higher-Order Functions Operating on Lists

Once you combine higher-order functions with lists you have a powerful functional language at your disposal. Higher-order functions operating on lists look very much like algorithms. Let me show you some classic examples. Here’s the function (or algorithm), all, that returns true if and only if all elements of a list satisfy a given predicate.

all pred [] = True
all pred (head:tail) = (pred head) && (all pred tail)

By now you should be familiar with all the techniques I used here, like pattern matching or list recursion.

Here’s the same code obfuscated by the C++ syntax:

template<template<class> class predicate, class... list> struct
all;

template<template<class> class predicate> struct
all<predicate> {
    static const bool value = true;
};

template<
    template<class> class predicate, 
    class head, 
    class... tail> struct
all<predicate, head, tail...> {
    static const bool value = 
        predicate<head>::value 
        && all<predicate, tail...>::value;
};

Except for the initial declaration required by C++ there is a one-to-one paradigm match between the two implementations.

Another useful algorithm, a veritable workhorse of functional programming, is “fold right” (together with it’s dual partner, “fold left”). It folds a list while accumulating the results (that’s why in runtime C++ this algorithm is called “accumulate”). Here’s the Haskell implementation:

foldr f init [] = init
foldr f init (head:tail) =
    f head (foldr f init tail)

Function f, which is the first argument to foldr, takes two arguments, the current element of the list and the accumulated value. Its purpose is to process the element and incorporate the result in the accumulator. The new accumulated value is then returned. It is totally up to the client to decide what kind of processing to perform, how it is accumulated, and what kind of value is used. The second argument, init, is the initial value for the accumulator.

Here’s how it works: The result of foldr is generated by acting with f on the head of the list and whatever has been accumulated by processing the tail of the list. The algorithm recurses until the tail is empty, in which case it returns the initial value. At runtime this type of algorithm would make N recursive calls before starting to pop the stack and accumulate the results.

For instance, foldr may be used to sum the elements of a list (so_far is the accumulator, which is initialized to zero):

add_it elem so_far = elem + so_far
sum_it lst = foldr add_it 0 lst

The accumulator function is add_it. If, instead, I wanted to calculate the product of all elements, I’d use a function mult_it and the starting value of one. You get the idea.

Here’s the same algorithm in C++ TMP:

template<template<class, int> class, int, class...> struct
fold_right;

template<template<class, int> class f, int init> struct
fold_right<f, init> {
    static const int value = init;
};

template<template<class, int> class f, int init, class head, class...tail> struct
fold_right<f, init, head, tail...> {
    static const int value = f<head, fold_right<f, init, tail...>::value>::value;
};

Once you understand the Haskell version, this complex code suddenly becomes transparent (if it doesn’t, try squinting 😉 ).

Lists of Numbers

Let’s now switch to integers for a moment. Haskell defines a function sum that adds all elements of a list:

sum [] = 0
sum (head:tail) = head + (sum tail)

We can do the same in C++ TMP (in five times as many lines of code):

template<int...> struct
sum;

template<> struct
sum<> {
    static const int value = 0;
};

template<int i, int... tail> struct
sum<i, tail...> {
    static const int value = i + sum<tail...>::value;
};

List Comprehension

Haskell has one more trick up its sleeve for operating on lists without explicit recursion. It’s called list comprehension. It’s a way of defining new lists based on existing lists. The nomenclature and notation are borrowed from Set Theory, where you often encounter definitions such as: S is a set of elements, where… Let’s look at a simple Haskell example:

[x * x | x <- [3, 4, 5]]

This is a set (list) of elements x * x, where x is from the list [3, 4, 5].

Remember our recursive definition of count? Using list comprehension it’s reduced to a one-liner:

count lst = sum [1 | x <- lst]

Here, we create a list of ones, one for each element of the list. Our result is the sum of those ones. To make this definition more amenable to translation into C++, let’s define an auxiliary function one that, for any argument x, returns 1.

one x = 1

Here’s the modified definition of count:

count lst = sum [one x | x <- lst]

Now we are ready to convert this code to C++ TMP:

template<class T> struct
one {
    static const int value = 1;
};

template<class... lst> struct
count {
    static const int value = sum<one<lst>::value...>::value;
};

Here our list is stored in a template parameter pack, lst. If we wanted to expand this pack, we’d use the notation lst…, but that’s not what’s happening here. The ellipsis appears after the pattern containing the pack:

one<lst>::value...

Compare this with the equivalent Haskell:

[one x | x <- lst]

In C++, when the ellipsis follows a pattern that contains a pack, it’s not the pack that’s expanded, but the whole pattern is repeated for each element of the pack. Here, if our list were <int, char, void*>, the pattern would be expanded to:

<one<int>::value, one<char>::value, one<void*>::value>

The subsequent call to sum would be made with those arguments.

Notice that a different positioning of the ellipsis would result in a completely different expansion. This pattern:

one<lst...>::value

would result in the call to one with the list of types, which would be an error.

Here’s another example of pattern expansion: a function that counts the number of pointers in a list of types:

template<class... lst> struct
countPtrs {
    static const int value = sum<isPtr<lst>::value ...>::value;
};

In this case the pattern is:

isPtr<lst>::value ...

and it expands into a list of Booleans. (I’m taking advantage of the fact that false is zero and true is one, when converted to integers.)

You may find a more complex practical example in the Gregor, Järvi, and Powell paper (see bibliography).

Continuations

List comprehension can be used to define some very useful higher-order functions. One of such functions is map, which takes a list and applies a unary function to each element, resulting in a new list. You might be familiar with the runtime implementation of this algorithm in C++ STL under the name of transform. This is what map looks like in Haskell:

map f lst = [f x | x <- lst]

Here, f is the unary function and lst is the input list. You have to admire the terseness and elegance of this notation.

The first impulse would be to translate it into C++ TMP as:

template<template<class> class f, class... lst> struct
map {
    typedef f<lst>... type;
};

This is surprisingly terse too. The problem is that it doesn’t compile. As far as I know there is no way for a template to “return” a variable list of elements. In my opinion, this is a major language design flaw, but that’s just me.

There are several workarounds, none of them too exciting. One is to define a separate entity called a typelist along the lines of:

template struct
typelist<hd, tl...> {
    typedef hd head;
    typedef typelist<tl...> tail;
};

(As a matter of fact I have implemented typelists and related algorithms both in C++ and D.)

Another approach is to use continuations. Template parameter packs cannot be returned, but they can be passed to variadic templates (after expansion). So how about defining an algorithm like map to take one additional function that would consume the list that is the result of mapping? Such a function is often called a continuation, since it continues the calculation where normally one would return the result. First, let’s do it in Haskell:

map_cont cont f lst = cont [f x | x <- lst]

The function map_cont is just like map except that it takes a continuation, cont, and applies it to the result of mapping. We can test it by defining yet another implementation of count:

count_cont lst = map_cont sum one lst

The continuation here is the function sum that will be applied to the list produced by acting with function one on the list lst. Since this is quite a handful, let me rewrite it in a more familiar notation of runtime C++:

int map_cont(int (*cont)(list), int (*f)(int), list lst) {
    list tmp;
    for (auto it = lst.begin(); it != lst.end(); ++it)
        tmp.push_front(f(*it));
    return cont(tmp);
}

Now for the same thing in compile-time C++:

template<template<class...> class cont, 
         template<class> class f,
         class... lst> struct
map_cont {
    static const int value =
        cont<typename f<lst>::type ...>::value;
};

It’s a one-to-one mapping of Haskell code–and it has very little in common with the iterative runtime C++ implementation. Also notice how loose the typing is in the TMP version as compared with the runtime version. The continuation is declared as a variadic template taking types, function f is declared as taking a type, and the list is a variadic list of types. Nothing is said about return types, except for the constraints that f returns a type (because cont consumes a list of types) and cont returns an int. Actually, this last constraint can be relaxed if we turn integers into types–a standard trick (hack?) in TMP:

template<int n> struct
Int {
    static const int value = n;
};

Loose typing–or “kinding,” as it is called for types of types–is an essential part of compile-time programming. In fact the popularity of the above trick shows that C++ kinding might be already too strong.

The D Digression

I’m grateful to Andrei Alexandrescu for reviewing this post. Since he objected to the sentence, “I’m disappointed that the D programming language followed the same path as C++ rather than lead the way,” I feel compelled to support my view with at least one example. Consider various implementations of all.

In Haskell, beside the terse and elegant version I showed before:

all pred [] = True
all pred (head:tail) = (pred head) && (all pred tail)

there is also a slightly more verbose one:

all pred list =
    if null list then True
    else pred (head list) && all pred (tail list)

which translates better into D. The D version (taken from its standard library, Phobos) is not as short as Haskell’s, but follows the same functional paradigm:

template allSatisfy(alias F, T...) {
    static if (T.length == 1)
    {
        alias F!(T[0]) allSatisfy;
    }
    else
    {
        enum bool allSatisfy = F!(T[0]) && allSatisfy!(F, T[1 .. $]);
    }
}

It definitely beats C++, there’s no doubt about it. There are some oddities about it though. Notice that the type tuple, T… gets the standard list treatment, but the split into the head and tail follows the array/slice notation. The head is T[0] and the tail is an array slice, T[1..$]. Instead of using value for the return value, D uses the “eponymous hack,” as I call it. The template “returns” the value using its own name. It’s a hack because it breaks down if you want to define the equivalent of “local variable” inside a template. For instance, the following code doesn’t compile:

template allSatisfy(alias F, T...) {
    static if (T.length == 1)
    {
        alias F!(T[0]) allSatisfy;
    }
    else
    {
        private enum bool tailResult = allSatisfy!(F, T[1..$]);
        enum bool allSatisfy = F!(T[0]) && tailResult;
    }
}

This breaks one of the fundamental property of any language: decomposability. You want to be able to decompose your calculation into smaller chunks and then combine them together. Of course, you may still use decomposition if you don’t use the eponymous hack and just call your return value value. But that means modifying all the calling sites.

By the way, this is how decomposition works in Haskell:

all2 pred [] = True;
all2 pred (head:tail) = (pred head) && tailResult
    where tailResult = all2 pred tail

Anyway, the important point is that there is no reason to force functional programming paradigm at compile-time. The following hypothetical syntax would be much easier for programmers who are used to imperative programming:

template allSatisfy(alias Pred, T...) {
    foreach(t; T)
        if (!Pred!(t))
            return false;
    return true;
}

In fact it’s much closer to the parameterized runtime function proposed by Andrei:

bool all(alias pred, Range)(Range r) {
    foreach (e; r) 
        if (!pred(e)) 
            return false;
    return true;
}

This is why I’m disappointed that the D programming language followed the same path as C++ rather than lead the way.

Conclusions

I have argued that some familiarity with Haskell may be really helpful in understanding and designing templates in C++. You might ask why C++ chose such horrible syntax to do compile-time functional programming. Well, it didn’t. The ability to do compile-time calculations in C++ was discovered rather than built into the language. It was a very fruitful discovery, as the subsequent developments, especially the implementation of the Boost MPL, have shown. However, once the functional paradigm and its weird syntax took root in C++ TMP, it stayed there forever. I’m aware of only one effort to rationalize C++ TMP by Daveed Vandevoorde in Reflective Metaprogramming in C++.

I have tested all code in this blog using the Hugs interpreter for Haskell and the GNU C++ compiler v. 4.4.1 with the special switch -std=c++0x. The names I used in the blog might conflict with the standard definitions. For instance, Hugs defines its own map and foldr.

If you want to learn more about template metaprogramming, I recommend two books:

  1. Andrei Alexandrescu, Modern C++ Design
  2. David Abrahams and Aleksey Gurtvoy, C++ Template Metaprogramming

The reference I used for variadic templates was the paper by Douglas Gregor, Jaakko Järvi, and Gary Powell


“I’ve been doing some template metaprogramming lately,” he said nonchallantly.

Why is it funny? Because template metaprogramming is considered really hard. I mean, über-guru-level hard. I’m lucky to be friends with two such gurus, Andrei Alexandrescu who wrote the seminal “Modern C++ Programming,” and Eric Niebler, who implemented the Xpressive library for Boost; so I know the horrors.

But why is template metaprogramming so hard? Big part of it is that C++ templates are rather ill suited for metaprogramming, to put it mildly. They are fine for simple tasks like parameterized containers and some generic algorithms, but not for operations on types or lists of types. To make things worse, C++ doesn’t provide a lot in terms of reflection, so even such simple tasks like deciding whether a given type is a pointer are hard (see the example later). Granted, C++0x offers some improvements, like template parameter packs; but guru-level creds are still required.

So, I’ve been doing some template metaprogramming lately… in D. The D programming language doesn’t have the baggage of compatibility with previous botched attempts, so it makes many things considerably easier on programmers. But before I get to it, I’d like to talk a little about the connection between generic programming and functional programming, give a short intro to functional programming; and then show some examples in C++ and D that involve pattern matching and type lists.

It’s not your father’s language

The key to understanding metaprogramming is to realize that it’s done in a different language than the rest of your program. Both in C++ and D you use a form of functional language for that purpose. First of all, no mutation! If you pass a list of types to a template, it won’t be able to append another type to it. It will have to create a completely new list using the old list and the new type as raw materials.

Frankly, I don’t know why mutation should be disallowed at compile time (all template calculations are done at compile time). In fact, for templates that are used in D mixins, I proposed not to invent a new language but to use a subset of D that included mutation. It worked just fine and made mixins much easier to use (for an example, see my DrDobbs article).

Once you disallow mutation, you’re pretty much stuck with functional paradigm. For instance, you can’t have loops, which require a mutable loop counter or some other mutable state, so you have to use recursion.

You’d think functional programmers would love template metaprogramming; except that they flip over horrendous syntax of C++ templates. The one thing going for functional programming is that it’s easy to define and implement. You can describe typeless lambda calculus with just a few formulas in operational semantics.

One thing is important though: meta-language can’t be strongly typed, because a strongly typed language requires another language to implement generic algorithms on top of it. So to terminate the succession of meta-meta-meta… languages there’s a need for either a typeless, or at least dynamically-typed, top-level meta-language. My suspicion is that C++0x concepts failed so miserably because they dragged the metalanguage in the direction of strong typing. The nails in the coffin for C++ concepts were concept maps, the moral equivalent of implicit conversions in strongly-typed languages.

Templates are still not totally typeless. They distinguish between type arguments (introduced by typename or class in C++), template template arguments, and typed template arguments. Here’s an example that shows all three kinds:

template<class T, template<class X> class F, int n>

Functional Programming in a Nutshell

“Functions operating on functions”–that’s the gist of functional programming. The rest is syntactic sugar. Some of this sugar is very important. For instance, you want to have built-in integers and lists for data types, and pattern matching for dispatching.

-Functions

Here’s a very simple compile-time function in the C++ template language:

template<class T> 
struct IsPtr {
    static const bool apply = false;
}

If it doesn’t look much like a function to you, here it is in more normal ad-hoc notation:

IsPtr(T) {
    return false;
}

You can “execute” or “call” this meta-function by instantiating the template IsPtr with a type argument and accessing its member apply:

IsPtr<int>::apply;

There is nothing magical about “apply”, you can call it anything (“result” or “value” are other popular identifiers). This particular meta-function returns a Boolean, but any compile-time constant may be returned. What’s more important, any type or a template may be returned. But let’s not get ahead of ourselves.

-Pattern matching

You might be wondering what the use is for a function (I’ll be dropping the “meta-” prefix in what follows) that always returns false and is called IsPtr. Enter the next weapon in the arsenal of functional programmers: pattern matching. What we need here is to be able to match function arguments to different patterns and execute different code depending on the match. In particular, we’d like to return a different value, true, for T matching the pattern T*. In the C++ metalanguage this is done by partial template specialization. It’s enough to define another template of the same name that matches a more specialized pattern, T*:

template<class T>
struct IsPtr<T*> {
    static const bool apply = true;
}

When faced with the call,

IsPtr<int*>::apply

the compiler will first look for specializations of the template IsPtr, starting with the most specialized one. In our case, the argument int* matches the pattern T* so the version returning true will be instantiated. Accessing the apply member of this instantiation will result in the Boolean value true, which is exactly what we wanted. Let me rewrite this example using less obfuscated syntax.

IsPtr(T*) {
    return true;
}
IsPtr(T) { // default case
    return false;
}

D template syntax is slightly less complex than that of C++. The above example will read:

template IsPtr(T) {
    static if (is (T dummy: U*, U))
        enum IsPtr = true;
    else
        enum IsPtr = false;
}
// Compile-time tests
static assert( IsPtr!(int*) );
static assert( !IsPtr!(int) );

As you can see, D offers compile-time if statements and more general pattern matching. The syntax of pattern matching is not as clear as it could be (what’s with the dummy?), but it’s more flexible. Compile-time constants are declared as enums.

There is one little trick (a hack?) that makes the syntax of “function call” a little cleaner. If, inside the template, you define a member of the same name as the template itself (I call it the “eponymous” member) than you don’t have to use the “apply” syntax. The “call” looks more like a call, except for the exclamation mark before the argument list (a D tradeoff for not using angle brackets). You’ll see later how the eponymous trick fails for more complex cases.

-Lists

The fundamental data structure in all functional languages is a list. Lists are very easy to operate upon using recursive algorithms and, as it turns out, they can be used to define arbitrarily complex data structures. No wonder C++0x felt obliged to introduce a compile-time type list as a primitive. It’s called a template parameter pack and the new syntax is:

template<class... T>Foo

You can instantiate such a template with zero arguments,

Foo<>

one argument,

Foo<int>

or more arguments,

Foo<int, char*, void*>

How do you iterate over a type list? Well, there is no iteration in the metalanguge so the best you can do is to use recursion. To do that, you have to be able to separate the head of the list from its tail. Then you perform the action on the head and call yourself recursively with the tail. The head/tail separation is done using pattern matching.

Let me demonstrate a simple example from the paper Variadic Templates by Garry Powell et al. It calculates the length of a pack using recursion. First, the basic case–length-zero list:

template<>
struct count <> {
    static const int value = 0;
}

That is the full specialization of a template, so it will be tried first. Here’s the general case:

template<typename Head, typename... Tail>
struct count<Head, Tail...> {
    static const int value = 1 + count<Tail...>::value;
}

Let’s see what it would look like in “normal” notation:

count() {
    return 0;
}
count(head, tail) {
    return 1 + count(tail);
}

And here’s the D version:

template count(T...) {
    static if (T.length == 0)
        enum count = 0;
    else
        enum count = 1 + count!(T[1..$]);
}
// tests
static assert( count!() == 0);
static assert( count!(int, char*, char[]) == 3);

T… denotes a type tuple, which supports array-like access. To get to the tail of the list, D uses array slicing, where T[1..$] denotes the slice of the array starting from index 1 up to the length of the array (denoted by the dollar sign). I’ll explain the important differences between C++ pack and D tuple (including pack expansion) in the next installment.

Conclusion

When looked upon from the functional perspective, template metaprogramming doesn’t look as intimidating as it it seems at first. Knowing this interpretation makes you wonder if there isn’t a better syntax or even a better paradigm for metaprogramming.

I’ll discuss more interesting parts of template metaprogramming in the next installment (this one is getting too big already). In particular, I’ll show examples of higher order meta-functions like Filter or Not and some interesting tricks with type lists.

« Previous Page