Haskell is a language deeply rooted in category theory. But as you don’t need to study the root system of Vitis vinifera in order to enjoy a glass of wine, you don’t need to know much about category theory in order to program in Haskell. Nevertheless, some of us just can’t help ourselves. We have to dig into the rich terroir of category theory to gain deeper insight into the art of functional programming. And today I’d like to talk about functions.
The basic category-theoretical model for Haskell is the category Hask, where objects are Haskell types, and morphisms are functions. The problem with this picture is that it puts functions on a different footing than the rest of the language. Functions from type A to type B — in other words, morphisms from object A to object B in Hask — form a set. This set is called the hom set, Hom(A, B). The fact that it’s just a set and not something bigger is a property of Hask — the property of being locally small. But in Haskell functions from type A to type B also form a type A->B. A type is an object in Hask. So what’s the connection between the set Hom(A, B) and the object A->B? The answer to this question is very interesting and involves products, exponentials, currying, and of course universal constructions.
In my previous blog I talked about the universal construction of limits — objects that represent relationships between other objects. In particular, a product can be defined as such a limit representing the most trivial relationship between two objects — that of just being two objects. Morphisms are also involved in relationships between objects, so maybe there is a way of representing them as an object as well. And indeed, it’s possible to define an object to represent a set of morphisms between two other objects A and B. Such an object is called the exponential and denoted by BA.
Notice that the domain A of the morphisms appears in the exponent. That might seem odd at first, but it makes perfect sense if you consider the relationship between multiplication (product) and exponentiation. In arithmetic, mn means m multiplied by itself n times. If you replace m and n with types (for simplicity, think of types as sets of values) and multiplication with (set-theoretical) product, you can think of mn as a set of n-tuples of values of type m: (m1, m2, m3,… mn). Of course, if n is a type, it’s not immediately clear what an n-tuple is (it’s a categorical power), but you can gain some intuition if you consider enumerated finite types. For instance, functions from
Bool to any type
Bool->m, can be represented as all possible pairs of
ms (one value for
True and one for
False). They correspond to the exponential
mBool. Also, for finite types, the number of different functions from n to m is equal to mn. But the connection between products and exponentials goes deeper than that.
The basic relationship describing a function is that of application. Given a pair (function, argument), produce a result. It terms of types, a function of type
X->Y applied to
Y. We want to define the exponential object YX to model this relationship. How do we do that?
There isn’t really that much choice. We need to map a pair of objects (YX, X) to Y. But what is a pair, and what does it mean to map? We can represent the pair as an object — a product of YX × X — and then we can map it to Y using a morphism, which we’ll call
It immediately follows that we can’t define exponential objects if we don’t have products. Again, it kind of make intuitive sense — exponentiation arising from iterated multiplication.
From previous experience we know that having a relationship between objects is usually not enough to define a new object. There may be many other objects that model this relationship. We need a way to compare them and pick the one that models it best.
So suppose that we have an impostor object Z, together with a morphism g from Z × X to Y impersonating application. We know that our choice for YX is universal if for any Z and g there is a unique morphism, which we’ll call λg, that maps Z to YX, and which factors through
g = app . (λg, id)
Universality diagram defining the exponential object
Such universal object might not exist in every category, but it does in Hask. In general, a category in which there is a terminal object, a product of any two objects, and an exponential of any two objects is called Cartesian closed. Cartesian closed categories are, for obvious reasons, very important in computer science.
There’s another way of looking at the diagram that defines the exponential object. You can think of the morphism g as a function of two variables:
g :: (Z, X) -> Y
For any such g there is a unique morphism λg that maps Z to YX, an object representing a function from X to Y. This establishes a one-to-one correspondence between functions of two variables and functions returning functions, which we know under the name of currying. So currying “falls out” of the definition of the exponential object.
Any time there is a one-to-one correspondence between sets of morphisms you might want to look for an underlying adjunction. You might remember from my previous blog post that a functor F is said to be left adjoint to the functor G (or G right adjoint to F) if the following two hom sets are naturally isomorphic:
Hom(FZ, Y) ~ Hom(Z, GY)
In our case we have a one-to-one mapping between the morphism g from Z×X to Y and the morphism λg from Z to YX. In a category where all products and all exponentials exist, we can define these two functors:
FXZ = Z × X
GXY = YX
In Haskell, these functors would be implemented as:
newtype F x z = F (z, x)
instance Functor (F x) where
fmap f (F (z, x)) = F (f z, x)
newtype G x y = G (x -> y)
instance Functor (G x) where
fmap f (G g) = G (f . g)
and the isomorphism of hom sets would be given by the function
phi and its inverse
phi :: (F x z -> y) -> z -> G x y
phi f z = G $ \x -> f (F (z, x))
phi' :: (z -> G x y) -> F x z -> y
phi' g (F (z, x)) = let G f = g z
in f x
Exponentiation can thus be defined as the right adjoint of taking a product.
C++ is like an oil tanker — it takes a long time for it to change course. The turbulent reefs towards which C++ has been heading were spotted on the horizon more than ten years ago. I’m talking, of course, about the end of smooth sailing under the Moore’s law and the arrival of the Multicore. It took six years to acknowledge the existence of concurrency in the C++11 Standard, but that’s only the beginning. It’s becoming more and more obvious that a major paradigm shift is needed if C++ is to remain relevant in the new era.
Why do we need a new paradigm to deal with concurrency? Can’t we use object oriented programming with small modifications? The answer to this question goes to the heart of programming: it’s about composability. We humans solve complex problems by splitting them into smaller subproblems. This is a recursive process, we split subproblems into still smaller pieces, and so on. Eventually we reach the size of the problem which can be easily translated into computer code. We then have to compose all these partial solutions into larger programs.
The key to composability is being able to hide complexity at each level. This is why object oriented programming has been so successful. When you’re implementing an object, you have to deal with its internals, with state transitions, intermediate states, etc. But once the object is implemented, all you see is the interface. The interface must be simpler than the implementation for object oriented programming to make sense. You compose larger objects from smaller objects based on their interfaces, not the details of their implementation. That’s how object oriented programming solves the problem of complexity.
Unfortunately, objects don’t compose in the presence of concurrency. They hide the wrong kind of things. They hide sharing and mutation. Let me quote the definition of data race: Two or more threads accessing the same piece of memory at the same time, at least one of them writing. In other words: Sharing + Mutation = Data Race. Nothing in the object’s interface informs you about the possibility of sharing and mutation inside the object’s implementation. Each object in isolation may be data-race-free but their composition may inadvertently introduce data races. And you won’t know about it unless you study the details of their implementation down to every single memory access.
In Java, an attempt had been made to mollify this problem: Every object is equipped with a mutex that can be invoked by declaring the method synchronized. This is not a scalable solution. Even Java’s clever thin lock implementation incurs non-negligible performance overhead, so it is used only when the programmer is well aware of potential races, which requires deep insight into the implementation of all subobjects, exactly the thing we are trying to avoid.
More importantly, locking itself doesn’t compose. There’s a classic example of a locked bank account whose
withdraw methods are synchronized by a lock. The problem occurs when one tries to transfer money from one account to another. Without exposing the locks, it’s impossible to avoid a transient state in which the funds have already left one account but haven’t reached the second. With locks exposed, one may try to hold both locks during the transfer, but that creates a real potential for deadlocks. (Software Transactional Memory provides a composable solution to this problem, but there are no practical implementations of STM outside of Haskell and Clojure.)
Moreover, if we are interested in taking advantage of multicores to improve performance, the use of locks is a non-starter. Eking out parallel performance is hard enough without locks, given all the overheads of thread management and the Amdahl’s law. Parallelism requires a drastically different approach.
Since the central problem of concurrency is the conflict between sharing and mutation, the solution is to control these two aspects of programming. We can do mutation to our heart’s content as long as there’s no sharing. For instance, we can mutate local variables; or we can ensure unique ownership by making deep copies, using move semantics, or by employing
unique_ptrs. Unique ownership plays very important role in message passing, allowing large amounts of data to be passed cheaply between threads.
However, the key to multicore programming is controlling mutation. This is why functional languages have been steadily gaining ground in concurrency and parallelism. In a nutshell, functional programmers have found a way to program using what, to all intents and purposes, looks like immutable data. An imperative programmer, when faced with immutability, is as confused as a barbecue cook in a vegetarian kitchen. And the truth is that virtually all data structures from the C++ standard library are unsuitable for this kind of programming — the standard vector being the worst offender. A continuous slab of memory is perfect for random or sequential access, but the moment mutation is involved, you can’t share it between threads. Of course, you can use a mutex to lock the whole vector every time you access it, but as I explained already, you can forget about performance and composability of such a solution.
The trick with functional data structures is that they appear immutable, and therefore require no synchronization when accessed from multiple threads. Mutation is replaced by construction: you construct a new object that’s a clone of the source object but with the requested modification in place. Obviously, if you tried to do this with a vector, you’d end up with a lot of copying. But functional data structures are designed for maximum sharing of representation. So a clone of a functional object will share most of its data with the original, and only record a small delta. The sharing is totally transparent since the originals are guaranteed to be immutable.
A singly-linked list is a classical, if not somewhat trivial, example of such a data structure. Adding an element to the front of a list requires only the creation of a single node to store the new value and a pointer to the original (immutable) list. There are also many tree-like data structures that are logarithmically cheap to clone-mutate (red-black trees, leftist heaps). Parallel algorithms are easy to implement with functional data structures, since the programmer doesn’t have to worry about synchronization.
Functional data structures, also known as “persistent” data structures, are naturally composable. This follows from the composability of immutable data — you can build larger immutable objects from smaller immutable objects. But there’s more to it: This new way of mutating by construction also composes well. A composite persistent object can be clone-mutated by clone-mutating only the objects on the path to the mutation; everything else can be safely shared.
Concurrency also introduces nonstandard flows of control. In general, things don’t progress sequentially. Programmers have to deal with inversion of control, jumping from handler to handler, keeping track of shared mutable state, etc. Again, in functional programming this is nothing unusual. Functions are first class citizens and they can be composed in many ways. A handler is nothing but a continuation in the continuation passing style. Continuations do compose, albeit in ways that are not familiar to imperative programmers. Functional programmers have a powerful compositional tool called a monad that, among other things, can linearize inverted flow of control. The design of libraries for concurrent programming makes much more sense once you understand that.
A paradigm shift towards functional programming is unavoidable and I’m glad to report that there’s a growing awareness of that new trend among C++ programmers. I used to be the odd guy talking about Haskell and monads at C++ meetings and conferences. This is no longer so. There was a sea change at this year’s C++Now. The cool kids were all talking about functional programming, and the presentation “Functional Data Structures in C++” earned me the most inspiring session award. I take it as a sign that the C++ community is ready for a big change.