Note: A PDF version of this series is available on github.

My gateway drug to category theory was the Haskell lens library. What first piqued my attention was the van Laarhoven representation, which used functions that are functor-polymorphic. The following function type:

type Lens s t a b = 
  forall f. Functor f => (a -> f b) -> (s -> f t)

is isomorphic to the getter/setter pair that traditionally defines a lens:

get :: s -> a
set :: s -> b -> t

My intuition was that the Yoneda lemma must be somehow involved. I remember sharing this idea excitedly with Edward Kmett, who was the only expert on category theory I knew back then. The reasoning was that a polymorphic function in Haskell is equivalent to a natural transformation in category theory. The Yoneda lemma relates natural transformations to functor values. Let me explain.

In Haskell, the Yoneda lemma says that, for any functor f, this polymorphic function:

forall x. (a -> x) -> f x

is isomorphic to (f a).
In category theory, one way of writing it is:

\int_{x} \mathbf{Set}\big(\mathcal{C}(a, x), f x\big) \cong f a

If this looks a little intimidating, let me go through the notation:

  1. The functor f goes from some category \mathcal{C} to the category of sets, which is called \mathbf{Set}. Such functor is called a co-presheaf.
  2. \mathcal{C}(a, x) stands for the set of arrows from a to x in \mathcal{C}, so it corresponds to the Haskell type a->x. In category theory it’s called a hom-set. The notation for hom-sets is: the name of the category followed by names of two objects in parentheses.
  3. \mathbf{Set}\big(\mathcal{C}(a, x), f x\big) stands for a set of functions from \mathcal{C}(a, x) to f x or, in Haskell (a -> x)-> f x. It’s a hom-set in \mathbf{Set}.
  4. Think of the integral sign as the forall quantifier. In category theory it’s called an end. Natural transformations between two functors f and g can be expressed using the end notation:
    \int_x \mathbf{Set}(f x, g x)

As you can see, the translation is pretty straightforward. The van Laarhoven representation in this notation reads:

\int_f \mathbf{Set}\big( \mathcal{C}(a, f b), \mathcal{C}(s, f t) \big)

If you vary x in \mathcal{C}(b, x), it becomes a functor, which is called a representable functor—the object b “representing” the whole functor. In Haskell, we call it the reader functor:

newtype Reader b x = Reader (b -> x)

You can plug a representable functor for f in the Yoneda lemma to get the following very important corollary:

\int_x \mathbf{Set}\big(\mathcal{C}(a, x), \mathcal{C}(b, x)\big) \cong \mathcal{C}(b, a)

The set of natural transformation between two representable functors is isomorphic to a hom-set between the representing objects. (Notice that the objects are swapped on the right-hand side.)

The van Laarhoven representation

There is just one little problem: the forall quantifier in the van Laarhoven formula goes over functors, not types.

This is okay, though, because category theory works at many levels. Functors themselves form a category, and the Yoneda lemma works in that category too.

For instance, the category of functors from \mathcal{C} to \mathbf{Set} is called [\mathcal{C},\mathbf{Set}]. A hom-set in that category is a set of natural transformations between two functors which, as we’ve seen, can be expressed as an end:

[\mathcal{C},\mathbf{Set}](f, g) \cong \int_x \mathbf{Set}(f x, g x)

Remember, it’s the name of the category, here [\mathcal{C},\mathbf{Set}], followed by names of two objects (here, functors f and g) in parentheses.

So the corollary to the Yoneda lemma in the functor category, after a few renamings, reads:

\int_f \mathbf{Set}\big( [\mathcal{C},\mathbf{Set}](g, f), [\mathcal{C},\mathbf{Set}](h, f)\big) \cong [\mathcal{C},\mathbf{Set}](h, g)

This is getting closer to the van Laarhoven formula because we have the end over functors, which is equivalent to

forall f. Functor f => ...

In fact, a judicious choice of g and h is all we need to finish the proof.

But sometimes it’s easier to define a functor indirectly, as an adjoint to another functor. Adjunctions actually allow us to switch categories. A functor L defined by a mapping-out in one category can be adjoint to another functor R defined by its mapping-in in another category.

\mathcal{C}(L a, b) \cong \mathcal{D}(a, R b)

A useful example is the currying adjunction in \mathbf{Set}:

\mathbf{Set}(c \times a, y) \cong \mathbf{Set}(c, y^a) \cong \mathbf{Set}\big(c, \mathbf{Set}(a, y)\big)

where y^a corresponds to the function type a->y and, in \mathbf{Set}, is isomorphic to the hom-set \mathbf{Set}(a, y). This is just saying that a function of two arguments is equivalent to a function returning a function.

Here’s the clever trick: let’s replace g and h in the functorial Yoneda lemma with L_b a and L_t s, where L_b and L_t are some higher-order functors from \mathcal{C} to [\mathcal{C},\mathbf{Set}] (as you will see, this notation anticipates the final substitution). We get:

\int_f \mathbf{Set}\big( [\mathcal{C},\mathbf{Set}](L_b a, f), [\mathcal{C},\mathbf{Set}](L_t s, f)\big) \cong [\mathcal{C},\mathbf{Set}](L_t s, L_b a)

Now suppose that these functors are left adjoint to some other functors: R_b and R_t that go in the opposite direction from [\mathcal{C},\mathbf{Set}] to \mathcal{C} . We can then replace all mappings-out in [\mathcal{C},\mathbf{Set}] with the corresponding mappings-in in \mathcal{C}:

\int_f \mathbf{Set}\big( \mathcal{C}(a, R_b f), \mathcal{C}(s, R_t f)\big) \cong \mathcal{C}\big(s, R_t (L_b a)\big)

We are almost there! The last step is to realize that, in order to get the van Laarhoven formula, we need:

R_b f = f b

R_t f = f t

So these are just functors that apply f to some fixed objects: b and t, respectively. The left-hand side becomes:

\int_f \mathbf{Set}\big( \mathcal{C}(a, f b), \mathcal{C}(s, f t) \big)

which is exactly the van Laarhoven representation.

Now let’s look at the right-hand side:

\mathcal{C}\big(s, R_t (L_b a)\big) = \mathcal{C}\big( s, (L_b a) t \big)

We know what R_b is, but what’s its left adjoint L_b? It must satisfy the adjunction:

[\mathcal{C},\mathbf{Set}](L_b a, f) \cong \mathcal{C}(a, R_b f) = \mathcal{C}(a, f b)

or, using the end notation:

\int_x \mathbf{Set}\big((L_b a) x, f x\big) \cong \mathcal{C}(a, f b)

This identity has a simple solution when \mathcal{C} is \mathbf{Set}, so we’ll just temporarily switch to \mathbf{Set}. We have:

(L_b a) x = \mathbf{Set}(b, x) \times a

which is known as the IStore comonad in Haskell. We can check the identity by first applying the currying adjunction to eliminate the product:

\int_x \mathbf{Set}\big(\mathbf{Set}(b, x) \times a, f x\big) \cong \int_x \mathbf{Set}\big(\mathbf{Set}(b, x), \mathbf{Set}(a, f x )\big)

and then using the Yoneda lemma to “integrate” over x, which replaces x with b,

\int_x \mathbf{Set}\big(\mathbf{Set}(b, x), \mathbf{Set}(a, f x )\big) \cong \mathbf{Set}(a, f b)

So the right hand side of the original identity (after replacing \mathcal{C} with \mathbf{Set}) becomes:

\mathbf{Set}\big(s, R_t (L_b a)\big) \cong \mathbf{Set}\big( s, (L_b a) t \big) \cong \mathbf{Set}\big(s, \mathbf{Set}(b, t) \times a) \big)

which can be translated to Haskell as:

(s -> b -> t, s -> a)

or a pair of set and get.

I was very proud of myself for finding the right chain of substitutions, so I was pretty surprised when I learned from Mauro Jaskelioff and Russell O’Connor that they had a paper ready for publication with exactly the same proof. (They added a reference to my blog in their publication, which was probably a first.)

The Existentials

But there’s more: there are other optics for which this trick doesn’t work. The simplest one was the prism defined by a pair of functions:

match :: s -> Either t a
build :: b -> t

In this form it’s hard to see a commonality between a lens and a prism. There is, however, a way to unify them using existential types.

Here’s the idea: A lens can be applied to types that, at least conceptually, can be decomposed into two parts: the focus and the residue. It lets us extract the focus using get, and replace it with a new value using set, leaving the residue unchanged.

The important property of the residue is that it’s opaque: we don’t know how to retrieve it, and we don’t know how to modify it. All we know about it is that it exists and that it can be combined with the focus. This property can be expressed using existential types.

Symbolically, we would want to write something like this:

type Lens s t a b = exists c . (s -> (c, a), (c, b) -> t)

where c is the residue. We have here a pair of functions: The first decomposes the source s into the product of the residue c and the focus a . The second recombines the residue with the new focus b resulting in the target t.

Existential types can be encoded in Haskell using GADTs:

data Lens s t a b where
  Lens :: (s -> (c, a), (c, b) -> t) -> Lens s t a b

They can also be encoded in category theory using coends. So the lens can be written as:

\int^c \mathcal{C}(s, c \times a) \times \mathcal{C}(c \times b, t)

The integral sign with the argument at the top is called a coend. You can read it as “there exists a c”.

There is a version of the Yoneda lemma for coends as well:

\int^c f c \times \mathcal{C}(c, a) \cong f a

The intuition here is that, given a functorful of c‘s and a function c->a, we can fmap the latter over the former to obtain f a. We can do it even if we have no idea what the type c is.

We can use the currying adjunction and the Yoneda lemma to transform the new definition of the lens to the old one:

\int^c \mathcal{C}(s, c \times a) \times \mathcal{C}(c \times b, t) \cong \int^c \mathcal{C}(s, c \times a) \times \mathcal{C}(c, t^b) \cong \mathcal{C}(s, t^b \times a)

The exponential t^b translates to the function type b->t, so this this is really the set/get pair that defines the lens.

The beauty of this representation is that it can be immediately applied to the prism, just by replacing the product with the sum (coproduct). This is the existential representation of a prism:

\int^c \mathcal{C}(s, c + a) \times \mathcal{C}(c + b, t)

To recover the standard encoding, we use the mapping-out property of the sum:

\mathcal{C}(c + b, t) \cong \mathcal{C}(c, t) \times \mathcal{C}(b, t)

This is simply saying that a function from the sum type is equivalent to a pair of functions—what we call case analysis in programming.

We get:

\int^c \mathcal{C}(s, c + a) \times \mathcal{C}(c + b, t) \cong \int^c \mathcal{C}(s, c + a) \times \mathcal{C}(c, t) \times \mathcal{C}(b, t)

This has the form suitable for the use of the Yoneda lemma, namely:

\int^c f c \times \mathcal{C}(c, t)

with the functor

f c = \mathcal{C}(s, c + a) \times \mathcal{C}(b, t)

The result of the Yoneda is replacing c with t, so the result is:

\mathcal{C}(s, t + a) \times \mathcal{C}(b, t)

which is exactly the match/build pair (in Haskell, the sum is translated to Either).

It turns out that every optic has an existential form.

Next: Profunctors.