In my category theory blog posts, I stated many theorems, but I didn’t provide many proofs. In most cases, it’s enough to know that the proof exists. We trust mathematicians to do their job. Granted, when you’re a professional mathematician, you have to study proofs, because one day you’ll have to prove something, and it helps to know the tricks that other people used before you. But programmers are engineers, and are therefore used to taking advantage of existing solutions, trusting that somebody else made sure they were correct. So it would seem that mathematical proofs are irrelevant to programming. Or so it may seem, until you learn about the Curry-Howard isomorphism–or propositions as types, as it is sometimes called–which says that there is a one to one correspondence between logic and programs, and that every function can be seen as a proof of a theorem. And indeed, I found that a lot of proofs in category theory turn out to be recipes for implementing functions. In most cases the problem can be reduced to this: Given some morphisms, implement another morphism, usually using simple composition. This is very much like using point-free notation to define functions in Haskell. The other ingredient in categorical proofs is diagram chasing, which is very much like equational resoning in Haskell. Of course, mathematicians use different notation, and they use lots of different categories, but the principles are the same.

I want to illustrate these points with the example from Emily Riehl’s excellent book Category Theory in Context. It’s a book for mathematicians, so it’s not an easy read. I’m going to concentrate on theorem 6.2.1, which derives a formula for left Kan extensions using colimits. I picked this theorem because it has calculational content: it tells you how to calculate a particular functor.

It’s not a short proof, and I have made it even longer by unpacking every single step. These steps are not too hard, it’s mostly a matter of understanding and using definitions of functoriality, naturality, and universality.

There is a bonus at the end of this post for Haskell programmers.

## Kan Extensions

I wrote about Kan extensions before, so here I’ll only recap the definition using the notation from Emily’s book. Here’s the setup: We want to extend a functor , which goes from category to , along another functor , which goes from to . This extension is a new functor from to .

*To give you some intuition, imagine that the functor is the Rosetta Stone. It’s a functor that maps the Ancient Egyptian text of a royal decree to the same text written in Ancient Greek. The functor embeds the Rosetta Stone hieroglyphics into the know corpus of Egyptian texts from various papyri and inscriptions on the walls of temples. We want to extend the functor to the whole corpus. In other words, we want to translate new texts from Egyptian to Greek (or whatever other language that’s isomorphic to it).*

In the ideal case, we would just want to be isomorphic to the composition of the new functor after . That’s usually not possible, so we’ll settle for less. A Kan extension is a functor which, when composed with produces something that is related to through a natural transformation. In particular, the *left Kan extension*, , is equipped with a natural transformation from to .

(The right Kan extension has this natural transformation reversed.)

There are usually many such functors, so there is the standard trick of universal construction to pick the best one.

*In our analogy, we would ideally like the new functor, when applied to the hieroglyphs from the Rosetta stone, to exactly reproduce the original translation, but we’ll settle for something that has the same meaning. We’ll try to translate new hieroglyphs by looking at their relationship with known hieroglyphs. That means we should look closely at morphism in .*

## Comma Category

The key to understanding how Kan extensions work is to realize that, in general, the functor embeds in in a lossy way.

There may be objects (and morphisms) in that are not in the image of . We have to somehow define the action of on those objects. What do we know about such objects?

We know from the Yoneda lemma that all the information about an object is encoded in the totality of morphisms incoming to or outgoing from that object. We can summarize this information in the form of a functor, the hom-functor. Or we can define a category based on this information. This category is called the slice category. Its objects are morphisms from the original category. Notice that this is different from Yoneda, where we talked about *sets* of morphisms — the hom-sets. Here we treat individual morphisms as objects.

This is the definition: Given a category and a fixed object in it, the *slice category* has as objects pairs , where is an object of and is a morphism from to . In other words, all the arrows whose codomain is become objects in .

A morphism in between two objects, and is a morphism in that makes the following triangle commute:

In our case, we are interested in an object in , and the slice category describes it in terms of morphisms. Think of this category as a holographic picture of .

But what we are really interested in, is how to transfer this information about to . We do have a functor , which goes from to . We need to somehow back-propagate the information about to along , and then use to move it to .

So let’s try again. Instead of using all morphisms impinging on , let’s only pick the ones that originate in the image of , because only those can be back-propagated to .

This gives us limited information about , but it’ll have to do. We’ll just use a partial hologram of . Instead of the slice category, we’ll use the comma category .

Here’s the definition: Given a functor and an object of , the *comma category* has as objects pairs , where is an object of and is a morphism from to . So, indeed, this category describes the totality of morphisms coming to from the image of .

A morphism in the comma category from to is a morphism such that the following triangle commutes:

So how does back-propagation work? There is a projection functor that maps an object to (with obvious action on morphisms). This functor loses a lot of information about objects (completely forgets the part), but it keeps a lot of the information in morphisms — it only picks those morphisms in that preserve the structure of the comma category. And it lets us “upload” the hologram of into

Next, we transport all this information to using . We get a mapping

Here’s the main observation: We can look at this functor as a *diagram* in (remember, when constructing limits, diagrams are defined through functors). It’s just a bunch of objects and morphisms in that somehow approximate the image of . This holographic information was extracted by looking at morphisms impinging on . In our search for we should therefore look for an object in together with morphisms impinging on it from the diagram we’ve just constructed. In particular, we could look at cones under that diagram (or co-cones, as they are sometimes called). The best such cone defines a colimit. If that colimit exists, it is indeed the left Kan extension . That’s the theorem we are going to prove.

To prove it, we’ll have to go through several steps:

- Show that the mapping we have just defined on objects is indeed a functor, that is, we’ll have to define its action on morphisms.
- Construct the transformation from to and show the naturality condition.
- Prove universality: For any other functor together with a natural transformation , show that uniquely factorizes through .

All of these things can be shown using clever manipulations of cones and the universality of the colimit. Let’s get to work.

## Functoriality

We have defined the action of on objects of . Let’s pick a morphism . Just like , the object defines its own comma category , its own projection , its own diagram , and its own limiting cone. Parts of this new cone, however, can be reinterpreted as a cone for the old object . That’s because, surprisingly, the diagram contains the diagram .

Take, for instance, an object , with . There is a corresponding object in . Both get projected down to the same . That shows that every object in the diagram for the cone is automatically an object in the diagram for the cone.

Now take a morphism from to in . It is also a morphism in between and . The commuting triangle condition in

ensures the commuting condition in

All this shows that the new cone that defines the colimit of contains a cone under .

But that diagram has its own colimit . Because that colimit is universal, there must be a unique morphism from to , which makes all triangles between the two cones commute. We pick this morphism as the lifting of our , which ensures the functoriality of .

The commuting condition will come in handy later, so let’s spell it out. For every object we have a leg of the cone, a morphism from to ; and another leg of the cone from to . If we denote the lifting of as then the commuting triangle is:

In principle, we should also check that this newly defined functor preserves composition and identity, but this pretty much follows automatically whenever lifting is defined using composition of morphisms, which is indeed the case here.

## Natural Transformation

We want to show that there is a natural transformation from to . As usual, we’ll define the component of this natural transformation at some arbitrary object . It’s a morphism between and . We have lots of morphisms at our disposal, with all those cones lying around, so it shouldn’t be a problem.

First, observe that, because of the pre-composition with , we are only looking at the action of on objects which are inside the image of .

The objects of the corresponding comma category have the form , where . In particular, we can pick , and , which will give us one particular vertex of the diagram . The object at this vertex is — exactly what we need as a starting point for our natural transformation. The leg of the colimiting cone we are interested in is:

We’ll pick this leg as the component of our natural transformation .

What remains is to show the naturality condition. We pick a morphism . We know how to lift this morphism using and using . The other two sides of the naturality square are and .

The bottom left composition is . Let’s take the commuting triangle that we used to show the functoriality of :

and replace by , by , by , and by , to get:

Let’s draw this as the diagonal in the naturality square, and examine the upper right composition:

.

This is also equal to the diagonal . That’s because these are two legs of the same colimiting cone corresponding to and . These vertices are connected by in .

But how do we know that is a morphism in ? Not every morphism in is a morphism in the comma category. In this case, however, the triangle identity is automatic, because one of its sides is an identity .

We have shown that our naturality square is composed of two commuting triangles, with as its diagonal, and therefore commutes as a whole.

## Universality

Kan extension is defined using a universal construction: it’s the best of all functors that extend along . It means that any other extension will factor through ours. More precisely, if there is another functor and another natural transformation:

then there is a unique natural transformation , such that

(where we have a horizontal composition of a natural transformation with the functor )

As always, we start by clearly stating the goals. The proof proceeds in these steps:

- Implement .
- Prove that it’s natural.
- Show that it factorizes through .
- Show that it’s unique.

We are defining a natural transformation between two functors: , which we have defined as a colimit, and . Both are functors from to . Let’s implement the component of at some . It must be a morphism:

Notice that varies over all of , not just the image of . We are comparing the two extensions where it really matters.

We have at our disposal the natural transformation:

or, in components:

More importantly, though, we have the universal property of the colimit. If we can construct a cone with the nadir at then we can use its factorizing morphism to define .

This cone has to be built under the same diagram as . So let’s start with some . We want to construct a leg of our new cone going from to . We can use to get to and then hop to using . Thus we can define the new cone’s leg as:

Let’s make sure that this is really a cone, meaning, its sides must be commuting triangles.

Let’s pick a morphism in from to . A morphism in must satisfy the triangle condition, :

We can lift this triangle using :

Naturality condition for tells us that:

By combining the two, we get the pentagon:

whose outline forms a commuting triangle:

Now that we have a cone with the nadir at , universality of the colimit tells us that there is a unique morphism from the colimiting cone to it that factorizes all triangles between the two cones. We make this morphism our . The commuting triangles are between the legs of the colimiting cone and the legs of our new cone :

Now we have to show that so defined is a natural transformation. Let’s pick a morphism . We can lift it using or using in order to construct the naturality square:

Remember that the lifting of a morphism by satisfies the following commuting condition:

We can combine the two diagrams:

The two arms of the large triangle can be replaced using the diagram that defines , and we get:

which commutes due to functoriality of .

Now we have to show that factorizes through . Both and are natural transformations between functors going from to , whereas connects functors from to . To extend , we horizontally compose it with the identity natural transformation from to . This is called whiskering and is written as . This becomes clear when expressed in components. Let’s pick an object in . We want to show that:

Let’s go back all the way to the definition of :

where is the identity morphism at . Compare this with the definition of :

If we replace with and with , we get:

which, due to functoriality of is exactly what we need:

Finally, we have to prove the uniqueness of . Here’s the trick: assume that there is another natural transformation that factorizes . Let’s rewrite the naturality condition for :

Replacing with , we get:

The lifiting of by satisfies the triangle identity:

where we recognize as .

We said that factorizes through :

which let us straighten the left side of the pentagon.

This shows that is another factorization of the cone with the nadir at through the colimit cone with the nadir . But that would contradict the universality of the colimit, therefore must be the same as .

This completes the proof.

## Haskell Notes

This post would not be complete if I didn’t mention a Haskell implementation of Kan extensions by Edward Kmett, which you can find in the library `Data.Functor.Kan.Lan`

. At first look you might not recognize the definition given there:

data Lan g h a where Lan :: (g b -> a) -> h b -> Lan g h a

To make it more in line with the previous discussion, let’s rename some variables:

data Lan k f d where Lan :: (k c -> d) -> f c -> Lan k f d

This is an example of GADT syntax, which is a Haskell way of implementing existential types. It’s equivalent to the following pseudo-Haskell:

Lan k f d = exists c. (k c -> d, f c)

This is more like it: you may recognize `(k c -> d)`

as an object of the comma category , and `f c`

as the mapping of (which is the projection of this object back to ) under the functor . In fact, the Haskell representation is based on the encoding of the colimit using a coend:

The Haskell library also contains the proof that Kan extension is a functor:

instance Functor (Lan k f) where fmap g (Lan kcd fc) = Lan (g . kcd) fc

The natural transformation that is part of the definition of the Kan extension can be extracted from the Haskell definition:

eta :: f c -> Lan k f (k c) eta = Lan id

In Haskell, we don’t have to prove naturality, as it is a consequence of parametricity.

The universality of the Kan extension is witnessed by the following function:

toLan :: Functor g => (forall c. f c -> g (k c)) -> Lan k f d -> g d toLan gamma (Lan kcd fc) = fmap kcd (gamma fc)

It takes a natural transformation from to , and produces a natural transformation we called from to .

This is expressed as a composition of and :

fromLan :: (forall d. Lan k f d -> g d) -> f c -> g (k c) fromLan alpha = alpha . eta

As an example of equational reasoning, let’s prove that defined by `toLan`

indeed factorizes . In other words, let’s prove that:

fromLan (toLan gamma) = gamma

Let’s plug the definition of `toLan`

into the left hand side:

fromLan (\(Lan kcd fc) -> fmap kcd (gamma fc))

then use the definition of `fromLan`

:

(\(Lan kcd fc) -> fmap kcd (gamma fc)) . eta

Let’s apply this to some arbitrary function `g`

and expand `eta`

:

(\(Lan kcd fc) -> fmap kcd (gamma fc)) (Lan id g)

Beta-reduction gives us:

fmap id (gamma g)

which is indeed equal to the right hand side acting on `g`

:

gamma g

The proof of `toLan . fromLan = id`

is left as an exercise to the reader (hint: you’ll have to use naturality).

## Acknowledgments

I’m grateful to Emily Riehl for reviewing the draft of this post and for writing the book from which I borrowed this proof.