September 2021

Category theory extracts the essence of structure and composition. At its foundation it deals with the composition of arrows. Building on composition of arrows it then goes on describing the ways objects can be composed: we have products, coproducts and, at a higher level, tensor products. They all describe various modes of composing objects. In monoidal categories any two objects can be composed.

Unlike composition, which can be described uniformly, decomposition requires case-by-case treatment. It’s easy to decompose a cartesian product using projections. A coproduct (sum) can be decomposed using pattern matching. A generic tensor product, on the other hand, has no standard means of decompositon.

Optics is the essence of decomposition. It answers the question of what it means to decompose a composite.

We consider an object decomposable when:

  • We can split it into the focus and the complement,
  • We can replace the focus with something else, without changing the complement, to get a new composite object,
  • We can zoom in; that is, if the focus is decomposable, we can compose the two decompositions,
  • It’s possible for the whole object to be the focus.

Let’s translate these requirements into the language of category theory. We’ll start with the standard example: the lens, which is the optic for decomposing cartesian products.

The splitting means that there is a morphism from the composite object s to the product c \times a, where c is the complement and a is the focus. This morphism is a member of the hom-set \mathcal{C}(s, c \times a).

To replace the focus we need another morphism that takes the same complement c, combines it with the new focus b to produce the new composite t. This morphism is a member of the hom-set \mathcal{C}(c \times b, t)

Here’s the important observation: We don’t care what the complement is. We are “focusing” on the focus. We carry the complement over to combine it with the new focus, but we don’t use it for anything else. It’s a featureless black box.

To erase the identity of the complement, we hide it inside a coend. A coend is a generalization of a sum, so it is written using the integral sign (see the Appendix for details). Programmers know it as an existential type, logicians call it an existential quantifier. We say that there exists a complement c, but we don’t care what it is. We “integrate” over all possible complements.

Here’s the existential definition of the lens:

L(s, t; a, b) = \int^{c : \mathcal{C}} \mathcal{C}(s, c \times a) \times \mathcal{C}(c \times b, t)

Just like we construct a coproduct using one of the injections, so the coend is constructed using one of (possibly infinite number of) injections. In our case we construct a lens L(s, t; a, b) by injecting a pair of morphisms from the two hom-sets sharing the same c. But once the lens is constructed, there is no way to extract the original c from it.

It’s not immediately obvious that this representation of the lens reproduces the standard setter/getter form. However, in a cartesian closed category, we can use the currying adjunction to transform the second hom-set:

\mathcal{C}(c \times b, t) \cong \mathcal{C}(c, [b, t])

Here, [b, t] is the internal hom, or the function object representing morphisms from b to t. We can then use the co-Yoneda lemma to reduce the coend:

\int^{c : \mathcal{C}} \mathcal{C}(s, c \times a) \times \mathcal{C}(c, [b, t]) \cong \mathcal{C}(s, [b, t] \times a) \cong \mathcal{C}(s \times b, t) \times \mathcal{C}(s, a)

The first part of this product is the setter: it takes the source object s and the new focus b to produce the new target t. The second part is the getter that extracts the focus a.

Even though all optics have similar form, each of them reduces differently.

Here’s another example: the prism. We just replace the product with the coproduct (sum).

P(s, t; a, b) = \int^{c : \mathcal{C}} \mathcal{C}(s, c + a) \times \mathcal{C}(c + b, t)

This time the reduction goes through the universal property of the coproduct: a mapping out of a sum is a product of mappings:

\mathcal{C}(c + b, t) \cong\mathcal{C}(c, t) \times\mathcal{C}(b, t)

Again, we use the co-Yoneda to reduce the coend:

\int^{c : \mathcal{C}} \mathcal{C}(s, c + a) \times\mathcal{C}(c, t) \times\mathcal{C}(b, t) \cong\mathcal{C}(s, t + a) \times\mathcal{C}(b, t)

The first one extracts the focus a, if possible, otherwise it constructs a t (by secretly injecting a c). The second constructs a t by injecting a b.

We can easily generalize existential optics to an arbitrary tensor product in a monoidal category:

O(s, t; a, b) = \int^{c : \mathcal{C}} \mathcal{C}(s, c \otimes a) \times \mathcal{C}(c \otimes b, t)

In general, though, this form cannot be further reduced using the co-Yoneda trick.

But what about the third requirement: the zooming-in property of optics? In the case of the lens and the prism it works because of associativity of the product and the sum. In fact it works for any tensor product. If you can decompose s into c \otimes a, and further decompose a into c' \otimes a', then you can decompose s into (c \otimes c') \otimes a'. Zooming-in is made possible by the associativity of the tensor product.

Focusing on the whole object plays the role of the unit of zooming.

These two properties are used in the definition of the category of optics. The objects in this category are pairs of object in \mathcal{C}. A morphism from a pair \langle s, t \rangle to \langle a, b \rangle is the optic O(s, t; a, b). Zooming-in is the composition of morphisms.

But this is still not the most general setting. The useful insight is that the multiplication (product) in a lens, and addition (coproduct) in a prism, look like examples of linear transformations, with the residue c playing the role of a parameter. In fact, a composition of a lens with a prism produces a 2-parameter affine transformation, which also behaves like an optic. We can therefore generalize optics to work with an arbitrary monoidal action (first hinted in the discussion at the end of this blog post). Categories with such actions are known as actegories.

The idea is that you define a family of endofunctors A_m in \mathcal{C} that is parameterized by objects from a monoidal category \mathcal{M}. So far we’ve only discussed examples where the parameters were taken from the same category \mathcal{C} and the action was either multiplication or addition. But there are many examples in which \mathcal{M} is not the same as \mathcal{C}.

The zooming principles are satisfied if the action respects the tensor product in \mathcal{M}:

A_{m \otimes n} \cong A_m \circ A_n

A_1 \cong \mathit{Id}

(Here, 1 is the unit object with respect to the tensor product \otimes in \mathcal{M}, and \mathit{Id} is the identity endofunctor.)

The actegorical version of the optic doesn’t deal directly with the residue. It tells us that the “unimportant” part of the composite object can be parameterized by some m \colon \mathcal{M}.

This additional abstraction allows us to transport the residue between categories. It’s enough that we have one action L_m in \mathcal{C} and another R_m in \mathcal{D} to create this mixed optics (first introduced by Mitchell Riley):

O(s, t; a, b) = \int^{m : \mathcal{M}} \mathcal{C}(s, L_m a) \times \mathcal{D}(R_m b, t)

The separation of the focus from the complement using monoidal actions is reminiscent of what physicists call the distinction between “physical”  and “gauge” degrees of freedom.

An in-depth presentation of optics, including their profunctor representation, is available in this paper.

Appendix: Coends and the Co-Yoneda Lemma

A coend is defined for a profunctor, that is a functor of two variables, one contravariant and one covariant, p \colon \mathcal{C}^{op} \times \mathcal{C} \to \mathbf{Set}. It’s a cross between a coproduct and a trace, as it’s constructed using injections of diagonal elements (with some identifications):

\iota_{a} \colon p \langle a, a \rangle \to \int^{c : \mathcal{C}} p \langle c, c \rangle

Co-Yoneda lemma is the identity that works for any covariant functor (copresheaf) F \colon \mathcal{C} \to \mathbf{Set}:

\int^{c \colon \mathcal{C}} F(c) \times \mathcal{C}(c, x) \cong F(x)

A PDF version of this post is available on GitHub.

Dependent types, in programming, are families of types indexed by elements of an indexing type. For instance, counted vectors are families of tuples indexed by natural numbers—the lengths of the vectors.

In category theory we model dependent types as fibrations. We start with the total space E, the base space B, and a projection, or a display map, p \colon E \to B. The fibers of p correspond to members of the type family. For instance, the total space, or the bundle, of counted vectors is the list type \mathit{List} (A) (a free monoid generated by A) with the projection \mathit{len} \colon \mathit{List} (A) \to \mathbb{N} that returns the length of a list.

Another way of looking at dependent types is as objects in the slice category \mathcal{C}/B. Counted vectors, for instance, are represented as objects in \mathcal{C}/\mathbb{N} given by pairs \langle \mathit{List} (A), \mathit{len} \rangle. Morphisms in the slice category correspond to fibre-wise mappings between bundles.

We often require that \mathcal{C} be a locally cartesian closed category, that is a category whose slice categories are cartesian closed. In such categories, the base-change functor f^* has both the left adjoint, the dependent sum \Sigma_f; and the right adjoint, the dependent product \Pi_f. The base-change functor is defined as a pullback:


This pullback defines a cartesian product in the slice category \mathcal{C}/B between two objects: \langle B', f \rangle and \langle E, p \rangle. In a locally cartesian closed category, this product has the right adjoint, the internal hom in \mathcal{C}/B.

Dependent optics

The most general optic is given by two monoidal actions L_m and R_m in two categories \mathcal{C} and \mathcal{D}. It can be written as the following coend of the product of two hom-sets:

O(A, A'; S, S') = \int^{m \colon \mathcal{M}} \mathcal{C}( S, L_m A) \times \mathcal{D}(R_m A', S')

Monoidal actions are parameterized by objects in a monoidal category (\mathcal{M}, \otimes, 1).

Dependent optics are a special case of general optics, where one or both categories in question are slice categories. When the monoidal action is defined in the slice category, the transformations must respect fibrations. For instance, the action in the bundle \langle E, p \rangle over B must commute with the projection:

p \circ L_m = p

This is reminiscent of gauge transformations in physics, which act on fibers in bundles over spacetime. The action must respect the monoidal structure of \mathcal{M} so, for instance,

L_{m \otimes n} \cong L_m \circ L_n

L_1 \cong \mathit{Id}

We can define a dependent (mixed) optic as:

\int^{m : \mathcal{M}} (\mathcal{C}/B)( S, L_m A) \times (\mathcal{D}/B')(R_m A', S')

Just like regular optics, dependent optics can be represented using Tambara modules, which are profunctors with the additional structure given by transformations:

\alpha_{m, \langle A, A' \rangle} \colon P \langle A, A' \rangle \to P\langle L_m A, R_m A' \rangle

where A and A' are objects in the appropriate slice categories.
The optic is then given by the following end in the Tambara category:

O(A, A'; S, S') = \int_{p : \mathbf{Tam}} \mathbf{Set}(P \langle A, A' \rangle, P \langle S, S' \rangle)

Dependent lens

The primordial optic, the lens, is defined by the monoidal action of a product. By analogy, we define a dependent lens by the action of the product in a slice category. The action parameterized by an object \langle C, q \rangle on another object \langle A, p \rangle is given by the pullback:

M_C A = C \times_B A

Since a pullback is the product in the slice category \mathcal{C}/B, it is automatically associative and unital, so it can be used to define a dependent lens:

\mathit{DLens}(A, A'; S, S') = \int^{\langle C, p \rangle : \mathcal{C}/B} (\mathcal{C}/B)( S, C \times_B A) \times (\mathcal{C}/B)(C \times_B A', S')

Since \mathcal{C} is locally cartesian closed, there is an adjunction between the product and the exponential. We can use it to get:

\cong \int^{\langle C, p \rangle : \mathcal{C}/B} (\mathcal{C}/B)( S, C \times_B A) \times (\mathcal{C}/B)(C , [A', S']_B)

We can then apply the Yoneda lemma to get the setter/getter form:

(\mathcal{C}/B)( S, [A', S']_B \times_B A)

The internal hom [A', S']_B in a locally cartesian closed category can be expressed using a dependent product:

\left [\left \langle A' \atop p \right \rangle, \left \langle S' \atop q \right \rangle \right ] \cong \Pi_p \left(p^* \left \langle S' \atop q \right \rangle \right)

where p \colon A' \to B is the fibration of A', \Pi_p is the right adjoint to the base change functor, and p^* is the base-change functor along p.

The dependent lens can be written as:

(\mathcal{C} / B) \left( \left \langle S \atop r \right \rangle, \Pi_p \left(p^* \left \langle S' \atop q \right \rangle \right) \times \left \langle A \atop r' \right \rangle \right)

In particular, if B is \mathbb{N}, this is equal to an infinite tuple of functions:

O(A, B; S, T) \cong \prod_n \left( s_n \to \left((b_n \to t_n) \times a_n \right) \right)

or fiber-wise pairs of setter/getter \langle s_n \to b_n \to t_n, s_n \to a_n \rangle indexed by n.


Traversals are optics whose monoidal action is generated by polynomial functors of the form:

M_{c} a = \sum_{n \colon \mathbb{N}} c_n \times a^n

The coefficients c_n can be expressed as a fibration \langle C, p \colon C \to \mathbb{N} \rangle, with C = \sum_n c_n, the sum of the fibers. The set of powers of a can be similarly written as \langle L(a), \mathit{len} \rangle, with L(a) the type of list of a (a free monoid generated by a), and \mathit{len} the function that assigns the length to a list. The monoidal action can then be written using a product (pullback) in the slice category \mathbf{Set}/\mathbb{N}:

\left \langle {C \atop p} \right \rangle \times \left \langle {L(a) \atop \mathit{len}} \right \rangle

There is an obvious forgetful functor U \colon \mathbf{Set}/\mathbb{N} \to \mathbf{Set}, which can be used to express the polynomial action:

M_c a = U\left( \left \langle {C \atop p} \right \rangle \times \left \langle {L(a) \atop \mathit{len}} \right \rangle \right)

The traversal is the optic:

\int^{\langle C, p \rangle : \mathbf{Set}/\mathbb{N}} \mathbf{Set} \left(s, M_c a \right) \times \mathbf{Set}(M_c b, t)

Eqivalently, the second factor can be rewritten as:

\mathbf{Set}\left( \sum_{n \colon \mathbb{N}} c_n \times b^n, t\right) \cong \prod_{n \colon \mathbb{N}} \mathbf{Set}(c_n \times b^n, t)

This, in turn, is equivalent to a single hom-set in the slice category:

\cong (\mathbf{Set}/\mathbb{N})\left(\left \langle {C \atop p} \right \rangle \times \left \langle {L(b) \atop \mathit{len}} \right \rangle, \left \langle {\mathbb{N} \times t \atop \pi_1} \right \rangle \right)

where \pi_1 is the projection from the cartesian product.

The traversal is therefore a mixed optic:

\int^{\langle C, p \rangle : \mathbf{Set}/\mathbb{N}} \mathbf{Set} \left(s, M_c a \right) \times (\mathbf{Set}/\mathbb{N})\left( \left \langle {C \atop p} \right \rangle \times \left \langle {L(b) \atop \mathit{len}} \right \rangle, \left \langle {\mathbb{N} \times t \atop \pi_1} \right \rangle \right)

The second factor can be transformed using the internal hom adjunction:

(\mathbf{Set}/\mathbb{N})\left(\left \langle {C \atop p} \right \rangle, \left[ \left \langle {L(b) \atop \mathit{len}} \right \rangle, \left \langle {\mathbb{N} \times t \atop \pi_1} \right \rangle \right] \right)

We can then use the ninja Yoneda lemma on the optic to “integrate” over \langle C, p \rangle and get:

O(a, b; s, t) \cong \mathbf{Set} \left( s, U\left( \left[ \left \langle {L(b) \atop \mathit{len}} \right \rangle, \left \langle {\mathbb{N} \times t \atop \pi_1} \right \rangle \right] \times \left \langle {L(a) \atop \mathit{len}} \right \rangle \right) \right)

which, in components, reads:

s \to \sum_n \left( (b^n \to t) \times a^n \right)