Category theory extracts the essence of structure and composition. At its foundation it deals with the composition of arrows. Building on composition of arrows it then goes on describing the ways objects can be composed: we have products, coproducts and, at a higher level, tensor products. They all describe various modes of composing objects. In monoidal categories any two objects can be composed.

Unlike composition, which can be described uniformly, decomposition requires case-by-case treatment. It’s easy to decompose a cartesian product using projections. A coproduct (sum) can be decomposed using pattern matching. A generic tensor product, on the other hand, has no standard means of decompositon.

Optics is the essence of decomposition. It answers the question of what it means to decompose a composite.

We consider an object decomposable when:

• We can split it into the focus and the complement,
• We can replace the focus with something else, without changing the complement, to get a new composite object,
• We can zoom in; that is, if the focus is decomposable, we can compose the two decompositions,
• It’s possible for the whole object to be the focus.

Let’s translate these requirements into the language of category theory. We’ll start with the standard example: the lens, which is the optic for decomposing cartesian products.

The splitting means that there is a morphism from the composite object $s$ to the product $c \times a$, where $c$ is the complement and $a$ is the focus. This morphism is a member of the hom-set $\mathcal{C}(s, c \times a)$.

To replace the focus we need another morphism that takes the same complement $c$, combines it with the new focus $b$ to produce the new composite $t$. This morphism is a member of the hom-set $\mathcal{C}(c \times b, t)$

Here’s the important observation: We don’t care what the complement is. We are “focusing” on the focus. We carry the complement over to combine it with the new focus, but we don’t use it for anything else. It’s a featureless black box.

To erase the identity of the complement, we hide it inside a coend. A coend is a generalization of a sum, so it is written using the integral sign (see the Appendix for details). Programmers know it as an existential type, logicians call it an existential quantifier. We say that there exists a complement $c$, but we don’t care what it is. We “integrate” over all possible complements.

Here’s the existential definition of the lens: $L(s, t; a, b) = \int^{c : \mathcal{C}} \mathcal{C}(s, c \times a) \times \mathcal{C}(c \times b, t)$

Just like we construct a coproduct using one of the injections, so the coend is constructed using one of (possibly infinite number of) injections. In our case we construct a lens $L(s, t; a, b)$ by injecting a pair of morphisms from the two hom-sets sharing the same $c$. But once the lens is constructed, there is no way to extract the original $c$ from it.

It’s not immediately obvious that this representation of the lens reproduces the standard setter/getter form. However, in a cartesian closed category, we can use the currying adjunction to transform the second hom-set: $\mathcal{C}(c \times b, t) \cong \mathcal{C}(c, [b, t])$

Here, $[b, t]$ is the internal hom, or the function object representing morphisms from $b$ to $t$. We can then use the co-Yoneda lemma to reduce the coend: $\int^{c : \mathcal{C}} \mathcal{C}(s, c \times a) \times \mathcal{C}(c, [b, t]) \cong \mathcal{C}(s, [b, t] \times a) \cong \mathcal{C}(s \times b, t) \times \mathcal{C}(s, a)$

The first part of this product is the setter: it takes the source object $s$ and the new focus $b$ to produce the new target $t$. The second part is the getter that extracts the focus $a$.

Even though all optics have similar form, each of them reduces differently.

Here’s another example: the prism. We just replace the product with the coproduct (sum). $P(s, t; a, b) = \int^{c : \mathcal{C}} \mathcal{C}(s, c + a) \times \mathcal{C}(c + b, t)$

This time the reduction goes through the universal property of the coproduct: a mapping out of a sum is a product of mappings: $\mathcal{C}(c + b, t) \cong\mathcal{C}(c, t) \times\mathcal{C}(b, t)$

Again, we use the co-Yoneda to reduce the coend: $\int^{c : \mathcal{C}} \mathcal{C}(s, c + a) \times\mathcal{C}(c, t) \times\mathcal{C}(b, t) \cong\mathcal{C}(s, t + a) \times\mathcal{C}(b, t)$

The first one extracts the focus $a$, if possible, otherwise it constructs a $t$ (by secretly injecting a $c$). The second constructs a $t$ by injecting a $b$.

We can easily generalize existential optics to an arbitrary tensor product in a monoidal category: $O(s, t; a, b) = \int^{c : \mathcal{C}} \mathcal{C}(s, c \otimes a) \times \mathcal{C}(c \otimes b, t)$

In general, though, this form cannot be further reduced using the co-Yoneda trick.

But what about the third requirement: the zooming-in property of optics? In the case of the lens and the prism it works because of associativity of the product and the sum. In fact it works for any tensor product. If you can decompose $s$ into $c \otimes a$, and further decompose $a$ into $c' \otimes a'$, then you can decompose $s$ into $(c \otimes c') \otimes a'$. Zooming-in is made possible by the associativity of the tensor product.

Focusing on the whole object plays the role of the unit of zooming.

These two properties are used in the definition of the category of optics. The objects in this category are pairs of object in $\mathcal{C}$. A morphism from a pair $\langle s, t \rangle$ to $\langle a, b \rangle$ is the optic $O(s, t; a, b)$. Zooming-in is the composition of morphisms.

But this is still not the most general setting. The useful insight is that the multiplication (product) in a lens, and addition (coproduct) in a prism, look like examples of linear transformations, with the residue $c$ playing the role of a parameter. In fact, a composition of a lens with a prism produces a 2-parameter affine transformation, which also behaves like an optic. We can therefore generalize optics to work with an arbitrary monoidal action (first hinted in the discussion at the end of this blog post). Categories with such actions are known as actegories.

The idea is that you define a family of endofunctors $A_m$ in $\mathcal{C}$ that is parameterized by objects from a monoidal category $\mathcal{M}$. So far we’ve only discussed examples where the parameters were taken from the same category $\mathcal{C}$ and the action was either multiplication or addition. But there are many examples in which $\mathcal{M}$ is not the same as $\mathcal{C}$.

The zooming principles are satisfied if the action respects the tensor product in $\mathcal{M}$: $A_{m \otimes n} \cong A_m \circ A_n$ $A_1 \cong \mathit{Id}$

(Here, $1$ is the unit object with respect to the tensor product $\otimes$ in $\mathcal{M}$, and $\mathit{Id}$ is the identity endofunctor.)

The actegorical version of the optic doesn’t deal directly with the residue. It tells us that the “unimportant” part of the composite object can be parameterized by some $m \colon \mathcal{M}$.

This additional abstraction allows us to transport the residue between categories. It’s enough that we have one action $L_m$ in $\mathcal{C}$ and another $R_m$ in $\mathcal{D}$ to create this mixed optics (first introduced by Mitchell Riley): $O(s, t; a, b) = \int^{m : \mathcal{M}} \mathcal{C}(s, L_m a) \times \mathcal{D}(R_m b, t)$

The separation of the focus from the complement using monoidal actions is reminiscent of what physicists call the distinction between “physical”  and “gauge” degrees of freedom.

An in-depth presentation of optics, including their profunctor representation, is available in this paper.

### Appendix: Coends and the Co-Yoneda Lemma

A coend is defined for a profunctor, that is a functor of two variables, one contravariant and one covariant, $p \colon \mathcal{C}^{op} \times \mathcal{C} \to \mathbf{Set}$. It’s a cross between a coproduct and a trace, as it’s constructed using injections of diagonal elements (with some identifications): $\iota_{a} \colon p \langle a, a \rangle \to \int^{c : \mathcal{C}} p \langle c, c \rangle$

Co-Yoneda lemma is the identity that works for any covariant functor (copresheaf) $F \colon \mathcal{C} \to \mathbf{Set}$: $\int^{c \colon \mathcal{C}} F(c) \times \mathcal{C}(c, x) \cong F(x)$