Lenses and, more general, optics are an example of hard-core category theory that has immediate application in programming. While working on polynomial lenses, I had a vague idea how they could be implemented in a programming language. I thought up an example of a polynomial lens that would focus on all the leaves of a tree at once. It could retrieve or modify them in a single operation. There already is a Haskell optic called traversal that could do it. It can safely retrieve a list of leaves from a tree. But there is a slight problem when it comes to replacing them: the size of the input list has to match the number of leaves in the tree. If it doesn’t, the traversal doesn’t work.

A polynomial lens adds an additional layer of safety by keeping track of the sizes of both the trees and the lists. The problem is that its implementation requires dependent types. Haskell has some support for dependent types, so I tried to work with it, but I quickly got bogged down. So I decided to bite the bullet and quickly learn Idris. This was actually easier than I expected and this post is the result.

Counted Vectors and Trees

I started with the “Hello World!” of dependent types: counted vectors. Notice that, in Idris, type signatures use a single colon rather than the Haskell’s double colon. You can quickly get used to it after the compiler slaps you a few times.

data Vect : Type -> Nat -> Type where
VNil : Vect a Z
VCons : (x: a) -> (xs : Vect a n) -> Vect a (S n)

If you know Haskell GADTs, you can easily read this definition. In Haskell, we usually think of Nat as a “kind”, but in Idris types and values live in the same space. Nat is just an implementation of Peano artithmetics, with Z standing for zero, and (S n) for the successor of n. Here, VNil is the constructor of an empty vector of size Z, and VCons prepends a value of type a to the tail of size n resulting in a new vector of size (S n). Notice that Idris is much more explicit about types than Haskell.

The power of dependent types is in very strict type checking of both the implementation and of usage of functions. For instance, when mapping a function over a vector, we can make sure that the result is the same size as the argument:

mapV : (a -> b) -> Vect a n -> Vect b n
mapV f VNil = VNil
mapV f (VCons a v) = VCons (f a) (mapV f v)

When concatenating two vectors, the length of the result must be the sum of the two lengths, (plus m n):

concatV : Vect a m -> Vect a n -> Vect a (plus m n)
concatV VNil v = v
concatV (VCons a w) v = VCons a (concatV w v)

Similarly, when splitting a vector in two, the lengths must match, too:

splitV : (n : Nat) -> Vect a (plus n m) -> (Vect a n, Vect a m)
splitV Z v = (VNil, v)
splitV (S k) (VCons a v') = let (v1, v2) = splitV k v'
in (VCons a v1, v2)

Here’s a more complex piece of code that implements insertion sort:

sortV : Ord a => Vect a n -> Vect a n
sortV VNil = VNil
sortV (VCons x xs) = let xsrt = sortV xs
in (ins x xsrt)
where
ins : Ord a => (x : a) -> (xsrt : Vect a n) -> Vect a (S n)
ins x VNil = VCons x VNil
ins x (VCons y xs) = if x < y then VCons x (VCons y xs)
else VCons y (ins x xs)

In preparation for the polynomial lens example, let’s implement a node-counted binary tree. Notice that we are counting nodes, not leaves. That’s why the node count for Node is the sum of the node counts of the children plus one:

data Tree : Type -> Nat -> Type where
Empty : Tree a Z
Leaf  : a -> Tree a (S Z)
Node  : Tree a n -> Tree a m -> Tree a (S (plus m  n))

All this is not much different from what you’d see in a Haskell library.

Existential Types

So far we’ve been dealing with function that return vectors whose lengths can be easily calculated from the inputs and verified at compile time. This is not always possible, though. In particular, we are interested in retrieving a vector of leaves from a tree that’s parameterized by the number of nodes. We don’t know up front how many leaves a given tree might have. Enter existential types.

An existential type hides part of its implementation. An existential vector, for instance, hides its size. The receiver of an existential vector knows that the size “exists”, but its value is inaccessible. You might wonder then: What can be done with such a mystery vector? The only way for the client to deal with it is to provide a function that is insensitive to the size of the hidden vector. A function that is polymorphic in the size of its argument. Our sortV is an example of such a function.

Here’s the definition of an existential vector:

data SomeVect : Type -> Type where
HideV : {n : Nat} -> Vect a n -> SomeVect a

SomeVect is a type constructor that depends on the type a—the payload of the vector. The data constructor HideV takes two arguments, but the first one is surrounded by a pair of braces. This is called an implicit argument. The compiler will figure out its value from the type of the second argument, which is Vect a n. Here’s how you construct an existential:

secretV : SomeVect Int
secretV = HideV (VCons 42 VNil)

In this case, the compiler will deduce n to be equal to one, but the recipient of secretV will have no way of figuring this out.

Since we’ll be using types parameterized by Nat a lot, let’s define a type synonym:

Nt : Type
Nt = Nat -> Type

Both Vect a and Tree a are examples of this type.

We can also define a generic existential for stashing such types:

data Some : Nt -> Type where
Hide : {n : Nat} -> nt n -> Some nt

and some handy type synonyms:

SomeVect : Type -> Type
SomeVect a = Some (Vect a)
SomeTree : Type -> Type
SomeTree a = Some (Tree a)

Polynomial Lens

We want to translate the following categorical definition of a polynomial lens:

$\mathbf{PolyLens}\langle s, t\rangle \langle a, b\rangle = \prod_{k} \mathbf{Set}\left(s_k, \sum_{n} a_n \times [b_n, t_k] \right)$

We’ll do it step by step. First of all, we’ll assume, for simplicity, that the indices $k$ and $n$ are natural numbers. Therefore the four arguments to PolyLens are types parameterized by Nat, for which we have a type alias:

PolyLens : Nt -> Nt -> Nt -> Nt -> Type

The definition starts with a big product over all $k$‘s. Such a product corresponds, in programming, to a polymorphic function. In Haskell we would write it as forall k. In Idris, we’ll accomplish the same using an implicit argument {k : Nat}.

The hom-set notation $\mathbf{Set}(a, b)$ stands for a set of functions from $a$ to $b$, or the type a -> b. So does the notation $[a, b]$ (the internal hom is the same as the external hom in $\mathbf{Set}$). The product $a \times b$ is the type of pairs (a, b).

The only tricky part is the sum over $n$. A sum corresponds exactly to an existential type. Our SomeVect, for instance, can be considered a sum over n of all vector types Vect a n.

Here’s the intuition: Consider that to construct a sum type like Either a b it’s enough to provide a value of either type a or type b. Once the Either is constructed, the information about which one was used is lost. If you want to use an Either, you have to provide two functions, one for each of the two branches of the case statement. Similarly, to construct SomeVect its enough to provide a vector of some particular lenght n. Instead of having two possibilities of Either, we have infinitely many possibilities corresponding to different n‘s. The information about what n was used is then promptly lost.

The sum in the definition of the polynomial lens:

$\sum_{n} a_n \times [b_n, t_k]$

can be encoded in this existential type:

data SomePair : Nt -> Nt -> Nt -> Type where
HidePair : {n : Nat} ->
(k : Nat) -> a n -> (b n -> t k) -> SomePair a b t

Notice that we are hiding n, but not k.

Taking it all together, we end up with the following type definition:

PolyLens : Nt -> Nt -> Nt -> Nt -> Type
PolyLens s t a b = {k : Nat} -> s k -> SomePair a b t

The way we read this definition is that PolyLens is a function polymorphic in k. Given a value of the type s k it produces and existential pair SomePair a b t. This pair contains a value of the type a n and a function b n -> t k. The important part is that the value of n is hidden from the caller inside the existential type.

Using the Lens

Because of the existential type, it’s not immediately obvious how one can use the polynomial lens. For instance, we would like to be able to extract the foci a n, but we don’t know what the value of n is. The trick is to hide n inside an existential Some. Here is the “getter” for this lens:

getLens :  PolyLens sn tn an bn -> sn n -> Some an
getLens lens t =
let  HidePair k v _ = lens t
in Hide v

We call lens with the argument t, pattern match on the constructor HidePair and immediately hide the contents back using the constructor Hide. The compiler is smart enough to know that the existential value of n hasn’t been leaked.

The second component of SomePair, the “setter”, is trickier to use because, without knowing the value of n, we don’t know what argument to pass to it. The trick is to take advantage of the match between the producer and the consumer that are the two components of the existential pair. Without disclosing the value of n we can take the a‘s and use a polymorphic function to transform them into b‘s.

transLens : PolyLens sn tn an bn -> ({n : Nat} -> an n -> bn n)
-> sn n -> Some tn
transLens lens f t =
let  HidePair k v vt = lens t
in  Hide (vt (f v))

The polymorphic function here is encoded as ({n : Nat} -> an n -> bn n). (An example of such a function is sortV.) Again, the value of n that’s hidden inside SomePair is never leaked.

Example

Let’s get back to our example: a polynomial lens that focuses on the leaves of a tree. The type signature of such a lens is:

treeLens : PolyLens (Tree a) (Tree b) (Vect a) (Vect b)

Using this lens we should be able to retrieve a vector of leaves Vect a n from a node-counted tree Tree a k and replace it with a new vector Vect b n to get a tree Tree b k. We should be able to do it without ever disclosing the number of leaves n.

To implement this lens, we have to write a function that takes a tree of a and produces a pair consisting of a vector of a‘s and a function that takes a vector of b‘s and produces a tree of b‘s. The type b is fixed in the signature of the lens. In fact we can pass this type to the function we are implementing. This is how it’s done:

treeLens : PolyLens (Tree a) (Tree b) (Vect a) (Vect b)
treeLens {b} t = replace b t

First, we bring b into the scope of the implementation as an implicit parameter {b}. Then we pass it as a regular type argument to replace. This is the signature of replace:

replace : (b : Type) -> Tree a n -> SomePair (Vect a) (Vect b) (Tree b)

We’ll implement it by pattern-matching on the tree.

The first case is easy:

replace b Empty = HidePair 0 VNil (\v => Empty)

For an empty tree, we return an empty vector and a function that takes and empty vector and recreates and empty tree.

The leaf case is also pretty straightforward, because we know that a leaf contains just one value:

replace b (Leaf x) = HidePair 1 (VCons x VNil)
(\(VCons y VNil) => Leaf y)

The node case is more tricky, because we have to recurse into the subtrees and then combine the results.

replace b (Node t1 t2) =
let (HidePair k1 v1 f1) = replace b t1
(HidePair k2 v2 f2) = replace b t2
v3 = concatV v1 v2
f3 = compose f1 f2
in HidePair (S (plus k2 k1)) v3 f3

Combining the two vectors is easy: we just concatenate them. Combining the two functions requires some thinking. First, let’s write the type signature of compose:

compose : (Vect b n -> Tree b k) -> (Vect b m -> Tree b j) ->
(Vect b (plus n m)) -> Tree b (S (plus j k))

The input is a pair of functions that turn vectors into trees. The result is a function that takes a larger vector whose size is the sume of the two sizes, and produces a tree that combines the two subtrees. Since it adds a new node, its node count is the sum of the node counts plus one.

Once we know the signature, the implementation is straightforward: we have to split the larger vector and pass the two subvectors to the two functions:

compose {n} f1 f2 v =
let (v1, v2) = splitV n v
in Node (f1 v1) (f2 v2)

The split is done by looking at the type of the first argument (Vect b n -> Tree b k). We know that we have to split at n, so we bring {n} into the scope of the implementation as an implicit parameter.

Besides the type-changing lens (that changes a to b), we can also implement a simple lens:

treeSimpleLens : PolyLens (Tree a) (Tree a) (Vect a) (Vect a)
treeSimpleLens {a} t = replace a t

We’ll need it later for testing.

Testing

To give it a try, let’s create a small tree with five nodes and three leaves:

t3 : Tree Char 5
t3 = (Node (Leaf 'z') (Node (Leaf 'a') (Leaf 'b')))

We can extract the leaves using our lens:

getLeaves : Tree a n -> SomeVect a
getLeaves t = getLens treeSimpleLens t

As expected, we get a vector containing 'z', 'a', and 'b'.

We can also transform the leaves using our lens and the polymorphic sort function:

trLeaves : ({n : Nat} -> Vect a n -> Vect b n) -> Tree a n -> SomeTree b
trLeaves f t = transLens treeLens f t
trLeaves sortV

The result is a new tree: ('a',('b','z'))

Complete code is available on github.