I have recently watched a talk by Gabriel Gonzalez about folds, which caught my attention because of my interest in both recursion schemes and optics. A `Fold`

is an interesting abstraction. It encapsulates the idea of focusing on a *monoidal contents* of some data structure. Let me explain.

Suppose you have a data structure that contains, among other things, a bunch of values from some monoid. You might want to summarize the data by traversing the structure and accumulating the monoidal values in an accumulator. You may, for instance, concatenate strings, or add integers. Because we are dealing with a monoid, which is associative, we could even parallelize the accumulation.

In practice, however, data structures are rarely filled with monoidal values or, if they are, it’s not clear which monoid to use (e.g., in case of numbers, additive or multiplicative?). Usually monoidal values have to be extracted from the container. We need a way to convert the contents of the container to monoidal values, perform the accumulation, and then convert the result to some output type. This could be done, for instance by fist applying `fmap`

, and then traversing the result to accumulate monoidal values. For performance reasons, we might prefer the two actions to be done in a single pass.

Here’s a data structure that combines two functions, one converting `a`

to some monoidal value `m`

and the other converting the final result to `b`

. The traversal itself should not depend on what monoid is being used so, in Haskell, we use an existential type.

data Fold a b = forall m. Monoid m => Fold (a -> m) (m -> b)

The data constructor of `Fold`

is polymorphic in `m`

, so it can be instantiated for any monoid, but the client of `Fold`

will have no idea what that monoid was. (In actual implementation, the client is secretly passed a table of functions: one to retrieve the unit of the monoid, and another to perform the `mappend`

.)

The simplest container to traverse is a list and, indeed, we can use a `Fold`

to fold a list. Here’s the less efficient, but easy to understand implementation

fold :: Fold a b -> [a] -> b fold (Fold s g) = g . mconcat . fmap s

See Gabriel’s blog post for a more efficient implementation.

A `Fold`

is a functor

instance Functor (Fold a) where fmap f (Fold scatter gather) = Fold scatter (f . gather)

In fact it’s a `Monoidal`

functor (in category theory, it’s called a lax monoidal functor)

class Monoidal f where init :: f () combine :: f a -> f b -> f (a, b)

You can visualize a monoidal functor as a container with two additional properties: you can initialize it with a unit, and you can coalesce a pair of containers into a container of pairs.

instance Monoidal (Fold a) where -- Fold a () init = Fold bang id -- Fold a b -> Fold a c -> Fold a (b, c) combine (Fold s g) (Fold s' g') = Fold (tuple s s') (bimap g g')

where we used the following helper functions

bang :: a -> () bang _ = () tuple :: (c -> a) -> (c -> b) -> (c -> (a, b)) tuple f g = \c -> (f c, g c)

This property can be used to easily aggregate `Fold`

s.

In Haskell, a monoidal functor is equivalent to the more common applicative functor.

A list is the simplest example of a recursive data structure. The immediate question is, can we use `Fold`

with other recursive data structures? The generalization of folding for recursively-defined data structures is called a catamorphism. What we need is a *monoidal* catamorphism.

# Algebras and catamorphisms

Here’s a very short recap of simple recursion schemes (for more, see my blog). An algebra for a functor `f`

with the carrier `a`

is defined as

type Algebra f a = f a -> a

Think of the functor `f`

as defining a node in a recursive data structure (often, this functor is defined as a sum type, so we have more than one type of node). An algebra extracts the contents of this node and summarizes it. The type `a`

is called the carrier of the algebra.

A fixed point of a functor is the carrier of its initial algebra

newtype Fix f = Fix { unFix :: f (Fix f) }

Think of it as a node that contains other nodes, which contain nodes, and so on, recursively.

A catamorphism generalizes a fold

cata :: Functor f => Algebra f a -> Fix f -> a cata alg = alg . fmap (cata alg) . unFix

It’s a recursively defined function. It’s first applied using `fmap`

to all the children of the node. Then the node is evaluated using the algebra.

# Monoidal algebras

We would like to use a `Fold`

to fold an arbitrary recursive data structure. We are interested in data structures that store values of type `a`

which can be converted to monoidal values. Such structures are generated by functors of two arguments (bifunctors).

class Bifunctor f where bimap :: (a -> a') -> (b -> b') -> f a b -> f a' b'

In our case, the first argument will be the payload and the second, the placeholder for recursion and the carrier for the algebra.

We start by defining a monoidal algebra for such a functor by assuming that it has a monoidal payload, and that the child nodes have already been evaluated to a monoidal value

type MAlgebra f = forall m. Monoid m => f m m -> m

A monoidal algebra is polymorphic in the monoid `m`

reflecting the requirement that the evaluation should only be allowed to use monoidal unit and monoidal multiplication.

A bifunctor is automatically a functor in its second argument

instance Bifunctor f => Functor (f a) where fmap g = bimap id g

We can apply the fixed point to this functor to define a recursive data structure `Fix (f a)`

.

We can then use `Fold`

to convert the payload of this data structure to monoidal values, and then apply a catamorphism to fold it

cat :: Bifunctor f => MAlgebra f -> Fold a b -> Fix (f a) -> b cat malg (Fold s g) = g . cata alg where alg = malg . bimap s id

Here’s this process in more detail. This is the monoidal catamorphism that we are defining:

We first apply `cat`

, recursively, to all the children. This replaces the children with monoidal values. We also convert the payload of the node to the same monoid using the first component of `Fold`

. We can then use the monoidal algebra to combine the payload with the results of folding the children.

Finally, we convert the result to the target type.

We have factorized the original problem in three orthogonal directions: the monoidal algebra, the `Fold`

, and the traversal of the particular recursive data structure.

# Example

Here’s a simple example. We define a bifunctor that generates a binary tree with arbitrary payload `a`

stored at the leaves

data TreeF a r = Leaf a | Node r r

It is indeed a bifunctor

instance Bifunctor TreeF where bimap f g (Leaf a) = Leaf (f a) bimap f g (Node r r') = Node (g r) (g r')

The recursive tree is generated as its fixed point

type Tree a = Fix (TreeF a)

We define two smart constructors to simplify the construction of trees

leaf :: a -> Tree a leaf a = Fix (Leaf a) node :: Tree a -> Tree a -> Tree a node t t' = Fix (Node t t')

We can define a monoidal algebra for this functor. Notice that it only uses monoidal operations (we don’t even need the monoidal unit here, since values are stored in the leaves). It will therefore work for any monoid

myAlg :: MAlgebra TreeF myAlg (Leaf m) = m myAlg (Node m m') = m <> m'

Separately, we define a `Fold`

whose internal monoid is `Sum Int`

. It converts `Double`

values to this monoid using `floor`

, and converts the result to a `String`

using `show`

myFold :: Fold Double String myFold = Fold floor' show' where floor' :: Double -> Sum Int floor' = Sum . floor show' :: Sum Int -> String show' = show . getSum

This `Fold`

has no knowledge of the data structure we’ll be traversing. It’s only interested in its payload.

Here’s a small tree containing three `Double`

s

myTree :: Tree Double myTree = node (node (leaf 2.3) (leaf 10.3)) (leaf 1.1)

We can monoidally fold this tree and display the resulting `String`

Notice that we can use the same monoidal catamorphism with any monoidal algebra and any `Fold`

.

The following pragmas were used in this program

{-# language ExistentialQuantification #-} {-# language RankNTypes #-} {-# language FlexibleInstances #-} {-# language IncoherentInstances #-}

# Relation to Optics

A `Fold`

can be seen as a form of optic. It takes a source type, extracts a monoidal value from it, and maps a monoidal value to the target type; all the while keeping the monoid existential. Existential types are represented in category theory as coends—here we are dealing with a coend over the category of monoids in some monoidal category . There is an obvious forgetful functor that forgets the monoidal structure and produces an object of . Here’s the categorical formula that corresponds to `Fold`

This coend is taken over a profunctor in the category of monoids

The coend is defined as a disjoint union of sets in which we identify some of the elements. Given a monoid homomorphism , and a pair of morphisms

we identify the pairs

This is exactly what we need to make our monoidal catamorphism work. This condition ensures that the following two scenarios are equivalent:

- Use the function to extract monoidal values, transform these values to another monoid using , do the folding in the second monoid, and translate the result using
- Use the function to extract monoidal values, do the folding in the first monoid, use to transform the result to the second monoid, and translate the result using

Since the monoidal catamorphism only uses monoidal operations and is a monoid homomorphism, this condition is automatically satisfied.