June 2020



I have recently watched a talk by Gabriel Gonzalez about folds, which caught my attention because of my interest in both recursion schemes and optics. A Fold is an interesting abstraction. It encapsulates the idea of focusing on a monoidal contents of some data structure. Let me explain.

Suppose you have a data structure that contains, among other things, a bunch of values from some monoid. You might want to summarize the data by traversing the structure and accumulating the monoidal values in an accumulator. You may, for instance, concatenate strings, or add integers. Because we are dealing with a monoid, which is associative, we could even parallelize the accumulation.

In practice, however, data structures are rarely filled with monoidal values or, if they are, it’s not clear which monoid to use (e.g., in case of numbers, additive or multiplicative?). Usually monoidal values have to be extracted from the container. We need a way to convert the contents of the container to monoidal values, perform the accumulation, and then convert the result to some output type. This could be done, for instance by fist applying fmap, and then traversing the result to accumulate monoidal values. For performance reasons, we might prefer the two actions to be done in a single pass.

Here’s a data structure that combines two functions, one converting a to some monoidal value m and the other converting the final result to b. The traversal itself should not depend on what monoid is being used so, in Haskell, we use an existential type.

data Fold a b = forall m. Monoid m => Fold (a -> m) (m -> b)

The data constructor of Fold is polymorphic in m, so it can be instantiated for any monoid, but the client of Fold will have no idea what that monoid was. (In actual implementation, the client is secretly passed a table of functions: one to retrieve the unit of the monoid, and another to perform the mappend.)

The simplest container to traverse is a list and, indeed, we can use a Fold to fold a list. Here’s the less efficient, but easy to understand implementation

fold :: Fold a b -> [a] -> b
fold (Fold s g) = g . mconcat . fmap s

See Gabriel’s blog post for a more efficient implementation.

A Fold is a functor

instance Functor (Fold a) where
  fmap f (Fold scatter gather) = Fold scatter (f . gather)

In fact it’s a Monoidal functor (in category theory, it’s called a lax monoidal functor)

class Monoidal f where
  init :: f ()
  combine :: f a -> f b -> f (a, b)

You can visualize a monoidal functor as a container with two additional properties: you can initialize it with a unit, and you can coalesce a pair of containers into a container of pairs.

instance Monoidal (Fold a) where
  -- Fold a ()
  init = Fold bang id
  -- Fold a b -> Fold a c -> Fold a (b, c)
  combine (Fold s g) (Fold s' g') = Fold (tuple s s') (bimap g g')

where we used the following helper functions

bang :: a -> ()
bang _ = ()

tuple :: (c -> a) -> (c -> b) -> (c -> (a, b))
tuple f g = \c -> (f c, g c)

This property can be used to easily aggregate Folds.

In Haskell, a monoidal functor is equivalent to the more common applicative functor.

A list is the simplest example of a recursive data structure. The immediate question is, can we use Fold with other recursive data structures? The generalization of folding for recursively-defined data structures is called a catamorphism. What we need is a monoidal catamorphism.

Algebras and catamorphisms

Here’s a very short recap of simple recursion schemes (for more, see my blog). An algebra for a functor f with the carrier a is defined as

type Algebra f a = f a -> a


Think of the functor f as defining a node in a recursive data structure (often, this functor is defined as a sum type, so we have more than one type of node). An algebra extracts the contents of this node and summarizes it. The type a is called the carrier of the algebra.

A fixed point of a functor is the carrier of its initial algebra

newtype Fix f = Fix { unFix :: f (Fix f) }


Think of it as a node that contains other nodes, which contain nodes, and so on, recursively.

A catamorphism generalizes a fold

cata :: Functor f => Algebra f a -> Fix f -> a
cata alg = alg . fmap (cata alg) . unFix

It’s a recursively defined function. It’s first applied using fmap to all the children of the node. Then the node is evaluated using the algebra.

Monoidal algebras

We would like to use a Fold to fold an arbitrary recursive data structure. We are interested in data structures that store values of type a which can be converted to monoidal values. Such structures are generated by functors of two arguments (bifunctors).

class Bifunctor f where
  bimap :: (a -> a') -> (b -> b') -> f a b -> f a' b'


In our case, the first argument will be the payload and the second, the placeholder for recursion and the carrier for the algebra.

We start by defining a monoidal algebra for such a functor by assuming that it has a monoidal payload, and that the child nodes have already been evaluated to a monoidal value

type MAlgebra f = forall m. Monoid m => f m m -> m

A monoidal algebra is polymorphic in the monoid m reflecting the requirement that the evaluation should only be allowed to use monoidal unit and monoidal multiplication.

A bifunctor is automatically a functor in its second argument

instance Bifunctor f => Functor (f a) where
  fmap g = bimap id g

We can apply the fixed point to this functor to define a recursive data structure Fix (f a).

We can then use Fold to convert the payload of this data structure to monoidal values, and then apply a catamorphism to fold it

cat :: Bifunctor f => MAlgebra f -> Fold a b -> Fix (f a) -> b
cat malg (Fold s g) = g . cata alg
  where
    alg = malg . bimap s id

Here’s this process in more detail. This is the monoidal catamorphism that we are defining:

We first apply cat, recursively, to all the children. This replaces the children with monoidal values. We also convert the payload of the node to the same monoid using the first component of Fold. We can then use the monoidal algebra to combine the payload with the results of folding the children.

Finally, we convert the result to the target type.

We have factorized the original problem in three orthogonal directions: the monoidal algebra, the Fold, and the traversal of the particular recursive data structure.

Example

Here’s a simple example. We define a bifunctor that generates a binary tree with arbitrary payload a stored at the leaves

data TreeF a r = Leaf a | Node r r

It is indeed a bifunctor

instance Bifunctor TreeF where
  bimap f g (Leaf a) = Leaf (f a)
  bimap f g (Node r r') = Node (g r) (g r')

The recursive tree is generated as its fixed point

type Tree a = Fix (TreeF a)

Here’s an example of a tree

We define two smart constructors to simplify the construction of trees

leaf :: a -> Tree a
leaf a = Fix (Leaf a)

node :: Tree a -> Tree a -> Tree a
node t t' = Fix (Node t t')

We can define a monoidal algebra for this functor. Notice that it only uses monoidal operations (we don’t even need the monoidal unit here, since values are stored in the leaves). It will therefore work for any monoid

myAlg :: MAlgebra TreeF
myAlg (Leaf m) = m
myAlg (Node m m') = m <> m'

Separately, we define a Fold whose internal monoid is Sum Int. It converts Double values to this monoid using floor, and converts the result to a String using show

myFold :: Fold Double String
myFold = Fold floor' show'
  where
    floor' :: Double -> Sum Int
    floor' = Sum . floor
    show' :: Sum Int -> String
    show' = show . getSum

This Fold has no knowledge of the data structure we’ll be traversing. It’s only interested in its payload.

Here’s a small tree containing three Doubles

myTree :: Tree Double
myTree = node (node (leaf 2.3) (leaf 10.3)) (leaf 1.1)

We can monoidally fold this tree and display the resulting String

Notice that we can use the same monoidal catamorphism with any monoidal algebra and any Fold.

The following pragmas were used in this program

{-# language ExistentialQuantification #-}
{-# language RankNTypes #-}
{-# language FlexibleInstances #-}
{-# language IncoherentInstances #-}

Relation to Optics

A Fold can be seen as a form of optic. It takes a source type, extracts a monoidal value from it, and maps a monoidal value to the target type; all the while keeping the monoid existential. Existential types are represented in category theory as coends—here we are dealing with a coend over the category of monoids \mathbf{Mon}(\mathbf{C}) in some monoidal category \mathbf C. There is an obvious forgetful functor U that forgets the monoidal structure and produces an object of \mathbf C. Here’s the categorical formula that corresponds to Fold

\int^{m \in Mon(C)} C(s, U m)\times C(U m, t)

This coend is taken over a profunctor in the category of monoids

P n m = C(s, U m) \times C(U n, t)

The coend is defined as a disjoint union of sets P m m in which we identify some of the elements. Given a monoid homomorphism f \colon m \to n, and a pair of morphisms

u \colon s \to U m

v \colon U n \to t

we identify the pairs

((U f) \circ u, v) \sim (u, v \circ (U f))

This is exactly what we need to make our monoidal catamorphism work. This condition ensures that the following two scenarios are equivalent:

  • Use the function u to extract monoidal values, transform these values to another monoid using f, do the folding in the second monoid, and translate the result using v
  • Use the function u to extract monoidal values, do the folding in the first monoid, use f to transform the result to the second monoid, and translate the result using v

Since the monoidal catamorphism only uses monoidal operations and f is a monoid homomorphism, this condition is automatically satisfied.


Previously we discussed ninth chords, which are the first in a series of extension chords. Extensions are the notes that go beyond the first octave. Since we build chords by stacking thirds on top of each other, the next logical step, after the ninth chord, is the eleventh and the thirteenth chords. And that’s it: there is no fifteenth chord, because the fifteenth would be the same as the root (albeit two octaves higher).

This strange musical arithmetic is best understood if we translate all intervals into their semitone equivalents in equal temperament. Since we started by constructing the E major chord, let’s work with the E major scale, which consists of the following notes:

|E |  |F#|  |G#|A  |  |B |  |C#|  |D#|E |

Let’s chart the chord tones taking E as the root.

We see the clash of several naming conventions. Letter names have their origin is the major diatonic scale, as implemented by the white keys on the piano starting from C.

|C |  |D |  |E |F |  |G |  |A |  |B |C |

They go in alphabetical order, wrapping around after G. On the guitar we don’t have white and black keys, so this convention seems rather arbitrary.

The names of intervals (here, marked by digits, with occasional accidental symbols) are also based on the diatonic scale. They essentially count the number of letters from the root (including the root). So the distance from E to B is 5, because you count E, F, G, A, B — five letters. For a mathematician this convention makes little sense, but it is what it is.

After 12 semitones, we wrap around, as far as note names are concerned. With intervals the situation is a bit more nuanced. The ninth can be, conceptually, identified with the second; the eleventh with the fourth; and the thirteenth with the sixth. But how we name the intervals depends on their harmonic function. For instance, the same note, C#, is called the sixth in the E6 chord, and the thirteenth in E13. The difference is that E13 also contains the (dominant) seventh and the ninth.

A full thirteenth chord contains seven notes (root, third, fifth, seventh, ninth, eleventh, and thirteenth), so it cannot possibly be voiced on a six-string guitar. We usually drop the eleventh (as you can see above). The ninth and the fifth can be omitted as well. The root is very important, since it defines the chord, but when you’re playing in a band, it can be taken over by the bass instrument. The third is important because it distinguishes between major and minor modes (but then again, you have power chords that skip the third). The seventh is somewhat important in defining the dominant role of the chord.

Notice that a thirteenth chord can be seen as two separate chords on top of each other. E13 can be decomposed into E7 with F#m on top (try to spot these two shapes in this grip). Seen this way, the major/minor clash is another argument to either drop the eleventh (which serves as the minor third of F#m) or sharp it.

Alternatively, one could decompose E13 into E with DΔ7 on top. The latter shape is also easily recognized in this grip.

I decided against listing eleventh chords because they are awkward to voice on the guitar and because they are rarely used. Thirteenth chords are more frequent, especially in jazz. You’ve seen E13, here’s G13:

It skips the 11th and the 5th; and the 9th at the top is optional.

The Role of Harmonics

It might be worth explaining why omitting the fifth in G13 doesn’t change the character of the chord. The reason is that, when you play the root note, you are also producing harmonics. One of the strongest harmonics is the fifth, more precisely, the fifth over the octave. So, even if you don’t voice it, you can hear it. In fact, a lot of the quality of a given chord voicing depends on the way the harmonics interact with each other, especially in the bass. When you strum the E chord on the guitar, you get a strong root sound E, and the B on the next thickest string amplifies its harmonic fifth. Compare this with the G shape, which also starts with the root, but the next string voices the third, B, which sounds okay, but not great, so some people mute it.

Inverted chords, even though they contain the same notes (up to octave equivalence) may sound dissonant, depending on the context (in particular, voice leading in the bass). This is why we don’t usually play the lowest string in C and A shapes, or the two lowest strings in the D shape.

In the C shape, the third in the bass clashes with the root and is usually muted. That’s because the strongest harmonic of E is B, which makes C/E sound like CΔ7.

On the other hand, when you play the CΔ7 chord, the E in the bass sounds great, for exactly the same reason.

You can also play C with the fifth in the bass, as C/G, and it sounds good, probably because the harmonic D of G gives it the ninth flavor. This harmonic is an octave and a fifth above G, so it corresponds to the D that would be voiced on the third fret of the B string.

The same reasoning doesn’t quite work for the A shape. Firstly, because all four lower strings in A/E voice the very strong power chord (two of them open strings) drowning out the following third. Also the fifth above E is the B that’s just two semitones below the third C# voiced on the B string. (Theoretically, C/G has a third doubled on the thinest string but that doesn’t seem to clash as badly with the D harmonic of G. Again, the ear beats theory!)

Next: Altered chords.


We have already discussed several kinds of seventh chords. But if you can extend the chord by adding a third above it, why not top it with yet another third? This way we arrive at the ninth chord. But a ninth is one whole step above the octave. So far we’ve been identifying notes that cross the octave with their counterparts that are 12 semitones lower. A mathematician would say that we are doing arithmetic modulo 12. But this is just a useful fiction. A lot of things in music theory can be motivated using modular arithmetic, but ultimately, we have to admit that if something doesn’t sound right, it’s not right.

A ninth is 14 semitones above the root (if you don’t flat or sharp it), so it should be identified with the second, which is 2 semitones up from the root. That puts it smack in the middle between the root and the third: a pretty jarring dissonance. We’ve seen a second used in a chord before, but it was playing the role of a suspended third. In a ninth chord, you keep the third, and move the second to the next octave, where it becomes a ninth and cannot do as much damage. Instead it provides color and tension, making things more interesting.

To construct E9, we start with E7. It has the root duplicated on the thinnest string, so it’s easy to raise it by two semitones to produce the ninth.

There are many variations of the ninth chord. There is a minor version, with the third lowered; the seventh can be raised to a major seventh; and the ninth itself can be flatted or sharped. We won’t cover all these.

Following the same pattern, C9 can be constructed from C7 by raising the root by two semitones.

We get a highly movable shape, especially if we put the fifth on the thinnest string. In particular, it can be moved one fret towards the nut to produce B9–a slight modification of the B7 grip we’ve seen before.

If you look carefully at this shape, you might recognize parts of Gm in it (the three thinnest strings). This is no coincidence. The fifth, the seventh, and the ninth of any ninth chord form a minor triad.

Here is the E9 grip obtained by transposing C9 down the fretboard. It’s used a lot in funk:

The same chord with a sharped ninth is called the Hendrix chord, after Jimi Hendrix who popularized it:

The E9 shape is not only movable, but it’s also easy to mutate. This is the minor version:


and this is the major seventh version:

Such chords are quite common in Bossa Nova.

A9 is obtained by raising the root of A7 by two semitones:


Can you spot the Dm shape raised by two frets?

Similarly, G9 is constructed from G7, and it conceals a Dm as part of it.

Next: Extension chords.