Trying to improve my Haskell coding skills, I decided to test myself at solving the 2017 Advent of Code problems. It’s been a lot of fun and a great learning experience. One problem in particular stood out for me because, for the first time, it let me apply, in anger, the ideas I learned from category theory. But I’m not going to talk about category theory this time, just about programming.

The problem is really about dominoes. You get a set of dominoes, or pairs of numbers (the twist is that the numbers are not capped at 6), and you are supposed to build a chain, in which the numbers are matched between consecutive pieces. For instance, the chain `[(0, 5), (5, 12), (12, 12), (12, 1)]`

is admissible. Like in the real game, the pieces can be turned around, so `(1, 3)`

can be also used as `(3, 1)`

. The goal is to build a chain that starts from zero and maximizes the score, which is the sum of numbers on the pieces used.

The algorithm is pretty straightforward. You put all dominoes in a data structure that lets you quickly pull the pieces you need, you recursively build all possible chains, evaluate their sums, and pick the winner.

Let’s start with some data structures.

type Piece = (Int, Int) type Chain = [Piece]

At each step in the procedure, we will be looking for a domino with a number that matches the end of the current chain. If the chain is `[(0, 5), (5, 12)]`

, we will be looking for pieces with 12 on one end. It’s best to organize pieces in a map indexed by these numbers. To allow for turning the dominoes, we’ll add each piece twice. For instance, the piece `(12, 1)`

will be added as `(12, 1)`

and `(1, 12)`

. We can make a small optimization for symmetric dominoes, like `(12, 12)`

, by adding them only once.

We’ll use the `Map`

from the Prelude:

import qualified Data.Map as Map

The `Map`

we’ll be using is:

type Pool = Map.Map Int [Int]

The key is an integer, the value is a list of integers corresponding to the other ends of pieces. This is made clear by the way we insert each piece in the map:

addPiece :: Piece -> Pool -> Pool addPiece (m, n) = if m /= n then add m n . add n m else add m n where add m n pool = case Map.lookup m pool of Nothing -> Map.insert m [n] pool Just lst -> Map.insert m (n : lst) pool

I used point-free notation. If that’s confusing, here’s the translation:

addPiece :: Piece -> Pool -> Pool addPiece (m, n) pool = if m /= n then add m n (add n m pool) else add m n pool

As I said, each piece is added twice, except for the symmetric ones.

After using a piece in a chain, we’ll have to remove it from the pool:

removePiece :: Piece -> Pool -> Pool removePiece (m, n) = if m /= n then rem m n . rem n m else rem m n where rem :: Int -> Int -> Pool -> Pool rem m n pool = case fromJust $ Map.lookup m pool of [] -> Map.delete m pool lst -> Map.insert m (delete n lst) pool

You might be wondering why I’m using a partial function `fromJust`

. In industrial-strength code I would pattern match on the `Maybe`

and issue a diagnostic if the piece were not found. Here I’m fine with a fatal exception if there’s a bug in my reasoning.

It’s worth mentioning that, like all data structures in Haskell, `Map`

is a persistent data structure. It means that it’s never modified in place, and its previous versions *persist*. This is invaluable in this kind of recursive algorithms, where we use backtracking to explore multiple paths.

The input of the puzzle is a list of pieces. We’ll start by inserting them into our map. In functional programming we never think in terms of loops: we transform data. A list of pieces is a (recursive) data structure. We want to traverse it and accumulate the information stored in it into a `Map`

. This kind of transformation is, in general, called a *catamorphism*. A list catamorphism is called a *fold*. It is specified by two things: (1) its action on the empty list (here, it turns it into `Map.empty`

), and (2) its action on the head of the current list and the accumulated result of processing the tail. The head of the current list is a piece, and the accumulator is the `Map`

. The function `addPiece`

has just the right signature:

presort :: [Piece] -> Pool presort = foldr addPiece Map.empty

I’m using a right fold, but a left fold would work fine, too. Again, this is point free notation.

Now that the preliminaries are over, let’s think about the algorithm. My first approach was to define a bunch of mutually recursive functions that would build all possible chains, score them, and then pick the best one. After a few tries, I got hopelessly bogged down in details. I took a break and started thinking.

Functional programming is all about functions, right? Using a recursive function is the correct approach. Or is it? The more you program in Haskell, the more you realize that you get the most power by considering wholesale transformations of data structures. When creating a `Map`

of pieces, I didn’t write a recursive function over a list — I used a fold instead. Of course, behind the scenes, fold is implemented using recursion (which, thanks to tail recursion, is usually transformed into a loop). But the idea of applying transformations to data structures is what lets us soar above the sea of details and into the higher levels of abstraction.

So here’s the new idea: let’s create one gigantic data structure that contains all admissible chains built from the domino pieces at our disposal. The obvious choice is a tree. At the root we’ll have the starting number: zero, as specified in the description of the problem. All pool pieces that have a zero at one end will start a new branch. Instead of storing the whole piece at the node, we can just store the second number — the first being determined by the parent. So a piece (0, 5) starts a branch with a 5 node right below the 0 node. Next we’d look for pieces with a 5. Suppose that one of them is (5, 12), so we create a node with a 12, and so on. A tree with a variable list of branches is called a rose tree:

data Rose = NodeR Int [Rose] deriving Show

It’s always instructive to keep in mind at least one special boundary case. Consider what would happen if `(0, 5)`

were the only piece in the pool. We’d end up with the following tree:

NodeR 0 [NodeR 5 []]

We’ll come back to this example later.

The next question is, how do we build such a tree? We start with a set of dominoes gathered in a `Map`

. At every step in the algorithm we pick a matching domino, remove it from the pool, and start a new subtree. To start a subtree we need a number and a pool of remaining pieces. Let’s call this combination a *seed*.

The process of building a recursive data structure from a seed is called *anamorphism*. It’s a well studied and well understood process, so let’s try to apply it in our case. The key is to separate the big picture from the small picture. The big picture is the recursive data structure — the rose tree, in our case. The small picture is what happens at a single node.

Let’s start with the small picture. We are given a seed of the type `(Int, Pool)`

. We use the number as a key to retrieve a list of matching pieces from the `Pool`

(strictly speaking, just a list of numbers corresponding to the other ends of the pieces). Each piece will start a new subtree. The seed for such a subtree consists of the number at the other end of the piece and a new `Pool`

with the piece removed. A function that produces seeds from a given seed looks like this:

grow (n, pool) = case Map.lookup n pool of Nothing -> [] Just ms -> [(m, removePiece (m, n) pool) | m <- ms]

Now we have to translate this to a procedure that recreates a complete tree. The trick is to split the definition of the tree into local and global pictures. The local picture is captured by this data structure:

data TreeF a = NodeF Int [a] deriving Functor

Here, the recursion of the original rose tree is replaced by the type parameter `a`

. This data structure, which describes a single node, or a very shallow tree, is a functor with respect to `a`

(the compiler is able to automatically figure out the implementation of `fmap`

, but you can also do it by hand).

It’s important to realize that the recursive definition of a rose tree can be recovered as a fixed point of this functor. We define the fixed point as the data structure `X`

that results from replacing `a`

in the definition of `TreeF`

with `X`

. Symbolically:

X = TreeF X

In fact, this procedure of finding the fixed point can be written in all generality for any functor `f`

. If we call the fixed point `Fix f`

, we can define it by replacing the type argument to `f`

with `Fix f`

, as in:

newtype Fix f = Fix { unFix :: f (Fix f) }

Our rose tree is the fixed point of the functor `TreeF`

:

type Tree = Fix TreeF

This splitting of the recursive part from the functor part is very convenient because it lets us use non-recursive functions to generate or traverse recursive data structures.

In particular, the procedure of unfolding a data structure from a seed is captured by a non-recursive function of the following signature:

type Coalgebra f a = a -> f a

Here, `a`

serves as the seed that generates a single node populated with new seeds. We have already seen a function that generates seeds, we only have to cast it in the form of a coalgebra:

coalg :: Coalgebra TreeF (Int, Pool) coalg (n, pool) = case Map.lookup n pool of Nothing -> NodeF n [] Just ms -> NodeF n [(m, removePiece (m, n) pool) | m <- ms]

The pièce de résistance is the formula that uses a given coalgebra to unfold a recursive date structure. It’s called the anamorphism:

ana :: Functor f => Coalgebra f a -> a -> Fix f ana coalg = Fix . fmap (ana coalg) . coalg

Here’s the play-by-play: The anamorphism takes a seed and applies the coalgebra to it. That generates a single node with new seeds in place of children. Then it `fmap`

s the whole anamorphism over this node, thus unfolding the seeds into full-blown trees. Finally, it applies the constructor `Fix`

to produce the final tree. Notice that this is a recursive definition.

We are now in a position to build a tree that contains all admissible chains of dominoes. We do it by applying the anamorphism to our coalgebra:

tree = ana coalg

Once we have this tree, we could traverse it, or fold it, to retrieve all the chains and find the best one.

But once we have our tree in the form of a fixed point, we can be smart about folds as well. The procedure is essentially the same, except that now we are collecting information from the nodes of a tree. To do this, we define a non-recursive function called the *algebra*:

type Algebra f a = f a -> a

The type `a`

is called the carrier of the algebra. It plays the role of the accumulator of data.

We are interested in the algebra that would help us collect chains of dominoes from our rose tree. Suppose that we have already applied this algebra to all children of a particular node. Each child tree would produce its own list of chains. Our goal is to extend those chains by adding one more piece that is determined by the current node. Let’s start with our earlier trivial case of a tree that contains a single piece `(0, 5)`

:

NodeR 0 [Node 5 []]

We replace the leaf node with some value `x`

of the still unspecified carrier type. We get:

NodeR 0 x

Obviously, `x`

must contain the number 5, to let us recover the original piece `(0, 5)`

. The result of applying the algebra to the top node must produce the chain `[(0, 5)]`

. These two pieces of information suggest the carrier type to be a combination of a number and a list of chains. The leaf node is turned to `(5, [])`

, and the top node produces `(0, [[(0, 5)]])`

.

With this choice of the carrier type, the algebra is easy to implement:

chainAlg :: Algebra TreeF (Int, [Chain]) chainAlg (NodeF n []) = (n, []) chainAlg (NodeF n lst) = (n, concat [push (n, m) bs | (m, bs) <- lst]) where push :: (Int, Int) -> [Chain] -> [Chain] push (n, m) [] = [[(n, m)]] push (n, m) bs = [(n, m) : br | br <- bs]]

For the leaf (a node with no children), we return the number stored in it together with an empty list. Otherwise, we gather the chains from children. If a child returns an empty list of chains, meaning it was a leaf, we create a single-piece chain. If the list is not empty, we prepend a new piece to all the chains. We then concatenate all lists of chains into one list.

All that remains is to apply our algebra recursively to the whole tree. Again, this can be done in full generality using a catamorphism:

cata :: Functor f => Algebra f a -> Fix f -> a cata alg = alg . fmap (cata alg) . unFix

We start by stripping the fixed point constructor using `unFix`

to expose a node, apply the catamorphism to all its children, and apply the algebra to the node.

To summarize: we use an anamorphism to create a tree, then use a catamorphism to convert the tree to a list of chains. Notice that we don’t need the tree itself — we only use it to drive the algorithm. Because Haskell is lazy, the tree is evaluated on demand, node by node, as it is walked by the catamorphism.

This combination of an anamorphism followed immediately by a catamorphism comes up often enough to merit its own name. It’s called a *hylomorphism*, and can be written concisely as:

hylo :: Functor f => Algebra f a -> Coalgebra f b -> b -> a hylo f g = f . fmap (hylo f g) . g

In our example, we produce a list of chains using a hylomorphism:

let (_, chains) = hylo chainAlg coalg (0, pool)

The solution of the puzzle is the chain with the maximum score:

maximum $ fmap score chains score :: Chain -> Int score = sum . fmap score1 where score1 (m, n) = m + n

## Conclusion

The solution that I described in this post was not the first one that came to my mind. I could have persevered with the more obvious approach of implementing a big recursive function or a series of smaller mutually recursive ones. I’m glad I didn’t. I have found out that I’m much more productive when I can reason in terms of applying transformations to data structures.

You might think that a data structure that contains all admissible chains of dominoes would be too large to fit comfortably in memory, and you would probably be right in a *strict* language. But Haskell is a *lazy* language, and data structures work more often as control structures than as storage for data.

The use of recursion schemes further simplifies programming. You can design algebras and coalgebras as non-recursive functions, which are much easier to reason about, and then apply them to recursive data structures using catamorphisms and anamorphisms. You can even combine them into hylomorphisms.

It’s worth mentioning that we routinely apply these techniques to lists. I already mentioned that a fold is nothing but a list catamorphism. The functor in question can be written as:

data ListF e a = Nil | Cons e a deriving Functor

A list is a fixed point of this functor:

type List e = Fix (ListF e)

An algebra for the list functor is implemented by pattern matching on its two constructors:

alg :: ListF e a -> a alg Nil = z alg (Cons e a) = f e a

Notice that a list algebra is parameterized by two items: the value `z`

and the function `f :: e -> a -> a`

. These are exactly the parameters to `foldr`

. So, when you are calling `foldr`

, you are defining an algebra and performing a list catamorphism.

Likewise, a list anamorphism takes a coalgebra and a seed and produces a list. Finite lists are produced by the anamorphism called `unfoldr`

:

unfoldr :: (b -> Maybe (a, b)) -> b -> [a]

You can learn more about algebras and coalgebras from the point of view of category theory, in another blog post.

The source code for this post is available on GitHub.

December 29, 2017 at 11:35 pm

Great article! One correction – it should be “so we create a node with a 12” not 5.

December 30, 2017 at 12:47 am

In the text, “so we create a node with a 5…” should read “so we create a node with a 12…”

December 30, 2017 at 6:31 am

You mention in the first paragraph that you applied the concepts you learned in category theory “in anger”. Why is that? Could you elaborate?

December 30, 2017 at 8:20 am

It’s an expression. It means I used it not just to illustrate the concept but to solve an actual problem.

December 30, 2017 at 9:05 am

Ah, okay. Didn’t know. Thanks for explaining.

January 4, 2018 at 4:32 am

Related thoughts about generalized hylomorphism http://comonad.com/reader/2008/generalized-hylomorphisms/ and Control.Recursion https://www.eyrie.org/~zednenem/2004/hsce/Control.Recursion.html fold/unfold/refold

April 16, 2018 at 12:22 pm

I almost understand this, but I think I’m missing something. The first step is to build the tree using a Coalgebra with type b, then fold it using an Algebra with type a. It would seem there’s a conversion (b -> a) missing somewhere, yet the code compiles and works. Can you help with my error in thinking? Thanks!

April 16, 2018 at 5:57 pm

It’s a good question. Notice that the two types a and b play different roles: b is the type of the seed; and a is the type of the final result, or the accumulator. In principle they have no relation to the type of the tree in the middle. The type of the tree is Fix f, where f is TreeF. It doesn’t matter what type you apply TreeF to — it doesn’t appear in Fix f. What matters here is the load-carrying Ints that are hidden in each node.

April 17, 2018 at 1:01 pm

Thanks for posting this entry, it was invaluable. I just used this technique to solve a Rosalind challenge, to reconstitute a DNA strand from a smaller collection of overlapping snippets (given in random order).

Here’s the code if you’re interested in checking it out:

https://github.com/jasonincanada/rosalind/blob/master/long.hs

April 22, 2018 at 7:41 am

Hello Bartosz, Great post! I spend a full morning with it. A good use of my time to help consolidate the powerful

`hylomorphism`

pattern. Taking us through how you were thinking about the design was illuminating in that it describes a motivation that is critical to taking these skills elsewhere in our programming work.In a related effort to translate the Category to practice, I wrote something that I would love to get your thoughts on; after all, my time spent on your blog contributed to the idea.

I hope the ideas might also be relevant for the fantastic book you are putting together. Here is the link (https://stackoverflow.com/questions/3870088/a-monad-is-just-a-monoid-in-the-category-of-endofunctors-whats-the-proble%E2%85%BF/49950668#49950668).

One of my conclusions states that the

similaritiesbetween our choices for the`hylomorphism`

, i.e., choices between our two functorial, recursive computing structures, are not sufficiently described. Category that describes Monad, Lax, Day and the like, somehow does not resonate with a code-building intuition. The good news is that recently Rivas and Jaskelioff (Science of Computer Programming, Sep 2017), describes a way to unify these Categories in way that augments the importance of this understanding.My hypothesis: a great deal of misunderstanding has been caused by our using a monad to describe many of the important Categorical presentations of core computing concepts, without mentioning how often the concept is equally true for Applicatives and Arrows (among others).

Case in point, “monads are just monoids in the category of endofunctors”. Reputation aside, once I started to internalized what that means, what I found most compelling, is how much this infamous quote describes not only the power of a monad, but is equally relevant for describing the power of Applicatives, Arrows and others.

To the topic at hand, our choices for hylomorphic constructs in Haskell.

E

April 23, 2018 at 2:27 am

I noticed that in the original problem you only need to output the sum of the dominoes, and not the actual chain. In that case there is a much simpler algorithm: sum up all the numbers that belong to an even number of dominoes – you’ll never end your walk in one of them. Then of what’s left, pick the highest number, this will end the chain. As a preliminary step you’ll need to exclude all dominoes that aren’t reachable from 0.

Of course if you’re actually interested in the result chain then a recursive traversal is needed.

July 24, 2019 at 3:16 pm

[…] Aquinas. Hylomorphism is also a word prominently used by computer scientists to characterize recursive properties of Hadoop and other Big Data processing and analytics software platforms used for machine learning […]