December | 2017 | Bartosz Milewski's Programming Cafe

Trying to improve my Haskell coding skills, I decided to test myself at solving the 2017 Advent of Code problems. It’s been a lot of fun and a great learning experience. One problem in particular stood out for me because, for the first time, it let me apply, in anger, the ideas I learned from category theory. But I’m not going to talk about category theory this time, just about programming.

The problem is really about dominoes. You get a set of dominoes, or pairs of numbers (the twist is that the numbers are not capped at 6), and you are supposed to build a chain, in which the numbers are matched between consecutive pieces. For instance, the chain [(0, 5), (5, 12), (12, 12), (12, 1)] is admissible. Like in the real game, the pieces can be turned around, so (1, 3) can be also used as (3, 1). The goal is to build a chain that starts from zero and maximizes the score, which is the sum of numbers on the pieces used.

The algorithm is pretty straightforward. You put all dominoes in a data structure that lets you quickly pull the pieces you need, you recursively build all possible chains, evaluate their sums, and pick the winner.

Let’s start with some data structures.

type Piece = (Int, Int)
type Chain = [Piece]

At each step in the procedure, we will be looking for a domino with a number that matches the end of the current chain. If the chain is [(0, 5), (5, 12)], we will be looking for pieces with 12 on one end. It’s best to organize pieces in a map indexed by these numbers. To allow for turning the dominoes, we’ll add each piece twice. For instance, the piece (12, 1) will be added as (12, 1) and (1, 12). We can make a small optimization for symmetric dominoes, like (12, 12), by adding them only once.

We’ll use the Map from the Prelude:

import qualified Data.Map as Map

The Map we’ll be using is:

type Pool = Map.Map Int [Int]

The key is an integer, the value is a list of integers corresponding to the other ends of pieces. This is made clear by the way we insert each piece in the map:

addPiece :: Piece -> Pool -> Pool
addPiece (m, n) = if m /= n 
                  then add m n . add n m
                  else add m n
  where 
    add m n pool = 
      case Map.lookup m pool of
        Nothing  -> Map.insert m [n] pool
        Just lst -> Map.insert m (n : lst) pool

I used point-free notation. If that’s confusing, here’s the translation:

addPiece :: Piece -> Pool -> Pool
addPiece (m, n) pool = if m /= n 
                       then add m n (add n m pool)
                       else add m n pool

As I said, each piece is added twice, except for the symmetric ones.

After using a piece in a chain, we’ll have to remove it from the pool:

removePiece :: Piece -> Pool -> Pool
removePiece (m, n) = if m /= n
                     then rem m n . rem n m
                     else rem m n
  where
    rem :: Int -> Int -> Pool -> Pool
    rem m n pool = 
      case fromJust $ Map.lookup m pool of
        []  -> Map.delete m pool
        lst -> Map.insert m (delete n lst) pool

You might be wondering why I’m using a partial function fromJust. In industrial-strength code I would pattern match on the Maybe and issue a diagnostic if the piece were not found. Here I’m fine with a fatal exception if there’s a bug in my reasoning.

It’s worth mentioning that, like all data structures in Haskell, Map is a persistent data structure. It means that it’s never modified in place, and its previous versions persist. This is invaluable in this kind of recursive algorithms, where we use backtracking to explore multiple paths.

The input of the puzzle is a list of pieces. We’ll start by inserting them into our map. In functional programming we never think in terms of loops: we transform data. A list of pieces is a (recursive) data structure. We want to traverse it and accumulate the information stored in it into a Map. This kind of transformation is, in general, called a catamorphism. A list catamorphism is called a fold. It is specified by two things: (1) its action on the empty list (here, it turns it into Map.empty), and (2) its action on the head of the current list and the accumulated result of processing the tail. The head of the current list is a piece, and the accumulator is the Map. The function addPiece has just the right signature:

presort :: [Piece] -> Pool
presort = foldr addPiece Map.empty

I’m using a right fold, but a left fold would work fine, too. Again, this is point free notation.

Now that the preliminaries are over, let’s think about the algorithm. My first approach was to define a bunch of mutually recursive functions that would build all possible chains, score them, and then pick the best one. After a few tries, I got hopelessly bogged down in details. I took a break and started thinking.

Functional programming is all about functions, right? Using a recursive function is the correct approach. Or is it? The more you program in Haskell, the more you realize that you get the most power by considering wholesale transformations of data structures. When creating a Map of pieces, I didn’t write a recursive function over a list — I used a fold instead. Of course, behind the scenes, fold is implemented using recursion (which, thanks to tail recursion, is usually transformed into a loop). But the idea of applying transformations to data structures is what lets us soar above the sea of details and into the higher levels of abstraction.

So here’s the new idea: let’s create one gigantic data structure that contains all admissible chains built from the domino pieces at our disposal. The obvious choice is a tree. At the root we’ll have the starting number: zero, as specified in the description of the problem. All pool pieces that have a zero at one end will start a new branch. Instead of storing the whole piece at the node, we can just store the second number — the first being determined by the parent. So a piece (0, 5) starts a branch with a 5 node right below the 0 node. Next we’d look for pieces with a 5. Suppose that one of them is (5, 12), so we create a node with a 12, and so on. A tree with a variable list of branches is called a rose tree:

data Rose = NodeR Int [Rose]
  deriving Show

It’s always instructive to keep in mind at least one special boundary case. Consider what would happen if (0, 5) were the only piece in the pool. We’d end up with the following tree:

NodeR 0 [NodeR 5 []]

We’ll come back to this example later.

The next question is, how do we build such a tree? We start with a set of dominoes gathered in a Map. At every step in the algorithm we pick a matching domino, remove it from the pool, and start a new subtree. To start a subtree we need a number and a pool of remaining pieces. Let’s call this combination a seed.

The process of building a recursive data structure from a seed is called anamorphism. It’s a well studied and well understood process, so let’s try to apply it in our case. The key is to separate the big picture from the small picture. The big picture is the recursive data structure — the rose tree, in our case. The small picture is what happens at a single node.

Let’s start with the small picture. We are given a seed of the type (Int, Pool). We use the number as a key to retrieve a list of matching pieces from the Pool (strictly speaking, just a list of numbers corresponding to the other ends of the pieces). Each piece will start a new subtree. The seed for such a subtree consists of the number at the other end of the piece and a new Pool with the piece removed. A function that produces seeds from a given seed looks like this:

grow (n, pool) = 
  case Map.lookup n pool of
    Nothing -> []
    Just ms -> [(m, removePiece (m, n) pool) | m <- ms]

Now we have to translate this to a procedure that recreates a complete tree. The trick is to split the definition of the tree into local and global pictures. The local picture is captured by this data structure:

data TreeF a = NodeF Int [a]
  deriving Functor

Here, the recursion of the original rose tree is replaced by the type parameter a. This data structure, which describes a single node, or a very shallow tree, is a functor with respect to a (the compiler is able to automatically figure out the implementation of fmap, but you can also do it by hand).

It’s important to realize that the recursive definition of a rose tree can be recovered as a fixed point of this functor. We define the fixed point as the data structure X that results from replacing a in the definition of TreeF with X. Symbolically:

X = TreeF X

In fact, this procedure of finding the fixed point can be written in all generality for any functor f. If we call the fixed point Fix f, we can define it by replacing the type argument to f with Fix f, as in:

newtype Fix f = Fix { unFix :: f (Fix f) }

Our rose tree is the fixed point of the functor TreeF:

type Tree = Fix TreeF

This splitting of the recursive part from the functor part is very convenient because it lets us use non-recursive functions to generate or traverse recursive data structures.

In particular, the procedure of unfolding a data structure from a seed is captured by a non-recursive function of the following signature:

type Coalgebra f a = a -> f a

Here, a serves as the seed that generates a single node populated with new seeds. We have already seen a function that generates seeds, we only have to cast it in the form of a coalgebra:

coalg :: Coalgebra TreeF (Int, Pool)
coalg (n, pool) = 
  case Map.lookup n pool of
    Nothing -> NodeF n []
    Just ms -> NodeF n [(m, removePiece (m, n) pool) | m <- ms]

The pièce de résistance is the formula that uses a given coalgebra to unfold a recursive date structure. It’s called the anamorphism:

ana :: Functor f => Coalgebra f a -> a -> Fix f
ana coalg = Fix . fmap (ana coalg) . coalg

Here’s the play-by-play: The anamorphism takes a seed and applies the coalgebra to it. That generates a single node with new seeds in place of children. Then it fmaps the whole anamorphism over this node, thus unfolding the seeds into full-blown trees. Finally, it applies the constructor Fix to produce the final tree. Notice that this is a recursive definition.

We are now in a position to build a tree that contains all admissible chains of dominoes. We do it by applying the anamorphism to our coalgebra:

tree = ana coalg

Once we have this tree, we could traverse it, or fold it, to retrieve all the chains and find the best one.

But once we have our tree in the form of a fixed point, we can be smart about folds as well. The procedure is essentially the same, except that now we are collecting information from the nodes of a tree. To do this, we define a non-recursive function called the algebra:

type Algebra f a = f a -> a

The type a is called the carrier of the algebra. It plays the role of the accumulator of data.

We are interested in the algebra that would help us collect chains of dominoes from our rose tree. Suppose that we have already applied this algebra to all children of a particular node. Each child tree would produce its own list of chains. Our goal is to extend those chains by adding one more piece that is determined by the current node. Let’s start with our earlier trivial case of a tree that contains a single piece (0, 5):

NodeR 0 [Node 5 []]

We replace the leaf node with some value x of the still unspecified carrier type. We get:

NodeR 0 x

Obviously, x must contain the number 5, to let us recover the original piece (0, 5). The result of applying the algebra to the top node must produce the chain [(0, 5)]. These two pieces of information suggest the carrier type to be a combination of a number and a list of chains. The leaf node is turned to (5, []), and the top node produces (0, [[(0, 5)]]).

With this choice of the carrier type, the algebra is easy to implement:

chainAlg :: Algebra TreeF (Int, [Chain])
chainAlg (NodeF n []) = (n, [])
chainAlg (NodeF n lst) = (n, concat [push (n, m) bs | (m, bs) <- lst])
  where
    push :: (Int, Int) -> [Chain] -> [Chain]
    push (n, m) [] = [[(n, m)]]
    push (n, m) bs = [(n, m) : br | br <- bs]]

For the leaf (a node with no children), we return the number stored in it together with an empty list. Otherwise, we gather the chains from children. If a child returns an empty list of chains, meaning it was a leaf, we create a single-piece chain. If the list is not empty, we prepend a new piece to all the chains. We then concatenate all lists of chains into one list.

All that remains is to apply our algebra recursively to the whole tree. Again, this can be done in full generality using a catamorphism:

cata :: Functor f => Algebra f a -> Fix f -> a
cata alg = alg . fmap (cata alg) . unFix

We start by stripping the fixed point constructor using unFix to expose a node, apply the catamorphism to all its children, and apply the algebra to the node.

To summarize: we use an anamorphism to create a tree, then use a catamorphism to convert the tree to a list of chains. Notice that we don’t need the tree itself — we only use it to drive the algorithm. Because Haskell is lazy, the tree is evaluated on demand, node by node, as it is walked by the catamorphism.

This combination of an anamorphism followed immediately by a catamorphism comes up often enough to merit its own name. It’s called a hylomorphism, and can be written concisely as:

hylo :: Functor f => Algebra f a -> Coalgebra f b -> b -> a
hylo f g = f . fmap (hylo f g) . g

In our example, we produce a list of chains using a hylomorphism:

let (_, chains) = hylo chainAlg coalg (0, pool)

The solution of the puzzle is the chain with the maximum score:

maximum $ fmap score chains

score :: Chain -> Int
score = sum . fmap score1
  where score1 (m, n) = m + n

Conclusion

The solution that I described in this post was not the first one that came to my mind. I could have persevered with the more obvious approach of implementing a big recursive function or a series of smaller mutually recursive ones. I’m glad I didn’t. I have found out that I’m much more productive when I can reason in terms of applying transformations to data structures.

You might think that a data structure that contains all admissible chains of dominoes would be too large to fit comfortably in memory, and you would probably be right in a strict language. But Haskell is a lazy language, and data structures work more often as control structures than as storage for data.

The use of recursion schemes further simplifies programming. You can design algebras and coalgebras as non-recursive functions, which are much easier to reason about, and then apply them to recursive data structures using catamorphisms and anamorphisms. You can even combine them into hylomorphisms.

It’s worth mentioning that we routinely apply these techniques to lists. I already mentioned that a fold is nothing but a list catamorphism. The functor in question can be written as:

data ListF e a = Nil | Cons e a
  deriving Functor

A list is a fixed point of this functor:

type List e = Fix (ListF e)

An algebra for the list functor is implemented by pattern matching on its two constructors:

alg :: ListF e a -> a
alg Nil = z
alg (Cons e a) = f e a

Notice that a list algebra is parameterized by two items: the value z and the function f :: e -> a -> a. These are exactly the parameters to foldr. So, when you are calling foldr, you are defining an algebra and performing a list catamorphism.

Likewise, a list anamorphism takes a coalgebra and a seed and produces a list. Finite lists are produced by the anamorphism called unfoldr:

unfoldr :: (b -> Maybe (a, b)) -> b -> [a]

You can learn more about algebras and coalgebras from the point of view of category theory, in another blog post.

The source code for this post is available on GitHub.

Bartosz Milewski's Programming Cafe

December 2017

Stalking a Hylomorphism in the Wild

Conclusion

Top Posts

License

Blogroll

Follow Me

Archives