Persistent trees are more interesting than persistent lists, which were the topic of my previous blog. In this installment I will concentrate on binary search trees. Such trees store values that can be compared to each other (they support total ordering). Such trees may be used to implement sets, multisets, or associated arrays. Here I will focus on the simplest of those, the set — the others are an easy extensions of the same scheme.

A set must support insertion, and membership test (I’ll leave deletion as an exercise). These operations should be doable, on average, in logarithmic time, O(log(N)). Only balanced trees, however, can guarantee logarithmic time even in the worst case. A simple tree may sometimes degenerate to a singly-linked list, with performance dropping to O(N). I will start with a simple persistent tree and then proceed with a balanced red-black tree.

## Persistent Binary Search Tree

A tree is either empty or contains a left tree, a value, and a right tree.

This definition translates into a data structure with two constructors:

```template<class T>
class Tree {
public:
Tree(); // empty tree
Tree(Tree const & lft, T val, Tree const & rgt)
};```

Just as we did with persistent lists, we’ll encode the empty/non-empty tree using null/non-null (shared) pointer to a node. A `Node` represents a non-empty tree:

```   struct Node
{
Node(std::shared_ptr<const Node> const & lft
, T val
, std::shared_ptr<const Node> const & rgt)
: _lft(lft), _val(val), _rgt(rgt)
{}

std::shared_ptr<const Node> _lft;
T _val;
std::shared_ptr<const Node> _rgt;
};```

Here’s the complete construction/deconstruction part of the tree. Notice how similar it is to the list from my previous post. All these methods are const O(1) time, as expected. As before, the trick is to construct a new object (`Tree`) from big immutable chunks (`lft` and `rgt`), which can be safely put inside shared pointers without the need for deep copying.

```template<class T>
class Tree
{
struct Node;
explicit Tree(std::shared_ptr<const Node> const & node)
: _root(node) {}
public:
Tree() {}
Tree(Tree const & lft, T val, Tree const & rgt)
: _root(std::make_shared<const Node>(lft._root, val, rgt._root))
{
assert(lft.isEmpty() || lft.root() < val);
assert(rgt.isEmpty() || val < rgt.root());
}
bool isEmpty() const { return !_root; }
T root() const {
assert(!isEmpty());
return _root->_val;
}
Tree left() const {
assert(!isEmpty());
return Tree(_root->_lft);
}
Tree right() const {
assert(!isEmpty());
return Tree(_root->_rgt);
}
private:
std::shared_ptr<const Node> _root;
};```

### Insert

The persistent nature of the tree manifests itself in the implementation of `insert`. Instead of modifying the existing tree, `insert` creates a new tree with the new element inserted in the right place. The implementation is recursive, so imagine that you are at a subtree of a larger tree. This subtree might be empty. Inserting an element into an empty tree means creating a single-node tree with the value being inserted, `x`, and two empty children.

On the other hand, if you’re not in an empty tree, you can retrieve the root value `y` and compare it with `x`. If `x` is less then `y`, it has to be inserted into the left child. If it’s greater, it must go into the right child. In both cases we make recursive calls to `insert`. If `x` is neither less nor greater than `y`, we assume it’s equal (that’s why we need total order) and ignore it. Remember, we are implementing a set, which does not store duplicates.

```Tree insert(T x) const {
if (isEmpty())
return Tree(Tree(), x, Tree());
T y = root();
if (x < y)
return Tree(left().insert(x), y, right());
else if (y < x)
return Tree(left(), y, right().insert(x));
else
return *this; // no duplicates
}```

Now consider how many new nodes are created during an insertion. A new node is only created in the constructor of a tree (in the code: `std::make_shared<const Node>(lft._root, val, rgt._root)`). The left and right children are not copied, they are stored by reference. At every level of `insert`, a tree constructor is called at most once. So in the worst case, when we recurse all the way to the leaves of the tree, we only create h nodes, where h is the height of the tree. If the tree is not too much out of balance its height scales like a logarithm of the number of nodes. To give you some perspective, if you store a billion values in a tree, an insertion will cost you 30 copies on average. If you need a logarithmic bound on the worst case, you’d have to use balanced trees (see later).

If you study the algorithm more closely, you’ll notice that only the nodes that are on the path from the root to the point of insertion are modified.

Testing for membership in a persistent tree is no different than in a non-persistent one. Here’s the recursive algorithm:

```bool member(T x) const {
if (isEmpty())
return false;
T y = root();
if (x < y)
return left().member(x);
else if (y < x)
return right().member(x);
else
return true;
}```

When using C++11, you might take advantage of the initializer list constructor to initialize a tree in one big swoop like this:

`Tree t{ 50, 40, 30, 10, 20, 30, 100, 0, 45, 55, 25, 15 };`

.

Here’s the implementation of such constructor, which works in O(N*log(N)) average time (notice that it effectively sorts the elements, and O(N*log(N)) is the expected asymptotic behavior for sort):

```Tree(std::initializer_list<T> init) {
Tree t;
for (T v: init) {
t = t.insert(v);
}
_root = t._root;
}```

## Persistent Red-Black Tree

If you want to keep your tree reasonably balanced — that is guarantee that its height is on the order of log(N) — you must do some rebalancing after inserts (or deletes). Care has to be taken to make sure that rebalancing doesn’t change the logarithmic behavior of those operations. The balance is often expressed using some invariants. You can’t just require that every path from root to leaf be of equal length, because that would constrain the number of elements to be always a power of two. So you must give it some slack.

In the case of a red-black tree, the invariants are formulated in terms of colors. Every node in the tree is marked as either red or black. These are the two invariants that have to be preserved by every operation:

1. Red invariant: No red node can have a red child
2. Black invariant: Every path from root to an empty leaf node must contain the same number of black nodes — the black height of the tree.

This way, if the shortest path in a tree is all black, the longest path could only be twice as long, containing one red node between each pair of black nodes. The height of such a tree could only vary between (all black) log(N) and (maximum red) 2*log(N).

With these constraints in mind, the re-balancing can be done in log(N) time by localizing the modifications to the nearest vicinity of the path from the root to the point of insertion or deletion.

Let’s start with basic definitions. The node of the tree will now store its color:

`enum Color { R, B };`

Otherwise, it’s the same as before:

```    struct Node
{
Node(Color c,
std::shared_ptr const & lft,
T val,
std::shared_ptr const & rgt)
: _c(c), _lft(lft), _val(val), _rgt(rgt)
{}
Color _c;
std::shared_ptr _lft;
T _val;
std::shared_ptr _rgt;
};```

An empty tree will be considered black by convention.

The membership test ignores colors so we don’t have to re-implement it. In fact the search performance of a persistent RB Tree is exactly the same as that of an imperative RB Tree. You pay no penalty for persistence in search.

With insertion, you pay the penalty of having to copy the path from root to the insertion point, which doesn’t change its O(log(N)) asymptotic behavior. As I explained before, what you get in exchange is immutability of every copy of your data structure.

### The Balancing

Let’s have a look at the previous version of `insert` and figure out how to modify it so the result preserves the RB Tree invariants.

```Tree insert(T x) const {
if (isEmpty())
return Tree(Tree(), x, Tree());
T y = root();
if (x < y)
return Tree(left().insert(x), y, right());
else if (y < x)
return Tree(left(), y, right().insert(x));
else
return *this; // no duplicates
}```

Let’s first consider the most difficult scenario: the insertion into a maximum capacity tree for a given black height. Such a tree has alternating levels of all black and all red nodes. The only way to increase its capacity is to increase its black height. The cheapest way to add one more black level to all paths (thus preserving the black invariant) is to do it at the root (for instance, lengthening all the path at the leaves would require O(N) red-to-black re-paintings).

So here’s the plan: We’ll insert a new node at the leaf level and make it red. This won’t break the black invariant, but may break the red invariant (if the parent node was red). We’ll then retrace our steps back to the root, percolating any red violation up. Then, at the top level, we’ll paint the resulting root black, thus killing two birds with one stone: If we ended up with a red violation at the top, this will fix it and, at the same time, increase the black height of the whole tree.

It’s important that during percolation we never break the black invariant.

So here’s how we execute this plan: `insert` will call the recursive insertion/re-balancing method `ins`, which might return a red-topped tree. We’ll paint that root black (if it’s already black, it won’t change anything) and return it to the caller:

```RBTree insert(T x) const {
RBTree t = ins(x);
return RBTree(B, t.left(), t.root(), t.right());
}```

In the implementation of `ins`, the first case deals with an empty tree. This situation happens when it’s the first insertion into an empty tree or when, during the recursive process, we’ve reached the insertion point at the bottom of the tree. We create a red node and return it to the caller:

```if (isEmpty())
return RBTree(R, RBTree(), x, RBTree());```

Notice that, if this new node was inserted below another red node, we are creating a red violation. If that node was the root of the whole tree, `insert` will repaint it immediately. If it weren’t, and we pop one level up from recursion, we’ll see that violation. We can’t fix it at that point — for that we’ll have to pop one more level, up to the black parent, where we have more nodes to work with.

Here are the details of `ins`: We’ll follow the same logic as in the non-balanced tree, thus preserving the ordering of values; but instead of reconstructing the result tree on the spot we’ll call a function `balance`, which will do that for us in a semi-balanced way (that is, with a possibility of a red violation, but only at the very top).

```RBTree ins(T x) const
{
if (isEmpty())
return RBTree(R, RBTree(), x, RBTree());
T y = root();
Color c = rootColor();
if (x < y)
return balance(c, left().ins(x), y, right());
else if (y < x)
return balance(c, left(), y, right().ins(x));
else
return *this; // no duplicates
}```

Just like the constructor of the red-black tree, `balance` takes the following arguments: color, left subtree, value, and right subtree. Depending on the result of the comparison, the new element is inserted either into the left or the right subtree.

As I explained, `balance`, and consequently `ins`, cannot fix the red violation when they are sitting on it. All they can do is to make sure that the violation is at the very top of the tree they return. So when we call `balance` with the result of `ins`, as in:

`balance(c, left().ins(x), y, right())`

or:

`balance(c, left(), y, right().ins(x))`

the left or the right subtree, respectively, may be semi-balanced. This is fine because `balance` can then rotate this violation away.

So the interesting cases for `balance` are the ones that rebuild a black node with either the left or the right subtree having a red violation at the top.

There are four possible cases depending on the position of the violation. In each case we can rearrange the nodes in such a way that the violation disappears and the ordering is preserved. In the pictures below I have numbered the nodes and subtrees according to the order of the values stored in them. Remember that all values in the left subtree are less than the value stored in the node, which in turn is less than all the values in the right subtree.

Each rotation creates a tree that preserves both invariants. Notice, however, that the result of the rotation is always red-tipped, even though we were rebuilding a node that was originally black. So if the parent of that node was red, our caller will produce a red violation (it will call `balance` with red color as its argument, which will fall through to the default case). This violation will be then dealt with at the parent’s parent level.

```static RBTree balance(Color c
, RBTree const & lft
, T x
, RBTree const & rgt)
{
if (c == B && lft.doubledLeft())
return RBTree(R
, lft.left().paint(B)
, lft.root()
, RBTree(B, lft.right(), x, rgt));
else if (c == B && lft.doubledRight())
return RBTree(R
, RBTree(B, lft.left(), lft.root(), lft.right().left())
, lft.right().root()
, RBTree(B, lft.right().right(), x, rgt));
else if (c == B && rgt.doubledLeft())
return RBTree(R
, RBTree(B, lft, x, rgt.left().left())
, rgt.left().root()
, RBTree(B, rgt.left().right(), rgt.root(), rgt.right()));
else if (c == B && rgt.doubledRight())
return RBTree(R
, RBTree(B, lft, x, rgt.left())
, rgt.root()
, rgt.right().paint(B));
else
return RBTree(c, lft, x, rgt);
}```

For completeness, here are the auxiliary methods used in the implementation of `balance`:

```bool doubledLeft() const {
return !isEmpty()
&& rootColor() == R
&& !left().isEmpty()
&& left().rootColor() == R;
}
bool doubledRight() const {
return !isEmpty()
&& rootColor() == R
&& !right().isEmpty()
&& right().rootColor() == R;
}
RBTree paint(Color c) const {
assert(!isEmpty());
return RBTree(c, left(), root(), right());
}```

## Conclusion

Our implementation of the persistent red-black tree follows the Chris Okasaki’s book. As Chris asserts, this is one of the fastest implementations there is, and he offers hints to make it even faster. Of course there are many imperative implementations of red-black trees, including STL’s `std::set` and `std::map`. Persistent RB-trees match their performance perfectly when it comes to searching. Insertion and deletion, which are O(log(N)) for either implementation, are slower by a constant factor because of the need to copy the path from root to leaf. On the other hand, the persistent implementation is thread-safe and synchronization-free (except for reference counting in `shared_ptr` — see discussion in my previous blog).

Complete code is available at GitHub.

## Acknowledgment

I’d like to thank Eric Niebler for reading the draft and telling me which of my explanations were more abstruse than usual.

For comparison, here’s the original Haskell code. You can see that the C++ implementation preserves its structure pretty well. With proper optimization tricks (unboxing and eager evaluation) the Haskell code should perform as well as its C++ translation.

Regular (unbalanced) binary search tree:

```data Tree a = Empty | Node (Tree a) a (Tree a)

member x Empty = False
member x (Node lft y rgt) =
if x < y then member x lft
else if y < x then member x rgt
else True

insert x Empty = Node Empty x Empty
insert x t@(Node lft y rgt) =
if x < y then Node (insert x lft) y rgt
else if y < x then Node lft y (insert x rgt)
else t```

Balanced Red-Black tree:

```data Color = R | B

data Tree a = Empty | Node Color (Tree a) a (Tree a)

member x Empty = False
member x (Node _ lft y rgt) =
if x < y then member x lft
else if y < x then member x rgt
else True

insert x tree = Node B left val right
where
ins Empty = Node R Empty x Empty
ins t@(Node c lft y rgt) =
if (x < y) then balance c (ins lft) y rgt
else if (y < x) then balance c lft y (ins rgt)
else t
Node _ left val right = ins tree -- pattern match result of ins

balance B (Node R (Node R a x b) y c) z d =
Node R (Node B a x b) y (Node B c z d)
balance B (Node R a x (Node R b y c)) z d =
Node R (Node B a x b) y (Node B c z d)
balance B a x (Node R (Node R b y c) z d) =
Node R (Node B a x b) y (Node B c z d)
balance B a x (Node R b y (Node R c z d)) =
Node R (Node B a x b) y (Node B c z d)
balance color a x b = Node color a x b```