Introduction

A Binary Search Tree (BST) is a binary treeA tree-like data structure with each node having up to two children, "Left" and "Right". where all elements are stored in search order. An example tree might look like this:

The search order mandates that for each Node, its Left and Right children are ordered Left < Node < Right. For example, 4 < 10 < 11. And 3 < 4 < 7. And so on.

Wikipedia dates the invention of Binary Search Tree back to as early as 1960.

In the C++ language, a binary search tree might be represented like shown in Listing 1.

template<typename K, typename V>
struct bstree
{
  struct node
  {
    node *left, *right;
    K key;
    V value;
  } *root;
};

Listing 1. Binary Search Tree representation.

The root pointer denotes of the tree, whereas left and right point to the two children of each node (either of which may be null). The represents the key record of an element, according to which the tree ordering is upheld. The field value on the other hand comprises of any set of "satellite" data that is associated with each key. A Binary Search Tree allows for implementing associative containers, called maps or dictionaries in different languages. If satellite data is omitted, a Binary Search Tree can model a mathematical set data structure.

Why use a binary search tree?

"I can see that there is some order to the nodes in a BST, but why use such a multi-level structure to hold my data? Why not just put it in an array where it is all neat and linearly located?"

The main benefit is that a BST allows one to search for keys in the tree without needing to exhaustively look through every data element in the tree. The next section looks at the search algorithm to see why this is the case. As a concrete example, to find if the key 7 exists in the above BST, one will search the path "root → 10 → 4 → 7", and is able to skip all the other nodes: keys 11, 3, 20 and 16 never needed to be looked at.

If data was stored in an unsorted array, one would have to look through each element to conclude if a key exists or not; or if a key was stored in a sorted array, one would have to spend linear time to make room for inserting each key.

In a BST, ifand this is a big if. the tree is balanced... meaning that all possible paths from root to any leaf are roughly the same length., performance of the commonly used data structure operations is much more appealing compared to all the other "simple" ways to store data:

Operation	Unsorted Linked List	Sorted Linked List	Unsorted Array	Sorted Array	BST
Find Min	ϴ(N)	ϴ(1)	ϴ(N)	ϴ(1)	ϴ(logN)
Search	ϴ(N)	ϴ(N)	ϴ(N)	ϴ(logN)	ϴ(logN)
Insert	ϴ(1)	ϴ(N)	Amortized O(1)	ϴ(N)	ϴ(logN)
Delete	ϴ(N)	ϴ(N)	ϴ(1)	ϴ(N)	ϴ(logN)

This article is the first in a three part series on implementing efficient Red-Black Trees.

In this first part, we will look at how to implement the common Insert, Search and Delete data structure operations in a Binary-Search Tree. To differentiate from the hundreds of existingand excellent presentations on BSTs on the internet, this article will give special attention to discussing an efficient C++ implementation.

Search

The Binary Search Tree provides a fast logarithmic time algorithmoperating under the assumption that the tree is balanced to search for any given key in the tree. Instead of needing to examine through all nodes of the tree, data is ordered so that only exactly one root→leaf path needs to be traversed down the tree to find if a given key exists.

A textbook implementation of searching a BST looks as follows.

V *search(const K &key)
{
  node *n = root;
  while(n)
  {
    if (key == n->key) return &n->value;
    if (key < n->key) n = n->left;
    else n = n->right;
  }
  return 0;
}

Listing 2. Canonical example of searching a BST.

While this presentation is great for education purposes and at least avoids the academic usei.e. using recursion where possible just because it looks elegant. As software engineers know: recursion = slow, iteration = fast.

Iteration is elegant too! of recursion, the code in Listing 2 is a bit disappointing to use in production.

The first challenge is that the key is compared twice in the inner loop, once for equality, and a second time for less-than. When the key is a built-in data type, it doesn't matter much. However, when more complex key types are neededThink for example of a key that is the state of a solitaire game, the rotation configuration of a Rubik's cube, or e.g. a path from the east coast of the US to the west coast, a better practice is to use three-way comparisons. This enables one comparison to determine both equality and inequality at the same time, and reduces the number of comparisons in the inner loop from two to one, which speeds up the code if comparing the custom key type K can be costly.

An improved implementation might look like in Listing 3.

int trinary_cmp(int a, int b) { return a - b; }

V *search(const K &key)
{
  node *n = root;
  while(n)
  {
    int c = trinary_cmp(key, n->key);
    if (c == 0) return &n->value;
    if (c < 0) n = n->left;
    else n = n->right;
  }
  return 0;
}

Listing 3. Using three-way comparisons.

In this implementation, there is only one three-way comparison of the custom key type K in the inner loop. An example implementation of three-way comparisons for integers is shown as function trinary_cmp. When an exotic key type K is used, the developer would provide a matching trinary_cmp() implementation to go along with it. Sidenote: the C++20 standard brings a built-in "spaceship operator" for three-way comparison support.

Pay careful attention to the value range of keys when migrating to three-way comparisons, as e.g. when using subtraction to perform three-way comparison of 32-bit integers a and b, the computation a - b will overflow if distance between a and b is >= 0x80000000. In such a scenario, the key range would need to be reduced from full 32 bits down to 31 bits, i.e. [0, 0x7FFFFFFF] to work.

Migrating to three-way comparisons enables for another more interesting optimization. By replacing the left and right pointers with an indexed child[2] array, the three-way comparison result can be used to index into the children, choosing the child node in a branch-free manner from the sign bit of the comparison. The end result is a tight iteration loop, as shown in Listing 4.

template<typename K, typename V>
struct bstree
{
  struct node
  {
    node *child[2];
    K key;
    V value;
  } *root;

  V *search(const K &key)
  {
    uint32_t c;
    for(node *n = root; n; n = n->child[c>>31])
      if (!(c = (uint32_t)trinary_cmp(key, n->key)))
        return &n->value;
    return 0;
  }
};

Listing 4. Indexed child array.

Disassembling this function shows the intended optimization taking effect.

bstree<int, int>::search(int const&):
        mov     rax, qword ptr [rdi]          ; Load root pointer
        test    rax, rax                      ; Test if root is zero
        je      .LBB2_1                       ; Jump to return 0 if root was zero
        mov     ecx, dword ptr [rsi]          ; Load searched key value to ecx
.LBB2_4:
        cmp     ecx, dword ptr [rax + 16]     ; Three-way compare key against current node
        je      .LBB2_5                       ; If comparison is equal, jump to return value
        setl    dl                            ; Get the sign bit of comparison (optimized >> 31)
        movzx   edx, dl                       ; Zero-extend sign bit (a x86-64 technicality)
        mov     rax, qword ptr [rax + 8*rdx]  ; Descend from node to its child using sign bit indexing
        test    rax, rax                      ; Test if child node is null
        jne     .LBB2_4                       ; If not null, jump back to process the descended node
.LBB2_1:
        xor     eax, eax                      ; Return zero
        ret
.LBB2_5:
        add     rax, 20                       ; Return value of found node
        ret

Listing 5. x86-64 disassembly of optimized search function from Clang 18.1.0.

It should be noted that some compilers are so powerful to be able to make this type of optimization in the absence of developer assist. When comparing against disassembly of the earlier naive version in Listing 3, Clang managed to pull this same indexing optimization without developer aid, whereas GCC instead resorted to a slightly bulkier construct with a CMOV instruction (with nine instead of seven instructions total in the inner loop). However, Visual Studio missed optimizing the code at allBoo VS. Microsoft, are you falling behind in the compiler game?.

Even if the reader might not appreciate "being smart" on the compiler's behalf, this indexing feature is conveniently supercharged later in conjunction with the Delete operation, where its benefits are hard to dispute. Even more, in the second part of the series, I will try to argue how knowing this indexing trick could have saved the trouble of writing a certain academic publicationI'm not kidding here..

Insert

Adding new keys to a BST must be done in a manner that preserves the Left < Node < Right search invariant. This might seem like a non-recipe at first, but it actually tells everything we need to know: since the Search algorithm has a specific location where it expects to find each key, it means that each key must be inserted exactly in the same spot where Search would look for it. Performing insertions in this manner, the search invariant is upheld.

In other words, the Search operation for a BST *is also* the Insert operation, the only difference being that when inserting, a new node is added to the spot where a null child pointer is met during the search.

Any implementation will need to decide how to handle the situation when a key with the same value already exists in the tree. Different conventions exist. In Listing 6 a practice of replacing the previous value is taken.

void insert(const K &key, const V &value)
{
  if (!root)
  {
    root = new node(key, value);
    return;
  }
  node *n = root;
  while(n)
  {
    uint32_t c = (uint32_t)trinary_cmp(key, n->key);
    if (c == 0) // The key already existed?
    {
      n->value = value; // Replace previous value.
      return;
    }
    if (n->child[c>>31]) n = n->child[c>>31]; // Descend to child node
    else
    {
      n->child[c>>31] = new node(key, value); // Create new node as child of null
      return;
    }
  }
}

Listing 6. Inserting a key.

The special case handling of the null root in the beginning of Listing 6 does not feel particularly elegant. However, by using pointers to pointers of nodes, i.e. node **, handling the root pointer can be unified for a small code size and insertion performance win. A more compact implementation is shown in Listing 7.

void insert(const K &key, const V &value)
{
  uint32_t c;
  node **n = &root;
  for(; *n; n = (*n)->child + (c>>31))
    if (!(c = (uint32_t)trinary_cmp(key, (*n)->key)))
    {
      (*n)->value = value; // Replace previous value if key already existed.
      return;
    }
  *n = new node(key, value);
}

Listing 7. Using pointers to pointers to unify inserting to the tree root. Now this reads elegant!

This type of optimization will be inconsequential to the majority of modern developers, although a small group of embedded developers, WebAssembly programmers or writers targeting page limits on a printed publication, may find it appealing.

Examining disassembled WebAssembly code, this shorter version in Listing 7 was 52 instructions and the original in Listing 6 disassembles to 69 instructions, a -24.6% decrease.

Fatal Flaw of BSTs

Let's try inserting some integers in a BST.

int trinary_cmp(int a, int b) { return a - b; }

#include "bstree.h"

int main()
{
  bstree<int, int> tree;
  for(int i = 1; i <= 5; ++i)
    tree.insert(i, i);
  tree.print();
}

Listing 8. Inserting ordered data in a BST.

The code in Listing 8 inserts numbers 1,2,3,4,5 into a BST, and outputs:

Well, that doesn't look at all like a binary tree, what happened?

The problem is that the Binary Search Tree insertion process can do nothing to ensure that the generated tree is balanced. Balancing is critically important, as it is the behavior that enables skipping most of the keys during the search. The insertion algorithm above provides no degrees of freedom: each newly inserted node has a unique spot in the tree where it needs to appear. If a tree is built from an already sorted sequence of numbers, each number is inserted at the rightmost end of the BST, and the "tree" degenerates instead into a linked listan especially space and iteration inefficient one at that...

Isn't it kind of ironic that the achilles heel of the BST data structure that is supposed to keep our data in a quickly accessible sorted order is to receive sorted input in the first place?

Well, how does one fix this? There are two general ways:

Ensure that the BST is built from a sequence of data in randomized order, maybe by shuffling all the data first, or
Don't use a BST at all, but instead utilize a fancier self-balancing BST data structureBy far the most famous one being the Red-Black tree..

To motivate that option 1 above does kind of work, here is another example that showcases that the BST insertion process does produce a decent looking binary search tree if the inserted input is randomized.

int trinary_cmp(int a, int b) { return a - b; }

#include 
#include 
#include "bstree.h"

int main()
{
  srand(time(NULL));
  bstree<int, int> tree;
  for(int i = 1; i <= 50; ++i)
    tree.insert(rand()%100, i);
  tree.print();
}

Listing 9. Inserting random ordered data in a BST.

This program prints out something like

which looks more tree-like, rather than a linked list. However even in this case, there exists still considerable imbalance. Compare for example keys 94, which can be found quickly in just two hops from the root, versus and the bottommost key 58, which requires 10 hops to find, much slower. Nevertheless, even if there is some amount of lopsidedness, the data structure does provide considerable speedups to skip most of the keys during a search.

Delete

Removing a key from a Binary Search Tree is the most complex operation with this data structure. Explaining the Delete operation is a canonical exam question in university algorithms courses, and there are actually at least two different ways that are in circulation. One of these methods is considerably simpler and maybe a bit faster, but should unfortunately be deemed as the "wrong" way, as it exhibits much worse tree balancing performance in practice. Fortunately Wikipedia describes the better method, so it rightfully gains more attention, even though a bit more complex to implement.

In this article, both methods are discussed, if nothing else to highlight the way that deletion should not be accidentally done.

Zero, one or two children?

When removing a node from a BST, finding the node to delete is easyjust run the Search algorithm with the key to delete, but then the question is: "what should be done to the children of the to-be-deleted node?" The child nodes cannot just be left floating, but should somehow be reattached to the tree, since only one node is intended to be removed.

If the node to be removed has no children, there is no problem. Just cut the to-be-deleted node from the tree, and the operation is done.

If the node has only one child, the simplest attempt is to reattach that child node back to the tree in the place where the node-to-be-deleted was rooted at. With a little bit of careful examination, one should be able to convince themselves that this scheme does preserve the Left < Node < Right ordering invariant, so is a correct way to delete a node.

The complexity with deleting comes from the case when the node-to-be-deleted has two children, since both cannot be naively reattached to the parent, as the parent would then have to somehow carry three children.

The "wrong" way to delete a node

A simple way to Delete a node X, that has two children L and R, from a tree is to pick one of the children of X, say, L, and conceptually "merge" the deleted node with this child node L.

This merging of the node X with its left child L is shown in a big purple circle the following picture.

The problem with this merged purple node is that it has three children: a, m and r, so after this type of merge, the tree would not even be a binary tree anymore.

However, with careful examination we observe that two of these child nodes, a and r, do follow the search invariant, i.e. a < Purple < r. So the two child nodes a and r would work out well as the left and right nodes for the merged purple node. The third child node, the "middle" node m, is a problem that needs to be fixed.

It is conceptually easy to fix this conflict with m, by relocating the whole subtree rooted at m into a new place in the tree where it will follow the search invariant. This new place is to place m as the left child of the leftmost descendant of r.

This seems like a mouthful, but is rather straightforward in code. The leftmost descendant of r is marked as s in the above picture. By definition it does not have a left child. So m can become its left child.

The reason this relocation works is that the leftmost descendant is the next node in so-called successor order when traversing from m (after x has been removed).

node *left_descendant(node *n) const
{
  while(n->left) n = n->left;
  return n;
}

void remove(const K &key)
{
  uint32_t c;
  for(node **p = &root, *n = root; n; p = (c>>31) ? n->left : n->right, n = *p)
    if (!(c = (uint32_t)trinary_cmp(key, n->key)))
    {
      if (n->right && n->left)
      {
        node *hoisted = *p = n->left;
        left_descendant(n->right)->left = hoisted->right;
        hoisted->right = n->right;
      }
      else *p = n->right ? n->right : n->left;
      delete n;
      return;
    }
}

Listing 10. The suboptimal way to delete a node in BST.

The implementation is quite simple and pleasing. And it is correct in the sense of preserving the search invariant. But why exactly is this method suboptimal?

Well that is because of the whole subtree rooted at m being relocated as the child of the left descendant s. This change will potentially relocate an already deep subtree even deeper still, so there is no bound on how much more lopsided the resulting tree will get. So even if this kind of a simple Delete implementation might be appealing due to the simplicity of implementation, it is best to stay away from it.

The right way to delete a node

The correct method of deleting a node is to perform a mutation of the tree that will only ever cause a height imbalance of one level in the worst case.

This can be achieved by replacing the node-to-be-deleted by its in-order successor, i.e. the leftmost descendant of the node's right child. Because the successor node is the next element in sorted order, replacing a node by its successor retains the search invariant.

In other words, the Delete algorithm of a node with two children works by first deleting its in-order successor from the tree. The in-order successor of a node with two children cannot by definition have a left child, so will be straightforward to delete using the one-child case above. Then the node-to-be-deleted is replaced by that successor.

To the practicing programmer, there is a subtle gotcha that complicates the code. Since replacing a node with its in-order successor amounts to linking the successor to the original node's children, what happens if the node's immediate right child *is* the successor? If this special case is not taken into account, program code would attempt to link the successor node to be its own child.

With that gotcha in mind, Listing 11 shows an implementation.

void remove(const K &key)
{
  uint32_t c;
  for(node **p = &root, *n = root; n; p = n->child + (c>>31), n = *p)
    if (!(c = (uint32_t)trinary_cmp(key, n->key)))
    {
      if (n->child[0] && n->child[1]) // Node to be deleted has two children?
      {
        node *right = n->child[1], *y = right, *parent;
        if (y->child[0]) // Are "successor of n" and "right child of n" two different nodes?
        {
          do { parent = y; y = y->child[0]; } while(y->child[0]); // Walk left spine to the successor
          parent->child[0] = y->child[1]; // Remove successor from the tree
          y->child[1] = right; // Link successor's right child
        }
        y->child[0] = n->child[0]; // Link successor's left child
        *p = y; // Link parent to successor
      }
      else *p = n->child[0] ? n->child[0] : n->child[1];
      delete n;
      return;
    }
}

Listing 11. The proper way to delete a node in BST.

The implementation ends out to be a bit more complex than the suboptimal case, but it is more than worth it due to the better balanced trees that it provides.

However, the code is arguably a bit difficult to read given all the child[0] and child[1] indexing.. isn't that proving out to have been a bad idea? Why not have much more clearly named 'left' and 'right' pointers? The next section addresses that.

The imbalance with deletion

Like was seen in the previous section about insertion, in order for BSTs to be performant, one needs to have trees that are as balanced as possible. However, both inserting and deleting nodes does not contribute to keeping balance in the tree. The key values for insertions and deletions must appear in random order for the tree to stay healthy and balanced. Otherwise the tree will become lopsided as more and more tree operations are performed.

What about Delete then? Listing 11 showed how to delete a node by replacing it with its in-order successor. The net result is that the right-hand subtree of the node to be deleted will become lopsided compared to the left side subtree. After multiple deletes, this would cause the right side of the tree to become substantially shorter compared to the left sides, a systemic imbalance.

However there is no special reason why the in-order successor should always be used as the replacement in the two-children case. The node-to-be-deleted could have been replaced by its in-order predecessor just as well. So we get an idea: how about balancing the deletion by randomly choosing which direction to lean to for the deletion?

If an application performs a lot of deletes, it is absolutely crucial to perform this kind of randomization, or otherwise the tree will become very unbalanced after several deletes.

One way to do such a randomization might be to call out to rand()%2 in the code, and then have two special cases for predecessor vs successor-leaning key removal. But this would be disappointingly slow. Additionally it would introduce an unwanted dependency to a random number generator that may not be desirable.

Another idea might be to maintain a single bit counter somewhere that flip flops itself after each delete, as to keep count of replacing the node with the predecessor every second time.

However that is not so elegant either, since it adds state to each instance of the data structure. Or if the flipflop bit was global, then it would amount to undesirable cache line sharing across multiple threads in a program.

Rather, an effectively cheap and stateless way to obtain a source of pseudo-random decisions is to examine parity from the pointer values of the nodes to be deleted. The POPCNT instruction is available on ARM, x86 and WebAssembly that allows making a "good enough" random decision on each pointer, at practically zero performance cost. This is a great general technique for one to have in a data structures randomization toolbox.

When introducing this kind of random decision, it would be annoying to have to duplicate the deletion code twice, once for lean-to-predecessor and a second time for lean-to-successor handling. This is where the previous decision to use indexed child arrays will come to shine. With child indexing, we can effectively implement both of these cases for free with no increase to code size. Example code is shown in Listing 12. Do not be alarmed of how accessing the popcnt compiler intrinsics looks like in a portable Visual Studio/GCC/Clang x86/x64 cross-compiler fashion.

#if defined(_MSC_VER) && defined(_M_AMD64)
#define popcnt_ptr(x) __popcnt64((uintptr_t)(x))
#elif defined(_MSC_VER)
#define popcnt_ptr(x) __popcnt((uintptr_t)(x))
#elif defined(__GNUC__) && UINTPTR_MAX > UINT_MAX
#define popcnt_ptr(x) __builtin_popcountll((uintptr_t)(x))
#elif defined(__GNUC__)
#define popcnt_ptr(x) __builtin_popcount((uintptr_t)(x))
#endif

void remove(const K &key)
{
  uint32_t c;
  for(node **p = &root, *n = root; n; p = n->child + (c>>31), n = *p)
    if (!(c = (uint32_t)trinary_cmp(key, n->key)))
    {
      if (n->child[0] && n->child[1])
      {
        int s = popcnt_ptr(n) & 1; // Pseudo-random shuffle which direction to unbalance
        node *right = n->child[1-s], *y = right, *parent;
        if (y->child[s])
        {
          do { parent = y; y = y->child[s]; } while(y->child[s]);
          parent->child[s] = y->child[1-s];
          y->child[1-s] = right;
        }
        y->child[s] = n->child[s];
        *p = y;
      }
      else *p = n->child[0] ? n->child[0] : n->child[1];
      delete n;
      return;
    }
}

Listing 12. Random leaning deletion.

The newly added variable s decides which direction the deletion of a two-child node should lean to.

Peeping in the disassembly, it looks like Clang and GCC opt to use the SETNP instruction to acquire the bit parity of a pointer, rather than actually even call the POPCNT instruction. Visual Studio does invoke POPCNT instead. Consulting Agner Fog's tables, it does seem that the SETNP instruction has better throughput and latency on Intel CPUs compared to the POPCNT instruction. Neat!

Source Code

Final program source code is shown in Listing 11. This listing is not perfect in the sense of exception safety, low memory scenarios or seamless integration with STL iterators or Boost, rather it is provided with a compact presentation in mind to bring together the code shown above. Some additional operations and boilerplate for C++ object safety is added.

#pragma once

#include <stdint.h>
#include <stddef.h>

#if defined(_MSC_VER) && defined(_M_AMD64)
#define popcnt_ptr(x) __popcnt64((uintptr_t)(x))
#elif defined(_MSC_VER)
#define popcnt_ptr(x) __popcnt((uintptr_t)(x))
#elif defined(__GNUC__) && UINTPTR_MAX > UINT_MAX
#define popcnt_ptr(x) __builtin_popcountll((uintptr_t)(x))
#elif defined(__GNUC__)
#define popcnt_ptr(x) __builtin_popcount((uintptr_t)(x))
#endif

template<typename K, typename V>
class bstree
{
public:
  bstree():root(0){}
  bstree(const bstree &rhs):root(rhs._clone(rhs.root)){}
  ~bstree() { _clear(root); }

  // Searches for the value associated with the given key, or 0 if not found.
  // The value may be changed via the returned pointer.
  V *search(const K &key)
  {
    uint32_t c;
    for(node *n = root; n; n = n->child[c>>31])
      if (!(c = (uint32_t)trinary_cmp(key, n->key)))
        return &n->value;
    return 0;
  }

  // Inserts a new key-value pair in the BST.
  void insert(const K &key, const V &value)
  {
    uint32_t c;
    node **n = &root;
    for(; *n; n = (*n)->child + (c>>31))
      if (!(c = (uint32_t)trinary_cmp(key, (*n)->key)))
      {
        (*n)->value = value; // Replace previous value if key already existed.
        return;
      }
    *n = new node(key, value, 0, 0);
  }

  // Deletes a single key-value pair from the tree. Returns true if an element with
  // the given key did exist in the tree. Does nothing and returns false otherwise.
  bool remove(const K &key)
  {
    uint32_t c;
    for(node **p = &root, *n = root; n; p = n->child + (c>>31), n = *p)
      if (!(c = (uint32_t)trinary_cmp(key, n->key)))
      {
        if (n->child[0] && n->child[1])
        {
          int s = popcnt_ptr(n) & 1; // Pseudo-random shuffle which direction to unbalance
          node *right = n->child[1-s], *y = right, *parent;
          if (y->child[s])
          {
            do { parent = y; y = y->child[s]; } while(y->child[s]);
            parent->child[s] = y->child[1-s];
            y->child[1-s] = right;
          }
          y->child[s] = n->child[s];
          *p = y;
        }
        else *p = n->child[0] ? n->child[0] : n->child[1];
        delete n;
        return true;
      }
    return false;
  }

  bool empty() const { return !root; }
  size_t size() const { return _size(root); }
  void clear() { _clear(root); root = 0; }

  // Iterates over all elements in sorted order.
  template<typename T> void for_each(T &&callable) { if (root) _for_each(callable, root); }

  bstree &operator=(const bstree &rhs) const
  {
    if (this != &rhs)
    {
      _clear(root);
      root = _clone(rhs.root);
    }
    return *this;
  }
private:
  struct node
  {
    node(const K &k, const V &v, node *c0, node *c1):key(k),value(v) { child[0] = c0; child[1] = c1; }
    node *child[2];
    K key;
    V value;
  } *root;

  size_t _size(node *n) const { return n ? 1 + _size(n->child[0]) + _size(n->child[1]) : 0; }
  void _clear(node *n) { if (n) { _clear(n->child[0]); _clear(n->child[1]); delete n; }}
  template<typename T> void _for_each(T &&callable, node *n)
  {
    if (n->child[1]) _for_each(callable, n->child[1]);
    callable(n->key, n->value);
    if (n->child[0]) _for_each(callable, n->child[0]);
  }

  node *_clone(node *n) const
  {
    return n ? new node(n->key, n->value, _clone(n->child[0]), _clone(n->child[1])) : 0;
  }
};

Listing 13. bstree.h: Full code listing.

An example program is presented in Listing 14.

int trinary_cmp(int a, int b) { return a - b; }

#include "bstree.h"
#include <stdio.h>
#include <stdlib.h>
#include <string>

bool is_prime(int x)
{
  if (x <= 1) return false;
  for(int i = 2; i < x; ++i)
    if (x % i == 0) return false;
  return true;
}

template<typename T>
void print_node(T &n)
{
  if (n->child[1]) { printf("%d -> %d[label=\"L\"]\n", n->key, n->child[1]->key); print_node(n->child[1]); }
  if (n->child[0]) { printf("%d -> %d[label=\"R\"]\n", n->key, n->child[0]->key); print_node(n->child[0]); }
}

template<typename T>
void print(T &tree)
{
  printf("digraph {\n");
  if (tree.root)
  {
    print_node(tree.root);
  }
  printf("}\n");
}

int main()
{
  srand((unsigned int)time(0));

  bstree<int, int> tree;
  for(int i = 2; i < 100; ++i) // Add random numbers
  {
    int x = rand() % 100;
    tree.insert(x, is_prime(x));
  }

  // Update which numbers are Mersenne primes
  for(int i = 2; i <= 128; i *= 2)
  {
    if (is_prime(i-1))
    {
      int *val = tree.search(i-1);
      if (val) *val = 2;
    }
  }

  // Remove all numbers that have an even digit in them
  for(int i = 2; i < 128; ++i)
  {
    std::string str = std::to_string(i);
    if (str.find_first_of("02468") != std::string::npos)
      tree.remove(i);
  }

  // Print what we have left in tree in-order
  tree.for_each([](int key, int value) {
    printf("%d: %s\n", key, value == 2 ? "is Mersenne prime" : (value ? "is prime" : "is composite"));
  });

  // Print out the tree in graphviz format.
  print(tree);
}

Listing 14. bstree.test.cpp: Example operations on a Binary Search Tree.

The output of program in Listing 14 can be visualized as follows.

0: is composite
3: is Mersenne prime
7: is Mersenne prime
9: is composite
13: is prime
15: is composite
17: is prime
31: is Mersenne prime
35: is composite
39: is composite
51: is composite
55: is composite
57: is composite
59: is prime
73: is prime
77: is composite
79: is prime
91: is composite
93: is composite
95: is composite
99: is composite

digraph {
79 -> 39[label="L"]
39 -> 9[label="L"]
9 -> 7[label="L"]
7 -> 0[label="L"]
0 -> 3[label="R"]
9 -> 31[label="R"]
31 -> 17[label="L"]
17 -> 15[label="L"]
15 -> 13[label="L"]
31 -> 35[label="R"]
39 -> 59[label="R"]
59 -> 55[label="L"]
55 -> 51[label="L"]
55 -> 57[label="R"]
59 -> 77[label="R"]
77 -> 73[label="L"]
79 -> 93[label="R"]
93 -> 91[label="L"]
93 -> 99[label="R"]
99 -> 95[label="L"]
}

Conclusion

The Binary Search Tree is a classic data structure that provides logarithmic time insertion and deletion. A concise implementation can be realized in about a hundred C++ code lines.

When implementing a robust Delete operation, developers are highly encouraged to implement randomization in deletion in order to keep the tree more balanced. There is little reason to omit it, given that it can be incorporated effectively for free.

In most applications, use of Binary Search Trees is advised to be avoided unless problem domain allows to manage ordered data e.g. by randomizing input. If shuffling the input is not viable, then developers are advised to look at balanced Binary Search Tree structures instead, which pragmatically solve the problem of the tree becoming lopsided.

In the second part of this series, the presented source code will be further evolved into the Red-Black Tree, which is arguably the most popular balanced BST variant in use.

Part Two.

Binary Search Tree in C++

Jukka Jylänki

2024-09-19

Table of Contents