Splay tree
A splay tree is a self-balancing binary search tree with the additional property that recently accessed elements are quick to access again. It performs basic operations such as insertion, look-up and removal in O amortized time. For many sequences of non-random operations, splay trees perform better than other search trees, even when the specific pattern of the sequence is unknown. The splay tree was invented by Daniel Sleator and Robert Tarjan in 1985.
All normal operations on a binary search tree are combined with one basic operation, called splaying. Splaying the tree for a certain element rearranges the tree so that the element is placed at the root of the tree. One way to do this with the basic search operation is to first perform a standard binary tree search for the element in question, and then use tree rotations in a specific fashion to bring the element to the top. Alternatively, a top-down algorithm can combine the search and the tree reorganization into a single phase.
Advantages
Good performance for a splay tree depends on the fact that it is self-optimizing, in that frequently accessed nodes will move nearer to the root where they can be accessed more quickly. The worst-case height—though unlikely—is O, with the average being O.Having frequently-used nodes near the root is an advantage for many practical applications, and is particularly useful for implementing caches and garbage collection algorithms.
Advantages include:
- Comparable performance: Average-case performance is as efficient as other trees.
- Small memory footprint: Splay trees do not need to store any bookkeeping data.
Disadvantages
The representation of splay trees can change even when they are accessed in a 'read-only' manner. This complicates the use of such splay trees in a multi-threaded environment. Specifically, extra management is needed if multiple threads are allowed to perform find operations concurrently. This also makes them unsuitable for general use in purely functional programming, although even there they can be used in limited ways to implement priority queues.
Operations
Splaying
When a node x is accessed, a splay operation is performed on x to move it to the root. To perform a splay operation we carry out a sequence of splay steps, each of which moves x closer to the root. By performing a splay operation on the node of interest after every access, the recently accessed nodes are kept near the root and the tree remains roughly balanced, so that we achieve the desired amortized time bounds.Each particular step depends on three factors:
- Whether x is the left or right child of its parent node, p,
- whether p is the root or not, and if not
- whether p is the left or right child of its parent, g.
There are three types of splay steps, each of which has two symmetric variants: left- and right-handed. For the sake of brevity, only one of these two is shown for each type. The three types of splay steps are:
Zig step: this step is done when p is the root. The tree is rotated on the edge between x and p. Zig steps exist to deal with the parity issue and will be done only as the last step in a splay operation and only when x has odd depth at the beginning of the operation.
Zig-zig step: this step is done when p is not the root and x and p are either both right children or are both left children. The picture below shows the case where x and p are both left children. The tree is rotated on the edge joining p with its parent g, then rotated on the edge joining x with p. Note that zig-zig steps are the only thing that differentiate splay trees from the rotate to root method introduced by Allen and Munro prior to the introduction of splay trees.
Zig-zag step: this step is done when p is not the root and x is a right child and p is a left child or vice versa. The tree is rotated on the edge between p and x, and then rotated on the resulting edge between x and g.
Join
Given two trees S and T such that all elements of S are smaller than the elements of T, the following steps can be used to join them to a single tree:- Splay the largest item in S. Now this item is in the root of S and has a null right child.
- Set the right child of the new root to T.
Split
- Splay x. Now it is in the root so the tree to its left contains all elements smaller than x and the tree to its right contains all element larger than x.
- Split the right subtree from the rest of the tree.
Insertion
- Insert x as with a normal binary search tree.
- when an item is inserted, a splay is performed.
- As a result, the newly inserted node x becomes the root of the tree.
- Use the split operation to split the tree at the value of x to two sub-trees: S and T.
- Create a new tree in which x is the root, S is its left sub-tree and T its right sub-tree.
Deletion
- If x has two children:
- * Swap its value with that of either the rightmost node of its left sub tree or the leftmost node of its right subtree.
- * Remove that node instead.
Alternatively:
- The node to be deleted is first splayed, i.e. brought to the root of the tree and then deleted. leaves the tree with two sub trees.
- The two sub-trees are then joined using a "join" operation.
Implementation and variants
Another method which can be used is based on the argument that we can restructure the tree on our way down the access path instead of making a second pass. This top-down splaying routine uses three sets of nodes - left tree, right tree and middle tree. The first two contain all items of original tree known to be less than or greater than current item respectively. The middle tree consists of the sub-tree rooted at the current node. These three sets are updated down the access path while keeping the splay operations in check. Another method, semisplaying, modifies the zig-zig case to reduce the amount of restructuring done in all operations.
Below there is an implementation of splay trees in C++, which uses pointers to represent each node on the tree. This implementation is based on bottom-up splaying version and uses the second method of deletion on a splay tree. Also, unlike the above definition, this C++ version does not splay the tree on finds - it only splays on insertions and deletions, and the find operation, therefore, has linear time complexity.
- include
- ifndef SPLAY_TREE
- define SPLAY_TREE
class splay_tree ;
- endif // SPLAY_TREE
Analysis
A simple amortized analysis of static splay trees can be carried out using the potential method. Define:- size = the number of nodes in the sub-tree rooted at node r.
- rank = log2.
- Φ = the sum of the ranks of all the nodes in the tree.
To apply the potential method, we first calculate ΔΦ: the change in the potential caused by a splay operation. We check each case separately. Denote by rank′ the rank function after the operation. x, p and g are the nodes affected by the rotation operation.
Zig step
Zig-zig step
Zig-zag step
The amortized cost of any operation is ΔΦ plus the actual cost. The actual cost of any zig-zig or zig-zag operation is 2 since there are two rotations to make. Hence:When summed over the entire splay operation, this telescopes to 3−rank) which is O. The Zig operation adds an amortized cost of 1, but there's at most one such operation.
So now we know that the total amortized time for a sequence of m operations is:
To go from the amortized time to the actual time, we must add the decrease in potential from the initial state before any operation is done to the final state after all operations are completed.
where the last inequality comes from the fact that for every node x, the minimum rank is 0 and the maximum rank is log.
Now we can finally bound the actual time:
Weighted analysis
The above analysis can be generalized in the following way.- Assign to each node r a weight w.
- Define size = the sum of weights of nodes in the sub-tree rooted at node r.
- Define rank and Φ exactly as above.
where W is the sum of all weights.
The decrease from the initial to the final potential is bounded by:
since the maximum size of any single node is W and the minimum is w.
Hence the actual time is bounded by:
Performance theorems
There are several theorems and conjectures regarding the worst-case runtime for performing a sequence S of m accesses in a splay tree containing n elements.Dynamic optimality conjecture
In addition to the proven performance guarantees for splay trees there is an unproven conjecture of great interest from the original Sleator and Tarjan paper. This conjecture is known as the dynamic optimality conjecture and it basically claims that splay trees perform as well as any other binary search tree algorithm up to a constant factor.There are several corollaries of the dynamic optimality conjecture that remain unproven:
Variants
In order to reduce the number of restructuring operations, it is possible to replace the splaying with semi-splaying, in which an element is splayed only halfway towards the root.Another way to reduce restructuring is to do full splaying, but only in some of the access operations - only when the access path is longer than a threshold, or only in the first m access operations.