AVL Trees
In this section, we examine Adelson-Velsky & Landis trees, commonly known as AVL-trees.1 AVL-trees are height-balanced binary search trees. Accordingly, they are a subset of balanced binary trees:
Given this fact, we might guess that there are various ways of defining what it means to be "balanced." And in fact, we would guess correctly. The term balanced tree, without more, could mean many things. Do we mean balanced in terms of weight? Height? Depth? Order? Node categories?
For AVL-trees, we define "balanced" in terms of height. And to construct that definition, we need a way of quantifying balance. We do so by defining a height balance factor:
Definition: Height Balance Factor. Let be a node in some tree. The height balance factor of is defined as:
where:
- is the height balance factor of the node
- is the height of the left-subtree of the node and
- is the height of the right-subtree of the node
To understand this definition, it's important to recall the definitions for height:
Definition: Height of a Node. Let be a node in some rooted tree. The height of denoted is the number of nodes, including the starting and end nodes, along the longest strictly downward path from to a leaf.
Note that there are two different definitions for the height of a node. An alternative definition is:
Definition: Height of a Node. Let be a node in some tree The height of denoted is the number of edges in the longest strictly-downward path from to a leaf.
The key difference between this definition and the one we will use is whether a leaf has a height of or The existence of two differing definitions for node height is similar to the situation of natural numbers in mathematics—does the set of natural numbers include Some say yes, others no.
In response to this debate, we take the pragmatic approach. In some situations, it's more convenient to define the height of a node as the number of edges along the path, since this implies that a balanced binary tree of height has leaves. If we defined it as the number of nodes along the path (number of edges plus ), we would have or leaves. For the purposes of AVL-trees, we use the definition presented in the main text.
Definition: Height of a Tree. Let be some rooted tree. The height of denoted is the height of its root node.
From the height balance factor we have the definition of an AVL tree:
Definition: avl-tree. Let be a binary tree. is an AVL-tree if, and only if, for every node
Or, alternatively:
Following the definition above, if the absolute value of the balance factor of some node is greater than or equal to we say that the node is height-imbalanced. Moreover, to determine if a particular tree is an AVL-tree, we must calculate the height balance factor for each node in If a single node in is imbalanced, we conclude that is height-imbalanced. If the tree is imbalanced, we can attempt to balance the tree by performing rotations, as we'll see later.
That said, let's consider a few examples. Below are several binary trees, with their heights indicated.
Examining each tree, we have the following tables of balance factors:
Tree | Node | Height of Left-subtree | Height of Right-subtree | Balance Factor |
---|---|---|---|---|
A | a | 2 | 2 | 0 |
b | 1 | 1 | 0 | |
c | 1 | 0 | 1 | |
d | 0 | 0 | 0 | |
e | 0 | 0 | 0 | |
f | 0 | 0 | 0 |
Tree | Node | Height of Left-subtree | Height of Right-subtree | Balance Factor |
---|---|---|---|---|
B | a | 2 | 2 | 0 |
b | 1 | 0 | 1 | |
c | 0 | 1 | -1 | |
d | 0 | 0 | 0 | |
g | 0 | 0 | 0 |
Tree | Node | Height of Left-subtree | Height of Right-subtree | Balance Factor |
---|---|---|---|---|
C | a | 3 | 1 | 2 |
b | 2 | 0 | 2 | |
c | 0 | 0 | 0 | |
d | 1 | 0 | 1 | |
h | 0 | 0 | 0 |
Tree | Node | Height of Left-subtree | Height of Right-subtree | Balance Factor |
---|---|---|---|---|
D | a | 4 | 1 | 3 |
b | 3 | 0 | 3 | |
c | 0 | 0 | 0 | |
d | 0 | 2 | -2 | |
i | 1 | 0 | 1 | |
j | 0 | 0 | 0 |
From the tables above, we can see that trees and are imbalanced. The same trees are presented below with their balance factors indicated (red for imbalanced, green for balanced).
Now that we know what it means for a tree to be imbalanced, let's examine rotating—the act of correct height imbalances.
Rotations
If we think carefully about height-imbalanced trees, we find that there are really only four situations where a binary tree is height-imbalanced. Why? Because to have a height-imbalanced binary tree, we need, at a minimum, three nodes. This follows from the definition of a height-balanced binary tree—to determine if a binary tree is height-balanced, we must compare the left- and right-subtrees of the root. And to reach the conclusion that a binary tree is height-imbalanced, the root must have a right- and left-child to begin with; three nodes total.
As we know from the Catalan number formula, given three nodes, there are five possible binary trees:
Of the possibilities above, only is balanced. Given that all subtrees of a binary tree are binary trees, it follows that there are only four possible cases for height-imbalanced trees:
Because there are only four possible cases, there are only four methods for height-balancing some binary tree. These methods are called rotations: right rotation, left rotation, right-left rotation, and left-right rotation.
Right rotation
Suppose some user is entering a binary tree traversal sequence to construct a tree After the first two entries, the binary tree appears as:
The user then enters another element:
At this point, the tree is imbalanced. Why? Because has a left-subtree of height and a right-subtree of height Thus, the tree has an overall balance factor of We demarcate this situation as to convey the proposition that the tree is left-left-imbalanced, which in turn conveys: An imbalance occurred at the left-subtree of the left-subtree of
Now, because we have we perform an right-rotation:
The RR-rotation consists of the following steps:
-
the node just before the imbalance-causing node—in this case, —becomes the new root. We call the pivot.
-
formerly the root, becomes the right-child of with right-child as its left-child. In the case above, 's right-child is
-
remains the left-child of
One way to think about the right-rotation operation is to imagine holding down the pivot, and pulling the node to its right down. (Hence the term pivot). Alternatively, we can think of the right-rotation as it sounds like—the nodes move towards the right, transitioning from the shape to the shape
The right-rotation operation works regardless of how many nodes there are. For example, suppose we have the following tree:
In the diagram above, the nodes without circles indicate that there are further subtrees. Importantly, the fact that the pivot, has a balance factor of indicates that this tree is left-heavy. Accordingly, we have a right-rotation:
Left rotation
Now suppose the user enters the tree:
Here, we have an RR-imbalanced tree. To balance this tree, we must perform a left-rotation:
The procedure:
-
The pivot becomes the new root. .
-
formerly the root, becomes the
-
left-child of
remains the right-child of
Right-left Rotation
Now we examine the third case. Suppose the user enters:
We say that this tree is right-left imbalanced. To balance this tree, we perform a right-left rotation:
Left-right Rotation
Now we examine the final case. Suppose the user enters:
We say that this tree is left-right imbalanced. To balance this tree, we perform a left-right rotation:
General Form of AVL Rotations
AVL tree rotations have a general form. Consider the following tree:
The nodes without circles indicate that the child node is the root to some
larger subtree. In this case, we see that Node(A)
is imbalanced — it has
a balance factor of 2. Given that Node(B)
has a balance factor of 1, we
see that the rotation should be an LL-rotation. If Node(B)
had a balance
factor of -1, the rotation would be an LR-rotation.
More generally:
Tree | Imbalance | Procedure |
---|---|---|
Left-left | Left-left rotation | |
Left-right | Left-right rotation | |
Right-right | Right-right rotation | |
Right-left | Right-left rotation |
Returning to our imbalanced tree, how do we balance this tree knowing it potentially has thousands of other nodes in those subtrees? The idea is to just momentarily ignore the fact that those subtrees exist. Performing the LL-rotation, we get:
Now we see that it's balanced. The subtrees are now:
Ok hang on. How did those subtrees move? A helpful way to think about how the subtrees got to their positions is to imagine the links between each node as a thread:
Notice that by pulling the thread, the subtree slowly goes up, until it arrives at its final location. Knowing this general form, consider a more complicated tree:
This tree is balanced. Suppose we insert a new node with the value 4. Because 4 is smaller than 30, it goes towards the left, landing at the node containg 20. Then, because it's smaller than 20, it goes to the left again, landing on the node containing 10. And because it's smaller than 10, it lands at node 5. And because it's smaller than 5, it's inserted as the left child of 5:
Now that we've inserted node(4), the tree is imbalanced. To balance this tree, we focus on just three nodes:
After node(10), wherever node(4) is inserted, it's left-left imbalanced.
Footnotes
-
AVL-trees were invented in 1962, and named after their inventors, Soviet mathematicians Georgy Adelson-Velsky (8 January 1922–26 April 2014) and Evgenii Mikhailovich Landis (6 October 1921–12 December 1997).
Born in Samara, Russia, Adelson-Velsky is known for introducing bitboards, a bit array data structure now widely used to represent game positions in computer chess. Adelson-Velsky's chess program participated and won the first chess match between computer programs in 1966. The AVL-tree, in particular, was borne out of Adelson-Velsky's work in artificial intelligence.
His colleague, Evgenii Landis, was equally accomplished. Born in Kharkiv, Ukraine, Landis fought in Finland as a lieutenant in the Red Army, where he was wounded, frost-bitten, and severely shell-shocked. The traumas of war impaired his ability to hear high-pitched tones, to which Landis lamented his inability to hear violin flageolets. Despite these difficulties, Landis made numerous contributions to the field of partial differential equations, alongside his collaboration with Adelson-Velsky in inventing the AVL-tree. ↩