DSA2 Chapter 4 Trees
DSA2 Chapter 4 Trees
Chapter 4
Trees
From Linear ADTs to …
• For input of large size, the linear access time of
linked lists is prohibitive.
• In this chapter, we look at a simple data structure
for which the average running time of most
operations is O(log n), and some simple
modification to get O(log n) in the worst case.
• Binary Search Trees
• Trees are very useful abstractions in computer
science
• We will discuss their use in other, more general
applications. 2
Aims of this chapter
• We will also see how trees are used to:
– implement the file system of several popular
OSs.
– evaluate arithmetic expressions.
– support search operations in O(log n) average
time
– refine these ideas to obtain O(log n) worst-
case bounds.
– implement these operations when the data are
stored on a disk. 3
Formal Definition
A tree is a sequence of nodes.
There is a starting node known as root node.
Every node other than the root has a parent node.
The nodes may have any number of children,
themselves being roots of trees.
A node that has no children is a leaf
4
Illustration of a tree + terminology
14
int FileSystem::size( ) const
{ // Postorder traversal: total size of directory files
int totalSize = sizeOfThisFile( );
if( isDirectory( ) )
for each file c in this directory
totalSize += c.size( );
return totalSize;
15
}
16
Binary Trees
• A Binary Tree is a tree in which each node can have at
most two children.
– The depth of an average binary tree is generally
considerably smaller than N.
– An analysis shows that
• the average depth of a binary tree is O(√N)
(Exercise)
• for a special type of binary tree, namely the binary
search tree, average value of the depth is O(log N).
– Unfortunately, the depth can be as large as N − 1 (Worst
case: each node has exactly one child except leaf)
17
Different Types of BTs
• A full binary tree (sometimes
proper binary tree or 2-tree) is
a tree in which every node
other than the leaves has two
children
• A complete binary tree is a
binary tree in which every
level, except possibly the
last, is completely filled,
and all nodes are as far left
as possible.
struct BinaryTreeNode
{
Object element; // data in the node
BinaryTreeNode *left; // Left child
BinaryTreeNode *right; // Right child
};
20
Expression Trees
• The leaves of an expression tree are operands,
such as constants or variable names;
• The other nodes contain operators
• An ET is not necessarily binary:
– e.g. case of unary operators (- and +)
– Nodes may have more than 2 children: e.g. ternary
operators.
21
Inorder/Postorder/Preorder Traversal
• Inorder traversal strategy: (left, node, right)
– Produce an overly paranthesised expression by
o recursively processing a parenthesized left expression
o then printing out the operator at the root, and finally
o recursively processing a parenthesized right expression.
(a + (b * c)) + (((d * e) + f) * g)
• Postorder traversal strategy: (left subtree,
right subtree, operator)
abc*+de*f+g*+ (postfix notation of Chapter 3)
• Preorder traversal strategy: (operator, left subtree,
right subtree)
++a*bc*+*defg (prefix notation) 22
Constructing an ET
Algorithm to convert a postfix expression into an
expression tree:
• Read the expression one symbol at a time.
• If the symbol is an operand, create a one-node tree and
push a pointer to it onto a stack.
• If the symbol is an operator, pop (pointers) to two
trees T1 and T2 from the stack (T1 is popped first) and
form a new tree whose root is the operator and whose
left and right children point to T2 and T1, respectively.
• A pointer to this new tree is then pushed onto the
stack.
23
Example: input a b + c d e + * *
First two symbols are operands, so we create a one-node tree for
each of them and push pointers to them onto a stack
24
Next, c, d, and e are read, and for each, a one-node tree is
created and a pointer to the corresponding tree is pushed onto
the stack.
25
* is read, so we pop two tree pointers and form a new
tree with a * as root.
26
Finally, the last symbol * is read, the two trees are
merged, and a pointer to the final tree is left on the
stack.
27
Binary Trees
• Important application of binary trees is their use
in searching
• We will assume a tree of integers, though
arbitrarily complex (nodes) elements are
possible
• We will also assume that all the items are
distinct (duplicates dealt with later)
28
Section 1
29
Binary Search Tree
• Binary search tree (BST): a BT where every
node in the left subtree is less than the root,
and every node in the right subtree is larger
than the root.
• Properties of a BST are recursive
• Examples: Are the following BSTs?
31
Refresher: lvalue vs rvalue
std::vector<int> createVector() {
std::vector<int> v{ 1, 2, 3, 4, 5 };
return v;
}
int main() {
std::vector<int> v1 = createVector();
// copy constructor
std::vector<int>&& v2 = createVector();
// move constructor
return 0;
32
}
Refresher: lvalue vs rvalue
• Function createVector() returns a vector of integers.
• In main(), we call createVector() twice: once to
initialise v1 and once to initialise v2.
• When v1 is initialised, copy constructor is called:
– creates a new vector,
– copies contents of vector returned by createVector() into
it.
• When v2 is initialised, move constructor is called:
– moves contents of vector returned by createVector() in
to v2.
– Since original vector not needed anymore, this is more
efficient than copying it. 33
Searching an element in a BST
Start from the root.
Each time we encounter a node, see if the key
in the node equals the element. If yes stop.
If the element is less, go to the left subtree.
If it is more, go to the right subtree.
Conclude that the element is not in the list if
we reach a leaf node and the key in the node
does not equal the element.
41
Search(node, elt)
{
If (node = NULL) conclude NOT FOUND;
Else If (node.key = elt) conclude FOUND;
Else If (elt < node.key) Search(node.leftchild, elt);
Else If (elt > node.key) Search(node.rightchild, elt);
}
3 8
Search for 10
Sequence
4 10
1 Traveled:
5, 8, 10
Search for 3.5 Found!
Sequence Traveled:
5, 3, 4
Not found! 43
Find Min
• Returns a pointer to the node containing the
smallest element in the tree
• Start at the root and
– go left as long as there is a left child.
– The stopping point is the smallest element
BinaryNode * findMin( BinaryNode *t ) const { // recursive
if( t == nullptr )
return nullptr;
if( t->left == nullptr )
return t;
return findMin( t->left ); }
Complexity: O(d) 44
Find Min
// non-recursive version
BinaryNode * findMin( BinaryNode *t ) const
{
if( t != nullptr )
while( t->left != nullptr )
t = t->left;
return t;
}
45
5
3 8
1 4 10
Travel 5, 3, 1
Return 1;
46
Insert an element
47
Insertion function
void insert(const Comparable & x, BinaryNode * & t){
if( t == nullptr )
t = new BinaryNode{ x, nullptr, nullptr };
else if( x < t->element )
insert( x, t->left );
else if( t->element < x )
insert( x, t->right );
else
; // Duplicate; do nothing
}
Complexity: O(d) 48
5
Insert 3.5
3 8
Sequence
Traveled:
1 4 10
5, 3, 4
5
Insert 3.5 as left
child of 4
3 8
1 4 10
3.5
49
Insert an element by moving
void insert( Comparable && x, BinaryNode * & t ) {
if( t == nullptr )
t = new BinaryNode{ std::move( x ), nullptr, nullptr };
// std::move is exactly equivalent to a static_cast to
// an rvalue reference type
else if( x < t->element )
insert( std::move( x ), t->left );
else if( t->element < x )
insert( std::move( x ), t->right );
else
; // Duplicate; do nothing
}
Complexity: O(d) 50
DELETION
Deleting a node, has to be done such that the
property of the Binary Search Tree is maintained.
51
void remove( const Comparable & x, BinaryNode * & t ) {
if( t == nullptr )
return; // Item not found; do nothing
if( x < t->element )
remove( x, t->left );
else if( t->element < x )
remove( x, t->right );
else if( t->left != nullptr && t->right != nullptr ) // Two children
{
t->element = findMin( t->right )->element;
remove( t->element, t->right ); }
else {
BinaryNode *oldNode = t;
t = ( t->left != nullptr ) ? t->left : t->right;
delete oldNode; } } 52
If the node has two children:
Look at the right subtree of the node (subtree rooted at the
right child of the node).
Find the Minimum there.
Replace the key of the node to be deleted by the minimum
element.
Delete the minimum element.
Any problem deleting it?
Need to take care of the children of this min. element,
(The min element can have at most one child.)
For deletion convenience, always have a pointer from a
53
node to its parent.
5
Delete 3;
3 8
3 has 2 children;
1 4 10
Findmin in right subtree
3.5 of 3 returns 3.5
5
So 3 is replaced by 3.5,
and 3.5 is deleted.
3.5 8
1 4 10
54
Before Delete 4 After Delete 4
(with 1 child) (with 1 child)
55
Before Delete 2 After Delete 2
(with 2 children) (with 2 children)
Complexity? O(d)
58
Operations on BSTs: Code
59
AVL Trees
• We have seen that all operations depend on the
depth of the tree.
• We don’t want trees with large-height nodes
• This can be attained if both subtrees of each node
have roughly the same height.
• An AVL (Adelson-Velskii and Landis) tree is a BST
with a balance condition.
• The balance condition must be easy to maintain,
and it ensures that the depth of the tree is O(logN).
• Simplest idea: require left and right subtrees have
same height. 60
AVL Trees
Idea that left and
right subtrees have
roughly the same
height does not
force the tree to be
shallow
AVL Tree
62
Some AVL Tree Properties
• Height information is kept for each node (in the
node structure).
• It can be shown that the height of an AVL tree is
at most roughly 1.44 log(N + 2) − 1.328, but, in
practice, only slightly more than logN.
• The minimum number of nodes, S(h), in an AVL
tree of height h: S(h) = S(h−1)+S(h−2)+1.
For h = 0, S(h) = 1. For h = 1, S(h) = 2
all the tree operations can be performed in
O(logN) time, except insertion and deletion (need to
update all the balancing information) 63
Operations in AVL Tree
Deletion? Insertion?
64
Insertion into an AVL Tree
Insert 6
71
Double Rotation
• Single Rotation does not work for cases 2 and 3
(in which the insertion has occured on the “inside”, i.e.,
left–right or right–left of a node. After single
After rotation
insertion
73
Result of Double Rotation
74
Pseudocode
Insert(X, T)
{
If (T = NULL)
insert X at T; T->height = 0;
If (X T.element)
{
Insert(X, T ->left)
If Height(T ->left) - Height(T ->right) = 2
// SingleRotate routine in Fig 4.41 (Weiss)
// Separate for left and right nodes
// DoubleRotate routine in Fig 4.43 (Weiss)
// Separate for left and right nodes
75
{
If (X < T.leftchild.element) T =singleRotatewithleft(T);
else T =doubleRotatewithleft(T);
} }
Else If (X>T.element)
{ Insert(X, T ->right)
If Height(T ->right) - Height(T ->leftt) = 2
{
If (X > T.righchild.element) T =singleRotatewithright(T);
else T =doubleRotatewithright(T);
} }
T->height = max(height(T->left), height(T->right)) + 1;
Return(T); } 76
Extended Example
Insert 3,2,1,4,5,6,7, 16,15,14
3 2
3
3
2 1
Fig 1 2 3
Fig 4
Fig 2
2 1 2
Fig 3
1
1 3
3
Fig 5 Fig 6 4
77
4
5
2
2
1
1 4
4
3 5
3 5
Fig 8
Fig 7 6
4 4
2 2
5 5
1 3 6 1 3 6
4
Fig 9 Fig 10 7
2
6
1 3 7
5
Fig 11 78
4 4
2 2
6 6
1 3 5 7 1 3 5 7
16 16
Fig 12
Fig 13 15
4
2
6
1 3 15
5
Fig 14 7 16
79
4 4
2 2
6 7
1 3 15 15
5 1 3
6
16
7 14
Fig 15 5 16
14
Fig 16
Continued in Book
80
Tree Traversal Revisited
• Inorder traversal: process left subtree, process current
node, process right subtree. E.g. to list the elements of
a BST inorderTraversalBST
– total running time: O(N): constant work performed at every
node in the tree (testing against nullptr, setting up two
function calls, and doing an output statement) & each node is
visited once.
• Postorder traversal: when we need to process both
subtrees first before we can process a node. E.g. to
compute the height of a node LTree, RTree, Node
– total running time: O(N): constant work performed at each
node postOrderTraversalBST
81
Tree Traversal Revisited
• PreOrder traversal: Node is processed before the
children. E.g. to label each node with its depth.
(See file system example in this chapter)
85
• An M-ary search tree allows M-way branching As
branching increases, the depth decreases.
• Whereas a complete binary tree has height roughly log2 N, a
complete M-ary tree has height roughly logM N.
• M-ary search tree can be created in the same way as a BST.
• In a BST, we need one key to decide which of two branches
to take. In an M-ary search tree, we need M − 1 keys to
decide.
• To make this scheme efficient in the worst case, we need to
ensure that the M-ary search tree is balanced in some way.
Otherwise, like a BST, it could degenerate into a linked list.
• In fact, we want an even more restrictive balancing condition
so that an M-ary search tree does not degenerate to even a
BST.
86
B-Trees (B+ Trees)
A B-tree of order M is an M-ary tree such that:
1. The data items are stored at leaves.
2. The nonleaf nodes store up to M − 1 keys to guide the
searching; key i represents the smallest key in subtree i +1
3. The root is either a leaf or has between two and M
children.
4. All nonleaf nodes (except the root) have between
and M children. (avoids degeneration into binary tree)
5. All leaves are at the same depth and have between
and L data items, for some L (the determination of L is
described shortly).
87
N.B.: Rules 3 and 5 must be relaxed for the first L insertions.
Example
98
Slides based on the textbook
Mark Allen Weiss,
(2014 ) Data
Structures and
Algorithm Analysis
in C++, 4th edition,
Pearson.