DS UNIT -5-1
DS UNIT -5-1
TREES
1. TREES
• A tree is defined as a set of one or more nodes where one node is designated as the
root of the tree and all the remaining nodes can be partitioned into non- empty sets
each of which is a sub-tree of the root.
• The following Figure shows a tree where node A is the root node; nodes B, C, and D
are children of the root node and form sub-trees of the tree rooted at node A.
Root node: The root node R is the topmost node in the tree. If R = NULL, then it means the
tree is empty.
Sub-trees: If the root node R is not NULL, then the trees T1, T2, and T3 are called the sub-
trees of R.
Leaf node: A node that has no children is called the leaf node or the terminal node.
Path: A sequence of consecutive edges is called a path. For example, in Fig. 1.1, the path
from the root node A to node I is given as: A, D, and I.
Ancestor node: An ancestor of a node is any predecessor node on the path from root to that
node. The root node does not have any ancestors. In the tree given in Fig. 1.1, nodes A, C,
and G are the ancestors of node K.
Descendant node: A descendant node is any successor node on any path from the node to a
leaf node. Leaf nodes do not have any descendants. In the tree given in Fig. 1.1, nodes C, G,
J, and K are the descendants of node A.
Level number: Every node in the tree is assigned a level number in such a way that the root
node is at level 0, children of the root node are at level number 1. Thus, every node is at one
level higher than its parent. So, all child nodes have a level number given by parent’s level
number + 1.
Degree: Degree of a node is equal to the number of children that a node has. The degree of a
leaf node is zero.
In-degree: In-degree of a node is the number of edges arriving at that node.
Out-degree: Out-degree of a node is the number of edges leaving that node.
3. BINARY SEARCH TREES:
3.1 Basic Concepts:
• A binary search tree, also known as an ordered binary tree, is a variant of binary trees
in which the nodes are arranged in an order.
• In a binary search tree, all the nodes in the left sub-tree have a value less than that of
the root node. Correspondingly, all the nodes in the right sub-tree have a value either
equal to or greater than the root node.
• The same rule is applicable to every sub-tree in the tree. (Note that a binary search
tree may or may not contain duplicate values, depending on its implementation.)
• In the above Figure, the root node is 39. The left sub-tree of the root node consists of
nodes 9, 10, 18, 19, 21, 27, 28, 29, and 36. All these nodes have smaller values than
the root node.
• The right sub-tree of the root node consists of nodes 40, 45, 54, 59, 60, and 65.
Recursively, each of the sub-trees also obeys the binary search tree constraint.
• For example, in the left sub-tree of the root node, 27 is the root and all elements in
its left sub-tree (9, 10, 18, 19, 21) are smaller than 27, while all nodes in its right sub-
tree (28, 29, and 36) are greater than the root node’s value.
• Since the nodes in a binary search tree are ordered, the time needed to search an
element in the tree is greatly reduced.
• Whenever we search for an element, we do not need to traverse the entire tree.
• For example, in the given tree, if we have to search for 29, then we know that we have
to scan only the left sub-tree.
• The value is present in the tree, it will only be in the left sub-tree, as 29 is smaller
than 39 (the root node’s value).
• The left sub-tree has a root node with the value 27. Since 29 is greater than 27, we
will move to the right sub-tree, where we will find the element.
• Thus, the average running time of a search operation is O(log2n), as at every step, we
eliminate half of the sub-tree from the search process.
• Due to its efficiency in searching elements, binary search trees are widely used in
dictionary problems where the code always inserts and searches the elements that are
indexed by some key value.
• Binary search trees also speed up the insertion and deletion operations.
4. BST OPERATIONS
4.1 Inserting a New Node in a Binary Search Tree:
• The insert function is used to add a new node with a given value at the correct
position in the binary search tree.
Adding the node at the correct position means that the new node
should not violate the properties of the binary search
tree.
Algorithm to insert a given value in a binary search tree
• The initial code for the insert function is similar to the search function. This is
because we first find the correct position where the insertion has to be done then add
the node at that position.
• The insertion function changes the structure of the tree.
• Therefore, when the insert function is called recursively, the function should return
the new tree pointer.
Step 1 : The insert function checks if the current node of TREE is NULL. If it is NULL,
Simply adds the node
else It looks at the current node’s value and then recurs down the left or right sub-tree. If
the current node’s value is less than that of the new node, then the right sub-tree is traversed,
else
The left sub-tree is traversed.
The insert function continues moving down the levels of a binary tree until it reaches a leaf
node.
Step 2 : exit
• The insert function requires time proportional to the height of the tree in the worst case.
• It takes O(log n) time to execute in the average case
• O(n) time in the worst case.
• This deletion could also be handled by replacing node 56 with its in-order successor, as
shown below:
• The below steps shows the algorithm to delete a node from a binary search tree.
Step 1: We first check if TREE=NULL, because if it is true, then the node to be deleted is not
present in the tree.
Step 2: If that is not the case, then we check if the value to be deleted is less than the current
node’s data.
Step 3: In case the value is less, we call the algorithm recursively on the node’s left sub-tree,
otherwise the algorithm is called recursively on the node’s right sub-tree.
Step 4: If we have found the node whose value is equal to VAL, then we check which case of deletion it
is.
Step 5: If the node to be deleted has both left and right children, then we find the in-order predecessor of
the node by calling findLargestNode(TREE -> LEFT) and replace the current node’s value with that of
its in-order predecessor. Then, we call Delete(TREE -> LEFT, TEMP -> DATA) to delete the initial
node of the in-order predecessor.
Step 6: We reduce the case 3 of deletion into either case 1 or case 2 of deletion.
Step 7: If the node to be deleted does not have any child, then we simply set the node to NULL.
Step 8: if the node to be deleted has either a left or a right child but not both, then the current
node is replaced by its child node and the initial child node is deleted from the tree.
4.3 Tree Traversals:
• Traversing a binary tree is the process of visiting each node in the tree exactly once in
a systematic way.
• Unlike linear data structures in which the elements are traversed sequentially, tree is a
nonlinear data structure in which the elements can be traversed in many different
ways.
• There are different algorithms for tree traversals. These algorithms differ in the order
in which the nodes are visited. The following are these algorithms.
a) Pre-order Traversal:
• To traverse a non-empty binary tree in pre-order, the following operations are
performed recursively at each node. The algorithm works by:
1. Visiting the root node,
2. Traversing the left sub-tree, and finally
3. Traversing the right sub-tree.
Binary tree
• Consider the above tree. The pre-order traversal of the tree is given as A, B, C.
• Root node first, the left sub-tree next, and then the right sub-tree.
• Pre-order traversal is also called as depth-first traversal.
• In this algorithm, the left sub-tree is always traversed before the right sub-tree. The
word ‘pre’ in the pre-order specifies that the root node is accessed prior to any other
nodes in the left and right sub-trees.
• Pre-order algorithm is also known as the NLR traversal algorithm (Node-Left-Right).
• The algorithm for pre-order traversal is shown below.
Binary tree
• Consider the above tree. The in-order traversal of the tree is given as B, A, and C.
• Left sub-tree first, the root node next, and then the right sub-tree.
• In-order traversal is also called as symmetric traversal.
• In this algorithm, the left sub-tree is always traversed before the root node and the
right sub-tree.
• The word ‘in’ in the in-order specifies that the root node is accessed in between the
left and the right sub-trees.
• In-order algorithm is also known as the LNR traversal algorithm (Left-Node-Right).
The algorithm for in-order traversal is shown below.
Algorithm for in-order traversal
• In-order traversal algorithm is usually used to display the elements of a binary search
tree.
• Here, all the elements with a value lower than a given value are accessed before the
elements with a higher value.
Example: In Figs (a) and (b), find the sequence of nodes that will be visited using In-order
traversal algorithm.
Solution:
Binary trees
GRAPHS
1. BASIC CONCEPTS
1.1Introduction
• A graph is an abstract data structure that is used to implement the mathematical concept
of graphs.
• It is basically a collection of vertices (also called nodes) and edges that connect these
vertices.
• A graph is often viewed as a generalization of the tree structure, where instead of having
a purely parent-to-child relationship between tree nodes.
1.2 Definition:
• A graph G is defined as an ordered set (V, E), where V(G) represents the set of vertices and
E(G) represents the edges that connect these vertices. We have two types of Graphs. Basically:
1. Undirected Graph
2. Directed Graph
Undirected Graph:
• Shows a graph with V(G) = {A, B, C, D and E} and E(G) = {(A, B), (B, C), (A, D), (B,
D), (D,E), (C, E)}.
• There are five vertices or nodes and six edges in the graph.
• In an undirected graph, edges do not have any direction associated with them.
• That is, if an edge is drawn between nodes A and B, then the nodes can be traversed from A to
B as well as from B to A.
• The above figure shows an undirected graph because it does not give any information about the
direction of the edges.
Directed Graph:
• A directed graph G, also known as a digraph, is a graph in which every edge has a
direction assigned to it.
• An edge of a directed graph is given as an ordered pair (u, v) of nodes in G. For an edge
(u, v),
1. The edge begins at u and terminates at v.
u is known as the origin or initial point of e. Correspondingly, v is known as the
destination or terminal point of e.
u is the predecessor of v. Correspondingly, v is the successor of u.
Nodes u and v are adjacent to each other.
• The above figure shows a directed graph. In a directed graph, edges form an ordered pair.
• If there is an edge from A to B, then there is a path from A to B but not from B to A.
• The edge (A, B) is said to initiate from node A (also known as initial node) and terminate
at node B (terminal node).
2. REPRESENTATION OF GRAPHS
• There are two common ways of storing graphs in the computer’s memory. They are:
Sequential representation by using an adjacency matrix.
Linked representation by using an adjacency list that stores the neighbours of a node
using a linked list.
2.1 Adjacency Matrix Representation
• An adjacency matrix is used to represent which nodes are adjacent to one another.
• Two nodes are said to be adjacent if there is an edge connecting them.
• In a directed graph G, if node v is adjacent to node u, then there is definitely an edge from u to
v. That is, if v is adjacent to u, we can get from u to v by traversing one edge.
• For any graph G having n nodes, the adjacency matrix will have the dimension of n X n.
• In an adjacency matrix, the rows and columns are labeled by graph vertices.
• An entry aij in the adjacency matrix will contain 1, if vertices vi and vj are adjacent to each
other.
• If the nodes are not adjacent, aij will be set to zero.
Adjacency Matrix Entry
• An adjacency matrix contains only 0s and 1s, it is called a bit matrix or a Boolean matrix.
• The above figure shows some graphs and their corresponding adjacency matrices.
From the above examples, we can draw the following conclusions:
• For a simple graph (that has no loops), the adjacency matrix has 0s on the diagonal.
• The adjacency matrix of an undirected graph is symmetric.
• The memory use of an adjacency matrix is O(n2), where n is the number of nodes in the graph.
• Number of 1s (or non-zero entries) in an adjacency matrix is equal to the number of edges in
the graph.
• The adjacency matrix for a weighted graph contains the weights of the edges connecting the
nodes.
2.2 Adjacency Linked List Representation
An adjacency list is another way in which graphs can be represented in the computer’s
memory.
This structure consists of a list of all nodes in G.
Every node is in turn linked to its own list that contains the names of all other nodes that are
adjacent to it.
The key advantages of using an adjacency list are:
It is easy to follow and clearly shows the adjacent nodes of a particular node.
It is often used for storing graphs that have a small-to-moderate number of edges.
Adding new nodes in G is easy and straightforward when G is represented using an
adjacency list. Adding new nodes in an adjacency matrix is a difficult task, as the size of
the matrix needs to be changed and existing nodes may have to be reordered.
For a directed graph, the sum of the lengths of all adjacency lists is equal to the number
of edges in G.
For an undirected graph, the sum of the lengths of all adjacency lists is equal to twice the
number of edges in G because an edge (u, v) means an edge from node u to v as well as
an edge from v to u.
Adjacency lists can also be modified to store weighted graphs.
Let us now see an adjacency list for an undirected graph as well as a weighted graph.
PROGRAMMING EXAMPLE
1. Write a program to create a graph of n vertices using an adjacency list. Also write the
code to read and print its information and finally to delete the graph.
#include <stdio.h>
#include <conio.h>
#include <alloc.h>
struct node
{
char vertex;
struct node *next;
};
struct node *gnode;
void displayGraph(struct node *adj[], int no_of_nodes);
void deleteGraph(struct node *adj[], int no_of_nodes);
void createGraph(struct node *adj[], int no_of_nodes);
int main()
{
struct node *Adj[10]; int i, no_of_nodes;
clrscr();
printf("\n Enter the number of nodes in G: ");
scanf("%d", &no_of_nodes);
for(i = 0; i < no_of_nodes; i++)
Adj[i] = NULL;
createGraph(Adj, no_of_nodes);
printf("\n The graph is: ");
displayGraph(Adj, no_of_nodes);
deleteGraph(Adj, no_of_nodes);
getch();
return 0;
}
void createGraph(struct node *Adj[], int no_of_nodes)
{
struct node *new_node, *last;
int i, j, n, val;
for(i = 0; i < no_of_nodes; i++)
{
last = NULL;
printf("\n Enter the number of neighbours of %d: ", i);
scanf("%d", &n);
for(j = 1; j <= n; j++)
{
printf("\n Enter the neighbour %d of %d: ", j, i);
scanf("%d", &val);
new_node = (struct node *) malloc(sizeof(struct node));
new_node –> vertex = val;
new_node –> next = NULL;
if (Adj[i] == NULL)
Adj[i] = new_node;
else
last –> next = new_node; last = new_node
}
}
}
void displayGraph (struct node *Adj[], int no_of_nodes)
{
struct node *ptr;
int i;
for(i = 0; i < no_of_nodes; i++)
{
ptr = Adj[i];
printf("\n The neighbours of node %d are:", i);
while(ptr != NULL)
{
printf("\t%d", ptr –> vertex);
ptr = ptr –> next;
}
}
}
void deleteGraph (struct node *Adj[], int no_of_nodes)
{
int i;
struct node *temp, *ptr;
for(i = 0; i <= no_of_nodes; i++)
{
ptr = Adj[i];
while(ptr ! = NULL)
{
temp = ptr;
ptr = ptr –> next;
free(temp);
}
Adj[i] = NULL;
}
}
Output
Enter the number of nodes in G: 3
Enter the number of neighbours of 0: 1
Enter the neighbour 1 of 0: 2
Enter the number of neighbours of 1: 2
Enter the neighbour 1 of 1: 0
Enter the neighbour 2 of 1: 2
Enter the number of neighbours of 2: 1
Enter the neighbour 1 of 2: 1
The neighbours of node 0 are: 1
The neighbours of node 1 are: 0 2
The neighbours of node 2 are: 0
Note If the graph in the above program had been a weighted graph, then the structure of the node
would have been:
typedef struct node
{
int vertex;
int weight;
struct node *next;
};
3. GRAPH TRAVERSALS
• There are two standard methods of graph traversal. These two methods are:
1. Breadth-first search
2. Depth-first search
1. Breadth-First Search Algorithm
• Breadth-first search (BFS) is a graph search algorithm that begins at the root node and
explores all the neighboring nodes.
• Then for each of those nearest nodes, the algorithm explores their unexplored neighbor
nodes, and so on, until it finds the goal.
ALGORITHM
Step 1: SET STATUS = 1 (ready state) for each node in G
Step 2: Enqueue the starting node A and set its STATUS = 2
(waiting state)
Step 3: Repeat Steps 4 and 5 until QUEUE is empty
Step 4: Dequeue a node N. Process it and set its STATUS = 3
(processed state).
Step 5: Enqueue all the neighbours of N that are in the ready state
(whose STATUS = 1) and set their STATUS = 2
(waiting state) [END OF LOOP]
Step 6: EXIT
Programming Example
2. Write a program to implement the breadth-first search algorithm.
#include <stdio.h>
#define MAX 10
void breadth_first_search(int adj[][MAX],int visited[],int start)
{
int queue[MAX],rear = –1,front =– 1, i;
queue[++rear] = start;
visited[start] = 1;
while(rear != front)
{
start = queue[++front];
if(start == 4)
printf("5\t");
else
printf("%c \t",start + 65);
for(i = 0; i < MAX; i++)
{
if(adj[start][i] == 1 && visited[i] == 0)
{
queue[++rear] = i;
visited[i] = 1;
}
}
}
}
int main()
{
int visited[MAX] = {0};
int adj[MAX][MAX], i, j;
printf("\n Enter the adjacency matrix: ");
for(i = 0; i < MAX; i++)
for(j = 0; j < MAX; j++)
scanf("%d", &adj[i][j]);
breadth_first_search(adj,visited,0); return 0;
}
Output
Enter the adjacency matrix:
01010
10110
01001
11001
00110
ABDCE
Solution:
(a) Push H onto the stack.
STACK: H
(b) Pop and print the top element of the STACK, that is, H. Push all the neighbours of H onto the
stack that are in the ready state. The STACK now becomes
PRINT: H STACK: E, I
(c) Pop and print the top element of the STACK, that is, I. Push all the neighbours of I onto the
stack that are in the ready state. The STACK now becomes
PRINT: I STACK: E, F
(d) Pop and print the top element of the STACK, that is, F. Push all the neighbours of F onto the
stack that are in the ready state. (Note F has two neighbours, C and H. But only C will be added,
as H is not in the ready state.) The STACK now becomes
PRINT: F STACK: E, C
e) Pop and print the top element of the STACK, that is, C. Push all the neighbours of C onto the
stack that are in the ready state. The STACK now becomes
PRINT: C STACK: E, B, G
(f) Pop and print the top element of the STACK, that is, G. Push all the neighbours of G onto the
stack that are in the ready state. Since there are no neighbours of G that are in the ready state,
no push operation is performed. The STACK now becomes
PRINT: G STACK: E, B
(g) Pop and print the top element of the STACK, that is, B. Push all the neighbours of B onto the
stack that are in the ready state. Since there are no neighbours of B that are in the ready state,
no push operation is performed. The STACK now becomes
PRINT: B STACK: E
h) Pop and print the top element of the STACK, that is, E. Push all the neighbours of E onto the
stack that are in the ready state. Since there are no neighbours of E that are in the ready state,
no push operation is performed. The STACK now becomes empty.
PRINT: E STACK:
Since the STACK is now empty, the depth-first search of G starting at node H is complete and
the nodes which were printed are:
H, I, F, C, G, B, E
These are the nodes which are reachable from the node H.
Features of Depth-First Search Algorithm
Space complexity:
• The space complexity of a depth-first search is lower than that of a breadth first search.
Time complexity:
• The time complexity of a depth-first search is proportional to the number of vertices plus
the number of edges in the graphs that are traversed.
• The time complexity can be given as (O(|V|+|E|)).
Completeness:
• Depth-first search is said to be a complete algorithm.
• If there is a solution, depthfirst search will find it regardless of the kind of graph. But in
case of an infinite graph, where there is no possible solution, it will diverge.
Applications of Depth-First Search Algorithm
Depth-first search is useful for:
• Finding a path between two specified nodes, u and v, of an unweighted graph.
• Finding a path between two specified nodes, u and v, of a weighted graph.
• Finding whether a graph is connected or not.
• Computing the spanning tree of a connected graph.
Programming Example:
2. Write a program to implement the depth-first search
algorithm. #include <stdio.h>
#define MAX 5
void depth_first_search(int adj[][MAX],int visited[],int start)
{
int stack[MAX];
int top = –1, i;
printf("%c–",start + 65);
visited[start] = 1;
stack[++top] =
start; while(top !
= –1)
{
start = stack[top];
for(i = 0; i < MAX; i++)
{
if(adj[start][i] && visited[i] == 0)
{
stack[++top] = i;
printf("%c–", i +
65); visited[i] = 1;
break;
}
}
if(i ==
MAX)
top––;
}
}
int main()
{
int adj[MAX][MAX];
int visited[MAX] = {0}, i, j;
printf("\n Enter the adjacency
matrix: "); for(i = 0; i < MAX; i++)
for(j = 0; j < MAX; j++)
scanf("%d", &adj[i][j]);
printf("DFS Traversal: ");
depth_first_search(adj,visited,0)
; printf("\n");
return 0;
}
Output
Enter the adjacency
matrix: 0 1 0 1 0
10110
01001
11001
00110
DFS Traversal: A –> C –> E –>