D.S BCA III 2024-25 Notes Assignments
D.S BCA III 2024-25 Notes Assignments
UNIT-I
Introduction to Data Structure and its Characteristics Array
Representation of single and multidimensional arrays; Sparse arrays – lower and upper
triangular matrices and Tri diagonal matrices with Vector Representation also.
UNIT-II
Stacks and Queues : Introduction and primitive operations on stack; Stack application; Infix,
postfix, prefix expressions; Evaluation of postfix expression; Conversion between prefix,
infix and
postfix, introduction and primitive operation on queues, D- queues and priority queues.
UNIT-III
Lists: Introduction to linked lists; Sequential and linked lists, operations such as traversal,
insertion,
deletion searching, Two way lists and Use of headers
UNIT-IV
Trees: Introduction and terminology; Traversal of binary trees; Recursive algorithms for tree
operations such as traversal, insertion, deletion; Binary Search Tree
UNIT-V
B-Trees: Introduction, The invention of B-Tree; Statement of the problem; Indexing with
binary
search trees; a better approach to tree indexes; B-Trees; working up from the bottom;
Example for
creating a B-Tree
UNIT-VI
Sorting Techniques; Insertion sort, selection sort, merge sort, heap sort, searching
Techniques:
linear search, binary search and hashing
Representation in Memory:
Elements are stored in consecutive memory locations. If the base address of the
array (address of the first element) is base_address, and each element occupies size
bytes, then the address of the element at index i can be calculated as:
Address(array[i]) = base_address + i * size
C Language Program (1D Array):
C
#include <stdio.h>
int main() {
int numbers[5]; // Declare an integer array of size 5
return 0;
}
3.2 Multidimensional Arrays
Multidimensional arrays are used to represent data that has more than one
dimension. The most common types are two-dimensional (2D) arrays (matrices) and
three-dimensional (3D) arrays.
Diagram:
Column 0 Column 1 Column 2
Row 0 [ 1 ] [ 2 ] [ 3 ]
Row 1 [ 4 ] [ 5 ] [ 6 ]
Row 2 [ 7 ] [ 8 ] [ 9 ]
Representation in Memory:
2D arrays are typically stored in memory using one of two main methods:
• Row-Major Order: Elements of each row are stored contiguously, one row
after another. For a 2D array array[R][C] (R rows, C columns), the address of
the element at array[i][j] is:
• Address(array[i][j]) = base_address + (i * C + j) * size
• Column-Major Order: Elements of each column are stored contiguously, one
column after another. For a 2D array array[R][C], the address of the element at
array[i][j] is:
• Address(array[i][j]) = base_address + (j * R + i) * size
(Note: Row-major order is more common in languages like C and C++, while
Fortran uses column-major order).
C Language Program (2D Array):
C
#include <stdio.h>
int main() {
int matrix[3][3] = {
{1, 2, 3},
{4, 5, 6},
{7, 8, 9}
};
printf("Matrix elements:\n");
for (int i = 0; i < 3; i++) {
for (int j = 0; j < 3; j++) {
printf("%d ", matrix[i][j]);
}
printf("\n");
}
return 0;
}
3.2.2 Three-Dimensional Arrays (3D Arrays)
A 3D array can be visualized as a collection of 2D arrays stacked on top of each
other.
Diagram (Conceptual):
Imagine multiple matrices arranged in layers. If a 3D array is declared as array[L][R][C]
(L layers, R rows, C columns), you can think of it as L tables of R rows and C
columns.
Representation in Memory (Row-Major Order):
int main() {
int cube[2][2][2] = {
{ {1, 2}, {3, 4} },
{ {5, 6}, {7, 8} }
};
return 0;
}
4. Sparse Arrays
A sparse array is an array where most of the elements have a value of zero (or
some other default value). Storing all these zero elements can be inefficient in terms
of memory usage.
Example:
0 0 0 5 0
0 2 0 0 0
0 0 0 0 8
0 0 1 0 0
0 0 0 0 0
In this 5x5 matrix, most elements are zero.
Column-Major Order: Elements are stored column by column. The index k in the
vector for an element aij (where i≥j) can be calculated as:
k = j * (2n - j - 1) / 2 + i - j
(Assuming 0-based indexing for both matrix and vector)
Column-Major Order: The index k in the vector for an element aij (where i≤j) can be
calculated as:
Course Name: DATA STRUCTURE USING C, C++
Course Code: 0327002
Faculty Name- Davesh Bhardwaj
k = j * (j - 1) / 2 + i
(Assuming 0-based indexing)
// Sub-diagonal
for (int i = 1; i < n; i++) {
vector[k++] = matrix[i][i - 1];
}
// Main diagonal
for (int i = 0; i < n; i++) {
vector[k++] = matrix[i][i];
}
// Super-diagonal
for (int i = 0; i < n - 1; i++) {
vector[k++] = matrix[i][i + 1];
}
return vector;
}
int stack[MAX_SIZE];
int top = -1;
int pop() {
if (top == -1) {
printf("Stack Underflow\n");
return -1; // Or some error value
}
return stack[top--];
}
int peek() {
if (top == -1) {
printf("Stack is empty\n");
return -1; // Or some error value
}
Course Name: DATA STRUCTURE USING C, C++
Course Code: 0327002
Faculty Name- Davesh Bhardwaj
return stack[top];
}
int isEmpty() {
return top == -1;
}
int isFull() {
return top == MAX_SIZE - 1;
}
int main() {
push(10);
push(20);
push(30);
return 0;
}
1.3 Stack Applications
Stacks are used in various applications in computer science, including:
• Function Call Stack: When a function is called, its information (return
address, local variables, parameters) is pushed onto the stack. When the
function returns, this information is popped. This mechanism manages the
flow of execution in programs.
• Expression Evaluation: Stacks are crucial for evaluating arithmetic
expressions, especially postfix (RPN) expressions.
• Expression Conversion: Stacks are used to convert expressions between
infix, postfix, and prefix notations.
• Syntax Parsing: Compilers use stacks to parse the syntax of programming
languages.
• Backtracking Algorithms: Algorithms that explore all possible solutions
often use stacks to keep track of the path taken (e.g., solving mazes, finding
all permutations).
• Undo/Redo Functionality: Many applications use stacks to store the history
of operations, allowing users to undo and redo actions.
• Reversing a Sequence: Stacks can be used to easily reverse a sequence of
elements.
1.4 Infix, Postfix, Prefix Expressions
Arithmetic expressions can be written in three different notations:
• Infix Notation: The operator is placed between the operands. This is the
most common notation used by humans (e.g., a + b, (c * d) - e).
• Postfix Notation (Reverse Polish Notation - RPN): The operator is placed
after the operands (e.g., a b +, c d * e -).
(A + B) * C AB+C* *+ABC
A * (B + C) ABC+* *A+BC
Algorithm:
1. Scan the postfix expression from left to right.
2. If an operand is encountered, push it onto the stack.
3. If an operator is encountered:
o Pop the top two operands from the stack (let the second popped
operand be the first operand for the operation, and the first popped
operand be the second).
o Perform the operation on the two operands.
o Push the result back onto the stack.
4. After scanning the entire expression, the final result will be the only element
remaining on the stack.
Example (Evaluate 2 3 + 4 *):
1. Scan 2: Push 2 onto the stack. Stack: [2]
2. Scan 3: Push 3 onto the stack. Stack: [2, 3]
3. Scan +: Pop 3, then pop 2. Calculate 2 + 3 = 5. Push 5 onto the stack. Stack:
[5]
4. Scan 4: Push 4 onto the stack. Stack: [5, 4]
5. Scan *: Pop 4, then pop 5. Calculate 5 * 4 = 20. Push 20 onto the stack. Stack:
[20]
6. End of expression. The result is 20.
C Language Program (Evaluation of Postfix Expression):
C
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
int stack[MAX_STACK_SIZE];
int top = -1;
int pop() {
if (top == -1) {
printf("Stack Underflow\n");
exit(EXIT_FAILURE);
}
return stack[top--];
}
int main() {
char postfix[] = "2 3 + 4 *";
printf("Result of %s is %d\n", postfix, evaluatePostfix(postfix));
return 0;
}
1.6 Conversion Between Prefix, Infix, and Postfix
Course Name: DATA STRUCTURE USING C, C++
Course Code: 0327002
Faculty Name- Davesh Bhardwaj
Stacks are commonly used for these conversions.
Algorithm:
1. Initialize an empty output queue (for postfix expression) and an empty
operator stack.
2. Scan the infix expression from left to right.
3. If an operand is encountered, add it to the output queue.
4. If an opening parenthesis ( is encountered, push it onto the operator stack.
5. If a closing parenthesis ) is encountered, pop operators from the stack and
add them to the output queue until an opening parenthesis is encountered.
Pop and discard the opening parenthesis.3
6. If an operator is encountered:
o While the operator stack is not empty and the top operator has equal or
higher precedence than the current operator, pop the operator from the
stack and add it to the output queue.
o Push the current operator onto the stack.
7. After scanning the entire infix expression, pop any remaining operators from
the stack4 and add them to the output queue.
8. The resulting postfix expression is in the output queue.
Operator Precedence (Example):
Operator Precedence Associativity
^ Highest Right-to-left
*, / Medium Left-to-right
+, - Lowest Left-to-right
int queue[MAX_QUEUE_SIZE];
int front = 0;
int rear = 0;
int dequeue() {
if (front == rear) {
printf("Queue Underflow\n");
return -1; // Or some error value
}
return queue[front++];
}
int peekFront() {
if (front == rear) {
printf("Queue is empty\n");
return -1; // Or some error value
}
return queue[front];
}
int isEmptyQueue() {
return front == rear;
}
int isFullQueue() {
return rear == MAX_QUEUE_SIZE;
}
int main() {
enqueue(10);
Course Name: DATA STRUCTURE USING C, C++
Course Code: 0327002
Faculty Name- Davesh Bhardwaj
enqueue(20);
enqueue(30);
return 0;
}
Circular Queue: A variation of the queue that overcomes the limitation of using
array-based queues where the front and rear pointers might reach the end of the
array even if there are empty slots at the beginning. In a circular queue, the pointers
wrap around.
3. D-Queues (Double-Ended Queues)
A D-queue (Double-Ended Queue) is a linear data structure that allows insertion and
deletion of elements from both ends (front and rear). It's a generalization of both
stacks and queues.
Types of D-Queues:
• Input-Restricted D-Queue: Allows insertion at only one end (usually rear) but
deletion from both ends.
• Output-Restricted D-Queue: Allows deletion from only one end (usually
front) but insertion at both ends.
Primitive Operations on D-Queues:
• Insert at Front: Adds an element to the beginning of the D-queue.
• Insert at Rear: Adds an element to the end of the D-queue.
• Delete from Front: Removes an element from the beginning of the D-queue.
• Delete from Rear: Removes an element from the end of the D-queue.
• Peek Front: Gets the element at
3.1 Traversal
Visiting each node in the list exactly once, typically to process or display the data.
Algorithm:
Course Name: DATA STRUCTURE USING C, C++
Course Code: 0327002
Faculty Name- Davesh Bhardwaj
Algorithm Traverse(head)
currentNode = head
while currentNode is not NULL:
process currentNode.data
currentNode = currentNode.next
end while
end Algorithm
C Language Program (Traversal):
C
#include <stdio.h>
#include <stdlib.h>
int main() {
struct Node* head = createNode(10);
head->next = createNode(20);
head->next->next = createNode(30);
traverseList(head);
return 0;
}
3.2 Insertion
Adding a new node to the linked list at different positions.
• Insertion at the Beginning: The new node becomes the new head.
o Algorithm:
o Algorithm InsertAtBeginning(head, newData)
o newNode = createNode(newData)
o newNode.next = head
o head = newNode
o return head
Course Name: DATA STRUCTURE USING C, C++
Course Code: 0327002
Faculty Name- Davesh Bhardwaj
o end Algorithm
o C Language Program:
C
struct Node* insertAtBeginning(struct Node* head, int newData) {
struct Node* newNode = createNode(newData);
newNode->next = head;
return newNode;
}
// In main function:
// head = insertAtBeginning(head, 5);
• Insertion at the End: Traverse to the last node and make its next pointer
point to the new node.
o Algorithm:
o Algorithm InsertAtEnd(head, newData)
o newNode = createNode(newData)
o if head is NULL:
o head = newNode
o return head
o end if
o currentNode = head
o while currentNode.next is not NULL:
o currentNode = currentNode.next
o end while
o currentNode.next = newNode
o return head
o end Algorithm
o C Language Program:
C
struct Node* insertAtEnd(struct Node* head, int newData) {
struct Node* newNode = createNode(newData);
if (head == NULL) {
return newNode;
}
struct Node* current = head;
while (current->next != NULL) {
current = current->next;
}
current->next = newNode;
return head;
}
// In main function:
// head = insertAtEnd(head, 40);
• Insertion at a Specific Position (after a given node): Traverse to the node
after which the new node needs to be inserted.
o Algorithm:
o Algorithm InsertAfter(previousNode, newData)
o if previousNode is NULL:
o print "Previous node cannot be NULL"
o return
o end if
o newNode = createNode(newData)
o newNode.next = previousNode.next
o previousNode.next = newNode
o end Algorithm
Course Name: DATA STRUCTURE USING C, C++
Course Code: 0327002
Faculty Name- Davesh Bhardwaj
o C Language Program:
C
void insertAfter(struct Node* prevNode, int newData) {
if (prevNode == NULL) {
printf("Previous node cannot be NULL\n");
return;
}
struct Node* newNode = createNode(newData);
newNode->next = prevNode->next;
prevNode->next = newNode;
}
// In main function (assuming you have a pointer 'node2' to the second node):
// insertAfter(head->next, 25);
3.3 Deletion
Removing a node from the linked list.
• Deletion at the Beginning: The head pointer is moved to the second node.
o Algorithm:
o Algorithm DeleteAtBeginning(head)
o if head is NULL:
o print "List is empty"
o return NULL
o end if
o temp = head
o head = head.next
o free(temp) // Release the memory of the deleted node
o return head
o end Algorithm
o C Language Program:
C
struct Node* deleteAtBeginning(struct Node* head) {
if (head == NULL) {
printf("List is empty\n");
return NULL;
}
struct Node* temp = head;
head = head->next;
free(temp);
return head;
}
// In main function:
// head = deleteAtBeginning(head);
• Deletion at the End: Traverse to the second-to-last node and set its next
pointer to NULL.
o Algorithm:
o Algorithm DeleteAtEnd(head)
o if head is NULL:
o print "List is empty"
o return NULL
o end if
o if head.next is NULL: // Only one node
o free(head)
o return NULL
// In main function:
// head = deleteAtEnd(head);
• Deletion at a Specific Position (of a given node): Traverse to the node
before the one to be deleted. Adjust the next pointer to skip the node to be
deleted.
o Algorithm:
o Algorithm DeleteNode(head, key) // Delete the first node with data equal to key
o if head is NULL:
o print "List is empty"
o return NULL
o end if
o if head.data == key:
o temp = head
o head = head.next
o free(temp)
o return head
o end if
o currentNode = head
o while currentNode.next is not NULL and currentNode.next.data != key:
o currentNode = currentNode.next
o end while
o if currentNode.next is NULL:
o print "Key not found"
o return head
o end if
Course Name: DATA STRUCTURE USING C, C++
Course Code: 0327002
Faculty Name- Davesh Bhardwaj
o temp = currentNode.next
o currentNode.next = currentNode.next.next
o free(temp)
o return head
o end Algorithm
o C Language Program:
C
struct Node* deleteNode(struct Node* head, int key) {
if (head == NULL) {
printf("List is empty\n");
return NULL;
}
if (head->data == key) {
struct Node* temp = head;
head = head->next;
free(temp);
return head;
}
struct Node* current = head;
while (current->next != NULL && current->next->data != key) {
current = current->next;
}
if (current->next == NULL) {
printf("Key %d not found in the list\n", key);
return head;
}
struct Node* temp = current->next;
current->next = current->next->next;
free(temp);
return head;
}
// In main function:
// head = deleteNode(head, 20);
3.4 Searching
Finding a node with a specific data value in the list.
Algorithm:
Algorithm Search(head, key)
currentNode = head
while currentNode is not NULL:
if currentNode.data == key:
return currentNode // Return the node if found
end if
currentNode = currentNode.next
end while
return NULL // Key not found
end Algorithm
C Language Program (Searching):
C
struct Node* searchList(struct Node* head, int key) {
struct Node* current = head;
while (current != NULL) {
if (current->data == key) {
return current; // Found the node
// In main function:
// struct Node* foundNode = searchList(head, 20);
// if (foundNode != NULL) {
// printf("Node with data %d found\n", foundNode->data);
// } else {
// printf("Node with data not found\n");
// }
4. Two Way Lists (Doubly Linked Lists)
A doubly linked list is a variation where each node has three parts:
• Data: The value stored.
• Next Pointer: Points to the next node in the list.
• Previous Pointer: Points to the previous node in the list.
Diagram:
Head <--> [ Prev | Data | Next ] <--> [ Prev | Data | Next ] <--> [ Prev | Data | NULL ]
The first node's Previous pointer is usually NULL, and the last node's Next pointer is
NULL.
Advantages over Singly Linked Lists:
• Backward Traversal: Allows traversal in both forward and backward
directions.
• Efficient Deletion: Deletion of a node is more efficient if the pointer to the
node to be deleted is known (no need to traverse to the previous node).
• Efficient Insertion Before a Node: Similar to deletion, insertion before a
given node is more efficient.
Operations on Doubly Linked Lists:
The insertion, deletion, and traversal operations are similar to singly linked lists but
need to handle the previous pointers as well. For example, when inserting a node,
you need to update the previous pointer of the new node and the previous pointer of
the node that comes after it. Similarly, during deletion, you need to update the
previous pointer of the node after the deleted node.
C Language Program (Doubly Linked List Node Structure):
C
struct DoublyNode {
int data;
struct DoublyNode *next;
struct DoublyNode *prev;
};
Implementing the operations (insertion, deletion, traversal, searching) for a doubly
linked list involves careful manipulation of both next and prev pointers.
5. Use of Headers (Header Nodes)
A header node (or dummy node) is an extra node added at the beginning of a linked
list. This node typically does not store any meaningful data.
Diagram (Singly Linked List with Header):
Head --> [ Dummy | Next ] --> [ Data | Next ] --> [ Data | NULL ]
Advantages of Using Header Nodes:
Algorithm Preorder(node)
if node is not NULL:
process node.data
Preorder(node.left)
Preorder(node.right)
end if
end Algorithm
Algorithm Postorder(node)
if node is not NULL:
Postorder(node.left)
Postorder(node.right)
Course Name: DATA STRUCTURE USING C, C++
Course Code: 0327002
Faculty Name- Davesh Bhardwaj
process node.data
end if
end Algorithm
C Language Program (Recursive Traversal):
C
#include <stdio.h>
#include <stdlib.h>
struct TreeNode {
int data;
struct TreeNode *left;
struct TreeNode *right;
};
int main() {
struct TreeNode* root = createTreeNode(1);
root->left = createTreeNode(2);
root->right = createTreeNode(3);
root->left->left = createTreeNode(4);
root->left->right = createTreeNode(5);
return 0;
}
3.2 Recursive Algorithms for Other Tree Operations
• Insertion in a Binary Tree (Level Order): For a general binary tree (not
necessarily a Binary Search Tree), insertion is often done in a level-order
fashion to maintain completeness (if needed). This typically uses a queue and
is iterative.
• Insertion in a Binary Search Tree (BST): (Covered in Section 4)
• Deletion in a Binary Tree: Deletion in a general binary tree can be complex
to maintain structure. A common approach is to find the rightmost leaf node
and replace the node to be deleted with it, then delete the leaf. This is usually
implemented iteratively.
• Deletion in a Binary Search Tree (BST): (Covered in Section 4)
• Searching in a Binary Tree:
• Algorithm Search(node, key)
• if node is NULL:
• return false
• if node.data == key:
• return true
• return Search(node.left, key) or Search(node.right, key)
• end Algorithm
• Finding Height of a Binary Tree:
• Algorithm FindHeight(node)
• if node is NULL:
• return -1 // Height of an empty tree is -1
• leftHeight = FindHeight(node.left)
• rightHeight = FindHeight(node.right)
• return 1 + max(leftHeight, rightHeight)
• end Algorithm
C Language Program (Recursive Search and Height):
C
// (TreeNode structure from previous example)
int main() {
struct TreeNode* root = createTreeNode(1);
root->left = createTreeNode(2);
root->right = createTreeNode(3);
root->left->left = createTreeNode(4);
root->left->right = createTreeNode(5);
return 0;
}
4. Binary Search Tree (BST)
A Binary Search Tree (BST) is a special type of binary tree that follows these
properties:
1. The left subtree of a node contains only nodes with keys less than the node's
key.
2. The right subtree of a node contains only nodes with keys greater than the
node's key. 3.2 Both the left and right subtrees must also be binary search
trees.
3. There3 are no duplicate keys (typically enforced).
Diagram (Example BST):
8
/\
3 10
/\ \
1 6 14
/\ /
4 7 13
4.1 Operations on Binary Search Trees
• Insertion: To insert a new key into a BST:
1. Start at the root.
2. If the key is less than the current node's key, move to the left child.
3. If the key is greater than the current node's key, move to the right child.
4. If the left or right child is NULL, insert the new node at that position.
5. If the key is equal to the current node's key (duplicates not allowed),
you can choose to do nothing or handle it based on your requirements.
Recursive Algorithm for Insertion in BST:
Algorithm InsertBST(node, key)
if node is NULL:
return createNewNode(key)
if key < node.data:
node.left = InsertBST(node.left, key)
Course Name: DATA STRUCTURE USING C, C++
Course Code: 0327002
Faculty Name- Davesh Bhardwaj
else if key > node.data:
node.right = InsertBST(node.right, key)
return node
end Algorithm
• Searching: To search for a key in a BST:
1. Start at the root.
2. If the key is equal to the current node's key, the key is found.
3. If the key is less than the current node's key, search in the left subtree.
4. If the key is greater than the current node's key, search in the right
subtree.
5. If you reach a NULL node, the key is not in the BST.
Recursive Algorithm for Search in BST:
Algorithm SearchBST(node, key)
if node is NULL:
return false
if key == node.data:
return true
else if key < node.data:
return SearchBST(node.left, key)
else: // key > node.data
return SearchBST(node.right, key)
end Algorithm
• Deletion: Deleting a node from a BST involves several cases:
1. Node to be deleted is a leaf node: Simply remove it.
2. Node to be deleted has only one child: Replace the node with its
child.
3. Node to be deleted has two children:
§ Find either the inorder successor (smallest node in the right
subtree) or the inorder predecessor (largest node in the left
subtree).
§ Replace the data of the node to be deleted with the data of the
successor/predecessor.
§ Delete the successor/predecessor (which will now have at most
one child).
Recursive Algorithm for Deletion in BST:
Algorithm DeleteBST(node, key)
if node is NULL:
return NULL // Key not found
Algorithm findMin(node)
while node.left is not NULL:
node = node.left
return node
end Algorithm
C Language Program (BST Operations):
C
#include <stdio.h>
#include <stdlib.h>
int main() {
struct TreeNode* root = NULL;
root = insertBST(root, 8);
root = insertBST(root, 3);
root = insertBST(root, 10);
root = insertBST(root, 1);
root = insertBST(root, 6);
root = insertBST(root, 14);
root = insertBST(root, 4);
root = insertBST(root, 7);
root = insertBST(root, 13);
return 0;
}
Binary Search Trees
Example: A balanced binary search tree with N keys has a height of approximately
log2(N). If N is very large (e.g., billions of records), the number of disk accesses for a
single search could be in the order of 30 or more, which is unacceptable for efficient
database operations.
5. A Better Approach to Tree Indexes: Multi-Way Trees
To reduce the height of the tree, we need a tree where each node can have multiple
children. This is the fundamental idea behind multi-way trees. If each node can hold
multiple keys and have multiple children, then the branching factor of the tree
increases significantly, leading to a much smaller height for the same number of
keys.
Example: Consider a balanced m-way tree (each internal node has between m/2
and m children). The height of such a tree with N keys is approximately logm/2(N). If
m is large (e.g., the number of keys that can fit into a disk block), the height becomes
much smaller than that of a binary search tree.
6. B-Trees
A B-tree is a self-balancing multi-way search tree. It is specifically designed for disk-
oriented data structures. A B-tree of order m (where m>1) has the following
properties:
1. Root Property: The root has at least 2 children unless it is a leaf node.
2. Node Size Property: Every node (except the root) has between ⌈m/2⌉ and m
children.
3. Number of Keys: Every internal node with c children contains c−1 keys.
4. Key Ordering Property: Within each node, the keys are stored in sorted
order.
5. Non-Decreasing Keys: The keys in the subtree rooted at the i-th child of a
node are all less than the (i+1)-th key of that node and greater than the i-th
key.
6. Leaf Property: All leaf nodes are at the same level.
7. Leaf Structure: Leaf nodes do not have children and contain between
⌈m/2⌉−1 and m−1 keys. They might also contain pointers to the actual data
records or pointers to child nodes if the B-tree is used as an index.
Order of the B-Tree (m): The order m determines the maximum number of children
a node can have. The choice of m is typically based on the block size of the disk. We
want each node to fit within one disk block to minimize the number of disk reads per
tree level.
Structure of a B-Tree Node:
A typical internal node in a B-tree of order m looks like this:
P1 K1 P2 K2 P3 ... Km-1 Pm
Where:
• Ki are the keys, sorted in non-decreasing order (K1<K2<...<Km−1).
• Pi are pointers to the children of the node.
• The subtree pointed to by P1 contains keys less than K1.
• The subtree pointed to by Pi (for 1<i<m) contains keys greater than or equal
to Ki−1 and less than Ki.
• The subtree pointed to by Pm contains keys greater than or equal to Km−1.
Course Name: DATA STRUCTURE USING C, C++
Course Code: 0327002
Faculty Name- Davesh Bhardwaj
A leaf node looks like this:
K1 K2 K3 ... Kl
Where l is the number of keys in the leaf node (⌈m/2⌉−1≤l≤m−1). These keys might
be associated with pointers to the actual data records.
7. Working Up from the Bottom: Insertion in B-Trees
B-trees are often visualized as growing upwards from the leaves. When a new key is
inserted, it is initially placed in a leaf node. If the leaf node becomes full (contains
m−1 keys), it is split into two nodes. The middle key is promoted to the parent node,
along with pointers to the two new leaf nodes. This process can propagate up the
tree. If the root node also becomes full and splits, the height of the tree increases by
one.
Insertion Algorithm (Conceptual):
1. Find the appropriate leaf node: Traverse the B-tree starting from the root to
find the leaf node where the new key should be inserted (based on the sorted
order of keys).
2. Insert the key: Insert the new key into the leaf node, maintaining the sorted
order.
3. Check for overflow: If the leaf node now contains m keys (one more than the
maximum), it has overflowed and needs to be split.
4. Split the overflowing node:
o The median key of the m keys is promoted to the parent node.
o The keys smaller than the median are kept in the left new leaf node.
o The keys larger than the median are kept in the right new leaf node.
o Pointers to the two new leaf nodes are inserted into the parent node at
the position of the promoted key.
5. Handle parent overflow: If the parent node also overflows due to the
insertion of the new key and pointers, the split process is repeated up the
tree.
6. Splitting the root: If the root node splits, a new root node is created with the
median key and pointers to the two new child nodes, increasing the height of
the tree by one.
8. Example for Creating a B-Tree
Let's create a B-tree of order m=3. This means each node (except the root) can have
between ⌈3/2⌉=2 and 3 children, and thus between 1 and 2 keys. Leaf nodes can
have between ⌈3/2⌉−1=1 and 3−1=2 keys.
Insert the following keys in order: 10, 20, 30, 5, 15, 25, 35, 40, 45.
1. Insert 10:
2. [10] (Leaf/Root)
3. Insert 20:
4. [10, 20] (Leaf/Root)
5. Insert 30: The leaf is now full (2 keys). Split it. The middle key is 20.
6. [20] (Root)
7. / \
8. [10] [30] (Leaves)
9. Insert 5: Insert into the left leaf.
10. [20]
11. / \
12. [5, 10] [30]
Course Name: DATA STRUCTURE USING C, C++
Course Code: 0327002
Faculty Name- Davesh Bhardwaj
13. Insert 15: Insert into the left leaf.
14. [20]
15. / \
16. [5, 10, 15] [30]
The left leaf is full. Split it. The middle key is 10. Promote 10 to the parent.
[10, 20] (Root)
/ | \
[5] [15] [30] (Leaves)
17. Insert 25: Insert into the right leaf.
18. [10, 20]
19. / | \
20. [5] [15] [25, 30]
21. Insert 35: Insert into the right leaf.
22. [10, 20]
23. / | \
24. [5] [15] [25, 30, 35]
The right leaf is full. Split it. The middle key is 30. Promote 30 to the parent.
[10, 20, 30] (Root)
/ | | \
[5] [15] [25] [35] (Leaves)
25. Insert 40: Insert into the rightmost leaf.
26. [10, 20, 30]
27. / | | \
28. [5] [15] [25] [35, 40]
29. Insert 45: Insert into the rightmost leaf.
30. [10, 20, 30]
31. / | | \
32. [5] [15] [25] [35, 40, 45]
The rightmost leaf is full. Split it. The middle key is 40. Promote 40 to the
parent.
[10, 20, 30, 40] (Root)
/ | | | \
[5] [15] [25] [35] [45] (Leaves)
The root is now full (3 keys). Split it. The middle key is 20 (the ⌈4/2⌉=2nd key).
Promote 20.
[20] (New Root)
/ \
[10] [30, 40]
/ / \
[5, 15] [25] [35, 45]
Correction in the last split: When splitting [10, 20, 30, 40], with m=3, a node can
hold at most m−1=2 keys. The split results in two nodes and the middle key is
promoted. The middle key here is 20.
[20] (New Root)
/ \
[10] [30, 40]
/ / \
[5, 15] [25] [35, 45]
Let's re-do the final split one more time for clarity with m=3. When [10, 20, 30,
40] in the root needs to split, the middle element is 20.
[20] (New Root)
/ \
[10] [30, 40]
Course Name: DATA STRUCTURE USING C, C++
Course Code: 0327002
Faculty Name- Davesh Bhardwaj
/ / \
[5, 15] [25] [35, 45]
Actually, for m=3, the maximum number of keys in a node is 2. When we have
[10, 20, 30] in a node and need to insert, say, 40, it becomes [10, 20, 30, 40]
(temporarily). The split would promote the middle key (20), leaving [10] on the
left and [30, 40] on the right.
Let's correct the step where [10, 20, 30] root needed to accommodate 40.
[20] (New Root)
/ \
[10] [30]
/ / \
[5, 15] [25] [35, 40, 45]
Now the rightmost leaf [35, 40, 45] needs to split. The middle key is 40.
[20]
/ \
[10] [30, 40]
/ / \
[5, 15] [25] [35] [45]
[ , 5, 4, 2, 8]
[1, 5, 4, 2, 8]
[1, 4, 5, 2, 8]
[1, 2, 4, 5, 8]
[1, 2, 4, 5, 8]
int main() {
int arr[] = {5, 1, 4, 2, 8};
int n = sizeof(arr) / sizeof(arr[0]);
insertionSort(arr, n);
[1, 4, 6, 5, 3]
Iteration 2 (i = 1):
[1, 3, 6, 5, 4]
Iteration 3 (i = 2):
[1, 3, 4, 5, 6]
[1, 3, 4, 5, 6]
int main() {
int arr[] = {6, 4, 1, 5, 3};
int n = sizeof(arr) / sizeof(arr[0]);
selectionSort(arr, n);
Merge (pairs):
Merge (final):
[0, 1, 2, 3, 7, 8, 10]
C Language Program:
C
#include <stdio.h>
#include <stdlib.h>
merge(arr, l, m, r);
}
}
int main() {
int arr[] = {8, 3, 1, 7, 0, 10, 2};
int n = sizeof(arr) / sizeof(arr[0]);
mergeSort(arr, 0, n - 1);
return 0;
}
Time Complexity:
• Best Case: O(nlogn)
• Average Case: O(nlogn)
• Worst Case: O(nlogn) Space Complexity: O(n) (due to the13 auxiliary space
used in the merge operation)
1.4 Heap Sort
Heap sort is a comparison-based sorting algorithm that uses a binary heap data
structure. It is similar to selection sort where we first14 find the maximum element and
place the maximum element at the end. We repeat the same process for the
remaining elements.15
Algorithm:
1. Build a max heap from the input data.
2. Heapify: While the size of the heap is greater than 1:
o Swap the root (maximum element) with the last element of the heap.
o Decrease the heap size by 1.
o Heapify the root of the remaining heap.
Diagram:
Let's sort the array [4, 10, 3, 5, 1]
1. Build Max Heap:
2. Sorting:
o Swap root (10) with last (1): [1, 5, 3, 4, 10], heap size = 4.
o Heapify root: [5, 4, 3, 1, 10]
o Swap root (5) with last (1): [1, 4, 3, 5, 10], heap size = 3.
o Heapify root: [4, 1, 3, 5, 10]
o Swap root (4) with last (3): [3, 1, 4, 5, 10], heap size = 2.
o Heapify root: [3, 1, 4, 5, 10] (no change needed)
o Swap root (3) with last (1): [1, 3, 4, 5, 10], heap size = 1.
o Heap size is 1, sorting complete.
int main() {
int arr[] = {4, 10, 3, 5, 1};
int n = sizeof(arr) / sizeof(arr[0]);
heapSort(arr, n);
return 0;