0% found this document useful (0 votes)
7 views

DS Notes-1

Data structure reference note for ai education

Uploaded by

rohanjaisonpov
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

DS Notes-1

Data structure reference note for ai education

Uploaded by

rohanjaisonpov
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

DATA STRUCTURES

LECTURE NOTES
UNIT-I
INTRODUCTION TO ALGORITHMS AND DATA STRUCTURES

Definition: - An algorithm is a Step By Step process to solve a problem, where each step
indicates an intermediate task. Algorithm contains finite number of steps that leads to
the solution of the problem.
Properties /Characteristics of an
Algorithm:- Algorithm has the following
basic properties
● Input-Output:- Algorithm takes ‘0’ or more input and produces the required
output. This is the basic characteristic of an algorithm.
● Finiteness:- An algorithm must terminate in countable number of steps.
● Definiteness: Each step of an algorithm must be stated clearly and unambiguously.
● Effectiveness: Each and every step in an algorithm can be converted in to
programming language statement.
● Generality: Algorithm is generalized one. It works on all set of inputs and
provides the required output. In other words it is not restricted to a single input
value.

Performance Analysis an Algorithm:


The Efficiency of an Algorithm can be measured by the following metrics.
i. Time Complexity and
ii. Space
Complexity. i.Time
Complexity:
The amount of time required for an algorithm to complete its execution is its time
complexity. An algorithm is said to be efficient if it takes the minimum (reasonable)
amount of time to complete its execution.
ii. Space Complexity:
The amount of space occupied by an algorithm is known as Space Complexity. An
algorithm is said to be efficient if it occupies less space and required the minimum
amount of time to complete its execution.
ASYMPTOTIC NOTATIONS
Asymptotic analysis of an algorithm refers to defining the mathematical boundation/framing of
its run-time performance. Using asymptotic analysis, we can very well conclude the best case,
average case, and worst case scenario of an algorithm.
Asymptotic analysis is input bound i.e., if there's no input to the algorithm, it is concluded to
work in a constant time. Other than the "input" all other factors are considered constant.
Asymptotic analysis refers to computing the running time of any operation in mathematical
units of computation. For example, the running time of one operation is computed as f(n) and
may be for another operation it is computed as g(n2). This means the first operation running
time will increase linearly with the increase in n and the running time of the second operation
will increase exponentially when n increases. Similarly, the running time of both operations will
be nearly the same if n is significantly small.

The time required by an algorithm falls under three types −


● Best Case − Minimum time required for program execution.
● Average Case − Average time required for program execution.
● Worst Case − Maximum time required for program execution.
Asymptotic Notations
Following are the commonly used asymptotic notations to calculate the running time
complexity of an algorithm.
● Ο Notation
● Ω Notation
● θ Notation
Big Oh Notation, Ο
The notation Ο(n) is the formal way to express the upper bound of an algorithm's running
time. It measures the worst case time complexity or the longest amount of time an algorithm
can possibly take to complete.

For example, for a function f(n)


Ο(f(n)) = { g(n) : there exists c > 0 and n0 such that f(n) ≤ c.g(n) for all n > n0. }
DATA STRUCTURES
Data may be organized in many different ways logical or mathematical model of a program
particularly organization of data. This organized data is called “Data Structure”.
Or
The organized collection of data is called a ‘Data Structure’.

Data Structure involves two complementary goals. The first goal is to identify and develop
useful, mathematical entities and operations and to determine what class of problems can be
solved by using these entities and operations. The second goal is to determine representation
for those abstract entities to implement abstract operations on this concrete representation.

Primitive Data structures are directly supported by the language ie; any operation is directly performed
in these data items.
Ex: integer, Character, Real numbers etc.
Non-primitive data types are not defined by the programming language, but are instead created
by the programmer.
Linear data structures organize their data elements in a linear fashion, where data
elements are attached one after the other. Linear data structures are very easy to implement,
since the memory of the computer is also organized in a linear fashion. Some commonly used
linear data structures are arrays, linked lists, stacks and queues.
In nonlinear data structures, data elements are not organized in a sequential fashion.
Data structures like multidimensional arrays, trees, graphs, tables and sets are some examples
of widely used nonlinear data structures.
Operations on the Data Structures:
Following operations can be performed on the data structures:
1. Traversing
2. Searching
3. Inserting
4. Deleting
5. Sorting
6. Merging
1. Traversing- It is used to access each data item exactly once so that it can be processed.
2. Searching- It is used to find out the location of the data item if it exists in the given
collection of data items.
3. Inserting- It is used to add a new data item in the given collection of data items.
4. Deleting- It is used to delete an existing data item from the given collection of data
items. 5. Sorting- It is used to arrange the data items in some order i.e. in ascending or
descending order in case of numerical data and in dictionary order in case of alphanumeric
data.
6. Merging- It is used to combine the data items of two sorted files into single file in the sorted
form.
STACKS AND QUEUES
STACKS
A Stack is linear data structure. A stack is a list of elements in which an element may be inserted
or deleted only at one end, called the top of the stack. Stack principle is LIFO (last in, first out).
Which element inserted last on to the stack that element deleted first from the stack.

As the items can be added or removed only from the top i.e. the last item to be added to a stack
is the first item to be removed.

Real life examples of stacks are:

Operations on stack:

The two basic operations associated with stacks are:


1. Push
2. Pop

While performing push and pop operations the following test must be conducted on the stack.
a) Stack is empty or not b) stack is full or not

1. Push: Push operation is used to add new elements in to the stack. At the time of
addition first check the stack is full or not. If the stack is full it generates an error message "stack
overflow".

2. Pop: Pop operation is used to delete elements from the stack. At the time of deletion
first check the stack is empty or not. If the stack is empty it generates an error message "stack
underflow".

All insertions and deletions take place at the same end, so the last element added to the
stack will be the first element removed from the stack. When a stack is created, the stack base
remains fixed while the stack top changes as elements are added and removed. The most
accessible element is the top and the least accessible element is the bottom of the stack.
Representation of Stack (or) Implementation of stack:
The stack should be represented in two ways:
1. Stack using array
2. Stack using linked list

1. Stack using array:


Let us consider a stack with 6 elements capacity. This is called as the size of the stack. The
number of elements to be added should not exceed the maximum size of the stack. If we
attempt to add new element beyond the maximum size, we will encounter a stack overflow
condition. Similarly, you cannot remove elements beyond the base of the stack. If such is the
case, we will reach a stack underflow condition.

1. push():When an element is added to a stack, the operation is performed by push().


Below Figure shows the creation of a stack and addition of elements using push().

Initially top=-1, we can insert an element in to the stack, increment the top value i.e top=top+1.
We can insert an element in to the stack first check the condition is stack is full or not. i.e
top>=size-1. Otherwise add the element in to the stack.
void push() Algorithm: Procedure for push():
{
int x; Step 1: START
if(top >= n-1) Step 2: if top>=size-1 then
{ Write “ Stack is Overflow”
printf("\n\nStack Step 3: Otherwise
Overflow.."); 3.1: read data value
return; ‘x’ 3.2: top=top+1;
} 3.3: stack[top]=x;
else Step 4: END
{
printf("\n\nEnter data: ");
scanf("%d", &x);
stack[top] = x;
top = top + 1;
printf("\n\nData Pushed into
the stack");
}
}
2. Pop(): When an element is taken off from the stack, the operation is performed by
pop(). Below figure shows a stack initially with three elements and shows the deletion of
elements using pop().

We can insert an element from the stack, decrement the top value i.e top=top-1.
We can delete an element from the stack first check the condition is stack is empty or not.
i.e top==-1. Otherwise remove the element from the stack.

Void pop() Algorithm: procedure pop():


{ Step 1: START
If(top==-1) Step 2: if top==-1 then
{ Write “Stack is
Printf(“Stack is Underflow”); Underflow” Step 3: otherwise
} 3.1: print “deleted element”
else 3.2: top=top-1;
{ Step 4: END
printf(“Delete data %d”,stack[top]);
top=top-1;
}
}

3. display(): This operation performed display the elements in the stack. We display the
element in the stack check the condition is stack is empty or not i.e top==-1.Otherwise display
the list of elements in the stack.
void display() Algorithm: procedure pop():
{ Step 1: START
If(top==-1) Step 2: if top==-1 then
{ Write “Stack is Underflow”
Printf(“Stack is Underflow”); Step 3: otherwise
} 3.1: print “Display elements are”
else 3.2: for top to 0
{ Print ‘stack[i]’
printf(“Display elements Step 4: END
are:); for(i=top;i>=0;i--)
printf(“%d”,stack[i]);
}
}

2. Stack using Linked List:


We can represent a stack as a linked list. In a stack push and pop operations are performed at
one end called top. We can perform similar operations at one end of list using top pointer. The
linked stack looks as shown in figure.

Applications of stack:
1. Stack is used by compilers to check for balancing of parentheses, brackets and braces.
2. Stack is used to evaluate a postfix expression.
3. Stack is used to convert an infix expression into postfix/prefix form.
4. In recursion, all intermediate arguments and return values are stored on the processor’s
stack.
5. During a function call the return address and arguments are pushed onto a stack and on
return they are popped off.
Converting and evaluating Algebraic expressions:

An algebraic expression is a legal combination of operators and operands. Operand is the


quantity on which a mathematical operation is performed. Operand may be a variable like x, y, z
or a constant like 5, 4, 6 etc. Operator is a symbol which signifies a mathematical or logical
operation between the operands. Examples of familiar operators include +, -, *, /, ^ etc.

An algebraic expression can be represented using three different notations. They are infix,
postfix and prefix notations:
Infix: It is the form of an arithmetic expression in which we fix (place) the arithmetic operator in
between the two operands.
Example: A + B
Prefix: It is the form of an arithmetic notation in which we fix (place) the arithmetic
operator before (pre) its two operands. The prefix notation is called as polish notation.
Example: + A B

Postfix: It is the form of an arithmetic expression in which we fix (place) the arithmetic operator
after (post) its two operands. The postfix notation is called as suffix notation and is also referred
to reverse polish notation.
Example: A B +

Conversion from infix to postfix:


Procedure to convert from infix expression to postfix expression is as follows:
1. Scan the infix expression from left to right.
2. a) If the scanned symbol is left parenthesis, push it onto the stack.
b) If the scanned symbol is an operand, then place directly in the postfix expression
(output).
c) If the symbol scanned is a right parenthesis, then go on popping all the items from the
stack and place them in the postfix expression till we get the matching left parenthesis.
d) If the scanned symbol is an operator, then go on removing all the operators from the
stack and place them in the postfix expression, if and only if the precedence of the operator
which is on the top of the stack is greater than (or greater than or equal) to the precedence
of the scanned operator and push the scanned operator onto the stack otherwise, push the
scanned operator onto the stack.
The three important features of postfix expression are:
1. The operands maintain the same order as in the equivalent infix expression.
2. The parentheses are not needed to designate the expression unambiguously.
3. While evaluating the postfix expression the priority of the operators is no longer relevant.

We consider five binary operations: +, -, *, / and $ or ↑ (exponentiation). For these binary


operations, the following in the order of precedence (highest to lowest):
Evaluation of postfix expression:
The postfix expression is evaluated easily by the use of a stack.
1. When a number is seen, it is pushed onto the stack;
2. When an operator is seen, the operator is applied to the two numbers that are popped
from the stack and the result is pushed onto the stack.
3. When an expression is given in postfix notation, there is no need to know any
precedence rules; this is our obvious advantage.
QUEUE
A queue is linear data structure and collection of elements. A queue is another special kind of
list, where items are inserted at one end called the rear and deleted at the other end called the
front. The principle of queue is a “FIFO” or “First-in-first-out”.
Queue is an abstract data structure. A queue is a useful data structure in programming. It is
similar to the ticket queue outside a cinema hall, where the first person entering the queue is
the first person who gets the ticket.
A real-world example of queue can be a single-lane one-way road, where the vehicle enters
first, exits first.

More real-world examples can be seen as queues at the ticket windows and bus-stops and our
college library.

The operations for a queue are analogues to those for a stack; the difference is that the
insertions go at the end of the list, rather than the beginning.
Operations on QUEUE:
A queue is an object or more specifically an abstract data structure (ADT) that allows the
following operations:
● Enqueue or insertion: which inserts an element at the end of the queue.
● Dequeue or deletion: which deletes an element at the start of the queue.
Queue operations work as follows:
1. Two pointers called FRONT and REAR are used to keep track of the first and last
elements in the queue.
2. When initializing the queue, we set the value of FRONT and REAR to 0.
3. On enqueing an element, we increase the value of REAR index and place the new
element in the position pointed to by REAR.
4. On dequeueing an element, we return the value pointed to by FRONT and increase the
FRONT index.
5. Before enqueing, we check if queue is already full.
6. Before dequeuing, we check if queue is already empty.
7. When enqueing the first element, we set the value of FRONT to 1.
8. When dequeing the last element, we reset the values of FRONT and REAR to 0.

Representation of Queue (or) Implementation of Queue:


The queue can be represented in two ways:
1. Queue using Array
2. Queue using Linked List
1. Queue using Array:
Let us consider a queue, which can hold maximum of five elements. Initially the queue is empty.
Now, insert 11 to the queue. Then queue status will be:

Next, insert 22 to the queue. Then the queue status is:

Again insert another element 33 to the queue. The status of the queue is:

Now, delete an element. The element deleted is the element at the front of the queue.So the
status of the queue is:

Again, delete an element. The element to be deleted is always pointed to by the FRONT
pointer. So, 22 is deleted. The queue status is as follows:

Now, insert new elements 44 and 55 into the queue. The queue status is:
Next insert another element, say 66 to the queue. We cannot insert 66 to the queue as the rear
crossed the maximum size of the queue (i.e., 5). There will be queue full signal. The queue
status is as follows:

Now it is not possible to insert an element 66 even though there are two vacant positions in the
linear queue. To overcome this problem the elements of the queue are to be shifted towards
the beginning of the queue so that it creates vacant position at the rear end. Then the FRONT
and REAR are to be adjusted properly. The element 66 can be inserted at the rear end. After this
operation, the queue status is as follows:

This difficulty can overcome if we treat queue position with index 0 as a position that comes
after position with index 4 i.e., we treat the queue as a circular queue.

Queue operations using array:


a.enqueue() or insertion():which inserts an element at the end of the queue.
void insertion() Algorithm: Procedure for insertion():
{ Step-1:START
if(rear==max) Step-2: if rear==max then
printf("\n Queue is Full"); Write ‘Queue is full’
else Step-3: otherwise
{ 3.1: read element ‘queue[rear]’
printf("\n Enter no %d:",j++); Step-4:STOP
scanf("%d",&queue[rear++]);
}
}

b.dequeue() or deletion(): which deletes an element at the start of the queue.


void deletion() Algorithm: procedure for deletion():
{ Step-1:START
if(front==rear) Step-2: if front==rear then
{ Write’ Queue is empty’
printf("\n Queue is empty"); Step-3: otherwise
} 3.1: print deleted
else element Step-4:STOP
{
printf("\n Deleted Element is
%d",queue[front++]);
x++;
}}
c.dispaly(): which displays an elements in the queue.
void deletion() Algorithm: procedure for deletion():
{ Step-1:START
if(front==rear) Step-2: if front==rear then
{ Write’ Queue is empty’
printf("\n Queue is empty"); Step-3: otherwise
} 3.1: for i=front to rear then
else 3.2: print ‘queue[i]’
{ Step-4:STOP
for(i=front; i<rear; i++)
{
printf("%d",queue[i]);
printf("\n");
}
}
}

2. Queue using Linked list:


We can represent a queue as a linked list. In a queue data is deleted from the front end and
inserted at the rear end. We can perform similar operations on the two ends of alist. We use
two pointers front and rear for our linked queue implementation.
The linked queue looks as shown in figure:

Applications of Queue:
1. It is used to schedule the jobs to be processed by the CPU.
2. When multiple users send print jobs to a printer, each printing job is kept in the printing
queue. Then the printer prints those jobs according to first in first out (FIFO) basis.
3. Breadth first search uses a queue data structure to find an element from a graph.
CIRCULAR QUEUE
A more efficient queue representation is obtained by regarding the array Q[MAX] as circular.
Any number of items could be placed on the queue. This implementation of a queue is called a
circular queue because it uses its storage array as if it were a circle instead of a linear list.
There are two problems associated with linear queue. They are:

● Time consuming: linear time to be spent in shifting the elements to the beginning of the
queue.
● Signaling queue full: even if the queue is having vacant position.
For example, let us consider a linear queue status as follows:

Next insert another element, say 66 to the queue. We cannot insert 66 to the queue as the rear
crossed the maximum size of the queue (i.e., 5). There will be queue full signal. The queue
status is as follows:

This difficulty can be overcome if we treat queue position with index zero as a position that
comes after position with index four then we treat the queue as a circular queue.
In circular queue if we reach the end for inserting elements to it, it is possible to insert new
elements if the slots at the beginning of the circular queue are empty.
Representation of Circular Queue:
Let us consider a circular queue, which can hold maximum (MAX) of six elements. Initially the
queue is empty.

Now, insert 11 to the circular queue. Then circular queue status will be:
Insert new elements 22, 33, 44 and 55 into the circular queue. The circular queue status is:

Now, delete an element. The element deleted is the element at the front of the circular queue.
So, 11 is deleted. The circular queue status is as follows:

Again, delete an element. The element to be deleted is always pointed to by the FRONT pointer.
So, 22 is deleted. The circular queue status is as follows:

Again, insert another element 66 to the circular queue. The status of the circular queue is:
Now, insert new elements 77 and 88 into the circular queue. The circular queue status is:

Now, if we insert an element to the circular queue, as COUNT = MAX we cannot add the
element to circular queue. So, the circular queue is full.

Operations on Circular queue:

a.enqueue() or insertion():This function is used to insert an element into the circular queue. In
a circular queue, the new element is always inserted at Rear position.
void insertCQ() Algorithm: procedure of insertCQ():
{
int data; Step-1:START
if(count ==MAX) Step-2: if count==MAX then
{ Write “Circular queue is full”
printf("\n Circular Queue is Full"); Step-3:otherwise
} 3.1: read the data element
else 3.2: CQ[rear]=data
{ 3.3 : rear=(rear+1)%MAX
printf("\n Enter data: "); 3.4 :
scanf("%d", &data); count=count+1
CQ[rear] = data; Step-4:STOP
rear = (rear + 1) % MAX;
count ++;
printf("\n Data Inserted in the Circular
Queue ");
}
}
b.dequeue() or deletion():This function is used to delete an element from the circular
queue. In a circular queue, the element is always deleted from front position.
void deleteCQ() Algorithm: procedure of deleteCQ():
{
if(count ==0) Step-1:START
{ Step-2: if count==0 then
printf("\n\nCircular Queue is Empty.."); Write “Circular queue is empty”
} Step-3:otherwise
else 3.1: print the deleted
{ element 3.2:
printf("\n Deleted element from Circular front=(front+1)%MAX
Queue is %d ", CQ[front]); 3.3: count=count-1
front = (front + 1) % MAX; Step-4:STOP
count --;
}
}
c.dispaly():This function is used to display the list of elements in the circular queue.
void displayCQ() Algorithm: procedure of displayCQ():
{
int i, j; Step-1:START
if(count Step-2: if count==0 then
==0) Write “Circular queue is empty”
{ Step-3:otherwise
printf("\n\n\t Circular Queue is Empty "); 3.1: print the list of elements
} 3.2: for i=front to j!=0
else 3.3: print CQ[i]
{ 3.4: i=(i+1)%MAX
printf("\n Elements in Circular Queue are: Step-4:STOP
");
j = count;
for(i = front; j != 0; j--)
{
printf("%d\t",
CQ[i]); i = (i + 1) %
MAX;
}
}
}

Deque:
In the preceding section we saw that a queue in which we insert items at one end and from
which we remove items at the other end. In this section we examine an extension of the queue,
which provides a means to insert and remove items at both ends of the queue. This data
structure is a deque. The word deque is an acronym derived from double-ended queue. Below
figure shows the representation of a deque.
deque provides four operations. Below Figure shows the basic operations on a deque.
• enqueue_front: insert an element at front.
• dequeue_front: delete an element at front.
• enqueue_rear: insert element at rear.
• dequeue_rear: delete element at rear.

There are two variations of deque. They are:


• Input restricted deque (IRD)
• Output restricted deque (ORD)
An Input restricted deque is a deque, which allows insertions at one end but allows deletions at
both ends of the list.
An output restricted deque is a deque, which allows deletions at one end but allows insertions
at both ends of the list.
Priority Queue:
A priority queue is a collection of elements such that each element has been assigned a
priority. We can insert an element in priority queue at the rare position. We can delete an
element from the priority queue based on the elements priority and such that the order in
which elements are deleted and processed comes from the following rules:

1. An element of higher priority is processed before any element of lower priority.


2. Two elements with same priority are processed according to the order in which they
were added to the queue. It follows FIFO or FCFS(First Comes First serve) rules.
We always remove an element with the highest priority, which is given by the minimal integer
priority assigned.

A prototype of a priority queue is time sharing system: programs of high priority are processed
first, and programs with the same priority form a standard queue. An efficient implementation
for the Priority Queue is to use heap, which in turn can be used for sorting purpose called heap
sort

Priority queues are two types:


1. Ascending order priority queue
2. Descending order priority queue
1. Ascending order priority queue: It is Lower priority number to high priority
number. Examples: order is 1,2,3,4,5,6,7,8,9,10
2. Descending order priority queue: It is high priority number to lowest priority
number. Examples: Order is 10,9,8,7,6,5,4,3,2,1
Implementation of Priority Queue:
Implementation of priority queues are two types:
1. Through Queue(Using Array)
2. Through Sorted List(Using Linked List)
1. Through Queue (Using Array): In this case element is simply added at the rear end as
usual. For deletion, the element with highest priority is searched and then deleted.
2. Through sorted List (Using Linked List): In this case insertion is costly because the
element insert at the proper place in the list based on the priority. Here deletion is easy since
the element with highest priority will always be in the beginning of the list.
1. Difference between stacks and Queues?
stacks Queues
1. A stack is a linear list of elements in 1. A Queue is a linerar list of elements in
which the element may be inserted or which the elements are added at one end and
deleted at one end. deletes the elements at another end.
2. . In Queue the element which is
2. In stacks, elements which are inserted first is the element deleted first.
inserted last is the first element to be
deleted. 3. Queues are called FIFO (First
In First Out)list.
3. Stacks are called LIFO (Last
In First Out)list 4. In Queue elements are removed
in the same order in which thy are
4. In stack elements are removed in inserted.
reverse order in which thy are inserted.
5. Suppose the elements a,b,c,d,e are
5. suppose the elements inserted in the Queue, the deletion of
a,b,c,d,e are inserted in the stack, elements will be in the same order in which
the deletion of elements will be thy are inserted.
e,d,c,b,a.
6. In Queue there are two pointers
6. In stack there is only one pointer to one for insertion called “Rear” and
insert and delete called “Top”. another for
deletion called “Front”.
7. Initially top=-1 indicates a stack is empty.
7. Initially Rear=Front=-1 indicates a
8. Stack is full represented by the Queue is empty.
condition TOP=MAX-1(if array index starts
from ‘0’). 8. Queue is full represented by the
condition Rear=Max-1.
9. To push an element into a stack,
Top is incremented by one 9. To insert an element into Queue,
Rear is incremented by one.
10. To POP an element from
stack,top is decremented by one. 10. To delete an element from Queue, Front is
incremented by one.
11.The conceptual view of Stack is as
follows: 11.The conceptual view of Queue is as
follows:
INTRODUCTION
Linear Data Structures:

Linear data structures are those data structures in which data elements are accessed (read and
written) in sequential fashion (one by one). Ex: Stacks, Queues, Lists, Arrays
Non Linear Data Structures:
Non Linear Data Structures are those in which data elements are not accessed in sequential
fashion.
Ex: trees, graphs
Difference between Linear and Nonlinear Data Structures
Main difference between linear and nonlinear data structures lie in the way they organize data
elements. In linear data structures, data elements are organized sequentially and therefore they
are easy to implement in the computer’s memory. In nonlinear data structures, a data element
can be attached to several other data elements to represent specific relationships that exist
among them. Due to this nonlinear structure, they might be difficult to be implemented in
computer’s linear memory compared to implementing linear data structures. Selecting one data
structure type over the other should be done carefully by considering the relationship among
the data elements that needs to be stored.
LINEAR LIST
A data structure is said to be linear if its elements form a sequence. A linear list is a list that
displays the relationship of adjacency between elements.
A Linear list can be defined as a data object whose instances are of the form (e1, e2, e3…en)
where n is a finite natural number. The ei terms are the elements of the list and n is its length.
The elements may be viewed as atomic as their individual structure is not relevant to the
structure of the list. When n=0, the list is empty. When n>0,e1 is the first element and en the last.
Ie;e1 comes before e2, e2 comes before e3 and so on.
Some examples of the Linear List are
● An alphabetized list of students in a class
● A list of exam scores in non decreasing order
● A list of gold medal winners in the Olympics
● An alphabetized list of members of Congress
The following are the operations that performed on the Linear List
✔ Create a Linear List

✔ Destroy a Linear List

✔ Determine whether the list is empty

✔ Determine the size of the List

✔ Find the element with a given index

✔ Find the index of a given number

✔ Delete, erase or remove an element given its index


✔ Insert a new element so that it has a given index
A Linear List may be specified as an abstract Data type (ADT) in which we provide a specification
of the instance as well as of the operations that are to be performed. The below abstract data
type omitted specifying operations to create and destroy instance of the data type. All ADT
specifications implicitly include an operation to create an empty instance and optionally, an
operation to destroy an instance.
Array Representation: (Formula Based Representation)
A formula based representation uses an array to represent the instance of an object. Each
position of the Array is called a Cell or Node and is large enough to hold one of the elements
that make up an instance, while in other cases one array can represent several instances.
Individual elements of an instance are located in the array using a mathematical formula.
Suppose one array is used for each list to be represented. We need to map the elements
of a list to positions in the array used to represent it. In a formula based representation, a
mathematical formula determines the location of each element. A simple mapping formulas is

This equation states that the ith element of the list is in position i-1 of the array. The below figure
shows a five element list represented in the array element using the mapping of equation.
To completely specify the list we need to know its current length or size. For this purpose we
use variable length. Length is zero when list is empty. Program gives the resulting C++ class
definition. Since the data type of the list element may vary from application to application, we
have defined a template class in which the user specifies the element data type T. the data
members length, MaxSize and element are private members are private members, while the
remaining members are public. Insert and delete have been defined to return a reference to a
linear list.
Insertion and Deletion of a Linear List:
Suppose we want to remove an element ei from the list by moving to its right down by 1.For
example, to remove an element e1=2 from the list,we have to move the elements e2=4,
e3=8,and e4=1,which are to the right of e1, to positions 1,2 and 3 of the array element. The
below figure shows this result. The shaded elements are moved.
To insert an element so that it becomes element I of a list, must move the existing element
ei and all elements to its right one position right and then put the new element into position I of
the array. For example to insert 7 as the second element of the list, we first move elements e2
and e3 to the right by 1 and then put 7 in to second position 2 of the array. The below figure
shows this result. The shaded elements were moved.

Linked Representation And Chains


In a linked list representation each element of an instance of a data object is represented in a
cell or node. The nodes however need not be component of an array and no formula is used to
locate individual elements. Instead of each node keeps explicit information about the location
of other relevant nodes. This explicit information about the location of another node is called
Link or Pointer.
Let L=(e1, e2, e3…en) be a linear List. In one possible linked representation for this list, each
element ei is represented in a separate node. Each node has exactly one link field that is used to
locate the next element in the linear list. So the node for ei links to that for ei+1, 0<=i<n-1. The
node for en-1 has no need to link to and so its link field is NULL. The pointer variables first locate
the first node in the representation. The below figure shows the linked representation of a
List=(e1, e2, e3…en).

Since each node in the Linked representation of the above figure has exartly one link, the
structure of this figure is called a ‘Single Linked List’.the nodes are ordered from left to right
with each node (other than last one) linking to the next,and the last node has a NULL link,the
structure is also called a chain.
Insertion and Deletion of a Single Linked List:
Insertion Let the list be a Linked list with succesive nodes A and B as shown in below
figure.suppose a node N id to be inserted into the list between the node A and B.
In the New list the Node A points to the new Node N and the new node N points to the node B
to which Node A previously pointed.
Deletion:
Let list be a Linked list with node N between Nodes A and B is as shown in the following figure.
In the new list the node N is to be deleted from the Linked List. The deletion occurs as the link
field in the Node A is made to point node B this excluding node N from its path.

DOUBLE LINKED LIST (Or) TWO WAY LINKED LIST


In certain applications it is very desirable that list be traversed in either forward direction or
Back word direction. The property of Double Linked List implies that each node must contain
two link fields instead of one. The links are used to denote the preceding and succeeding of the
node. The link denoting the preceding of a node is called Left Link. The link denoting succeeding
of a node is called Right Link. The list contain this type of node is called a “Double Linked List”
or “Two Way List”. The Node structure in the Double Linked List is as follows:

Lptr contains the address of the before node. Rptr contains the address of next node. Data
Contains the Linked List is as follows.

In the above diagram Last and Start are pointer variables which contains the address of last
node and starting node respectively.
Insertion in to the Double Linked List:Let list be a double linked list with successive modes A
and B as shown in the following diagram. Suppose a node N is to be inserted into the list
between the node s A and B this is shown in the following diagram.
As in the new list the right pointer of node A points to the new node N ,the Lptr of the node ‘N’
points to the node A and Rptr of node ‘N’ points to the node ‘B’ and Lpts of node B points the
new node ‘N’
Deletion Of Double Linked List :- Let list be a linked list contains node N between the nodes A
and B as shown in the following diagram.

Support node N is to be deleted from the list diagram will appear as the above mention double
linked list. The deletion occurs as soon as the right pointer field of node A charged, so that it
points to node B and the lift point field of node B is changed. So that it pointes to node A.
Circular Linked List:- Circular Linked List is a special type of linked list in which all the nodes are
linked in continuous circle. Circular list can be singly or doubly linked list. Note that, there are no
Nulls in Circular Linked Lists. In these types of lists, elements can be added to the back of the list
and removed from the front in constant time.
Both types of circularly-linked lists benefit from the ability to traverse the full list beginning at
any given node. This avoids the necessity of storing first Node and last node, but we need a
special representation for the empty list, such as a last node variable which points to some node
in the list or is null if it's empty. This representation significantly simplifies adding and removing
nodes with a non-empty list, but empty lists are then a special case. Circular linked lists are
most useful for describing naturally circular structures, and have the advantage of being able to
traverse the list starting at any point. They also allow quick access to the first and last records
through a single pointer (the address of the last element)

Circular single linked list:

Circular linked list are one they of liner linked list. In which the link fields of last node of the list
contains the address of the first node of the list instead of contains a null pointer.
Advantages:- Circular list are frequency used instead of ordinary linked list because in circular
list all nodes contain a valid address. The important feature of circular list is as follows.
(1) In a circular list every node is accessible from a given node.
(2) Certain operations like concatenation and splitting becomes more efficient in circular
list.
Disadvantages: Without some conditions in processing it is possible to get into an infinite Loop.
Circular Double Linked List :- These are one type of double linked list. In which the rpt field of
the last node of the list contain the address of the first node ad the left points of the first node
contains the address of the last node of the list instead of containing null pointer.

Advantages:- circular list are frequently used instead of ordinary linked list because in circular
list all nodes contained a valid address. The important feature of circular list is as follows.
(1) In a circular list every node is accessible from a given node.
(2) Certain operations like concatenation and splitting becomes more efficient in
circular list.
Disadvantage:-Without some conditions in processes it is possible to get in to an infant glad.
Difference between single linked list and double linked list?

Single linked list(SLL) Double linked list(DLL)


1. In Single Linked List the list will be 1. In Double Linked List the list will be
traversed in only one way ie; in forward. traversed in two way ie; either forward and
2. In Single Linked List the node backward
contains one link field only. 2. In Double Linked List the node
3. Every node contains the address of contains two link fields.
next node. 3. Every node contains the address of
4. The node structure in Single linked next node as well as preceding node.
list is as follows: 4. the node structure in double linked list is
as
follows:

5. the conceptual view of DLL is as follows:


5. The conceptual view of SLL is as follows:

6. DLL is maintained in memory by using


three arrays.
6. SLL are maintained in memory by
using two arrays.

2. Difference between sequential allocation and linked allocation?


OR
Difference between Linear List and Linked List?
OR
Difference between Arrays and Linked List?
Arrays Linked List
1. Arrays are used in the predictable 1. Linked List are used in the
storage requirement ie; exert amount of unpredictable storage requirement ie; exert
data storage required by the program can amount of data storage required by the
be determined. program can’t be determined.

2. In arrays the operations such as 2. In Linked List the operations such


insertion and deletion are done in an as insertion and deletion are done more
inefficient manner. efficient manner ie; only by changing the
pointer.

3. The insertion and deletion are 3. The insertion and deletion are
done by moving the elements either up done by only changing the pointers.
or down.
4. Successive elements need not
4. Successive elements occupy occupy adjacent space.
adjacent space on memory.
5. In linked list each location contains
5. In arrays each location contain DATA only data and pointer to denote whether the
6. The linear relation ship between the next element present in the memory.
data elements of an array is reflected by the
physical relation ship of data in the memory. 6. The linear relation ship between the
data elements of a Linked List is reflected
7. In array declaration a block of by the Linked field of the node.
memory space is required.
7. In Linked list there is no need of
8. There is no need of storage of such thing.
pointer or lines

8. In Linked list a pointer is stored


along into the element.
9. The Conceptual view of an 9. The Conceptual view of Linked list
Array is as follows: is as follows:

10. In array there is no need for an


element to specify whether the next is
stored 10. There is need for an element
(node) to specify whether the next node
is formed.
SORTING-INTRODUCTION

Sorting is a technique of organizing the data. It is a process of arranging the records,


either in ascending or descending order i.e. bringing some order lines in the data. Sort methods
are very important in Data structures.
Sorting can be performed on any one or combination of one or more attributes present in each
record. It is very easy and efficient to perform searching, if data is stored in sorting order. The
sorting is performed according to the key value of each record. Depending up on the makeup of
key, records can be stored either numerically or alphanumerically. In numerical sorting, the
records arranged in ascending or descending order according to the numeric value of the key.
Let A be a list of n elements A1, A2, A3 An in memory. Sorting A refers to the
operation of rearranging the contents of A so that they are increasing in order, that is, so that A1
<=A2 <=A3 <=…………….<=An. Since A has n elements, there are n! Ways that the contents can
appear in A. these ways corresponding precisely to the n! Permutations of 1,2,3, n. accordingly
each sorting algorithm must take care of these n! Possibilities.

Ex: suppose an array DATA contains 8elements as follows:


DATA: 70, 30,40,10,80,20,60,50.
After sorting DATA must appear in memory as follows:
DATA: 10 20 30 40 50 60 70 80
Since DATA consists of 8 elements, there are 8!=40320 ways that the numbers 10,20,30,40,50,60,70,80
can appear in DATA.
The factors to be considered while choosing sorting techniques are:

​ Programming Time

​ Execution Time

​ Number of Comparisons

​ Memory Utilization

​ Computational Complexity

Types of Sorting Techniques:


Sorting techniques are categorized into 2 types. They are Internal Sorting and External Sorting.
Internal Sorting: Internal sorting method is used when small amount of data has to be sorted. In
this method , the data to be sorted is stored in the main memory (RAM).Internal sorting method
can access records randomly. EX: Bubble Sort, Insertion Sort, Selection Sort, Shell sort, Quick
Sort, Radix Sort, Heap Sort etc.
External Sorting: Extern al sorting method is used when large amount of data has to be sorted.
In this method, the data to be sorted is stored in the main memory as well as in the secondary
memory such as disk. External sorting methods an access records only in a sequential order. Ex:
Merge Sort, Multi way Mage Sort.

Complexity of sorting Algorithms: The complexity of sorting algorithm measures the running
time as a function of the number n of items to be stored. Each sorting algorithm S will be made
up of the following operations, where A1, A2, A3 An contain the
items to be sorted and B is an auxiliary location.

Refer Quick sort Notes in Unit-5

Define sorting? What is the difference between internal and external sorting methods? Ans:-
Sorting is a technique of organizing data. It is a process of arranging the elements either may be
ascending or descending order, ie; bringing some order lines with data.

Internal sorting External sorting


1. Internal Sorting takes place in the main 1. External sorting is done with additional
memory of a computer. external memory like magnetic tape or hard
disk
2. The internal sorting methods are applied 2. The External sorting methods are applied
to small collection of data. only when the number of data elements to
be sorted is too large.
3. Internal sorting takes small input 3. External sorting can take as much as large
input.
4. It means that, the entire collection of data 4. External sorting typically uses a sort-
to be sorted in small enough that the sorting merge strategy, and requires auxiliary
can take place within main memory. storage.
5. For sorting larger datasets, it may be 5. In the sorting phase, chunks of data small
necessary to hold only a chunk of data in enough to fit in main memory are read,
memory at a time, since it wont all fit. sorted, and written out to a temporary file.
6. Example of Internal Sorting algorithms are 6. Example of External sorting algorithms
:- Bubble Sort, Internal Sort, Quick Sort, are: - Merge Sort, Two-way merge sort.
Heap Sort, Binary Sort, Radix Sort, Selection
sort.
7. Internal sorting does not make use of 7. External sorting make use of extra
extra resources. resources.
TREES AND BINARY TREES

TREES

INTRODUCTION
In linear data structure data is organized in sequential order and in non-linear data structure
data is organized in random order. A tree is a very popular non-linear data structure used in a
wide range of applications. Tree is a non-linear data structure which organizes data in
hierarchical structure and this is a recursive definition.
DEFINITION OF TREE:
Tree is collection of nodes (or) vertices and their edges (or) links. In tree data structure, every
individual element is called as Node. Node in a tree data structure stores the actual data of that
particular element and link to next element in hierarchical structure.

Note: 1. In a Tree, if we have N number of nodes then we can have a maximum of N- 1 number
of links or edges.
2. Tree has no cycles.
TREE TERMINOLOGIES:
1. Root Node: In a Tree data structure, the first node is called as Root Node. Every tree
must have a root node. We can say that the root node is the origin of the tree data structure. In
any tree, there must be only one root node. We never have multiple root nodes in a tree.

2. Edge: In a Tree, the connecting link between any two nodes is called as EDGE. In a tree
with 'N' number of nodes there will be a maximum of 'N-1' number of edges.
3. Parent Node: In a Tree, the node which is a predecessor of any node is called as PARENT
NODE. In simple words, the node which has a branch from it to any other node is called a
parent node. Parent node can also be defined as "The node which has child / children".

Here, A is parent of B&C. B is the parent of D,E&F and so on…


4. Child Node: In a Tree data structure, the node which is descendant of any node is called
as CHILD Node. In simple words, the node which has a link from its parent node is called as child
node. In a tree, any parent node can have any number of child nodes. In a tree, all the nodes
except root are child nodes.

5. Siblings: In a Tree data structure, nodes which belong to same Parent are called as
SIBLINGS. In simple words, the nodes with the same parent are called Sibling nodes.
6. Leaf Node: In a Tree data structure, the node which does not have a child is called as
LEAF Node. In simple words, a leaf is a node with no child. In a tree data structure, the leaf
nodes are also called as External Nodes. External node is also a node with no child. In a tree,
leaf node is also called as 'Terminal' node.

7. Internal Nodes: In a Tree data structure, the node which has atleast one child is called as
INTERNAL Node. In simple words, an internal node is a node with atleast one child.

In a Tree data structure, nodes other than leaf nodes are called as Internal Nodes. The root
node is also said to be Internal Node if the tree has more than one node. Internal nodes are also
called as 'Non-Terminal' nodes.

8. Degree: In a Tree data structure, the total number of children of a node is called as
DEGREE of that Node. In simple words, the Degree of a node is total number of children it has.
The highest degree of a node among all the nodes in a tree is called as 'Degree of Tree'

Degree of Tree is: 3

9. Level: In a Tree data structure, the root node is said to be at Level 0 and the children of
root node are at Level 1 and the children of the nodes which are at Level 1 will be at Level 2
10. and so on... In simple words, in a tree each step from top to bottom is called as a Level
and the Level count starts with '0' and incremented by one at each level (Step).

11. Height: In a Tree data structure, the total number of edges from leaf node to a particular
node in the longest path is called as HEIGHT of that Node. In a tree, height of the root node is
said to be height of the tree. In a tree, height of all leaf nodes is '0'.

12. Depth: In a Tree data structure, the total number of egdes from root node to a particular
node is called as DEPTH of that Node. In a tree, the total number of edges from root node to a
leaf node in the longest path is said to be Depth of the tree. In simple words, the highest depth
of any leaf node in a tree is said to be depth of that tree. In a tree, depth of the root node is '0'.

13. Path: In a Tree data structure, the sequence of Nodes and Edges from one node to
another node is called as PATH between that two Nodes. Length of a Path is total number of
nodes in that path. In below example the path A - B - E - J has length 4.
14. Sub Tree: In a Tree data structure, each child from a node forms a subtree recursively.
Every child node will form a subtree on its parent node.

TREE REPRESENTATIONS:
A tree data structure can be represented in two methods. Those methods are as follows...

1. List Representation
2. Left Child - Right Sibling Representation

Consider the following tree...

1. List Representation
In this representation, we use two types of nodes one for representing the node with data
called 'data node' and another for representing only references called 'reference node'. We start
with a 'data node' from the root node in the tree. Then it is linked to an internal node.
through a 'reference node' which is further linked to any other node directly. This process
repeats for all the nodes in the tree.
The above example tree can be represented using List representation as follows...

2. Left Child - Right Sibling Representation


In this representation, we use a list with one type of node which consists of three fields namely
Data field, Left child reference field and Right sibling reference field. Data field stores the actual
value of a node, left reference field stores the address of the left child and right reference field
stores the address of the right sibling node. Graphical representation of that node is as follows...

In this representation, every node's data field stores the actual value of that node. If that node has left a
child, then left reference field stores the address of that left child node otherwise stores NULL. If that
node has the right sibling, then right reference field stores the address of right sibling node otherwise
stores NULL.

The above example tree can be represented using Left Child - Right Sibling representation as follows...

BINARY TREE:

In a normal tree, every node can have any number of children. A binary tree is a special type of
tree data structure in which every node can have a maximum of 2 children. One is known as a
left child and the other is known as right child.

A tree in which every node can have a maximum of two children is called Binary Tree.
In a binary tree, every node can have either 0 children or 1 child or 2 children but not more than
2 children.

In general, tree nodes can have any number of children. In a binary tree, each node can have at
most two children. A binary tree is either empty or consists of a node called the root together
with two binary trees called the left subtree and the right subtree. A tree with no nodes is
called as a null tree

Example:

TYPES OF BINARY TREE:


1. Strictly Binary Tree:
In a binary tree, every node can have a maximum of two children. But in strictly binary tree,
every node should have exactly two children or none. That means every internal node must
have exactly two children. A strictly Binary Tree can be defined as follows...

Strictly binary tree is also called as Full Binary Tree or Proper Binary Tree or 2-Tree.
Strictly binary tree data structure is used to represent mathematical expressions.Example

2. Complete Binary Tree:


In a binary tree, every node can have a maximum of two children. But in strictly binary tree,
every node should have exactly two children or none and in complete binary tree all the nodes
must have exactly two children and at every level of complete binary tree there must be 2level
number of nodes. For example at level 2 there must be 22 = 4 nodes and at level 3 there must
be 23 = 8 nodes.

Complete binary tree is also called as Perfect Binary Tree.

3. Extended Binary Tree:


A binary tree can be converted into Full Binary tree by adding dummy nodes to existing nodes
wherever required.
In above figure, a normal binary tree is converted into full binary tree by adding dummy nodes.
4. Skewed Binary Tree:
If a tree which is dominated by left child node or right child node, is said to be a Skewed Binary
Tree.
In a skewed binary tree, all nodes except one have only one child node. The remaining node
has no child.

In a left skewed tree, most of the nodes have the left child without corresponding right child.
In a right skewed tree, most of the nodes have the right child without corresponding left child.
Properties of binary trees:
Some of the important properties of a binary tree are as follows:
1. If h = height of a binary tree, then
a. Maximum number of leaves = 2h
b. Maximum number of nodes = 2h + 1 - 1
2. If a binary tree contains m nodes at level l, it contains at most 2m nodes at level l + 1.
3. Since a binary tree can contain at most one node at level 0 (the root), it can contain at
most 2l node at level l.
4. The total number of edges in a full binary tree with n node is n –
BINARY TREE REPRESENTATIONS:
A binary tree data structure is represented using two methods. Those methods are as
follows...
1. Array Representation
2. Linked List Representation
Consider the following binary tree...
1. Array Representation of Binary Tree
In array representation of a binary tree, we use one-dimensional array (1-D Array) to
represent a binary tree.
Consider the above example of a binary tree and it is represented as follows...

To represent a binary tree of depth 'n' using array representation, we need one dimensional
array with a maximum size of 2n + 1.
2. Linked List Representation of Binary Tree
We use a double linked list to represent a binary tree. In a double linked list, every node
consists of three fields. First field for storing left child address, second for storing actual data
and third for the right child address.
In this linked list representation, a node has the following structure...

The above example of the binary tree represented using Linked list representation is shown as
follows...

BINARY TREE TRAVERSALS:


Unlike linear data structures (Array, Linked List, Queues, Stacks, etc) which have only one logical
way to traverse them, binary trees can be traversed in different ways. Following are the
generally used ways for traversing binary trees.
When we wanted to display a binary tree, we need to follow some order in which all the nodes
of that binary tree must be displayed. In any binary tree, displaying order of nodes depends on
the traversal method.
Displaying (or) visiting order of nodes in a binary tree is called as Binary Tree Traversal.
There are three types of binary tree traversals.
1. In - Order Traversal
2. Pre - Order Traversal
3. Post - Order Traversal
1. In - Order Traversal (left Child - root - right Child):In In-Order traversal, the root node is visited
between the left child and right child. In this traversal, the left child node is visited first, then the root
node is visited and later we go for visiting the right child node. This in-order traversal is applicable for
every root node of all sub trees in the tree. This is performed recursively for all nodes in the tree.
Algorithm:
Step-1: Visit the left subtree, using inorder.
Step-2: Visit the root.
Step-3: Visit the right subtree, using inorder.

In the above example of a binary tree, first we try to visit left child of root node 'A', but A's left
child 'B' is a root node for left subtree. so we try to visit its (B's) left child 'D' and again D is a
root for subtree with nodes D, I and J. So we try to visit its left child 'I' and it is the leftmost
child. So first we visit 'I' then go for its root node 'D' and later we visit D's right child 'J'. With
this we have completed the left part of node B. Then visit 'B' and next B's right child 'F' is
visited. With this we have completed left part of node A. Then visit root node 'A'. With this we
have completed left and root parts of node A. Then we go for the right part of the node A. In
right of A again there is a subtree with root C. So go for left child of C and again it is a subtree
with root G. But G does not have left part so we visit 'G' and then visit G's right child K. With this
we have completed the left part of node C. Then visit root node 'C' and next visit C's right child
'H' which is the rightmost child in the tree. So we stop the process.
That means here we have visited in the order of I - D - J - B - F - A - G - K - C - H using In- Order
Traversal.
2. Pre - Order Traversal ( root - leftChild - rightChild ):
In Pre-Order traversal, the root node is visited before the left child and right child nodes. In this
traversal, the root node is visited first, then its left child and later its right child. This pre- order
traversal is applicable for every root node of all subtrees in the tree. Preorder search is also
called backtracking.
Algorithm:
Step-1: Visit the root.
Step-2: Visit the left subtree, using preorder.
Step-3: Visit the right subtree, using preorder.
In the above example of binary tree, first we visit root node 'A' then visit its left child 'B'
which is a root for D and F. So we visit B's left child 'D' and again D is a root for I and
J. So we visit D's left child 'I' which is the leftmost child. So next we go for visiting D's right child
'J'. With this we have completed root, left and right parts of node D and root, left parts of node
B. Next visit B's right child 'F'. With this we have completed root and left parts of node A. So we
go for A's right child 'C' which is a root node for G and H. After visiting C, we go for its left child
'G' which is a root for node K. So next we visit left of G, but it does not have left child so we go
for G's right child 'K'. With this, we have completed node C's root and left parts. Next visit C's
right child 'H' which is the rightmost child in the tree. So we stop the process.
That means here we have visited in the order of A-B-D-I-J-F-C-G-K-H using Pre-Order Traversal.
3. Post - Order Traversal ( leftChild - rightChild - root ):
In Post-Order traversal, the root node is visited after left child and right child. In this
traversal, left child node is visited first, then its right child and then its root node. This is
recursively performed until the right most nodes are visited.
Algorithm:
Step-1: Visit the left subtree, using postorder.
Step-2: Visit the right subtree, using postorder
Step-3: Visit the root.

Here we have visited in the order of I - J - D - F - B - K - G - H - C - A using Post-Order


Traversal.
Graph
Graph
A graph can be defined as group of vertices and edges that are used to connect these
vertices. A graph can be seen as a cyclic tree, where the vertices (Nodes) maintain any
complex relationship among them instead of having parent child relationship.

Definition
A graph G can be defined as an ordered set G(V, E) where V(G) represents the set of vertices
and E(G) represents the set of edges which are used to connect these vertices.

A Graph G(V, E) with 5 vertices (A, B, C, D, E) and six edges ((A,B), (B,C), (C,E), (E,D), (D,B), (D,A))
is shown in the following figure.

Directed and Undirected Graph


A graph can be directed or undirected. However, in an undirected graph, edges are not
associated with the directions with them. An undirected graph is shown in the above figure
since its edges are not attached with any of the directions. If an edge exists between vertex
A and B then the vertices can be traversed from B to A as well as A to B.

In a directed graph, edges form an ordered pair. Edges represent a specific path from
some vertex A to another vertex B. Node A is called initial node while node B is called
terminal node.

A directed graph is shown in the following figure.

Graph Terminology
Path
A path can be defined as the sequence of nodes that are followed in order to reach
some terminal node V from the initial node U.

Closed Path
A path will be called as closed path if the initial node is same as terminal node. A path
will be closed path if V0=VN.

Simple Path
If all the nodes of the graph are distinct with an exception V0=VN, then such path P is
called as closed simple path.

Cycle
A cycle can be defined as the path which has no repeated edges or vertices except the
first and last vertices.

Connected Graph
A connected graph is the one in which some path exists between every two vertices (u,
v) in V. There are no isolated nodes in connected graph.

Complete Graph
A complete graph is the one in which every node is connected with all other nodes. A
complete graph contain n(n-1)/2 edges where n is the number of nodes in the graph.

Weighted Graph
In a weighted graph, each edge is assigned with some data such as length or weight.
The weight of an edge e can be given as w(e) which must be a positive (+) value
indicating the cost of traversing the edge.

Digraph
A digraph is a directed graph in which each edge of the graph is associated with some
direction and the traversing can be done only in the specified direction.

Loop
An edge that is associated with the similar end points can be called as Loop.

Adjacent Nodes
If two nodes u and v are connected via an edge e, then the nodes u and v are called as
neighbours or adjacent nodes.

Degree of the Node


A degree of a node is the number of edges that are connected with that node. A node
with degree 0 is called as isolated node.

Graph representation
By Graph representation, we simply mean the technique to be used to store some graph
into the computer's memory.

A graph is a data structure that consist a sets of vertices (called nodes) and edges.
There are two ways to store Graphs into the computer's memory:

○ Adjacency matrix(Sequential representation representation)


○ Adjacency list (Linked list representation)

In sequential representation, an adjacency matrix is used to store the graph. Whereas in


linked list representation, there is a use of an adjacency list to store the graph.

the ways of representing a graph in the data structure.

Adjacency matrix

In sequential representation, there is a use of an adjacency matrix to represent the mapping


between vertices and edges of the graph. We can use an adjacency matrix to represent the
undirected graph, directed graph, weighted directed graph, and weighted undirected graph.

If adj[i][j] = w, it means that there is an edge exists from vertex i to vertex j with weight w.

An entry Aij in the adjacency matrix representation of an undirected graph G will be 1 if an


edge exists between Vi and Vj. If an Undirected Graph G consists of n vertices, then the
adjacency matrix for that graph is n x n, and the matrix A = [aij] can be defined as -

aij = 1 {if there is a path exists from Vi to Vj}

aij = 0 {Otherwise}

It means that, in an adjacency matrix, 0 represents that there is no association exists


between the nodes, whereas 1 represents the existence of a path between two edges.

If there is no self-loop present in the graph, it means that the diagonal entries of the
adjacency matrix will be 0.

Now, let's see the adjacency matrix representation of an undirected graph.


In the above figure, an image shows the mapping among the vertices (A, B, C, D, E), and this
mapping is represented by using the adjacency matrix.

There exist different adjacency matrices for the directed and undirected graph. In a directed
graph, an entry Aij will be 1 only when there is an edge directed from Vi to Vj.

Adjacency list /Linked list representation


An adjacency list is used in the linked representation to store the Graph in the computer's
memory. It is efficient in terms of storage as we only have to store the values for edges.

Let's see the adjacency list representation of an undirected graph.

In the above figure, we can see that there is a linked list or adjacency list for every node of
the graph. From vertex A, there are paths to vertex B and vertex D. These nodes are linked to
nodes A in the given adjacency list.

An adjacency list is maintained for each node present in the graph, which stores the node
value and a pointer to the next adjacent node to the respective node. If all the adjacent
nodes are traversed, then store the NULL in the pointer field of the last node of the list.

The sum of the lengths of adjacency lists is equal to twice the number of edges present in an
undirected graph.
Graph Traversal - BFS
Graph traversal is a technique used for searching a vertex in a graph. The graph traversal is also

used to decide the order of vertices is visited in the search process. A graph traversal finds the

edges to be used in the search process without creating loops. That means using graph traversal we

visit all the vertices of the graph without getting into looping path.

There are two graph traversal techniques and they are as follows...

1. DFS (Depth First Search)


2. BFS (Breadth First Search)

BFS (Breadth First Search)


BFS traversal of a graph produces a spanning tree as final result. Spanning Tree is a graph
without loops. We use Queue data structure with maximum size of total number of vertices in the
graph to implement BFS traversal.

We use the following steps to implement BFS traversal...


​ Step 1 - Define a Queue of size total number of vertices in the graph.
​ Step 2 - Select any vertex as starting point for traversal. Visit that vertex and insert it into the
Queue.
​ Step 3 - Visit all the non-visited adjacent vertices of the vertex which is at front of the Queue
and insert them into the Queue.
​ Step 4 - When there is no new vertex to be visited from the vertex which is at front of the
Queue then delete that vertex.
​ Step 5 - Repeat steps 3 and 4 until queue becomes empty.
​ Step 6 - When queue becomes empty, then produce final spanning tree by removing unused
edges from the graph
Example

Depth First Traversal (or DFS) for a graph is similar to Depth First Traversal of a tree.
Like Trees, we traverse all adjacent one by one, when we traverse an adjacent, we
finish traversal of all vertices reachable through the adjacent completely. After we
finish one adjacent and its reachable, we go to the next adjacent and finish all
reachable through next and continue this way. Similar to tree where we first
completely traverse the left subtree and then go to the right subtree. The only catch
here is, that, unlike trees, graphs may contain cycles (a node may be visited twice).
To avoid processing a node more than once, use a boolean visited array.

Example:

Input: V = 5, E = 5, edges = {{1, 2}, {1, 0}, {0, 2}, {2, 3}, {2, 4}}, s = 1

Output: 1 2 0 3 4
Explanation: The source vertex s is 1. visit it first, then we visit an adjacent. There
are two adjacent 1, 0 and 2. pick any of the two (

● Start at 1: Mark as visited. Output: 1

● Move to 2: Mark as visited. Output: 2


● Move to 0: Mark as visited. Output: 0 (backtrack to 2)

● Move to 3: Mark as visited. Output: 3 (backtrack to 2)

● Move to 4: Mark as visited. Output: 4 (backtrack to 1)

Input: V = 5, E = 4, edges = {{0, 2}, {0, 3}, {0, 1}, {2, 4}}, s = 0

Output: 0 2 4 3 1
Explanation: DFS Steps:

● Start at 0: Mark as visited. Output: 0

● Move to 2: Mark as visited. Output: 2

● Move to 4: Mark as visited. Output: 4 (backtrack to 2, then backtrack to 0)

● Move to 3: Mark as visited. Output: 3 (backtrack to 0)

● Move to 1: Mark as visited. Output: 1


Spanning Tree
A spanning tree is a subset of Graph G, such that all the vertices are connected using the minimum
possible number of edges. Hence, a spanning tree does not have cycles and a graph may have
more than one spanning tree.

Properties of a Spanning Tree:


● A Spanning tree does not exist for a disconnected graph.

● For a connected graph having N vertices then the number of edges in the

spanning tree for that graph will be N-1.

● A Spanning tree does not have any cycle.

● We can construct a spanning tree for a complete graph by removing E-N+1

edges, where E is the number of Edges and N is the number of vertices.

● Cayley’s Formula: It states that the number of spanning trees in a complete

graph with N vertices is

○ For example: N=4, then maximum number of spanning tree

possible = 16 (shown in the above image).


Minimum Spanning Tree:
A minimum spanning tree (MST) is defined as a spanning tree that has the minimum
weight among all the possible spanning trees.

The minimum spanning tree has all the properties of a spanning tree with an added
constraint of having the minimum possible weights among all possible spanning trees.
Like a spanning tree, there can also be many possible MSTs for a graph.

Kruskal’s Minimum Spanning Tree Algorithm:

This is one of the popular algorithms for finding the minimum spanning tree from a
connected, undirected graph. This is a greedy algorithm. The algorithm workflow is as
below:

● First, it sorts all the edges of the graph by their weights,

● Then starts the iterations of finding the spanning tree.

● At each iteration, the algorithm adds the next lowest-weight edge one by one,

such that the edges picked until now does not form a cycle.

This algorithm can be implemented efficiently using a DSU ( Disjoint-Set ) data


structure to keep track of the connected components of the graph. This is used in a
variety of practical applications such as network design, clustering, and data analysis.

Illustration:

Below is the illustration of the above approach:


Input Graph:

The graph contains 9 vertices and 14 edges. So, the minimum spanning tree formed will
be having (9 – 1) = 8 edges.

Step 1: Pick edge 7-6. No cycle is formed, include it.

Add edge 7-6 in the MST


Step 2: Pick edge 8-2. No cycle is formed, include it.

Add edge 8-2 in the MST

Step 3: Pick edge 6-5. No cycle is formed, include it.

Add edge 6-5 in the MST

Step 4: Pick edge 0-1. No cycle is formed, include it.

Add edge 0-1 in the MST


Step 5: Pick edge 2-5. No cycle is formed, include it.

Add edge 2-5 in the MST

Step 6: Pick edge 8-6. Since including this edge results in the cycle, discard it. Pick edge
2-3: No cycle is formed, include it.

Add edge 2-3 in the MST


Step 7: Pick edge 7-8. Since including this edge results in the cycle, discard it. Pick edge
0-7. No cycle is formed, include it.

Add edge 0-7 in MST

Step 8: Pick edge 1-2. Since including this edge results in the cycle, discard it. Pick edge
3-4. No cycle is formed, include it.

Add edge 3-4 in the MST

Note: Since the number of edges included in the MST equals to (V – 1), so the algorithm
stops here
Hashing in Data Structure

Hashing is a technique used in data structures that efficiently stores and retrieves data in a way that
allows for quick access. It involves mapping data to a specific index in a hash table using a hash
function that enables fast retrieval of information based on its key. This method is commonly used
in databases, caching systems, and various programming applications to optimize search and
retrieval operations. The great thing about hashing is, we can achieve all three operations (search,
insert and delete) in O(1) time on average.

Hash Table
Hash table is one of the most important data structures that uses a special function known as a hash
function that maps a given value with a key to access the elements faster.

A Hash table is a data structure that stores some information, and the information has basically two
main components, i.e., key and value. The hash table can be implemented with the help of an
associative array. The efficiency of mapping depends upon the efficiency of the hash function used
for mapping.

For example, suppose the key value is John and the value is the phone number, so when we pass
the key value in the hash function shown as below:

Hash(key)= index;

When we pass the key in the hash function, then it gives the index.

Hash(john) = 3;

The above example adds the john at the index 3.

Drawback of Hash function

A Hash function assigns each value with a unique key. Sometimes hash table uses an imperfect
hash function that causes a collision because the hash function generates the same key of two
different values.

In Hashing technique, the hash table and hash function are used. Using the hash function, we can
calculate the address at which the value can be stored.

The main idea behind the hashing is to create the (key/value) pairs. If the key is given, then the
algorithm computes the index at which the value would be stored. It can be written as:

Index = hash(key)
There are three ways of calculating the hash function:

○ Division method
○ Folding method
○ Mid square method

In the division method, the hash function can be defined as:

h(ki) = ki % m;

where m is the size of the hash table.

For example, if the key value is 6 and the size of the hash table is 10. When we apply the hash
function to key 6 then the index would be:

h(6) = 6%10 = 6

The index is 6 at which the value is stored.

Mid-Square Method
In the mid-square method, the key is squared, and the middle digits of the result are taken as
the hash value.
Steps:
1. Square the key.
2. Extract the middle digits of the squared value.

Advantages:
● Produces a good distribution of hash values.
Disadvantages:
● May require more computational effort.

Folding Method
The process involves two steps:

● except for the last component, which may have fewer digits than the other parts, the
key-value k should be divided into a predetermined number of pieces, such as k1, k2,
k3,..., kn, each having the same amount of digits.
● Add each element individually. The hash value is calculated without taking into
account the final carry, if any.

Formula:

k = k1, k2, k3, k4, ….., kn

s = k1+ k2 + k3 + k4 +….+ kn

h(K)= s

(Where, s = addition of the parts of key k)

Advantages:

● Creates a simple hash value by precisely splitting the key value into equal-sized
segments.
● Without regard to distribution in a hash table.

Disadvantages:

● When there are too many collisions, efficiency can occasionally suffer.

Example of Folding Method

k = 12345

k1 = 67; k2 = 89; k3 = 12Therefore,s = k1 + k2 + k3

s = 67 + 89 + 12

s = 168
Mid square method

The following steps are required to calculate this hash method:


● k*k, or square the value of k
● Using the middle r digits, calculate the hash value.

Formula:

h(K) = h(k x k)

(where k = key value)

Advantages:

● This technique works well because most or all of the digits in the key value affect the result.
All of the necessary digits participate in a process that results in the middle digits of the
squared result.
● The result is not dominated by the top or bottom digits of the initial key value.

Disadvantages:

● The size of the key is one of the limitations of this system; if the key is large, its square will
contain twice as many digits.
● Probability of collisions occurring repeatedly.

Example of Mid Square Method

k = 60 Therefore,k = k x k

k = 60 x 60

k = 3600 Thus,

h(60) = 60

Collision
When the two different values have the same value, then the problem occurs between the two
values, known as a collision. In the above example, the value is stored at index 6. If the key value is
26, then the index would be:

h(26) = 26%10 = 6

Therefore, two values are stored at the same index, i.e., 6, and this leads to the collision problem. To
resolve these collisions, we have some techniques known as collision techniques.

The following are the collision techniques:

○ Open Hashing: It is also known as closed addressing.


○ Closed Hashing: It is also known as open addressing.

Open Hashing

In Open Hashing, one of the methods used to resolve the collision is known as a chaining method.

Let's first understand the chaining to resolve the collision.

Suppose we have a list of key values

A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, and h(k) = 2k+3

In this case, we cannot directly use h(k) = ki/m as h(k) = 2k+3

○ The index of key value 3 is:

index = h(3) = (2(3)+3)%10 = 9


The value 3 would be stored at the index 9.

○ The index of key value 6 is:

index = h(6) = (2(6)+3)%10 = 5

The value 6 would be stored at the index 5.

○ The index of key value 11 is:

index = h(11) = (2(11)+3)%10 = 5

The value 11 would be stored at the index 5. Now, we have two values (6, 11) stored at the same
index, i.e., 5. This leads to the collision problem, so we will use the chaining method to avoid the
collision. We will create one more list and add the value 11 to this list. After the creation of the new
list, the newly created list will be linked to the list having value 6.

In Closed hashing, three techniques are used to resolve the collision:

1. Linear probing
2. Quadratic probing
3. Double Hashing technique

Linear Probing

Linear probing is one of the forms of open addressing. As we know that each cell in the hash table
contains a key-value pair, so when the collision occurs by mapping a new key to the cell already
occupied by another key, then linear probing technique searches for the closest free locations and
adds a new key to that empty cell. In this case, searching is performed sequentially, starting from the
position where the collision occurs till the empty cell is not found.

Let's understand the linear probing through an example.

Consider the above example for the linear probing:

A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, and h(k) = 2k+3

The key values 3, 2, 9, 6 are stored at the indexes 9, 7, 1, 5 respectively. The calculated index value
of 11 is 5 which is already occupied by another key value, i.e., 6. When linear probing is applied, the
nearest empty cell to the index 5 is 6; therefore, the value 11 will be added at the index 6.

Quadratic Probing

In case of linear probing, searching is performed linearly. In contrast, quadratic probing is an open
addressing technique that uses quadratic polynomial for searching until a empty slot is found.
It can also be defined as that it allows the insertion ki at first free location from (u+i2)%m where i=0
to m-1.

Let's understand the quadratic probing through an example.

Consider the same example which we discussed in the linear probing.

A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, and h(k) = 2k+3

The key values 3, 2, 9, 6 are stored at the indexes 9, 7, 1, 5, respectively. We do not need to apply
the quadratic probing technique on these key values as there is no occurrence of the collision.

The index value of 11 is 5, but this location is already occupied by the 6. So, we apply the quadratic
probing technique.

When i = 0

Index= (5+02)%10 = 5

When i=1

Index = (5+12)%10 = 6

Since location 6 is empty, so the value 11 will be added at the index 6.

The next element is 13. When the hash function is applied on 13, then the index value comes out to
be 9, which we already discussed in the chaining method. At index 9, the cell is occupied by another
value, i.e., 3. So, we will apply the quadratic probing technique to calculate the free location.

Let's understand quadratic probing through an example.

Consider the same example which we discussed in the linear probing.

A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, and h(k) = 2k+3

The key values 3, 2, 9, 6 are stored at the indexes 9, 7, 1, 5, respectively. We do not need to apply
the quadratic probing technique on these key values as there is no occurrence of the collision.

The index value of 11 is 5, but this location is already occupied by the 6. So, we apply the quadratic
probing technique.

When i = 0

Index= (5+02)%10 = 5

When i=1

Index = (5+12)%10 = 6

Since location 6 is empty, the value 11 will be added at index 6.


The next element is 13. When the hash function is applied on 13, then the index value comes out to
be 9, which we already discussed in the chaining method. At index 9, the cell is occupied by another
value, i.e., 3. So, we will apply the quadratic probing technique to calculate the free location.

You might also like