DSA Unit-1
DSA Unit-1
UNIT-1
Analysis of Algorithms: Efficiency of algorithms, A priori Analysis, Asymptotic notations.
Searching: Introduction, linear search, binary search, Fibonacci search.
Sorting: Introduction, bubble sort, insertion sort, selection sort, quick sort, merge sort.
Analysis of algorithms:
Algorithm is a step by step procedure to solve a particular problem. Algorithms are generally
created independent of underlying languages, i.e. an algorithm can be implemented in more than one
programming language.
Characteristics of algorithm:
1. Unambiguous: Algorithm should be clear and unambiguous. Each of its steps (or phases), and
their inputs/outputs should be clear and must lead to only one meaning.
2. Input : An algorithm should have 0 or more well-defined inputs.
3. Output: An algorithm should have 1 or more well-defined outputs, and should match the desired
output.
4. Finiteness − Algorithms must terminate after a finite number of steps.
5. Feasibility − Should be feasible with the available resources.
Example:
Problem − Design an algorithm to add two numbers and display the result.
Step 1 − START
Step 2 − declare three integers a, b & c
Step 3 − define values of a & b
Step 4 − add values of a & b
Step 5 − store output of step 4 to c
Step 6 − print c
Step 7 − STOP
We design an algorithm to get a solution of a given problem. A problem can be solved in more than
one ways.
Hence, many solution algorithms can be derived for a given problem. The next step is to analyze those
proposed solution algorithms and implement the best suitable solution.
Efficiency of algorithm:
Efficiency of an algorithm is a measure of the average execution time necessary for an
algorithm to complete work on a set of data. That means
� How efficiency the problem has been solved
It uses the asymptotic notations to represent how It doesn’t use asymptotic notations to represent
much time the algorithm will take in order to the time complexity of an algorithm.
complete its execution.
The time complexity of an algorithm using a The time complexity of an algorithm using a
priori analysis is same for every system posteriori analysis differ from system to system
If the program running faster, credit goes to the If the time taken by the algorithm is less, then the
programmer. credit will go to compiler and hardware.
Asymptotic Analysis:
As we know that data structure is a way of organizing the data efficiently and that efficiency is
measured either in terms of time or space. So, the ideal data structure is a structure that occupies the
least possible time to perform all its operation and the memory space. Our focus would be on finding
the time complexity rather than space complexity, and by finding the time complexity, we can decide
which data structure is the best for an algorithm. The main question arises in our mind that on what
basis should we compare the time complexity of data structures?. The time complexity can be
compared based on operations performed on them.
How to find the Time Complexity or running time for performing the operations?
The measuring of the actual running time is not practical at all. The running time to perform
any operation depends on the size of the input. Let's understand this statement through a simple
example.
Suppose we have an array of five elements, and we want to add a new element at the beginning
of the array. To achieve this, we need to shift each element towards right, and suppose each element
takes one unit of time. There are five elements, so five units of time would be taken. Suppose there are
The notation Ο(n) is the formal way to express the upper bound of an algorithm's running time. It
measures the worst case time complexity or the longest amount of time an algorithm can possibly take
to complete.
Example:
If f(n) and g(n) are the two functions defined for positive integers, then f(n) = O(g(n)) as f(n) is big oh
of g(n) or f(n) is on the order of g(n)) if there exists constants c and no such that:
f(n)≤c.g(n) for all n≥no
This implies that f(n) does not grow faster than g(n), or g(n) is an upper bound on the function f(n). In
this case, we are calculating the growth rate of the function which eventually calculates the worst time
complexity of a function, i.e., how worst an algorithm can perform.
Let us consider
f(n) = 3n+2, g(n) = n
Can we say that f(n) = O(g(n)) ?
Initially we need to find c, n0
f(n) <= c.g(n) c>0 and n0 >=1
Omega Notation, Ω:
o It basically describes the best-case scenario which is opposite to the big o notation.
o It is the formal way to represent the lower bound of an algorithm's running time. It measures
the best amount of time an algorithm can possibly take to complete or the best-case time
complexity.
o It determines what the fastest time that an algorithm can run is.
F(n) >= c . g(n) or f(n) = Ω (g(n))
After some ‘n’
n>= n0 where c, n are real numbers c>0 , n0 >=1
If we required that an algorithm takes at least certain amount of time without using an upper bound, we
use big- Ω notation i.e. the Greek letter "omega". It is used to bound the growth of running time for
large input size.
If f(n) and g(n) are the two functions defined for positive integers, then f(n) = Ω (g(n)) as f(n) is
Omega of g(n) or f(n) is on the order of g(n)) if there exists constants c and no such that:
f(n)>=c.g(n) for all n≥no and c>0
Let us consider
f(n) = 3n+2, g(n) = n
Can we say that f(n) = Ω (g(n)) ?
Initially we need to find c, n0
f(n) >= c.g(n) c>0 and n0 >=1
3n+2 >= c.n
c=1 3n+2 >= (1).n => 3n+2>=1n
3n+2>=1n
c=1 n<=1
For every n<=1 at c=1 f(n)>=c.g(n) such that
Theta Notation, θ:
o The theta notation mainly describes the average case scenarios.
o It represents the realistic time complexity of an algorithm. Every time, an algorithm does not
perform worst or best, in real-world problems, algorithms mainly fluctuate between the worst-
case and best-case, and this gives us the average case of the algorithm.
B.keerthana, Asst.Prof, Cse deptPage 4
o Big theta is mainly used when the value of worst-case and the best-case is same.
o It is the formal way to express both the upper bound and lower bound of an algorithm running
time.
c1.g(n)<=f(n)<=c2.g(n) or f(n) = θ (g(n))
After some ‘n’
n>= n0 where c, n are real numbers c>0 , n0 >=1
Example:
f(n) = 3n+2, g(n) = n
Initially we need to find c, n0
f(n) <= c1.g(n) c1>0 and n0 >=1
3n+2 <= c.n
c1=1 3n+2 <= (1).n => 3n+2<=1n
c1=2 3n+2 <= (2).n => 3n+2<=2n
c1=3 3n+2 <= (3).n => 3n+2<=3n
c1=4 3n+2 <= (4).n => 3n+2<=4n
2 <= 4n-3n
c1=4 n>=2
Searching
Searching techniques are used to retrieve a particular record from a list of records in an efficient
manner so that least possible time is consumed. The list of records can have a key field, which can be
used to identify a record. Therefore, the given value must be matched with the key value in order to
find a particular record. If the searching technique identifies a particular record, then the search
operation is said to be successful; otherwise, it is said to be unsuccessful.
The techniques which are used to search data from the memory of the computer are called internal
searching techniques. The techniques that are used to search data from a storage device- such as a hard
disk or a floppy disk-are called external searching techniques.
Searching techniques:
● Linear search
● Binary search
● Fibonacci search
Linear Search:
Linear search, also known as sequential search involves searching a data item from a given set
of data items. These data items are stored in structures, such as arrays. Each data item is checked one
by one in the order in which it exists in the structure to determine if it matches the criteria of search.
1. It is simple to implement
2. It does not require specific ordering before applying the method
1. It is less efficient.
Binary search:
Binary search is the fastest searching algorithm. It works on the ordered list either in ascending order
or descending order. The principle used in this search is to divide the list into two halves and compare
the key with the middle element. If the comparison results in true then print its position. If it is false,
then check key element is greater than the middle element or less than the middle element. If the key
element is greater than the middle element then search the key element in second half of the list. If the
key is less than the middle element then search the key in the first half of the list . Same process is
repeated for both the sub lists depending upon the key present in half of the list or second half of the
list.
Binary search is implemented using following steps...
● Step 1 - Read the search element from the user.
● Step 2 - Find the middle element in the sorted list.
● Step 3 - Compare the search element with the middle element in the sorted list.
● Step 4 - If both are matched, then display "Given element is found!!!" and terminate the
function.
● Step 5 - If both are not matched, then check whether the search element is smaller or larger
than the middle element.
● Step 6 - If the search element is smaller than middle element, repeat steps 2, 3, 4 and 5 for the
left sublist of the middle element.
1. It is an efficient technique.
Binarysearch(A, n, key)
{
low=0, high=n;
while(low < =high)
{
mid=(low + high)/2;
if(a[mid] > key)
high=mid - 1;
else if(a[mid] <key)
low = mid + 1;
else if(key==a[mid])
print “The element is found “
else
print “element is not found”
}
}
Time complexity:
⚫ Best case - O(1)
⚫ Average case - O(log n)
⚫ Worst case – O(log n)
Example:
Step1: initially find Fibonacci series of a given number n, find F(k) -> (k is the Fibonacci number)
which is greater than or equal to n
Step2: if F(k)=0 stop and print element not foud
Step3: set offset = -1
Step4: find i=min(offset+F(k-2),n-1)
Step 5: if(key= =a[i])
Return a[i] and print element found
Step6: if(key>a[i])
k=k-1 , offset =i repeat 4,5
step7: if(key<a[i])
k=k-2 repeat 4,5
In fiboacci search rather than considering the mid element, we consider the indices as the numbers
from Fibonacci series. As we know, the Fibonacci series is…
0 1 1 2 3 5 8 13 21….
To understand how Fibonacci search works, we will consider one example, suppose following is the
list of elements.
Arr[]
10 20 30 40 50 60 70
0 1 2 3 4 5 6
k= 0 1 2 3 4 5 6
F(k)= 0 1 1 2 3 5 8
if(40==30) ->false
if(40>30) -> true
k=k-1 , offset=i
k=5, offset=2
i=min(offset+F(k-2), n-1)
B.keerthana, Asst.Prof, Cse deptPage 11
= min(2+F(3),6)
= min(2+2,6)
i=min(4,6)
i=4 => a[i]=4 =>a[4]=50
if(40==50) ->false
if(40>50) -> false
if(40<50) -> true
k=k-2 , offset=2
k=3
i=min(offset+F(k-2), n-1)
= min(2+F(1),6)
= min(2+1,6)
i=min(3,6)
i=3 => a[i]=3 =>a[3]=40
if(40==40) ->true
so print “element is found”
Time Complexity:
Sorting
Sorting is an important concept that is extensively used in the fields of computer science. Sorting is
nothing but arranging the elements in some logical order. For example, we want to obtain the
telephone number of a person. If the telephone directory is not arranged in alphabetical order, one has
to search from the very first page to till the last page. If the directory is sorted, we can easily search for
the telephone number.
Bubble sort:
This is the simplest sorting technique when compare with all other sorting.
The bubble sort technique starts with comparing the first two data items of an array and swapping
these two data items if they are not in a proper order, i.e. in descending or ascending order. This
process continues until all the data items in the array are compared or the end of an array is reached.
How Bubble Sort will work?
1. Starting from the first index, compare the first and the second elements.
2. If the first element is greater than the second element, they are swapped.
3. The same process goes on for the remaining iterations. After each iteration, the largest element
among the unsorted elements is placed at the end.
4. The array is sorted when all the unsorted elements are placed at their correct positions.
Example
Advantages
⚫ The primary advantage of the bubble sort is that it is popular and easy to implement.
⚫ In bubble sort, elements are swapped in place without using additional temporary storage, so
the space requirement is at a minimum.
Disadvantage
⚫ The main disadvantage of the bubble sort is that it does not deal well with a list containing a
huge number of items.
Selection sort:
Selection sort is an algorithm that selects the smallest element from an unsorted list in each iteration
and places that element at the beginning of the unsorted list.
How Selection Sort Works?
1. Set the first element as ‘minimum’.
3. After each iteration, minimum is placed in the front of the unsorted list.
4. For each iteration, indexing starts from the first unsorted element. Step 1 to 3 are repeated until
all the elements are placed at their correct positions.
Time complexity
Insertion sort:
The insertion sort works very well when the number of elements are very less. This technique is
similar to the way a librarian keeps the books in the shelf. Initially all the books are placed in shelf
according to their access number. When a student returns the book to the librarian, he compares the
access number of this book with all other books of access numbers and inserts it into the correct
position, so that all books are arranged in order with respect to their access numbers.
1. The first element in the array is assumed to be sorted. Take the second element and store it
separately in key.
Compare key with the first element. If the first element is greater than key, then key is placed in front
of the first element.
Heap Sort:
⚫ Heap sort is one of the sorting algorithms used to arrange a list of elements in an order. Heap
sort algorithm uses one of the tree concepts called Heap Tree.
⚫ In this sorting algorithm, we use Max Heap to arrange list of elements in Ascending order
and Min Heap to arrange list elements in Descending order.
What is a heap?
Heap is a special tree-based data structure, that satisfies the following special heap properties:
1. Shape Property: Heap data structure is always a Complete Binary Tree, which means all
levels of the tree are fully filled.
2. Heap Property: All nodes are either greater than or equal to or less than or equal to each
of its children i.e, Max-Heap, Min-Heap.
⚫ Step 2: Delete root (90) from the Max heap. To delete root node it needs to be swapped with
last node (12). After delete tree needs to be heapify to make it Max Heap.
⚫ Step 3: Delete root (82) from the Max heap. To delete root node it needs to be swapped with
last node (55). After delete tree needs to be heapify to make it max heap.
⚫ Step4: Delete root (77) from the Max Heap. To delete root node it needs to be swapped with
last node (10). After delete tree needs to be notify to make it Max Heap.
⚫ Step 6: Delete root (23) from the Max Heap. To delete root node it needs to be swapped with
last node(12). After delete tree needs to be heapify to make it Max Heap.
⚫ Step 7: Delete root (15) from the Max Heap. To delete root node it need to be swapped with
last node (10). After delete tree needs to be heapify to make it Max Heap.
Advantages:
⚫ The Heap sort algorithm is widely used because of its efficiency.
Merge sort:
⚫ Merge sort is one of the most efficient sorting algorithms. It works on the principle of Divide
and Conquer.
⚫ Merge sort repeatedly breaks down a list into several sub-lists until each sub-list consists of a
single element and merging those sub-lists in a manner that results into a sorted list.
mergeSort(A,lb,ub)
if (lb<ub)
1. Find the middle point to divide the array into two halves:
middle mid = (lb+ub)/2
2. Call mergeSort for first half:
Call mergeSort(A,lb, mid)
3. Call mergeSort for second half:
Call mergeSort(A, mid+1, ub)
4. Merge the two halves sorted in step 2 and 3:
Call merge(A, lb, mid, ub)
merge(A,lb,mid,ub):
{
i=lb;
j=mid+1;
k=lb;
while(i <= mid && j<= ub)
{
if(a[i]<=a[j])
{
b[k]=a[i];
B.keerthana, Asst.Prof, Cse deptPage 22
i++; k++;
}
else{
b[k]=a[j];
j++; k++;
}
}
if(i>mid)
{
while(j<=ub)
{
b[k]=a[j];
j++;
k++;
}
}
else{
while(i<=mid)
{
b[k]=a[i];
i++;
k++;
}
}
}
Advantages
• It is quicker for larger lists because unlike insertion and bubble sort it does not go through the
whole list several times.
• It has a consistent running time, carries out different bits with similar times in a stage.
Disadvantages
⚫ Slower comparative to the other sort algorithms for smaller tasks
⚫ Uses more memory space to store the sub elements of the initial split list.
Quick sort:
⚫ Quick sort is a fast sorting algorithm used to sort a list of elements.
⚫ The quick sort algorithm attempts to separate the list of elements into two parts and then sort
each part recursively. That means it use divide and conquer strategy.
⚫ In quick sort, the partition of the list is performed based on the element called pivot. Here pivot
element is one of the elements in the list.
⚫ The list is divided into two partitions such that "all elements to the left of pivot are smaller than
the pivot and all elements to the right of pivot are greater than or equal to the pivot".
Time complexity:
⚫ To sort an unsorted list with 'n' number of elements, we need to make ((n-1)+(n-2)+(n-3)+......
+1) = (n (n-1))/2 number of comparisons in the worst case.
⚫ If the list is already sorted, then it requires 'n' number of comparisons.
⚫ Worst case: O(n2)
⚫ Best case: O (n log n)
⚫ Average case: O (n log n)
Advantages
⚫ Its average-case time complexity to sort an array of n elements is O(n log n).
⚫ On the average it runs very fast, even faster than Merge Sort.
B.keerthana, Asst.Prof, Cse deptPage 24
⚫ It requires no additional memory.
Disadvantages
⚫ Its running time can differ depending on the contents of the array.
⚫ Quick sort's running time degrades if given an array that is almost sorted (or almost reverse
sorted). Its worst-case running time, O(n 2 ) to sort an array of n elements, happens when given
a sorted array.
⚫ It is not stable.