3 Searching and Sorting_notes
3 Searching and Sorting_notes
Linear search is a very simple search algorithm. In this type of search, a sequential
search is made over all items one by one. Every item is checked and if a match is found
then that particular item is returned, otherwise the search continues till the end of the
data collection.
Algorithm
Step 1: Set i to 1
Step 4: Set i to i + 1
Step 5: Go to Step 2
Step 8: Exit
Pseudocode
end function
Page 29 of 65
2. Binary search
Binary search is a fast search algorithm with run-time complexity of Ο(log n). This
search algorithm works on the principle of divide and conquer. For this algorithm to
work properly, the data collection should be in the sorted form.
Binary search looks for a particular item by comparing the middle most item of the
collection. If a match occurs, then the index of item is returned. If the middle item is
greater than the item, then the item is searched in the sub-array to the right of the middle
item. Otherwise, the item is searched for in the sub-array to the left of the middle item.
This process continues on the sub-array as well until the size of the subarray reduces to
zero.
Pseudo code
Procedure binary_search
A ← sorted array
n ← size of array
x ← value ot be searched
Set lowerBound = 0
if A[midPoint] < x
if A[midPoint] > x
end while
end procedure
In binary search, if the desired data is not found then the rest of the list is divided in two
parts, lower and higher. The search is carried out in either of them.
Even when the data is sorted, binary search does not take advantage to probe the
position of the desired data.
Interpolation search finds a particular item by computing the probe position. Initially,
the probe position is the position of the middle most item of the collection.
If a match occurs, then the index of the item is returned. To split the list into two parts,
we use the following method:
Page 31 of 65
N.B: The idea of the formula is to return higher value of pos when element to be
searched is closer to arr[hi]. And smaller value when closer to arr[lo]
If the middle item is greater than the item, then the probe position is again calculated in
the sub-array to the right of the middle item. Otherwise, the item is searched in the sub-
array to the left of the middle item. This process continues on the sub-array as well until
the size of subarray reduces to zero.
Algorithm:
Step 4 − Divide the list using probing formula and find the new middle.
Pseudocode:
A → Array list
N → Size of A
X → Target Value
Procedure Interpolation_Search()
Set Lo → 0
Set Mid → -1
Set Hi → N-1
Page 32 of 65
While X does not match
end if
if A[Mid] = X
else
if A[Mid] < X
Set Lo to Mid+1
Set Hi to Mid-1
end if
end if
End While
End Procedure
4. Jump search
Jump search is a searching algorithm for sorted arrays. The basic idea is to check
fewer elements by jumping ahead by fixed steps or skipping some elements in place of
searching all elements.
Page 33 of 65
Step 1: Start from first index
Step 3: If element at current position < target element, then do Linear Search on element
from position current position -B to current position else go to step 2. If current position
is last position, go to step 4.
int prev = 0;
prev = step;
step += sqrt(n);
if (prev >= n)
return -1; }
prev++;
return -1; }
Page 34 of 65
// If element is found
if (arr[prev] == x)
return prev;
return -1;
For example, suppose we have an array arr[] of size n and block (to be jumped) size m.
Then we search at the indexes arr[0], arr[m], arr[2m]…..arr[km] and so on.
Once we find the interval (arr[km] < x < arr[(k+1)m]), we perform a linear search
operation from the index km to find the element x.
In the worst case, we have to do N/B jumps and if the element is not present, we perform
B-1 comparisons.
Therefore, the total number of comparisons in the worst case will be ((N/B) + B-1). The
value of the function ((N/B) + B-1) will be minimum when B = √N.
✓ The optimal size of a block to be jumped is (√ n). This makes the time complexity
of Jump Search O(√ n).
✓ The time complexity of Jump Search is between Linear Search ( ( O(n) ) and
Binary Search ( O (Log n) ).
Page 35 of 65
✓ Binary Search is better than Jump Search, but Jump search has an advantage that
we traverse back only.
Find the element 22 from the below array using jump search algorithm:
4 6 8 10 13 14 20 22 25 30
5. Exponential search
The idea is to start with sub-list of size 1. Compare the last element of the list with the
target element, then try size 2, then 4 and so on until last element of the list is not greater.
Once we find a location loc (after repeated doubling of list size), we know that the
element must be present between loc/2 and loc.
if(arr[0]==k)
return 0;
int i=1;
while(i<=n&&arr[min(i,(n-1))]<k){
Page 36 of 65
i=i*2;}
return binary_search(arr,i/2,min(i,(n-1)),k); }
while(l<=r){
int mid=(l+r)/2;
if(arr[mid]==k)
return mid;
else if(arr[mid]>k)
r=mid-1;
else
l=mid+1; }
return -1; }
Applications
Exponential Binary Search is useful for unbounded searches where size of array is
infinite.
It works better than Binary Search for bounded arrays when the element to be searched
is closer to the beginning of the array.
Page 37 of 65
Find the element 80 from the below array using exponential search algorithm:
8 10 13 20 36 37 40 45 60 80
SORTING ALGORITHMS
Increasing Order
Decreasing Order
than the current one. For example, 9, 8, 6, 4, 3, 1 are in decreasing order, as every next
Non-Increasing Order
Page 38 of 65
less than or equal to its previous element in the sequence. This order occurs when the
order, as every next element is less than or equal to (in case of 3) but not greater than
Non-Decreasing Order
greater than or equal to its previous element in the sequence. This order occurs when
the sequence contains duplicate values. For example, 1, 3, 3, 6, 8, 9 are in non-
decreasing order, as every next element is greater than or equal to (in case of 3) but not
less than the previous one.
The following are simple sorting algorithms used to sort small-sized lists.
✓ Insertion Sort
✓ Selection Sort
✓ Bubble Sort
1. Insertion sort
An element which is to be 'inserted’ in this sorted sub-list, has to find its appropriate
place and then it has to be inserted there. Hence the name, insertion sort.
The array is searched sequentially and unsorted items are moved and inserted in to the
sorted sub-list (in the same array). This algorithm is not suitable for large data sets as
its average and worst case complexity are of Ο(𝒏𝟐 ), where n is the number of items.
Page 39 of 65
In Insertion sort, the list is divided into two parts: the sorted part followed by the
unsorted part.
In each step, pick the first element from the unsorted part and insert in the correct
position in the sorted part.
Initially, the sorted part is empty and finally, the unsorted part is empty.
int key, j;
key = arr[i];
j = i - 1;
arr[j + 1] = arr[j];
j = j - 1; }
arr[j + 1] = key; } }
Let us loop for i = 1 (second element of the array) to 4 (last element of the array)
i = 2. 13 will remain at its position as all elements in A[0..I-1] are smaller than 13
11, 12, 13, 5, 6
i = 3. 5 will move to the beginning and all other elements from 11 to 13 will move one
position ahead of their current position.
5, 11, 12, 13, 6
Page 40 of 65
i = 4. 6 will move to position after 5, and elements from 11 to 13 will move one
position ahead of their current position.
5, 6, 11, 12, 13
✓ If we have n values in our array, Insertion Sort has a time complexity of O(n²) in
the worst case.
✓ In the best case, we already have a sorted array but we need to go through the
array at least once to be sure! Therefore, in the best case, Insertion Sort
takes O(n) time complexity.
2. Selection sort
The smallest element is selected from the unsorted array and swapped with the leftmost
element, and that element becomes a part of the sorted array. This process continues
moving unsorted array boundary by one element to the right.
This algorithm is not suitable for large data sets as its average and worst case
complexities are of O(n²), where n is the number of items.
3. Repeat the steps above for remainder of the list (starting at the second position)
Page 41 of 65
Working principle(Implementation) of selection sort
int min_indx=i;
for(int j=i+1;j<n;j++) {
if(arr[j]<arr[min_indx]){
min_indx=j;
}}
swap(arr,i,min_indx); } }
int temp;
temp = arr[firstIndex];
arr[firstIndex] = arr[min_indx];
arr[min_indx] = temp;
Example: Sort arr[] = [64, 25, 12, 22, 11] using selection sort
11 25 12 22 64
Page 42 of 65
Step 2: Find the minimum element in arr[1...4] and place it at beginning of arr[1...4]
11 12 25 22 64
Step 3: Find the minimum element in arr[2...4] and place it at beginning of arr[2...4]
11 12 22 25 64
Step 4: Find the minimum element in arr[3...4] and place it at beginning of arr[3...4]
11 12 22 25 64
✓ To sort an array with Selection Sort, you must iterate through the array once for
every value you have in the array.
✓ If we have n values in our array, Selection Sort has a time complexity of O(n²) in
the worst case.
✓ In the best case, we already have a sorted array but we need to go through the
array O(n²) times to be sure!
✓ Therefore, Selection Sort’s best and worst case time complexity are the same.
3. Bubble sort
It works by repeatedly stepping through the list to be sorted, comparing two items at a
time and swapping them if they are in the wrong order.
The pass through the list is repeated until no swaps are needed, which means the list is
sorted.
Step 1: Compare two adjacent elements and swap them if they are not in the correct
order
int i, j;
bool swapped;
swapped = false;
swap(arr,j, (j+1));
swapped = true; } }
if (swapped == false)
break; }}
Here, algorithm compares the first two elements, and swaps since 5 > 1.
Page 44 of 65
( 1 4 2 5 8 ) –> ( 1 4 2 5 8 ),
Now, since these elements are already in order (8 > 5), algorithm does not swap
them.
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
Now, the array is already sorted, but our algorithm does not know if it is completed.
The algorithm needs another pass without any swap to know it is sorted.
(1 2 4 5 8) –> (1 2 4 5 8)
(1 2 4 5 8) –> (1 2 4 5 8)
(1 2 4 5 8) –> (1 2 4 5 8)
✓ In Bubble Sort, n-1 comparisons will be done in the 1st pass, n-2 in 2nd pass, n-
3 in 3rd pass and so on. So the total number of comparisons will be:
Sum = n(n-1+1)/2
T(n)= O(𝒏𝟐 ).
✓ Worst and Average Case Time Complexity: O(𝑛2 ). Worst case occurs when
array is reverse sorted.
✓ Best Case Time Complexity: O(n). Best case occurs when array is already sorted.
Page 45 of 65
Working principle(implementation) of bubble sort
int i, j;
bool swapped;
swapped = false;
swap(arr,j, (j+1));
swapped = true; } }
if (swapped == false)
break; }}
Here, algorithm compares the first two elements, and swaps since 5 > 1.
( 1 4 2 5 8 ) –> ( 1 4 2 5 8 ),
Now, since these elements are already in order (8 > 5), algorithm does not swap
them.
Page 46 of 65
Second Pass (i=1): ( 1 4 2 5 8 ) –> ( 1 4 2 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
( 1 2 4 5 8 ) –> ( 1 2 4 5 8 )
Now, the array is already sorted, but our algorithm does not know if it is completed.
The algorithm needs another pass without any swap to know it is sorted.
(1 2 4 5 8) –> (1 2 4 5 8)
(1 2 4 5 8) –> (1 2 4 5 8)
(1 2 4 5 8) –> (1 2 4 5 8)
✓ In Bubble Sort, n-1 comparisons will be done in the 1st pass, n-2 in 2nd pass, n-
3 in 3rd pass and so on. So the total number of comparisons will be:
Sum = n(n-1+1)/2
T(n)= O(𝒏𝟐 ).
✓ Worst and Average Case Time Complexity: O(𝑛2 ). Worst case occurs when
array is reverse sorted.
✓ Best Case Time Complexity: O(n). Best case occurs when array is already sorted.
Page 47 of 65
ADVANCED SORTING ALGORITHMS
1. Quick Sort Algorithm
1. https://www.softwaretestinghelp.com/heap-sort/
2. https://www.studytonight.com/data-structures/quick-sort
3. https://www.geeksforgeeks.org/shellsort/
Quick Sort is also based on the concept of Divide and Conquer, just like merge
sort. But in quick sort all the heavy lifting (major work) is done
while dividing the array into subarrays, while in case of merge sort, all the real
work happens during merging the subarrays. In case of quick sort, the combine
step does absolutely nothing.
It is also called partition-exchange sort. This algorithm divides the list into three
main parts:
Pivot element can be any element from the array, it can be the first element, the last
element or any random element. In this tutorial, we will take the rightmost element or
the last element as pivot.
begin
Page 48 of 65
Declare array A[N] to be sorted
begin
End
end
For example: In the array {52, 37, 63, 14, 17, 8, 6, 25}, we take 25 as pivot. So after the first
pass, the list will be changed like this.
{6 8 17 14 25 63 37 52}
Hence after the first pass, pivot will be set at its position, with all the
elements smaller to it on its left and all the elements larger than to its right. Now 6 8
17 14 and 63 37 52 are considered as two separate sub arrays, and same recursive logic
will be applied on them, and we will keep doing this until the complete array is sorted.
1. After selecting an element as pivot, which is the last index of the array in our
case, we divide the array for the first time.
2. In quick sort, we call this partitioning. It is not simple breaking down of array
into 2 subarrays, but in case of partitioning, the array elements are positioned
so that all the elements smaller than the pivot will be on the left side of the
pivot and all the elements greater than the pivot will be on the right side of it.
3. And the pivot element will be at its final sorted position.
Page 49 of 65
4. The elements to the left and right, may not be sorted.
5. Then we pick subarrays, elements on the left of pivot and elements on the right
of pivot, and we perform partitioning on them by choosing a pivot in the
subarrays.
And if keep on getting unbalanced subarrays, then the running time is the worst case,
which is O(n2).
Whereas if partitioning leads to almost equal subarrays, then the running time is the
best, with time complexity as O (n*log n).
Merge Sort follows the rule of Divide and Conquer to sort a given set of
numbers/elements, recursively, hence consuming less time. Merge sort, runs in O (n*log
n) time in all the cases. Before jumping on to, how merge sort works and its
implementation, first let’s understand what is the rule of Divide and Conquer?
If we can break a single big problem into smaller sub-problems, solve the smaller sub-
problems and combine their solutions to find the solution for the original big problem,
it becomes easier to solve the whole problem.
When Britishers came to India, they saw a country with different religions living in
harmony, hardworking but naive citizens, unity in diversity, and found it difficult to
Page 50 of 65
establish their empire. So, they adopted the policy of Divide and Rule. Where the
population of India was collectively a one big problem for them, they divided the
problem into smaller problems, by instigating rivalries between local kings, making
them stand against each other, and this worked very well for them.
Well, that was history, and a socio-political policy (Divide and Rule), but the idea
here is, if we can somehow divide a problem into smaller sub-problems, it becomes
easier to eventually solve the whole problem.
In Merge Sort, the given unsorted array with n elements, is divided into n subarrays,
each having one element, because a single element is always sorted in itself. Then, it
repeatedly merges these subarrays, to produce new sorted subarrays, and in the end,
one complete sorted array is produced.
Page 51 of 65
How Merge Sort Works?
As we have already discussed that merge sort utilizes divide-and-conquer rule to break
the problem into sub-problems, the problem in this case being, sorting a given array.
In merge sort, we break the given array midway, for example if the original array had 6
elements, then merge sort will break it down into two subarrays with 3 elements each.
But breaking the original array into 2 smaller subarrays is not helping us in sorting the
array. So, we will break these subarrays into even smaller subarrays, until we have
multiple subarrays with single element in them. Now, the idea here is that an array with
a single element is already sorted, so once we break the original array into subarrays
which has only a single element, we have successfully broken down our problem into
base problems. And then we have to merge all these sorted subarrays, step by step to
form one single sorted array.
General Algorithm
The general pseudo-code for the merge sort technique is given below.
Declare an array Arr of length N
If N=1, Arr is already sorted
If N>1,
Left = 0, right = N-1
Find middle = (left + right)/2
Call merge_sort(Arr,left,middle) =>sort first half recursively
Call merge_sort(Arr,middle+1,right) => sort second half recursively
Call merge(Arr, left, middle, right) to merge sorted arrays in above steps.
Exit
Below, we have a pictorial representation of how merge sort will sort the given array.
Page 52 of 65
In merge sort we follow the following steps:
1. We take a variable p and store the starting index of our array in this. And we
take another variable r and store the last index of array in it.
2. Then we find the middle of the array using the formula (p + r)/2 and mark the
middle index as q, and break the array into two subarrays, from p to q and
from q + 1 to r index.
Page 53 of 65
3. Then we divide these 2 subarrays again, just like we divided our main array
and this continues.
4. Once we have divided the main array into subarrays with single elements, then
we start merging the subarrays.
Merge Sort is quite fast, and has a time complexity of O (n*log n). It is also a stable sort,
which means the "equal" elements are ordered in the same order in the sorted list. In
this section we will understand why the running time for merge sort is O (n*log n).
Heap Sort is one of the best sorting methods being in-place and with no quadratic
worst-case running time. Heap sort involves building a Heap data structure from the
given array and then utilizing the Heap to sort the array.
You must be wondering, how converting an array of numbers into a heap data
structure will help in sorting the array. To understand this, let's start by understanding
what is a Heap.
What is a Heap?
Heap is a special tree-based data structure, that satisfies the following special heap
properties:
1. Shape Property: Heap data structure is always a Complete Binary Tree, which
means all levels of the tree are fully filled.
Page 54 of 65
2. Heap Property: All nodes are either greater than or equal to or less than or
equal to each of its children. If the parent nodes are greater than their child
nodes, heap is called a Max-Heap, and if the parent nodes are smaller than their
child nodes, heap is called Min-Heap.
Page 55 of 65
Creating a Heap of the unsorted list/array.
Then a sorted array is created by repeatedly removing the largest/smallest element from
the heap, and inserting it into the array. The heap is reconstructed after each removal.
Initially on receiving an unsorted list, the first step in heap sort is to create a Heap data
structure (Max-Heap or Min-Heap). Once heap is built, the first element of the Heap is
either largest or smallest (depending upon Max-Heap or Min-Heap), so we put the first
element of the heap in our array. Then we again make heap using the remaining
elements, to again pick the first element of the heap and put it into the array. We keep
on doing the same repeatedly untill we have the complete sorted list in our array.
In the below algorithm, initially heapsort() function is called, which calls heapify() to
build the heap.
Shell sort is often termed as an improvement over insertion sort. In insertion sort, we
take increments by 1 to compare elements and put them in their proper position.
In shell sort, the list is sorted by breaking it down into a number of smaller sub lists. It’s
not necessary that the lists need to be with contiguous elements. Instead, shell sort
technique uses increment i, which is also called “gap” and uses it to create a list of
elements that are “i” elements apart.
Time Complexity: Time complexity of shell sort is O(n2). The gap is reducing by half
in every iteration.
shell_sort (A, N)
Page 56 of 65
begin
set flag = 0
end
begin
set flag = 0
end
end
Page 57 of 65