0% found this document useful (0 votes)
18 views

Unit 3

Uploaded by

mimanshas28
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Unit 3

Uploaded by

mimanshas28
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 39

What is a Sorting Algorithm in Data Structures?

In simple words, Sorting refers to arranging data in a particular format. Sorting


algorithms in data structures refer to the methods used to arrange data in a
particular order mostly in numerical or lexicographical order.
When we have a large amount of data, it can be difficult to deal with it,
especially when it is arranged randomly. In such cases, sorting becomes crucial.
It is necessary to sort data to make searching easier. Sorting algorithms can help
in organizing data to make it more accessible to work with and even speed up
processing times for larger amounts of data.
Types of Sorting Algorithms in Data Structures
1. In-place Sorting and Not-in-place Sorting
Sorting algorithms may require some extra space for comparison and temporary
storage of a few data elements. There are two main types of sorting algorithms
based on how they rearrange elements in memory
 In-place Sorting
These algorithms rearrange the elements within the array being sorted, without
using any additional space or memory. In other words, the algorithm operates
directly on the input array, modifying its content in place.
 Not-in-place Sorting
These algorithms require space that is more than or equal to the elements being
sorted to store intermediate results.

2. Stable and Unstable Sorting Algorithm


 Stable Sorting Algorithm
It maintains the relative order of elements with equal values in the original data
set i.e. if two elements have the same value, the algorithm will sort them in the
same order as they appear in the original data set. Some examples of stable
sorting algorithms are Merge Sort, Insertion Sort, and Tim Sort.
 Unstable Sorting Algorithm
These algorithms do not guarantee to maintain the relative order of elements
with equal values in the original data set i.e. the order in which elements with
the same value appear in the sorted output may not be the same as their order in
the original data set. Some examples of unstable sorting algorithms are Quick
Sort, Heap Sort, and Selection Sort.

3. Adaptive and Non-Adaptive Sorting Algorithm


 Adaptive Sorting Algorithm
These algorithms are adaptive because they can take advantage of the partially
sorted nature of some input data to reduce the amount of work required to sort
the data. For example, suppose a portion of the input data is already sorted. In
that case, an adaptive sorting algorithm can skip over that portion of the data
and focus on sorting the remaining elements. Some examples are insertion sort,
bubble sort, and quicksort.
 Non-Adaptive Sorting Algorithm
These algorithms are non-adaptive because they do not take advantage of any
partial ordering of the input data. They try to force every single element to be
re-ordered to confirm whether they are sorted or not. Some examples are
selection sort, merge sort, and heap sort.
4. Comparison and non-comparison-based Sorting Algorithm
 Comparison-based Based Sorting Algorithms
These compare elements of the data set and determine their order based on the
result of the comparison. Examples include bubble sort, insertion sort,
quicksort, merge sort, and heap sort.
 Non-Comparison based Sorting Algorithms
They are the sorting algorithms that sort the data without comparing the
elements. Examples include radix sort, bucket sort, and counting sort.

5. Internal and External Sorting Algorithms


 Internal Sorting Algorithms
If the data sorting process takes place entirely within the Random-Access
Memory (RAM) of a computer, it’s called internal sorting. This is
possible whenever the size of the dataset to be sorted is small enough to
be held in RAM.
 External Sorting Algorithms
For sorting larger datasets, it may be necessary to hold only a smaller
chunk of data in memory at a time, since it won’t all fit in the RAM. The
rest of the data is normally held on some larger, but slower medium, like
a hard disk. The sorting of these large datasets will require different sets
of algorithms which are called external sorting.

Radix sort is the linear sorting algorithm that is used for integers. In Radix sort,
there is digit by digit sorting is performed that is started from the least
significant digit to the most significant digit.

The process of radix sort works similar to the sorting of student’s names,
according to the alphabetical order. In this case, there are 26 radix formed due
to the 26 alphabets in English. In the first pass, the names of students are
grouped according to the ascending order of the first letter of their names. After
that, in the second pass, their names are grouped according to the ascending
order of the second letter of their name. And the process continues until we find
the sorted list.
Algorithm

1. radixSort(arr)
2. max = largest element in the given array
3. d = number of digits in the largest element (or, max)
4. Now, create d buckets of size 0 - 9
5. for i -> 0 to d
6. sort the array elements using counting sort (or any stable sort) according t
o the digits at the i th place

Working of Radix sort Algorithm

o First, we have to find the largest element (suppose max) from the given
array. Suppose 'x' be the number of digits in max. The 'x' is calculated
because we need to go through the significant places of all elements.
o After that, go through one by one each significant place. Here, we have to
use any stable sorting algorithm to sort the digits of each significant
place.

Now let's see the working of radix sort in detail by using an example.

In the given array, the largest element is 736 that have 3 digits in it. So, the loop
will run up to three times (i.e., to the hundreds place). That means three passes
are required to sort the array.

Now, first sort the elements on the basis of unit place digits (i.e., x = 0). Here,
we are using the counting sort algorithm to sort the elements.

Pass 1:

In the first pass, the list is sorted on the basis of the digits at 0's place.
After the first pass, the array elements are -

Pass 2:

In this pass, the list is sorted on the basis of the next significant digits (i.e.,
digits at 10th place).
After the second pass, the array elements are -

Pass 3:

In this pass, the list is sorted on the basis of the next significant digits (i.e.,
digits at 100th place).
After the third pass, the array elements are -

Now, the array is sorted in ascending order.

Radix sort complexity

Now, let's see the time complexity of Radix sort in best case, average case, and
worst case. We will also see the space complexity of Radix sort.

1. Time Complexity

o Best Case Complexity - It occurs when there is no sorting required, i.e.


the array is already sorted. The best-case time complexity of Radix sort
is Ω(n+k).
o Average Case Complexity - It occurs when the array elements are in
jumbled order that is not properly ascending and not properly descending.
The average case time complexity of Radix sort is θ(nk).
o Worst Case Complexity - It occurs when the array elements are required
to be sorted in reverse order. That means suppose you have to sort the
array elements in ascending order, but its elements are in descending
order. The worst-case time complexity of Radix sort is O(nk).
 Radix sort is a non-comparative sorting algorithm that is better than the
comparative sorting algorithms. It has linear time complexity that is better
than the comparative algorithms with complexity O(n logn).

2. Space Complexity

o The space complexity of Radix sort is O(n + k).

Quick Sorting is a way of arranging items in a systematic manner. It is a faster


and highly efficient sorting algorithm that follows the divide and conquer
approach.

 Divide and conquer is a technique of breaking down the algorithms into


subproblems, then solving the subproblems, and combining the results back
together to solve the original problem.

Divide: In Divide, first pick a pivot element. After that, partition or rearrange
the array into two sub-arrays such that each element in the left sub-array is less
than or equal to the pivot element and each element in the right sub-array is
larger than the pivot element.

Conquer: Recursively, sort two subarrays with Quicksort.

Combine: Combine the already sorted array.

 Quicksort picks an element as pivot, and then it partitions the given array
around the picked pivot element. In quick sort, a large array is divided into
two arrays in which one holds values that are smaller than the specified
value (Pivot), and another array holds the values that are greater than the
pivot.

After that, left and right sub-arrays are also partitioned using the same approach.
It will continue until the single element remains in the sub-array.
Choosing the pivot

Picking a good pivot is necessary for the fast implementation of quicksort.


However, it is typical to determine a good pivot. Some of the ways of choosing
a pivot are as follows -

o Pivot can be random, i.e. select the random pivot from the given array.
o Pivot can either be the rightmost element of the leftmost element of the
given array.
o Select median as the pivot element.

Algorithm

Algorithm:

1. QUICKSORT (array A, start, end)


2. {
3. 1 if (start < end)
4. 2{
5. 3 p = partition(A, start, end)
6. 4 QUICKSORT (A, start, p - 1)
7. 5 QUICKSORT (A, p + 1, end)
8. 6}
9. }
Partition Algorithm:

The partition algorithm rearranges the sub-arrays in a place.

1. PARTITION (array A, start, end)


2. {
3. 1 pivot ? A[end]
4. 2 i ? start-1
5. 3 for j ? start to end -1 {
6. 4 do if (A[j] < pivot) {
7. 5 then i ? i + 1
8. 6 swap A[i] with A[j]
9. 7 }}
10. 8 swap A[i+1] with A[end]
11. 9 return i+1
12.}

Working of Quick Sort Algorithm

Let the elements of array are -

In the given array, we consider the leftmost element as pivot. So, in this case,
a[left] = 24, a[right] = 27 and a[pivot] = 24.

Since, pivot is at left, so algorithm starts from right and move towards left.
Now, a[pivot] < a[right], so algorithm moves forward one position towards left,
i.e. -

Now, a[left] = 24, a[right] = 19, and a[pivot] = 24.

Because, a[pivot] > a[right], so, algorithm will swap a[pivot] with a[right], and
pivot moves to right, as -

Now, a[left] = 19, a[right] = 24, and a[pivot] = 24. Since, pivot is at right, so
algorithm starts from left and moves to right.

As a[pivot] > a[left], so algorithm moves one position to right as -


Now, a[left] = 9, a[right] = 24, and a[pivot] = 24. As a[pivot] > a[left], so
algorithm moves one position to right as -

Now, a[left] = 29, a[right] = 24, and a[pivot] = 24. As a[pivot] < a[left], so,
swap a[pivot] and a[left], now pivot is at left, i.e. -

Since, pivot is at left, so algorithm starts from right, and move to left. Now,
a[left] = 24, a[right] = 29, and a[pivot] = 24. As a[pivot] < a[right], so algorithm
moves one position to left, as -
Now, a[pivot] = 24, a[left] = 24, and a[right] = 14. As a[pivot] > a[right], so,
swap a[pivot] and a[right], now pivot is at right, i.e. -

Now, a[pivot] = 24, a[left] = 14, and a[right] = 24. Pivot is at right, so the
algorithm starts from left and move to right.

Now, a[pivot] = 24, a[left] = 24, and a[right] = 24. So, pivot, left and right are
pointing the same element. It represents the termination of procedure.

Element 24, which is the pivot element is placed at its exact position.
Elements that are right side of element 24 are greater than it, and the elements
that are left side of element 24 are smaller than it.

Now, in a similar manner, quick sort algorithm is separately applied to the left
and right sub-arrays. After sorting gets done, the array will be -

Quicksort complexity

Now, let's see the time complexity of quicksort in best case, average case, and
in worst case. We will also see the space complexity of quicksort.

1. Time Complexity

o Best Case Complexity - In Quicksort, the best-case occurs when the


pivot element is the middle element or near to the middle element. The
best-case time complexity of quicksort is O(n*logn).
o Average Case Complexity - It occurs when the array elements are in
jumbled order that is not properly ascending and not properly descending.
The average case time complexity of quicksort is O(n*logn).
o Worst Case Complexity - In quick sort, worst case occurs when the
pivot element is either greatest or smallest element. Suppose, if the pivot
element is always the last element of the array, the worst case would
occur when the given array is sorted already in ascending or descending
order. The worst-case time complexity of quicksort is O(n2).

Though the worst-case complexity of quicksort is more than other sorting


algorithms such as Merge sort and Heap sort, still it is faster in practice. Worst
case in quick sort rarely occurs because by changing the choice of pivot, it can
be implemented in different ways. Worst case in quicksort can be avoided by
choosing the right pivot element.
2. Space Complexity

o The space complexity of quicksort is O(n*logn).

Heap sort processes the elements by creating the min-heap or max-heap using
the elements of the given array. Min-heap or max-heap represents the ordering
of array in which the root element represents the minimum or maximum
element of the array.

Heap sort basically recursively performs two main operations -

o Build a heap H, using the elements of array.


o Repeatedly delete the root element of the heap formed in 1st phase.

What is a heap?

A heap is a complete binary tree, and the binary tree is a tree in which the node
can have the utmost two children. A complete binary tree is a binary tree in
which all the levels except the last level, i.e., leaf node, should be completely
filled, and all the nodes should be left-justified.

What is heap sort?

Heapsort is a popular and efficient sorting algorithm. The concept of heap sort
is to eliminate the elements one by one from the heap part of the list, and then
insert them into the sorted part of the list.

 Heapsort is the in-place sorting algorithm.

Algorithm

1. HeapSort(arr)
2. BuildMaxHeap(arr)
3. for i = length(arr) to 2
4. swap arr[1] with arr[i]
5. heap_size[arr] = heap_size[arr] ? 1
6. MaxHeapify(arr,1)
7. End

BuildMaxHeap(arr)
1. BuildMaxHeap(arr)
2. heap_size(arr) = length(arr)
3. for i = length(arr)/2 to 1
4. MaxHeapify(arr,i)
5. End

MaxHeapify(arr,i)

1. MaxHeapify(arr,i)
2. L = left(i)
3. R = right(i)
4. if L ? heap_size[arr] and arr[L] > arr[i]
5. largest = L
6. else
7. largest = i
8. if R ? heap_size[arr] and arr[R] > arr[largest]
9. largest = R
10.if largest != i
11.swap arr[i] with arr[largest]
12.MaxHeapify(arr,largest)
13.End

Working of Heap sort Algorithm

In heap sort, basically, there are two phases involved in the sorting of elements.
By using the heap sort algorithm, they are as follows -

o The first step includes the creation of a heap by adjusting the elements of
the array.
o After the creation of heap, now remove the root element of the heap
repeatedly by shifting it to the end of the array, and then store the heap
structure with the remaining elements.

Now let's see the working of heap sort in detail by using an example
First, we have to construct a heap from the given array and convert it into max
heap.

After converting the given heap into max heap, the array elements are -

Next, we have to delete the root element (89) from the max heap. To delete this
node, we have to swap it with the last node, i.e. (11). After deleting the root
element, we again have to heapify it to convert it into max heap.

After swapping the array element 89 with 11, and converting the heap into max-
heap, the elements of array are -
In the next step, again, we have to delete the root element (81) from the max
heap. To delete this node, we have to swap it with the last node, i.e. (54). After
deleting the root element, we again have to heapify it to convert it into max
heap.

After swapping the array element 81 with 54 and converting the heap into max-
heap, the elements of array are -

In the next step, we have to delete the root element (76) from the max heap
again. To delete this node, we have to swap it with the last node, i.e. (9). After
deleting the root element, we again have to heapify it to convert it into max
heap.

After swapping the array element 76 with 9 and converting the heap into max-
heap, the elements of array are -
In the next step, again we have to delete the root element (54) from the max
heap. To delete this node, we have to swap it with the last node, i.e. (14). After
deleting the root element, we again have to heapify it to convert it into max
heap.

After swapping the array element 54 with 14 and converting the heap into max-
heap, the elements of array are -

In the next step, again we have to delete the root element (22) from the max
heap. To delete this node, we have to swap it with the last node, i.e. (11). After
deleting the root element, we again have to heapify it to convert it into max
heap.

After swapping the array element 22 with 11 and converting the heap into max-
heap, the elements of array are -
In the next step, again we have to delete the root element (14) from the max
heap. To delete this node, we have to swap it with the last node, i.e. (9). After
deleting the root element, we again have to heapify it to convert it into max
heap.

After swapping the array element 14 with 9 and converting the heap into max-
heap, the elements of array are -

In the next step, again we have to delete the root element (11) from the max
heap. To delete this node, we have to swap it with the last node, i.e. (9). After
deleting the root element, we again have to heapify it to convert it into max
heap.

After swapping the array element 11 with 9, the elements of array are -

Now, heap has only one element left. After deleting it, heap will be empty.
After completion of sorting, the array elements are -

Now, the array is completely sorted.

Heap sort complexity

1. Time Complexity

o Best Case Complexity - It occurs when there is no sorting required, i.e.


the array is already sorted. The best-case time complexity of heap sort
is O(n logn).
o Average Case Complexity - It occurs when the array elements are in
jumbled order that is not properly ascending and not properly descending.
The average case time complexity of heap sort is O(n log n).
o Worst Case Complexity - It occurs when the array elements are required
to be sorted in reverse order. That means suppose you have to sort the
array elements in ascending order, but its elements are in descending
order. The worst-case time complexity of heap sort is O(n log n).

The time complexity of heap sort is O(n logn) in all three cases (best case,
average case, and worst case). The height of a complete binary tree having n
elements is logn.

2. Space Complexity

o The space complexity of Heap sort is O(1).

//construction and deletion of max, min heap and heapify methods are well
explained in class room ,consider them.

Merge sort is the sorting technique that follows the divide and conquer
approach. Merge sort is similar to the quick sort algorithm as it uses the divide
and conquer approach to sort the elements. It is one of the most popular and
efficient sorting algorithm. It divides the given list into two equal halves, calls
itself for the two halves and then merges the two sorted halves. We have to
define the merge() function to perform the merging.
The sub-lists are divided again and again into halves until the list cannot be
divided further. Then we combine the pair of one element lists into two-element
lists, sorting them in the process. The sorted two-element pairs is merged into
the four-element lists, and so on until we get the sorted list.

Algorithm
In the following algorithm, arr is the given array, beg is the starting element,
and end is the last element of the array.
1. MERGE_SORT(arr, beg, end)
2.
3. if beg < end
4. set mid = (beg + end)/2
5. MERGE_SORT(arr, beg, mid)
6. MERGE_SORT(arr, mid + 1, end)
7. MERGE (arr, beg, mid, end)
8. end of if
9.
10.END MERGE_SORT
The important part of the merge sort is the MERGE function. This function
performs the merging of two sorted sub-arrays that are A[beg…
mid] and A[mid+1…end], to build one sorted array A[beg…end]. So, the
inputs of the MERGE function are A[], beg, mid, and end.
The implementation of the MERGE function is given as follows -
1. /* Function to merge the subarrays of a[] */
2. void merge(int a[], int beg, int mid, int end)
3. {
4. int i, j, k;
5. int n1 = mid - beg + 1;
6. int n2 = end - mid;
7.
8. int LeftArray[n1], RightArray[n2]; //temporary arrays
9.
10. /* copy data to temp arrays */
11. for (int i = 0; i < n1; i++)
12. LeftArray[i] = a[beg + i];
13. for (int j = 0; j < n2; j++)
14. RightArray[j] = a[mid + 1 + j];
15.
16. i = 0, /* initial index of first sub-array */
17. j = 0; /* initial index of second sub-array */
18. k = beg; /* initial index of merged sub-array */
19.
20. while (i < n1 && j < n2)
21. {
22. if(LeftArray[i] <= RightArray[j])
23. {
24. a[k] = LeftArray[i];
25. i++;
26. }
27. else
28. {
29. a[k] = RightArray[j];
30. j++;
31. }
32. k++;
33. }
34. while (i<n1)
35. {
36. a[k] = LeftArray[i];
37. i++;
38. k++;
39. }
40.
41. while (j<n2)
42. {
43. a[k] = RightArray[j];
44. j++;
45. k++;
46. }
47.}
Working of Merge sort Algorithm

Let the elements of array are -

According to the merge sort, first divide the given array into two equal halves.
Merge sort keeps dividing the list into equal parts until it cannot be further
divided.
As there are eight elements in the given array, so it is divided into two arrays of
size 4.
Now, again divide these two arrays into halves. As they are of size 4, so divide
them into new arrays of size 2.

Now, again divide these arrays to get the atomic value that cannot be further
divided.

Now, combine them in the same manner they were broken.


In combining, first compare the element of each array and then combine them
into another array in sorted order.
So, first compare 12 and 31, both are in sorted positions. Then compare 25 and
8, and in the list of two values, put 8 first followed by 25. Then compare 32 and
17, sort them and put 17 first followed by 32. After that, compare 40 and 42,
and place them sequentially.

In the next iteration of combining, now compare the arrays with two data values
and merge them into an array of found values in sorted order.

Now, there is a final merging of the arrays. After the final merging of above
arrays, the array will look like -

Now, the array is completely sorted.


Merge sort complexity
1. Time Complexity
o Best Case Complexity - It occurs when there is no sorting required, i.e.
the array is already sorted. The best-case time complexity of merge sort
is O(n*logn).
o Average Case Complexity - It occurs when the array elements are in
jumbled order that is not properly ascending and not properly descending.
The average case time complexity of merge sort is O(n*logn).
o Worst Case Complexity - It occurs when the array elements are required
to be sorted in reverse order. That means suppose you have to sort the
array elements in ascending order, but its elements are in descending
order. The worst-case time complexity of merge sort is O(n*logn).
2. Space Complexity
o The space complexity of merge sort is O(n). It is because, in merge sort,
an extra variable is required for swapping.

Tournament Tree (Winner Tree)

What is a tournament tree?


A tournament tree is a form of complete binary tree in which each node denotes
a player. The last level has n-1 nodes (external nodes) used to represent all the
players, and the rest of the nodes (internal nodes) represent either the winner or
loser among them. It is also referred to as a Selection tree.

Some of the important properties of the tournament tree are listed below:
a. The value of every internal node is always equal to one of its children.
b. A tournament tree can have holes. The tournament tree having nodes less
than 2 ^ (n+1) -1 contains holes. The hole represents a player or team's
absence and can be anywhere in the tree.
c. Every node in a tournament tree is linked to its predecessor and
successor, and unique paths exist between all nodes.
d. It is a type of binary heap (min or max heap).
e. The root node represents the winner of the tournament.
f. To find the winner or loser (best player) of the match, we need N-1
comparisons.

Types of tournament trees:


There exist a loser and a winner in every match. So, there are two methods to
represent both ideas:
1. Winner Tree, and
2. Loser Tree
What is a winner tree?
In a tournament tree, when the internal nodes represent the winner of the match,
the tree obtained is referred to as the winner tree. Each internal node stores
either the smallest or greatest of its children, depending on the winning criteria.
When the winner is the smaller value then the winner tree is referred to as
the minimum winner tree, and when the winner is the larger value, then the
loser tree is referred to as the maximum winner tree.
The tournament's winner is always the smallest or the greatest of all the players
or values and can be found in O(1). The time needed to create the winner tree is
O(Log N), where N represents the number of players.

Q. Eight players participate in a tournament where a number represents


each player, from 1 to 8. The pairings of these players are given below:
Group A: 1, 3, 5, 7 (Matches: 1 - 7 and 3 - 5)
Group B: 2, 4, 6, 8 (Matches: 2 - 6, and 4 - 8)
Winning Criteria - The player having the largest value wins the match.
Represent the winner tree (maximum winner tree) for this tournament
tree.
The winner tree (maximum winner tree) is represented below in the figure:
Here, the tree formed is a max heap, the value of all the parents is either equal
to or larger than its children, and the winner is 8, which is the largest of all the
players.

We can observe from this that every tournament tree is a binary tree, but the
reverse is not true because, in the tournament tree, the winner is always equal to
one of its children, but in binary heaps, the parent is not always equal to its
children.
What is a loser tree?
In a tournament tree, when the internal nodes are used to represent the loser of
the match between two, then the tree obtained is referred to as the loser tree.
When the loser is the smaller value then the loser tree is referred to as
the minimum loser tree, and when the loser is the larger value, then the loser
tree is referred to as the maximum loser tree.
It is also called the minimum or maximum loser tree. The same idea is also
applied here, the loser (or parent) is always equal to one of its children, and the
loser is always the greatest or smallest of all the players and can be found in
O(1). Also, the time needed to create a loser tree is O(Log N), where N is the
number of players.
Q. Eight players participate in a tournament where a number represents
each player, from 1 to 8. The pairings of these players are given below:
Group A: 1, 3, 5, 7 (Matches: 1 - 7 and 3 - 5)
Group B: 2, 4, 6, 8 (Matches: 2 - 6, and 4 - 8)
Winning Criteria - The player having the largest value wins the match.
Represent the loser tree (minimum loser tree) for this tournament tree.
The loser tree (minimum loser tree) obtained is shown in the given figure:

Here, the tree obtained is a min-heap, the value of the parent is either smaller or
equal to its children, and 1 is the overall loser of the tournament.
Finding the second-best player:
In the winner tree, the winner is found at the top of the tree. Now, the problem
is to find the second-best player in the tournament. Here, we need to note that
the second-best player is only beaten by the best player. To find the second-best
player, we need to check the last two games (or comparison) with the best
player. Let us see the above idea with an example.
Here, 8 is the tournament winner, and when we look for the second-best player,
we find 7 is the second-best player.
Applications of a tournament tree:
1. It can find the maximum or minimum value in an array.
2. It is generally used in sorting.
3. It can be used in M-way merges.
4. It is used to find the median of the sorted array
5. It is also used in the truck loading problem
How to find the median of N-sorted arrays using the Tournament tree?
One of the applications of a tournament tree is finding the median of N-sorted
arrays. The minimum winner tree is used to find the median efficiently.
We place the array in the leaf of the tournament tree and the winners into the
internal nodes.
Let us understand this with an example:
Assume that there are three sorted arrays as follows:
A = {12, 14, 15, 16, 17, 19}
B = {4, 5, 8, 10}
C = {1, 3, 6, 7, 9}
of sizes 6, 4, and 5 respectively.
Sorted merged array = {1, 3, 4, 5, 6, 7, 8, 9, 10, 12, 14, 15, 16, 17, 19}
Total no. of elements = 15
Median = (N + 1) / 2 = (15 + 1) / 2 = 8.
The eighth element is the median = 10. The height of the tree is calculated
using Log2(M) where N is the number of arrays. We take the upper bound of the
calculated value. Here, Log2(3) = 1.585 @ 2. So, the tree we need to construct
will be a height of 2 with 4 leave nodes shown in the figure below.
Now, the minimum winner tree of the first tournament tree will look like as
below:
Here, we need to note that in the last leave we have filled infinity so that the
match against it will always be won by the element and when the elements in an
array are exhausted, we fill infinity in that leave node.
Here also, 1 from array C is the winner of the tournament. So, the next element
= 3 from array C replaces it.
The minimum winner tree of the second tournament will be like as shown in the
figure:
Here, 3 is the overall winner of the tournament. In the next tournament, 6 from
array C will replace it.
The minimum winner tree of the third tournament will be like as shown in the
figure:
Here, 4 is the overall winner of the tournament. So, in the next tournament 5
from array B will replace it.
In order to find the median element which is the eighth element in our case. We
need a total of eight tournaments. We follow the same steps for the rest of the
tournament and the respective trees are shown below in the figure:
After the 8th tournament, we get the median of the array.
The time complexity of merging N-sorted arrays of sized n1, n2, n3, …... nn into a
single sorted array is O((n1 + n2 + ….. + nn) * Log2N) and the time complexity
of finding the median is O(m * Log2N) where m is the position of the median.
When the sizes of the N arrays are equal then the time complexity of merging
them becomes O(M * Log2N) where m is the size of the array.

What is a linear search?


A linear search is also known as a sequential search that simply scans each element
at a time. Suppose we want to search an element in an array or list; we simply
calculate its length and do not jump at any item.

Let's consider a simple example.

Suppose we have an array of 10 elements as shown in the below figure:


The above figure shows an array of character type having 10 values. If we want to
search 'E', then the searching begins from the 0 th element and scans each element
until the element, i.e., 'E' is not found. We cannot directly jump from the 0 th element
to the 4th element, i.e., each element is scanned one by one till the element is not
found.

Complexity of Linear search


As linear search scans each element one by one until the element is not found. If the
number of elements increases, the number of elements to be scanned is also
increased. We can say that the time taken to search the elements is proportional
to the number of elements. Therefore, the worst-case complexity is O(n)

What is a Binary search?


A binary search is a search in which the middle element is calculated to check
whether it is smaller or larger than the element which is to be searched. The main
advantage of using binary search is that it does not scan each element in the list.
Instead of scanning each element, it performs the searching to the half of the list. So,
the binary search takes less time to search an element as compared to a linear
search.

The one pre-requisite of binary search is that an array should be in sorted order,
whereas the linear search works on both sorted and unsorted array. The binary
search algorithm is based on the divide and conquer technique, which means that it
will divide the array recursively.

There are three cases used in the binary search:

Case 1: data<a[mid] then left = mid+1.

Case 2: data>a[mid] then right=mid-1

Case 3: data = a[mid] // element is found


In the above case, 'a' is the name of the array, mid is the index of the element
calculated recursively, data is the element that is to be searched, left denotes the left
element of the array and right denotes the element that occur on the right side of
the array.

Let's understand the working of binary search through an example.

Suppose we have an array of 10 size which is indexed from 0 to 9 as shown in the


below figure:

We want to search for ‘70’ element from the above array.

Step 1: First, we calculate the middle element of an array. We consider two variables,
i.e., left and right. Initially, left =0 and right=9 as shown in the below figure:

The middle element value can be calculated as:

Therefore, mid = 4 and a[mid] = 50. The element to be searched is 70, so a[mid] is
not equal to data. The case 2 is satisfied, i.e., data>a[mid].

Step 2: As data>a[mid], so the value of left is incremented by mid+1, i.e.,


left=mid+1. The value of mid is 4, so the value of left becomes 5. Now, we have got a
subarray as shown in the below figure:
ADVERTISEMENT

ADVERTISEMENT

Now again, the mid-value is calculated by using the above formula, and the value of
mid becomes 7. Now, the mid can be represented as:

In the above figure, we can observe that a[mid]>data, so again, the value of mid will
be calculated in the next step.

Step 3: As a[mid]>data, the value of right is decremented by mid-1. The value of mid
is 7, so the value of right becomes 6. The array can be represented as:

The value of mid will be calculated again. The values of left and right are 5 and 6,
respectively. Therefore, the value of mid is 5. Now the mid can be represented in an
array as shown below:
In the above figure, we can observe that a[mid]<data.

Step 4: As a[mid]<data, the left value is incremented by mid+1. The value of mid is 5,
so the value of left becomes 6.

Now the value of mid is calculated again by using the formula which we have already
discussed. The values of left and right are 6 and 6 respectively, so the value of mid
becomes 6 as shown in the below figure:

We can observe in the above figure that a[mid]=data. Therefore, the search is
completed, and the element is found successfully.

Differences between Linear search and Binary


search
The following are the differences between linear search and binary search:

Description

Linear search is a search that finds an element in the list by searching the element
sequentially until the element is found in the list. On the other hand, a binary search
is a search that finds the middle element in the list recursively until the middle
element is matched with a searched element.

Working of both the searches


The linear search starts searching from the first element and scans one element at a
time without jumping to the next element. On the other hand, binary search divides
the array into half by calculating an array's middle element.

Implementation

The linear search can be implemented on any linear data structure such as vector,
singly linked list, double linked list. In contrast, the binary search can be implemented
on those data structures with two-way traversal, i.e., forward and backward traversal.

Complexity

The linear search is easy to use, or we can say that it is less complex as the elements
for a linear search can be arranged in any order, whereas in a binary search, the
elements must be arranged in a particular order.

Sorted elements

The elements for a linear search can be arranged in random order. It is not
mandatory in linear search that the elements are arranged in a sorted order. On the
other hand, in a binary search, the elements must be arranged in sorted order. It can
be arranged either in an increasing or in decreasing order, and accordingly, the
algorithm will be changed. As binary search uses a sorted array, it is necessary to
insert the element at the proper place. In contrast, the linear search does not need a
sorted array, so that the new element can be easily inserted at the end of the array.

Approach

The linear search uses an iterative approach to find the element, so it is also known
as a sequential approach. In contrast, the binary search calculates the middle element
of the array, so it uses the divide and conquer approach.

Data set

Linear search is not suitable for the large data set. If we want to search the element,
which is the last element of the array, a linear search will start searching from the first
element and goes on till the last element, so the time taken to search the element
would be large. On the other hand, binary search is suitable for a large data set as it
takes less time.

Speed

If the data set is large in linear search, then the computational cost would be high,
and speed becomes slow. If the data set is large in binary search, then the
computational cost would be less compared to a linear search, and speed becomes
fast.

Dimensions

Linear search can be used on both single and multidimensional array, whereas the
binary search can be implemented only on the one-dimensional array.

Efficiency

Linear search is less efficient when we consider the large data sets. Binary search is
more efficient than the linear search in the case of large data sets.

Let's look at the differences in a tabular form.

Basis of Linear search Binary search


comparison

Definition The linear search starts searching It finds the position of the searched
from the first element and compares element by finding the middle
each element with a searched element of the array.
element till the element is not found.

Sorted data In a linear search, the elements don't The pre-condition for the binary
need to be arranged in sorted order. search is that the elements must be
arranged in a sorted order.

Implementatio The linear search can be The implementation of binary


n implemented on any linear data search is limited as it can be
structure such as an array, linked list, implemented only on those data
etc. structures that have two-way
traversal.

Approach It is based on the sequential It is based on the divide and


approach. conquer approach.

Size It is preferrable for the small-sized It is preferrable for the large-size


data sets. data sets.

Efficiency It is less efficient in the case of large- It is more efficient in the case of
size data sets. large-size data sets.

Worst-case In a linear search, the worst- case In a binary search, the worst-case
scenario scenario for finding the element is scenario for finding the element is
O(n). O(log2n).

Best-case In a linear search, the best-case In a binary search, the best-case


scenario scenario for finding the first element scenario for finding the first
in the list is O(1). element in the list is O(1).

Dimensional It can be implemented on both a It can be implemented only on a


array single and multidimensional array. multidimensional array.

You might also like