AST20105 Data Structure and Algorithms: Chapter 10 - Comparison-Based Sorting
AST20105 Data Structure and Algorithms: Chapter 10 - Comparison-Based Sorting
2
Sorting
● Categories
○ Comparison-based
■ Insertion sort
■ Selection sort
■ Bubble sort
■ Merge sort
■ Quicksort
■ Bucket sort
○ Non comparison-based
■ Counting sort
■ Radix sort
3
Sorting
● The first step is to choose the criteria that will be used to order data.
● Very often, the sorting criteria are natural, as in the case of
numbers.
○ A set of numbers can be sorted in ascending or descending order.
■ Ascending: (1, 2, 5, 8, 20)
■ Descending: (20, 8, 5, 2, 1)
● Names in the phone book are ordered alphabetically by last name,
which is the natural order.
○ For alphabetic and non-alphabetic characters, the American Standard Code
for Information Interchange (ASCII) code is commonly used.
4
Common Terminology for sorting
● In-place sorting
○ The amount of extra space required to sort the data is
constant with the input size.
○ This is important when size of data is huge. Memory may
be used up if non in-place sorting algorithms are used to
sort huge volume of data.
● Stable sorting
○ A stable sorting preserves relative order of equal values.
5
Insertion Sort
6
Insertion sort
Algorithm
To sort an array of size n in ascending order:
1. Iterate from arr[1] to arr[n] over the array.
2. Compare the current element (key) to its
predecessor.
3. If the key element is smaller than its
predecessor, compare it to the elements
before. Move the greater elements one position
up to make space for the swapped element.
7
Insertion sort
void insertionSort(int arr[], int n)
{
int i, key, j;
for (i = 1; i < n; i++){
key = arr[i];
j = i - 1;
/* Move elements of arr[0..i-1], that are greater than key, to one position
Ahead of their current position */
while (j >= 0 && arr[j] > key){
arr[j + 1] = arr[j];
j = j - 1;
}
arr[j + 1] = key;
}
}
8
Insertion sort
● Time Complexity:
○ Insertion sort takes maximum time O(n2) to sort if elements are sorted in reverse
order.
9
Insertion sort
Pros Cons
● Efficient for sorting of small data ● Less efficient for sorting large amount
● Efficient for data that are almost
of data
sorted
● In-place sorting as only constant
amount of additional memory space
is required
● Stable sorting algorithm, since it
does not change the relative order of
elements with equal keys
10
Selection Sort
11
Selection sort
The selection sort algorithm sorts an array by repeatedly
finding the minimum element (considering ascending
order) from unsorted part and putting it at the beginning.
The algorithm maintains two subarrays in a given array.
12
Selection sort
void selectionSort(int arr[], int n) {
int i, j, min_idx;
● Stable sorting: No
● Pros:
○ Easy to implement
● Cons:
14
Bubble Sort
15
Bubble sort
Algorithm
void bubbleSort(int arr[], int n)
{
int i, j;
for (i = 0; i < n-1; i++)
16
Bubble sort - improved version
// An optimized version of Bubble Sort
void bubbleSort(int arr[], int n){
int i, j;
bool swapped;
for (i = 0; i < n-1; i++){
swapped = false;
for (j = 0; j < n-i-1; j++){
if (arr[j] > arr[j+1])
{
swap(&arr[j], &arr[j+1]);
swapped = true;
}
}
// IF no two elements were swapped by inner loop, then break
if (swapped == false)
break;
}
17
}
Bubble sort
● Time Complexity:
○ Insertion sort takes maximum time O(n2) to sort if elements are sorted in reverse
order.
● Due to its simplicity, bubble sort is often used to introduce the concept of a sorting
algorithm.
18
Quicksort
19
Quicksort
QuickSort is a Divide and Conquer algorithm:
20
Quicksort - Partition in details
Partition algorithm in detail
● There are two indices i and j.
● At the very beginning of the partition algorithm
○ i points to the first element in the array and
○ j points to the last one.
● Then algorithm moves i forward, until an element with value greater or equal to
the pivot is found.
● Index j is moved backward, until an element with value lesser or equal to the pivot
is found.
21
Quicksort - Partition in details
Partition algorithm in detail (cont’)
● If i ≤ j then arr[i] and arr[j] are swapped and i steps to the next position (i + 1), j
steps to the previous one (j - 1).
● Algorithm stops, when i becomes greater than j.
● After partition,
○ all values before i-th element are less or equal than the pivot, and
○ all values after j-th element are greater or equal to the pivot.
22
Quicksort - Example
23
Quicksort - Example
24
Quicksort - Example
25
Quicksort - Example
26
Quicksort - Example
27
Quicksort - Overview
28
Quicksort
void quickSort(int arr[], int left, if (i <= j) {
int right) { tmp = arr[i];
int i = left, j = right; arr[i] = arr[j];
int tmp; arr[j] = tmp;
int pivot = i++;
arr[(left + right) / 2]; j--;
}
/* partition */ }
while (i <= j) { /* recursion */
while (arr[i] < pivot) if (left < j)
i++; quickSort(arr, left, j);
while (arr[j] > pivot) if (i < right)
j--; quickSort(arr, i, right);
}
29
Quicksort - How to pick a pivot?
● Use the last element as pivot
○ Fine if the input is random
○ If the input is already sorted in non-decreasing order or the reverse
■ All elements would be in one sub-array and the other is empty
■ This happens for every recursively calls
■ Resulted in a very bad running time
● Randomly chosen pivot
○ Generally is a good option, but random number generation can be
expensive
30
Quicksort - How to pick a pivot?
● Use the median as the pivot
○ Partitioning always partitions the input array into two halves of the same size
○ However, it is difficult to find median
○ Solution: Use median of three
● Median of three
○ Compare three elements, the leftmost, rightmost and the center one
31
Quicksort - Analysis
● Assumptions:
○ A random pivot
● Let T(n) be the running time of quick sort to sort n numbers
○ Assume n is a power of 2
● Analysis:
○ Pivot selection: O(1) time
○ Partitioning: O(n) time
○ Running time of two recursive calls
○ T(1) = 1
○ T(n) = T(i) + T(n-i-1) + n
32
Quicksort - Worst Case Analysis
● Worst case when the chosen T(n)=T(n-1)+n
pivot is the smallest element, T(n)=T(n-2)+(n-1)+n
T(n)=T(n-3)+(n-2)+(n-1)+n
all the time …
● Partition is always unbalanced T(n)=T(n-k)+(n-(k-1)) + … + n
● T(1) = 1 n-k = 1
k=n-1
● T(n) = T(n-1) + n
T(n) = T(1) + (n-(n-1-1)) + … + n
T(n) = 1 + 2 + … + n
T(n) = (1+n)(n) / 2
T(n) = O(n2)
33
Quicksort - Best Case Analysis
● Best case when the chosen T(n)=2T(n/2) + n
= 2(2T(n/22) + n/2) + n
pivot is always the median of = 22T(n/22) + 2n
= 22(2T(n/23) + n/22) + 2n
the array, all the time = 23T(n/23) + 3n
…
● Partition is always balanced = 2kT(n/2k) + kn
● T(1) = 1 Let n = 2k and log2n = log22k → k = log2n
T(n) = nT(1) + nlog2n
● T(n) = 2T(n/2) + n = n(1) + nlog2n
= nlog2n + n
= O(nlogn)
34
Quicksort
● Time Complexity:
○ Worst case: O(n2)
● In-place sorting: Broadly, yes. Memory spaces are consumed to store recursive
● Stable sorting: No
● Cons
36
Quicksort - another
/* This function takes last element as pivot,
places the pivot element at its correct position
in sorted array, and places all smaller (smaller
approach than pivot) to left of pivot and all greater
elements to right of pivot */
pivot = arr[high];
/* low --> Starting index, high --> Ending
i = (low - 1) // Index of smaller element
index */
for (j = low; j <= high- 1; j++){
quickSort(arr[], low, high){
// If current element is smaller than the pivot
if (low < high){
/* pi is partitioning index, arr[pi] is now if (arr[j] < pivot){
i++; // increment index of smaller element
at right place */
swap arr[i] and arr[j]
pi = partition(arr, low, high);
}
quickSort(arr, low, pi - 1); // Before pi
}
quickSort(arr, pi + 1, high); // After pi
} swap arr[i + 1] and arr[high])
return (i + 1)
}
}
37
Mergesort
38
Mergesort
MergeSort(arr[], l, r)
If r > l
1. Find the middle point to divide the
array into two halves:middle m =
(l+r)/2
2. Call mergeSort for first half:Call
mergeSort(arr, l, m)
3. Call mergeSort for second half: Call
mergeSort(arr, m+1, r)
4. Merge the two halves sorted in step 2
and 3: Call merge(arr, l, m, r)
39
Mergesort
40
Mergesort - How to merge?
● Input: Two sorted array A and B
● Output: An sorted array C
● Three counters: aCurr, bCurr, cCurr
● Initially set them to the beginning of their respective arrays
● The smallest of A[aCurr] and B[aCurr] is copied to the next entry in C, and the counters
are increased by 1
● When either count reached the end, the remaining elements in the other list is copied to
C. 41
Mergesort - How to merge?
42
Mergesort - How to merge?
43
Mergesort - How to merge?
44
Mergesort - Analysis of Merge Operation
● The running of merge takes O(n1 + n2) where n1 and n2 are the
sizes of the two sub-arrays, which is O(n)
● Space requirements of merge operation:
○ Merging two sorted lists requires O(n) extra memory
○ Additional work to copy the temporary array back to the original array
45
Mergesort
void mergeSort(int arr[], int left, int right, int size){
if(left < right){
int center = (left + right)/2;
mergeSort(arr, left, center, size);
mergeSort(arr, center + 1, right, size);
merge(arr, left, center, right, size);
}
}
46
Mergesort
void merge(int arr[], int low, int mid, if(l > mid) {
int high, int size){ for(int k=j; k<=high; k++) {
c[i] = arr[k]; i++;
int* c = new int[size]; }
int l = low, i = low, } else {
int j = mid+1; for(int k=l; k<=mid; k++){
c[i] = arr[k]; i++;
while((l<=mid) && (j<=high)) { }
if(arr[l] <= arr[j]) { }
c[i] = arr[l]; l++; for(int k=low; k<=high; k++)
} else { arr[k] = c[k];
c[i] = arr[j]; j++; delete [] c;
} }
i++;
}
47
Mergesort - Analysis
T(n)=2T(n/2) + n
● Let T(n) be the worst-case running =2(2T(n/22) +n/2) + n
time of merge sort to sort n numbers = 22T(n/22) +2n
= 22(2T(n/23)+ n/22) + 2n
● Assume n is a power of 2 = 23T(n/23) +3n
● Analysis: …
= 2kT(n/2k) + kn
○ Divide: O(1) time
Let n = 2k → log2n = log22k → k = log2n
○ Conquer: 2T(n/2) time T(n) = nT(1) + nlog2n
○ Combine step: O(n) time = n(1) + nlog2n
= nlog2n + n
○ Recurrence equation:
■ T(1) = 1 = O(nlogn)
■ T(n) = 2T(n/2) + n
48
Mergesort
● Time Complexity:
○ Time complexity of Merge Sort is θ(nLogn) in all 3 cases (worst, average and best) as merge
sort always divides the array into two halves and takes linear time to merge two halves.
● Algorithmic Paradigm: Divide and Conquer
● Sorting In Place: No, requires additional storage proportional to the size of input array for merge
operations, O(n)
● Stable: Yes
49
Q&A
50