Report (1)
Report (1)
Sorting Algorithms
Performance Analysis
Authors: Instructors:
Quan, Nguyen Hoang (23127106) M.S. Thong, Bui Huy
Phuc, Thai Hoang (23127458) M.S. Nhi, Tran Thi Thao
My, To Thao (23127231)
Cuong, Tran Tien (23127332)
June 2024
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
Contents
1 Information 4
1.1 Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Introduction 5
2.1 Content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Algorithm presentation 5
3.1 Selection Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.1.1 Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.1.2 Pseudo Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.1.3 Step-by-Step Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.1.4 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.1.5 Complexity Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 Insertion Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2.1 Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2.2 Pseudo Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2.3 Step-by-Step Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2.4 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2.5 Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2.6 Complexity Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3 Bubble Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3.1 Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3.2 Pseudo Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3.3 Step-by-Step Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3.4 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3.5 Complexity Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.4 Shell Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.4.1 Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.4.2 Pseudo Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.4.3 Step-by-Step Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.4.4 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
This is original work. Please do not copy without providing references. Page 1
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
This is original work. Please do not copy without providing references. Page 2
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
References 53
A Appendix 53
This is original work. Please do not copy without providing references. Page 3
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
1 Information
1.1 Authors
This report was written by the following four authors:
Note: The authorship of this report will not change, except in unavoidable circumstances as
required by the instructors.
1.2 Declaration
• We take full responsibility for any issues related to this report.
1.3 Environment
We measure the running time of sorting algorithms on the following computer specifications:
Note: Running time varies depending on different computer specifications. Results may differ
even when using the exact same computer specifications mentioned in this report.
This is original work. Please do not copy without providing references. Page 4
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
2 Introduction
2.1 Content
Sorting is a fundamental and crucial problem, not only in daily life but especially in the field of
Computer Science. Up to now, various sorting algorithms have been studied, each with its own
advantages and disadvantages. This report will present 11 typical sorting algorithms (which
includes Selection Sort, Insertion Sort, Bubble Sort, Shaker Sort, Shell Sort, Heap Sort, Merge Sort,
Quick Sort, Counting Sort, Radix Sort, and Flash Sort) through two main metrics: comparison
count and running time.
Authors of this report have managed to compile the content and analyze the sorting algorithms
in the most concise and understandable way possible. We hoped that this report will help learners
gain a comprehensive view of the sorting algorithms and apply them in practice.
Also, special thanks to M.S. Thong, Bui Huy, and Nhi, Tran Thi Thao for assigning us useful
exercises to learn sorting algorithms. We hope to receive feedback from our instructors to make
this report better.
2.2 Target
Our main goals of this report are as follows:
2. Upon completing this report, the authors of this project have gained valuable knowledge about
sorting algorithms.
3 Algorithm presentation
Selection sort [1] is a sorting algorithm that follows the brute-force approach. We starts the
algorithm by looking for the smallest element of the array and exchange it with the first element.
Then, we select the smallest element among the remaining n − 1 elements to exchange it with the
second element. This process continues until the array is sorted.
Selection sort is noted for its simplicity. However, its main disadvantage is the time complexity.
This is original work. Please do not copy without providing references. Page 5
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
1. For each index i ∈ [1, n − 2], we find mnIndex such that A[mnIndex] ≤ A[j], j ∈ [i, n − 1].
2. Swap A[mnIndex] (the minimum element that we found among the last n − i elements) with
A[i]. The array at the i-th pass should look like this:
A[0] ≤ A[1] ≤ ... ≤ A[i − 1] | A[i], ..., A[mnIndex], ..., A[n − 1] (1)
3.1.4 Visualization
Consider the input array is A = {89, 45, 68, 90, 29, 34, 17}. The following table shows the array
A in each iteration of selection sort. A pass through the list’s tail to the right of the vertical bar;
an element in bold indicates the smallest element found. Elements to the left of the vertical bar
are in their final positions and are not considered in this and subsequent iterations.
Iteration Array A
1 | 89 45 68 90 29 34 17
2 17 | 45 68 90 29 34 89
3 17 29 | 68 90 45 34 89
4 17 29 34 | 90 45 68 89
5 17 29 34 45 | 90 68 89
6 17 29 34 45 68 | 90 89
7 17 29 34 45 68 89 | 90
This is original work. Please do not copy without providing references. Page 6
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
– The input size is n (number of elements); the basic operation is the key comparison
A[j] < A[mnIndex]. The number of times the basic operation is executed depends only
on n and is given by the following sum:
n−2 X
n−1 n−2
X X n × (n − 1)
C(n) = 1= (n − 1 − i) = (2)
i=0 j=i+1 i=0
2
– Thus, selection sort is O(n2 ) on all inputs (or we say it is not adaptive).
– Number of key swaps: M (n) = n−1 ∈ O(n). This differs selection sort from other O(n2 )
sorting algorithm.
Insertion Sort is a Decrease-and-Conquer algorithm, it follows the basic idea of dividing the
array into two parts: a sorted part and an unsorted part. In the beginning, the first element of
the array is treated as the sorted part, while the remaining elements constitute the unsorted part.
It works by repeatedly inserting an element from the unsorted part into its correct position in the
sorted part.
This is original work. Please do not copy without providing references. Page 7
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
1. Start with the second element A[1] of the array and consider it as a key.
2. Compare the key with the elements before it in the sorted part.
3. Move the elements that are greater than the key one position ahead to make space for the
key.
4. Insert the key into its correct position in the sorted part.
3.2.4 Visualization
Consider the input array is A = {3, 2, 4, 1}. We will use ’ | ’ to separate the sorted part (before
’ | ’) from the unsorted part (after ’ | ’). The number highlighted in red represents the key. We will
use ’*’ to determine the exact position of the key in the sorted part.
The table below illustrates the step-by-step process of sorting array A using insertion sort.
3.2.5 Variations
• Binary Insertion Sort: Instead of searching linearly for the correct position to insert an el-
ement, a binary search can be used to find the correct position in the sorted section of the
array. This can reduce the number of comparisons needed, especially for larger arrays.
This is original work. Please do not copy without providing references. Page 8
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
• Time Complexity:
– Worst-case: O(n2 )
If the array is reverse sorted, we can calculate the number of basic operations as follows:
In the first step, the sorted array has 1 element, we need to make 1 comparison. In the
next step, the sorted array has 2 elements, so we need to make 2 comparisons. We will
continue this process until the array becomes completely sorted. Therefore, the total
number of basic operations is:
n × (n − 1)
C(n) = 1 + 2 + 3 + ... + (n − 1) = (3)
2
Order of growth: O(n2 )
– Average-case: O(n2 )
We suppose that the right position to put X into sorted array is in the middle. So the
number of basic operation would be approximately:
n × (n − 1)
C(n) = (4)
4
Order of growth: O(n2 ) but twice as fast as the worst-case performance.
– Best-case: O(n)
When the array is already sorted, the algorithm only needs to go through the array once,
making one comparison for each element.
Order of growth: O(n)
Bubble Sort is a simple comparison-based sorting algorithm. The idea of the algorithm is to
repeatedly traverse the list, compare adjacent elements, and swap them if they are in the wrong
order. This process is repeated until the list is sorted.
Given an array with n items:
This is original work. Please do not copy without providing references. Page 9
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
4. Move to the next pair and repeat steps 2 and 3 until the end of the array is reached.
5. Reduce the range of comparison by one and repeat steps 1-4 until no more swaps are needed.
3.3.4 Visualization
Consider the input array is A = {4, 1, 7, 3, 5}. Following the step-by-step description above,
each pair of adjacent elements is compared, and they are swapped if they are in the wrong order.
This process is repeated until the array is sorted.
The table below illustrates the step-by-step process of sorting array A using bubble sort.
This is original work. Please do not copy without providing references. Page 10
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
• Time Complexity:
The outer loop runs from i = 0 to i = n − 2, resulting in n − 1 iterations. For each iteration of
the outer loop, the inner loop runs from j = 0 to j = n − 2 − i. Therefore, the total number
of comparisons can be expressed as:
n−2
X
C(n) = (n − 1 − i) (5)
i=0
(n − 1) × n
C(n) = (n − 1) + (n − 2) + . . . + 1 = (6)
2
n2 − n
C(n) = (7)
2
O(n2 ) (8)
This is original work. Please do not copy without providing references. Page 11
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
Shell Sort [2] is a sorting algorithm that aims to improve the efficiency of the Insertion Sort.
It allows comparing and swapping elements that are far apart instead of only considering adjacent
elements like Insertion Sort. Initially, we have a sequence of numbers called "gaps". We iteratively
go through these gaps in decreasing order, and during each pass, we compare elements that are
separated by the corresponding gap.
1. Start with a gap value, initially set to half the size of the array (in case the size is an odd
number, we round the gap down).
2. Compare all pairs of elements that are separated by the corresponding gap. If the number on
the left side is greater than the number on the right side, swap them.
This is original work. Please do not copy without providing references. Page 12
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
3.4.4 Visualization
• Time Complexity: The time complexity of Shell Sort can vary depending on the choice of
the gap sequence. Here the typical time complexities for Shell Sort:
– Worst-case: O(n2 )
This occurs when the array is initially in reverse order or has a specific pattern that leads
to poor partitioning.
Order of growth: O(n2 )
– Average-case: O(n log n)
The average-case time complexity for Shell Sort is around O(n log n).
Order of growth: O(n log n)
– Best-case: O(n log n)
When the given array list is already sorted the total count of comparisons of each interval
is equal to the size of the given array.
Order of growth: O(n log n)
This is original work. Please do not copy without providing references. Page 13
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
Before learning Heap Sort, readers should be comfortable with the concepts related to heaps.
A heap [3] is a binary tree which satisfies the following condition:
• The binary tree is complete, i.e. all its levels are full except possibly the last level, where
only some rightmost leaves may be missing.
• The key in each node is greater than or equal to the keys in its children. (Leaves automatically
satisfy this condition, because a leaf has no child). This is called the max-heap property.
There are other variations as well, such as min-heap.
Figure 1: Illustration of the definition of heap: only the leftmost tree is a heap.
• Height of a node is the number of edges on the longest simple downward path from the node
to a leaf.
• Heap and its array representation: A heap can be implemented as an array by recording
its elements in the topdown, left-to-right fashion. Let A[0..n − 1] is an array to store our heap.
– Root node is placed at the first position of A, which is A[0]. The left child of root node
is placed at A[1], the right child of root node is placed at A[2]. Continue filling the array
with values from our heap from top to bottom, left to right.
This is original work. Please do not copy without providing references. Page 14
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
– In general, a node with index i will have its left child placed at index 2i + 1, right
child placed at index 2i + 2, parent placed at index (i − 1)/2 in A (assumming that the
calculated indexes are in range [0..n − 1].
– The elements in the subarray A[n/2..n − 1] are all leaves (this can be proven).
Figure 2: Heap and its array representation. An example for the array {10, 8, 7, 5, 2, 1, 6, 3, 5, 1}
Heap Sort is a sorting algorithm based on the transform and conquer approach, it works based
on 2 main stages:
• Stage 2: Apply the root-deletion operation n − 1 times to the remaining heap (i.e maximum
deletion).
Note: For further details, you may need to refer to the step-by-step description and additional
reading.
This is original work. Please do not copy without providing references. Page 15
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
function buildMaxHeap(A, n)
for i ← n/2 − 1 downto 0 do
maxHeapify(A, i, n)
end for
return
end function
function heapSort(A, n)
buildMaxHeap(A, n)
sz ← n
while sz ̸= 0 do
swap(A[0], A[sz − 1])
sz ← sz − 1
maxHeapify(A, 0, sz)
end while
return
end function
Let’s us first define the maxHeapify operation. In short, maxHeapify ensures that the
subtree rooted at index i satisfies the max-heap property by making sure the node at i is larger
than its children. If it’s not, the function swaps the node with its largest child and recursively
max-heapifies the affected subtree. This operation is fundamental in building a max heap from
an arbitrary array and for maintaining the heap property after extracting the maximum element
This is original work. Please do not copy without providing references. Page 16
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
during the heap sort process. Note: When calling this function, we must ensure that all subtrees
rooted at the children nodes of node i satisfy the max-heap property.
The heap sort process:
• Stage 1: buildMaxHeap, used for transforming a given array into a max-heap, the function
iterates over the non-leaf nodes of the array, starting from the last non-leaf node all the way
up to the root node, and applies the maxHeapify operation on each.
• Stage 2: Maximum key deletion. Swap root’s key with the last key K of the heap.
Decrease the heap size by 1 (in other words, discard the current largest node). maxHeapify
the smaller tree by sifting K down the tree.
• Repeat Stage 2 process until the heap is empty (heap size is 0).
3.5.4 Visualization
Consider the input array is A = {2, 9, 7, 6, 5, 8}. We first need to transform the array into a
max-heap using buildMaxHeap. After this stage, A = {9, 6, 8, 2, 5, 7}, please have a look at the
figure below.
Figure 3: buildMaxHeap operation visualization. The doubleheaded arrows show key compar-
isons verifying the parental dominance.
After that, we need to do maximum key deletions. Here is an example of the first pass, deleting
the node 9 from the heap:
This is original work. Please do not copy without providing references. Page 17
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
Figure 4: An example of maximum key deletion. Step 1: Swap 9 with 1; Step 2: Discard node 9;
Step 3: Max-Heapify Root node 1.
Repeat the maximum key deletion process until the heap is empty. Here is the process on array
A:
This is original work. Please do not copy without providing references. Page 18
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
Merge Sort is a Divide-and-Conquer algorithm that works by recursively dividing the array
into two halves, sorting each half, and then merging the sorted halves to produce a sorted array.
This algorithm is known for its efficiency and stability.
Given an array with n items:
This is original work. Please do not copy without providing references. Page 19
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
function Merge(B, C, A)
i ← 0, j ← 0, k ← 0
while i < length(B) and j < length(C) do
if B[i] ≤ C[j] then
A[k] ← B[i]
i←i+1
else
A[k] ← C[j]
j ←j+1
end if
k ←k+1
end while
if i = length(B) then
copy C[j..length(C) − 1] to A[k..length(B) + length(C) − 1]
else
copy B[i..length(B) − 1] to A[k..length(B) + length(C) − 1]
end if
end function
This is original work. Please do not copy without providing references. Page 20
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
3.6.4 Visualization
Consider the input array is A = {5, 3, 8, 6, 2}. Following the step-by-step description above, the
array is divided into two halves recursively until each sub-array contains a single element. These
are then merged back together in sorted order.
The table below illustrates the step-by-step process of sorting array A using merge sort.
Quick Sort is a Divide-and-Conquer algorithm, it works by selecting a pivot element from the
array and partitioning the other elements into two sub-arrays, according to whether they are less
than or greater than the pivot. Recursively repeating the partitioning step on the left and right
sub-arrays until the entire array is sorted. There are various versions of the Quick Sort algorithm
that use different strategies for choosing the pivot element.
This is original work. Please do not copy without providing references. Page 21
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
function QuickSort(A, l, r)
if l ≥ r then
return
end if
p ← Partition(A, l, r)
QuickSort(A, l, p − 1)
QuickSort(A, p + 1, r)
end function
1. Choose a pivot element from the array. (This can be done in various ways, the pseudo code
above initializes the pivot’s index p as the rightmost element of the current partition).
2. Partition the array into two sub-arrays, with elements less than the pivot on the left side and
elements greater than the pivot on the right side. The pivot is now in its final sorted position.
3. Recursively repeating the partitioning step on the left and right sub-arrays until the entire
array is sorted.
This is original work. Please do not copy without providing references. Page 22
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
3.7.4 Visualization
Consider the input array is A = {4, 1, 0, 3, 2}. Following the step-by-step description above, the
pivot’s index will be chosen as the rightmost element of the current partition. Quick sort seperates
elements that are less than pivot on the left side and elements greater than pivot on the right side by
finding two elements: itemFromLeft which is larger than pivot and itemFromRight which is smaller
than pivot then swap them [5].
The table below illustrates the step-by-step process of sorting array A using quick sort.
itemFromLeft is highlighted orange.
itemFromRight is highlighted blue.
3.7.5 Variations
• Three-Way Quicksort: this algorithm partitions the array into three subarrays: elements
smaller than the pivot, elements equal to the pivot, and elements greater than the pivot. This
variation is particularly useful when there are duplicate elements in the array.
• Dual-Pivot Quicksort: this algorithm divides the array into three parts: elements smaller
than the first pivot, elements between the two pivots, and elements greater than the second
pivot. Dual-Pivot Quicksort is known for its improved performance compared to the standard
version.
• Time Complexity: The complexity of the Shell Sort algorithm depends on how we choose
the gap sequence.
– Worst-case: O(n2 )
This occurs when the pivot element is consistently chosen poorly (the largest/smallest
element of the array is chosen), leading to 1 of the 2 subarrays empty.
This is original work. Please do not copy without providing references. Page 23
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
n × (n − 1)
C(n) = 1 + 2 + 3 + ... + (n − 1) = (9)
2
Order of growth: O(n2 )
– Average-case: O(n log n)
This occurs when p can be any position with the same probability 1/n.
n
1 X
C(n) = × [n − 1 + C(p) + C(n − 1 − p)] ≈ 1.39n log n (10)
n i=1
• Space Complexity:
– Worst-case: O(n)
When the pivot selection consistently leads to unbalanced partitioning, the recursion tree
can become skewed, resulting in a depth of n levels.
– Best/Average-case: O(log n)
When Quick Sort achieves a balanced partitioning, resulting in a relatively even split of
the array. As a result, the recursion tree is approximately log(n) levels deep.
Flash Sort is a distribution-based sorting algorithm that is particularly efficient for uniformly
distributed data. The idea of the algorithm is to partition the array into different classes and then
sort each class individually.
Given an array with n items:
This is original work. Please do not copy without providing references. Page 24
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
This is original work. Please do not copy without providing references. Page 25
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
This is original work. Please do not copy without providing references. Page 26
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
Algorithm 9 Part 2
for k ← 1 to m − 1 do
for i ← bucket[k] − 2 to bucket[k − 1] by −1 do
if A[i] > A[i + 1] then
t ← A[i]
j←i
while t > A[j + 1] do
A[j] ← A[j + 1]
j ←j+1
end while
A[j] ← t
end if
end for
end for
delete[] bucket
end function
2. Determine the number of classes (m) based on the size of the array (n).
5. Classify each element into one of the m classes and count the number of elements in each
class.
3.8.4 Visualization
Consider the input array is A = {6, 1, 8, 3, 4}. Flash sort works by classifying the elements into
buckets and then sorting within each bucket. Here’s a step-by-step description of the process:
The table below illustrates the step-by-step process of sorting array A using flash sort.
This is original work. Please do not copy without providing references. Page 27
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
• Time Complexity:
– Worst-case: O(n2 )
The worst-case time complexity can occur when the distribution of elements is highly
skewed, causing inefficient class partitioning and sorting.
– Average-case: O(n)
On average, Flash Sort performs in linear time, O(n), due to efficient classification and
distribution of elements.
– Best-case: O(n)
In the best case, when the elements are uniformly distributed, Flash Sort also runs in
O(n) time complexity.
Counting Sort is a non-comparison-based sorting algorithm that works well when there is
limited range of input values. It is particularly efficient when the range of input values is small
compared to the number of elements to be sorted. The basic idea of Counting Sort is to count
the number of objects that possess distinct key values, and applying prefix sum on those counts to
determine the positions of each key value in the output sequence.
This is original work. Please do not copy without providing references. Page 28
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
3. Initialization:
4. Counting Occurrences:
(a) For each element A[i] in the input array, increment C[A[i]] by 1.
5. Cumulative Count:
(a) For each index i from 1 to k, update C[i] to be the sum of C[i] and C[i − 1]. This
cumulative count will indicate the position of each element in the output array.
This is original work. Please do not copy without providing references. Page 29
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
(a) For each element A[i] in the input array, place it in the output array B at the position
indicated by C[A[i]].
(b) Decrement C[A[i]] by 1 after placing the element in B to account for the placement.
(a) Copy the elements from the output array B back into the original array A.
3.9.4 Visualization
For example, if we have an array A of input [4, 2, 2, 8, 3, 3, 1]. The operation of Counting Sort
will look like this:
After creating array C with 9 elements (the biggest value of array A is 8), we iterate through
array A and increase the value of array C at the index corresponding to each element of A by 1.
After a cumulative count on array C, we place each element in its correct position in array B
based on the values in array C and decreasing the count in array C by 1.
3.9.5 Variations
This original version can not be used to sort negative values. We can fix this using the array
shifting technique.
This is original work. Please do not copy without providing references. Page 30
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
– Worst-case: O(n + m)
The worst-case time complexity can occur when the biggest value m is n!, so the time
complexity is O(n + n!)
– Average-case: O(n + m)
– Best-case: O(n + m)
In the best case, when m is small, the algorithm can run in O(n + m) time complexity.
This is original work. Please do not copy without providing references. Page 31
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
Radix Sort [7] is a linear and non-comparative sorting algorithm that sorts elements by pro-
cessing them digit by digit. The key idea of Radix Sort is to exploit the concept of place value. It
assumes that sorting numbers digit by digit will eventually result in a fully sorted list.
1. Find the maximum number in the array to determine the number of digits.
This is original work. Please do not copy without providing references. Page 32
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
2. Starting from the least significant digit. While there are still digits to process in the largest
number:
(a) Use count sort to sort the array based on the current digit represented by the exponent
value.
(b) Increase the exponent value by a factor of 10 to move to the next significant digit.
3.10.4 Visualization
For example, if we have an array A of input [954, 354, 009, 411]. The operation of Radix Sort
will look like this:
First, we consider the rightmost digit of each number in the columns and sort accordingly.
Next, we move to the middle digit and sort the columns again, maintaining the order of the
previously sorted digits. Finally, we sort by the most significant digit, ensuring each digit’s sorting
order from the previous steps is preserved. This process culminates in a fully sorted matrix.
3.10.5 Variations
Radix Sort can be performed using different variations, such as Least Significant Digit (LSD)
Radix Sort or Most Significant Digit (MSD) Radix Sort. For string sorting, this also be known as
sorting in lexicalgraphic order (MSB) or alphabetical order (LSB).
For sorting with negative numbers, we can modify using the array shifting method demonstrated
in 3.9.5
This is original work. Please do not copy without providing references. Page 33
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
Shaker Sort [8] or Cocktail Sort is a variation of Bubble sort. The Bubble sort algorithm
always traverses elements from left and moves the largest element to its correct position in the first
iteration and second-largest in the second iteration and so on. Shaker Sort traverses through a
given array in both directions alternatively.
This is original work. Please do not copy without providing references. Page 34
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
1. Initialization:
This is original work. Please do not copy without providing references. Page 35
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
(c) If no elements were swapped in the forward pass (i.e., swapped = f alse), the array is
sorted, and the algorithm can terminate.
(d) Set swapped ← f alse.
(e) Decrease the value of end by 1 (end ← end − 1).
(f) Backward Pass:
• Traverse the array from index end − 1 to start.
• For each index i, if A[i] > A[i + 1], then swap A[i] and A[i + 1] and set swapped ←
true.
(g) Increase the value of start by 1 (start ← start + 1).
3. Repeat the main loop until no elements are swapped in either the forward or backward pass.
3.11.4 Visualization
Consider the input array is A = {9, 5, 4, 3, 5}. Following the step-by-step description above,
shaker sort alternates between forward and backward passes through the list. The forward pass
moves larger elements to the end, and the backward pass moves smaller elements to the beginning.
The table below illustrates the step-by-step process of sorting array A using shaker sort.
Elements being compared are highlighted red.
This is original work. Please do not copy without providing references. Page 36
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
Table 10: Experimental results for data order: Sorted Ascending (part 1)
Table 11: Experimental results for data order: Sorted Ascending (part 2)
This is original work. Please do not copy without providing references. Page 37
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
Figure 8: A line graph for visualizing the algorithms’ running times on ascending input data
Figure 9: A bar chart for visualizing the algorithms’ comparisons on sorted ascending input data
This is original work. Please do not copy without providing references. Page 38
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
4.1.2 Comments
– With sorted input, Selection Sort and Bubble Sort are the two worst sorting algo-
rithms, as their execution times are significantly larger than the rest of the algorithms.
The slow execution time of these two algorithms is likely due to the necessity of perform-
ing numerous comparisons.
– Additionally, Quick Sort’s execution time is also quite slow. This is because when the
array is already sorted, selecting the first or last element as the pivot results in very
unbalanced partitions (1 of the 2 sub-arrays is empty). This leads to poor load balancing
and inefficient recursion.
On the other hand, the other sorting algorithms’ execution times on sorted input are
relatively fast regardless of the input data size:
– Among them, Heap Sort is the slowest with an execution time of nearly 90 milliseconds
for a large data set of 500,000 numbers.
– In contrast, Shaker Sort and Insertion Sort are the optimal choices in cases when
the data is already sorted because the execution time of these algorithms is nearly 0
milliseconds. This is because the number of operations required is directly proportional
to the input size, and for a sorted array, the number of operations is minimized.
– Though Merge Sort, Counting Sort, Radix Sort, and Flash Sort are fast, they
require external memory usage as the input data size increases.
– Shell Sort demonstrate excellent time performance regardless of the data size and does
not require any additional memory apart from the input array.
– Quick Sort consistently shows a high number of comparisons, particularly for larger
input sizes. This reflects Quick Sort’s worst-case efficiency of O(n2 ) since the data is
already sorted.
– Bubble Sort, Selection Sort show a high number of comparisons, especially as input
size increases. Their performance highlights the O(n2 ) time complexity, making them
less efficient for large datasets.
This is original work. Please do not copy without providing references. Page 39
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
Table 12: Experimental results for data order: Nearly sorted (part 1)
This is original work. Please do not copy without providing references. Page 40
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
Table 13: Experimental results for data order: Nearly sorted (part 2)
Figure 10: A line graph for visualizing the algorithms’ running times on nearly sorted input data
This is original work. Please do not copy without providing references. Page 41
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
Figure 11: A bar chart for visualizing the algorithms’ comparisons on nearly sorted input data
4.2.2 Comments
• According to Table 12, Table 13, and the line graph above, it is evident that:
– With nearly sorted input, Bubble sort is the worst sorting algorithm, as its execution
times are significantly larger than the rest of the algorithms. The slow execution time of
this algorithm is likely due to the necessity of performing numerous comparisons.
– Additionally, Selection sort’s execution time is also quite slow, compare to any other
algorithms.
– The execution times of the remaining algorithms are relatively equal to each other.
• According to Table 12, Table 13, and the bar chart above, it is evident that:
– Quick Sort consistently shows a high number of comparisons, particularly for larger
input sizes. This reflects Quick Sort’s worst-case efficiency of O(n2 ) since the data is
already nearly sorted.
– Bubble Sort, Selection Sort show a high number of comparisons, especially as input
size increases. Their performance highlights the O(n2 ) time complexity, making them
less efficient for large datasets.
This is original work. Please do not copy without providing references. Page 42
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
– Insertion Sort and Shaker Sort are the fastest algothms because Insertion Sort take
advantages of the near-linearly data and Shaker Sort has a flag that check if the data is
already sorted, so it’s optimal for the nearly sorted data order.
– The remaining algorithms comparisons are relatively equal to each other.
Table 14: Experimental results for data order: Reverse sorted (part 1)
Table 15: Experimental results for data order: Reverse sorted (part 2)
This is original work. Please do not copy without providing references. Page 43
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
Figure 12: A bar chart for visualizing the algorithms’ running times on reverse input data
Figure 13: A line graph for visualizing the algorithms’ running times on reverse input data
This is original work. Please do not copy without providing references. Page 44
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
4.3.2 Comments
• From the graph, it is evident that algorithms with O(n2 ) complexity have the longest runtimes.
Specifically, Bubble Sort and Shaker Sort are among the slowest because they require numerous
comparisons and swaps. Bubble Sort, which continuously passes through the list and swaps
adjacent elements if they are in the wrong order, ends up performing more comparisons than
Insertion Sort. As a result, Selection Sort and Insertion Sort are faster than Bubble Sort and
Shaker Sort, with Insertion Sort being particularly effective for nearly sorted data or small
datasets due to its efficient use of comparisons and swaps.
• Moving on to algorithms with O(n log n) complexity, these are generally more efficient. Shell
Sort, an enhanced version of Insertion Sort, improves performance by comparing elements far
apart and gradually reducing the gap between them. This often leads to faster sorting times
compared to other O(n log n) algorithms, especially with a well-chosen gap sequence. Quick
Sort, however, exhibits its worst-case performance here, primarily because the pivot selection
strategy was suboptimal. In contrast, Merge Sort and Heap Sort maintain stable performance
across various data distributions due to their structured approach to sorting.
• In the realm of linearithmic algorithms, Shell Sort, Merge Sort, and Heap Sort generally
outperform Quick Sort in less favorable conditions. Shell Sort’s efficiency is highly dependent
on the gap sequence used; an optimal sequence can greatly enhance its performance. Merge
Sort, which systematically divides the array and merges sorted subarrays, provides consistent
performance with a stable O(n log n) complexity. Heap Sort, which constructs a heap from
the array elements, also offers reliable performance, although its implementation can be more
complex.
• When we analyze algorithms like Radix Sort and Flash Sort, we observe that Flash Sort
performs better. Counting Sort remains the fastest, particularly for datasets with a limited
range of elements, due to its non-comparative nature. Radix Sort processes digits of numbers,
resulting in quick sorting and less space compared to Counting Sort, which however requires
additional space proportional to the range of input data.
• In summary, for general-purpose sorting, algorithms such as Merge Sort, Quick Sort, and Heap
Sort tend to be the most efficient due to their balanced time complexity and practical per-
formance. However, the optimal sorting algorithm depends on the characteristics of the data
and the desired performance on varying input sizes and distributions. It’s crucial to consider
trade-offs between time complexity, data distribution, and implementation complexity when
selecting a sorting algorithm. Algorithms like Counting Sort, Radix Sort, and Flash Sort,
This is original work. Please do not copy without providing references. Page 45
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
while efficient in terms of comparisons, may not always be ideal for general-purpose sorting
due to their specific requirements and constraints. In practice, hybrid algorithms like Timsort,
which combines Merge Sort and Insertion Sort, are often favored in standard libraries for their
robust performance across diverse datasets.
• According to the chart, algorithms with O(n2 ) complexity lead in the number of comparisons,
particularly Bubble Sort, Shaker Sort, and Selection Sort, which all utilize n(n − 1)/2 com-
parisons. Insertion Sort, being optimized, runs more efficiently than these three and involves
fewer comparisons, making it preferable in many scenarios.
• For O(n log n) algorithms, Quick Sort stands out with the most comparisons in its worst-case
scenario. Merge Sort and Heap Sort require fewer comparisons, approximately n log2 (n) in
most cases. Shell Sort proves to be the fastest among these algorithms, likely due to an
effective choice of gap sequence, which significantly influences its performance. Merge Sort’s
divide-and-conquer strategy ensures reliable performance, while Heap Sort benefits from its
binary heap structure, maintaining a balanced approach to comparisons.
• Algorithms such as Counting Sort, Radix Sort, and Flash Sort utilize the fewest comparisons.
Counting Sort is particularly efficient for large input sizes due to its counting mechanism
that bypasses direct comparisons. Radix Sort, sorting by digit or character, also achieves
low comparison counts and performs well with uniformly distributed numeric data. Flash
Sort, though less common, combines distribution-based sorting and linear sorting techniques,
making it effective under specific conditions.
• In summary, algorithms like Counting Sort, Radix Sort, and Flash Sort achieve lower compar-
ison counts through unique mechanisms that avoid direct element comparisons. However, for
general-purpose sorting, Merge Sort and Quick Sort generally offer better efficiency and lower
comparison counts than quadratic algorithms like Selection Sort, Insertion Sort, Bubble Sort,
and Shaker Sort. The performance of these algorithms can be influenced by data distribution
and implementation details, so it’s essential to choose the sorting algorithm based on the
specific requirements of the task, considering factors such as stability, memory usage, and the
nature of the input data.
This is original work. Please do not copy without providing references. Page 46
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
This is original work. Please do not copy without providing references. Page 47
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
Figure 14: A line graph for visualizing the algorithms’ running times on random input data
This is original work. Please do not copy without providing references. Page 48
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
Figure 15: A bar chart for visualizing the algorithms’ comparisons on random input data
4.4.2 Comments
• According to Table 16, Table 17, and Figure 14, it is evident that:
– With randomized input, Bubble Sort and Shaker Sort exhibit the worst performance
regardless of the input data size due to their time complexity of O(n2 ).
– Following them are Selection Sort and Insertion Sort. Despite having the same
time complexity of O(n2 ), their figures are slightly better. This can be attributed to
their respective swap operations. Bubble Sort and Shaker Sort require multiple swaps,
which may result in elements traversing the entire array. As a result, these algorithms
can be slower, especially when dealing with larger arrays. Conversely, Insertion Sort
and Selection Sort only perform one swap per iteration, efficiently placing the smallest
element in its correct position.
On the other hand, the other sorting algorithms’ execution times are relatively fast
regardless of the input data size:
– Among them, Shell Sort and Heap Sort is the slowest with an execution time of 100
milliseconds for a large data set of 500,000 data points.
– Counting Sort exhibits exceptional speed, making it the fastest sorting algorithm. As
can be seen from Table 17, it takes only 4 milliseconds to sort a data set of 500,000
elements.
This is original work. Please do not copy without providing references. Page 49
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
– Following Counting Sort are Flash Sort, Radix Sort, Quick Sort, and Merge Sort,
in that respective order. However, though fast, these algorithms require external memory
usage as the input data size increases.
– Quick Sort demonstrate excellent time performance regardless of the data size and does
not require any additional memory apart from the input array.
• Looking at Table 16, Table 17, and Figure 15, we can observe that:
– The execution of Bubble Sort and Selection Sort demands the greatest number of
comparisons. With just an input data size of 10,000, they need more than 100,000,000
comparisons. Besides, Insertion Sort and Shaker Sort also require a significant num-
ber of comparisons (approximately 50,000,000 for a data set of 10,000 elements). This is
one of the reasons why these algorithms are unsuitable for large data sets.
– Shell Sort is an improvement of Insertion sort, the use of gap sequences and the ability
to move elements across multiple positions make Shell sort more efficient and result in a
smaller number of comparisons compared to Insertion sort.
– Heap Sort, Merge Sort, and Quick Sort are algorithms with the same time complex-
ity of O(nlogn). However, Quick Sort requires the fewest comparisons, about half the
comparisons needed for Merge Sort and one-third of the comparisons needed for Heap
sort.
– Counting Sort and Flash Sort requires the fewest comparisons among 11 algorithms
(only around 2,000,000 and 4,500,000 comparisons for a set of 500,000 data points).
Moreover, the number of comparisons needed to perform Counting sort is about 1.5 to
nearly 2 times lower than Flash sort.
• Unstable: Selection Sort, Shell Sort, Heap Sort, Quick Sort, Flash Sort.
This is original work. Please do not copy without providing references. Page 50
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
• Source files
– main.cpp: Contains the main function, which is the entry point of the program. It calls
processArg to process command-line arguments.
– process.cpp: Handles argument processing (processArg), data generation
(GenerateRandomData, GenerateSortedData, GenerateReverseData,
GenerateNearlySortedData), and sorting process orchestration (processSort) which
recieves data from processArg. It also includes utility functions like HoanVi for swapping
elements and genAndWrite for generating and writing data to files.
– sorting.cpp: Implementation of various sorting algorithms, including a version without
comparisons count (for measuring running time) and a version with comparisons count.
• Header file
– sorting.h: Including necessary libraries; prototype for sorting algorithms, data genera-
tor, command line arguments processing; struct declaration.
• Input/ Output files: These are *.txt files generated when running commands. For further
information, please review Lab 3 requirements.
• Executable
– main.exe: The compiled executable of the project, generated from main.cpp and all
other source files.
For simplicity, we have defined a struct and corresponding functions to store running time and
comparisons count of each sorting algorithm:
• The Record struct, defined in sorting.h, is a simple data structure used to store the perfor-
mance metrics of sorting algorithms. It contains two fields:
This is original work. Please do not copy without providing references. Page 51
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
– comparison: A long long type variable that holds the number of comparisons made
during the sorting process.
– time: A long long type variable, measures the time taken by the sorting algorithm in
milliseconds.
– Code snippet:
1 struct Record {
2 long long comparison ;
3 long long time ;
4 };
– The purpose of the Record struct is to encapsulate the performance data of sorting
algorithms, making it easier to collect, store, and compare the efficiency of different
sorting methods based on the number of comparisons and the actual execution time.
– Its prototype (for more details, please look at the source code of this project):
1 Record getRecord ( int a [] , int n , void (* sortFunctionCmp ) ( int [] ,
int , long long &) , void (* sortFunction ) ( int [] , int ) )
– A brief explanation of how this function works: The function sorts an array a of size
n and records the number of comparisons and the running time of a sorting algorithm.
We need to pass two function pointers of that sorting algorithm: one with comparison
counter (*sortFunctionCmp) and one without comparison counter.
To avoid stack overflow (for top-down sorting algorithms), we build the executable file with the
following command:
1 cd $dir && g ++ -Wl , - - stack ,36777216 *. cpp -o main
This is original work. Please do not copy without providing references. Page 52
University of Science - VNU-HCM
Sorting Algorithms Data Structures and Algorithms
References
[1] Anany Levitin. Selection sort. In Introduction to The Design and Analysis of Algorithms, pages
98–100. Addison-Wesley, 2012.
[3] Anany Levitin. Heaps and heap sort. In Introduction to The Design and Analysis of Algorithms,
pages 226–228. Addison-Wesley, 2012.
[4] Dr. Nguyen Hai Minh. Sorting (part 2) - DSA Lecture notes, 2024.
A Appendix
• This LATEX template is provided free of charge by Quan, Tran Hoang at https://github.
com/khongsomeo/hcmus-unofficial-report-template, with modifications by the authors
of this project under The GNU GPL v3.0.
This is original work. Please do not copy without providing references. Page 53