Sorting Algorithm In-Place Comparison Sort O Insertion Sort
Sorting Algorithm In-Place Comparison Sort O Insertion Sort
Module 5
Selection sort Selection sort is a sorting algorithm, specifically an in-place comparison sort. It has O(n2) time complexity, making it inefficient on large lists, and generally performs worse than the similar insertion sort. Selection sort is noted for its simplicity, and also has performance advantages over more complicated algorithms in certain situations, particularly where auxiliary memory is limited. The algorithm works as follows: 1. Find the minimum value in the list 2. Swap it with the value in the first position 3. Repeat the steps above for the remainder of the list (starting at the second position and advancing each time) Effectively, the list is divided into two parts: the sublist of items already sorted, which is built up from left to right and is found at the beginning, and the sublist of items remaining to be sorted, occupying the remainder of the array. Algorithm void selection_sort( int a[], int n) { for(i=0; i<n-1; i++) for(j=i+1; j<n; j++) if(a[i] >a[j]) { temp=a[i]; a[i]=a[j]; a[j] =temp } } Here is an example of this sort algorithm sorting five elements:
64 25 12 22 11 11 25 12 22 64 11 12 25 22 64 11 12 22 25 64 11 12 22 25 64
Time Complexity of selection sort Line 2: Executed ( n- 1) times. Lines 3 & 4: Executed ( n - 1) + ( n - 2) + ... + 1 = n( n - 1) / 2 times. Lines 6, 7 & 8: Executed between 0 and n( n - 1) / 2 times depending on how often a[ i ] > a[ j ]. T(n)= (n-1) + (n-2) + + 1 = n(n-1)/2 = O(n2)
DS
Module 5
Bubble sort Bubble sort, also known as sinking sort, is a simple sorting algorithm that works by repeatedly stepping through the list to be sorted, comparing each pair of adjacent items and swapping them if they are in the wrong order. The pass through the list is repeated until no swaps are needed, which indicates that the list is sorted. The algorithm gets its name from the way smaller elements "bubble" to the top of the list. Because it only uses comparisons to operate on elements, it is a comparison sort. Although the algorithm is simple, it is not efficient for sorting large lists; other algorithms are better. Algorithm Void bubble_sort( int a, int n) { for(i=1; i<n; i++) for(j=0; j<n-i; j++) if(a[j] >a[j+1]) { temp=a[j]; a[j+1]=a[j]; a[j] =temp } } Step-by-step example Let us take the array of numbers "5 1 4 2 8", and sort the array from lowest number to greatest number using bubble sort algorithm. In each step, elements written in bold are being compared. Three passes will be required. First Pass: ( 5 1 4 2 8 ) --> ( 1 5 4 2 8 ), Here, algorithm compares the first two elements, and swaps them. ( 1 5 4 2 8 ) --> ( 1 4 5 2 8 ), Swap since 5 > 4 ( 1 4 5 2 8 ) --> ( 1 4 2 5 8 ), Swap since 5 > 2 ( 1 4 2 5 8 ) --> ( 1 4 2 5 8 ), Now, since these elements are already in order (8 > 5), algorithm does not swap them. Second Pass: ( 1 4 2 5 8 ) --> ( 1 4 2 5 8 ) ( 1 4 2 5 8 ) --> ( 1 2 4 5 8 ), Swap since 4 > 2 ( 1 2 4 5 8 ) --> ( 1 2 4 5 8 ) ( 1 2 4 5 8 ) --> ( 1 2 4 5 8 ) Now, the array is already sorted, but our algorithm does not know if it is completed. The algorithm needs one whole pass without any swap to know it is sorted. Third Pass: ( 1 2 4 5 8 ) --> ( 1 2 4 5 8 ) ( 1 2 4 5 8 ) --> ( 1 2 4 5 8 ) ( 1 2 4 5 8 ) --> ( 1 2 4 5 8 ) ( 1 2 4 5 8 ) --> ( 1 2 4 5 8 ) Bubble sort has worst-case and average complexity both (n2), where n is the number of items being sorted.
DS
Module 5
Insertion sort Insertion sort is a simple sorting algorithm: a comparison sort in which the sorted array (or list) is built one entry at a time. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort. However, insertion sort provides several advantages: Simple implementation Efficient for (quite) small data sets Adaptive (i.e., efficient) for data sets that are already substantially sorted: the time complexity is O(n + d), where d is the number of inversions More efficient in practice than most other simple quadratic (i.e., O(n2)) algorithms such as selection sort or bubble sort; the best case (nearly sorted input) is O(n) Stable; i.e., does not change the relative order of elements with equal keys In-place; i.e., only requires a constant amount O(1) of additional memory space Online; i.e., can sort a list as it receives it
Every repetition of insertion sort removes an element from the input data, inserting it into the correct position in the already-sorted list, until no input elements remain. Algorithm PRE: array of N elements (from 0 to N-1) POST: array sorted 1. An array of one element only is sorted 2. Assume that the first p elements are sorted. For j = p to N-1 Take the j-th element and find a place for it among the first j sorted elements
int j, p; comparable tmp; for ( p = 1; p < N ; p++) { tmp = a[p]; for (j = p; j > 0 && tmp < a[j-1]; j--) a[j] = a[j-1]; a[j] = tmp; }
Time Complexity Case I (Best case List is already sorted): Only One comparison is needed on each pass. Hence the complexity is O(n).
DS
Module 5
Case II (Worse case List is sorted in reverse order): No. of comparison = n + (n-1) + (n-2) + + 2 = n(n+1)/2 1 = O(n2).
DS
Module 5
Example: The following table shows the steps for sorting the sequence {3, 7, 4, 9, 5, 2, 6, 1}. In each step the item under consideration is underlined and the item moved (or held in place if was biggest yet considered) in the previous step is shown in bold. 37495261 37495261 37495261 34795261 34795261 34579261 23457961 23456791 12345679 Quick Sort Quicksort (also known as "partition-exchange sort") is a divide and conquer algorithm. Quicksort first divides a large list into two smaller sub-lists: the low elements and the high elements. Quicksort can then recursively sort the sub-lists. Below the recursion step is described: 1. Choose a pivot value. We take the value of the middle element as pivot value, but it can be any value, which is in range of sorted values, even if it doesn't present in the array. 2. Partition. Rearrange elements in such a way, that all elements which are lesser than the pivot go to the left part of the array and all elements greater than the pivot, go to the right part of the array. Values equal to the pivot can stay in any part of the array. Notice, that array may be divided in non-equal parts. 3. Sort both parts. Apply quicksort algorithm recursively to the left and the right parts.
There are two indices i and j and at the very beginning of the partition algorithm i points to the first element in the array and j points to the last one. Then algorithm moves i forward, until an element with value greater or equal to the pivot is found. Index j is moved backward, until an element with value lesser or equal to the pivot is found. If i j then they are swapped and i steps to the next position (i + 1), j steps to the previous one (j - 1). Algorithm stops, when i becomes greater than j. After partition, all values before i-th element are less or equal than the pivot and all values after j-th element are greater or equal to the pivot.
void quickSort(int a[], int low, int high) { int i = left, j = right; int tmp; int pivot = a[(low + high) / 2];
DS
Module 5
/* partition */ while (i <= j) { while (a[i] < pivot) i++; while (a[j] > pivot) j--; if (i <= j) { tmp = a[i]; a[i] = a[j]; a[j] = tmp; i++; j--; } }; /* recursion */ if (low < j) quickSort(a, low, j); if (i < high) quickSort(a, i, high); }
Quicksort is a sorting algorithm which, on average, makes O(nlog n) (big O notation) comparisons to sort n items. In the worst case, it makes O(n2) comparisons, though this behavior is rare. Quicksort is often faster in practice than other O(nlog n) algorithms. Additionally, quicksort's sequential and localized memory references work well with a cache. Quicksort can be implemented with an in-place partitioning algorithm, so the entire sort can be done with only O(log n) additional space. Time Complexity Case1: Avearage case Assume that the file is of size n, and each time after fixing an element the file is divided into equal partitions. T(n)= 2T(n/2)+O(n)(to find the pivote element) = 4 T(n/4)+2 O(n) . k =2 T(n/2k)+k*O(n) this iteration continued till 2k=n. so k=log2n =2lognT(1) +logn*O(n) = O(nlog2n). CaseI1 (worse case): Assume that each time after fixing an element the file is divided into a smaller partition and a bigger partition. Say, during ith partition it is divided into 1 and (i-1). No. of comparison = n + (n-1) + (n-2) + + 2 = n(n+1)/2 1 = O(n2)
DS
Module 5
DS
Module 5
Merge sort Merge sort is an O(n log n) comparison-based sorting algorithm. Most implementations produce a stable sort, meaning that the implementation preserves the input order of equal elements in the sorted output. It is a divide and conquer algorithm Conceptually, a merge sort works as follows 1. If the list is of length 0 or 1, then it is already sorted. Otherwise: 2. Divide the unsorted list into two sublists of about half the size. 3. Sort each sublist recursively by re-applying the merge sort. 4. Merge the two sublists back into one sorted list. Recursive Algorithm mergesort(int a[], int low, int high) { int mid; if(low<high) { mid=(low+high)/2; mergesort(a,low,mid); mergesort(a,mid+1,high); merge(a,low,high,mid); } return(0); } merge(int a[], int low, int high, int mid) { int i, j, k, c[50]; i=low; j=mid+1; k=low; while((i<=mid)&&(j<=high)) { if(a[i]<a[j]) { c[k]=a[i]; k++; i++; } else { c[k]=a[j]; k++; j++; } } while(i<=mid) { c[k]=a[i];
DS
Module 5
k++; i++; } while(j<=high) { c[k]=a[j]; k++; j++; } for(i=low;i<k;i++) { a[i]=c[i]; } } Bottom-up merge sort is a non-recursive variant of the merge sort, in which the array is sorted by a sequence of passes. During each pass, the array is divided into blocks of size m. (Initially, m = 1). Every two adjacent blocks are merged (as in normal merge sort), and the next pass is made with a twice larger value of m.
Input: array a[] indexed from 0 to n-1. m = 1 while m < n do i = 0 while i < n-m do merge subarrays a[i..i+m-1] and a[i+m .. min(i+2*m-1,n-1)] in-place. i = i + 2 * m m = m * 2
Time Complexity There are log2n passes, each with n comparisons. Hence the complexity is O(nlog2n) for all cases. But Quick sort is considered better than Merge sort even though its worst case complexity is O(n2) because Merge sort has twice as many assignment as Quick sort. Also Merge sort needs additional space to store the auxiliary array.
DS
Module 5
Radix Sort A least significant digit (LSD) radix sort is a fast stable sorting algorithm which can be used to sort keys in integer representation order. An LSD radix sort operates in O(nk) time, where n is the number of keys, and k is the average key length. Each key is first figuratively dropped into one level of buckets corresponding to the value of the rightmost digit. Each bucket preserves the original order of the keys as the keys are dropped into the bucket. There is a one-to-one correspondence between the number of buckets and the number of values that can be represented by a digit. Then, the process repeats with the next neighboring digit until there are no more digits to process. In other words: 1. Take the least significant digit (or group of bits, both being examples of radices) of each key. 2. Group the keys based on that digit, but otherwise keep the original order of keys. (This is what makes the LSD radix sort a stable sort). 3. Repeat the grouping process with each more significant digit. 4. The sort in step 2 is usually done using bucket sort or counting sort, which are efficient in this case since there are usually only a small number of digits. Example: Consider the array of elements 624 852 426 987 269 146 415 301 730 78 593 Queue 0 1 2 3 4 5 6 7 8 9 pass 1 730 301 852 593 624 415 426, 146 987 78 269 pass 2 301 415 624, 426 730 146 852 269 78 987 593 pass 3 78 146 269 301 415, 426 593 624 730 852 987
Time Complexity Depends on number of digits in the numbers (m) and number of elements ion the array (n). Outer loop executes m times and the inner loop executes n times, once for each element. Hence the complexity on O(mn) or m is approximately equal to log2n. So time complexity of Radix sort is O (nlog2n).
10
DS
Module 5
Heap Sort The binary heap data structures is an array that can be viewed as a complete binary tree. Each node of the binary tree corresponds to an element of the array. The array is completely filled on all levels except possibly lowest.
Four basic procedures on heap are
2. Build-Heap, which runs in linear time. 3. Heap Sort, which runs in O(n lg n) time. 4. Extract-Max, which runs in O(lg n) time.
Heapify picks the largest child key and compare it to the parent key. If parent key is larger than heapify quits, otherwise it swaps the parent key with the largest child key. So that the parent is now becomes larger than its children. It is important to note that swap may destroy the heap property of the subtree rooted at the largest child node. If this is the case, Heapify calls itself again using largest child node as the new root. Heapify (A, i) l left [i] r right [i] if l heap-size [A] and A[l] > A[i] o then largest l o else largest i if r heap-size [A] and A[i] > A[largest] o then largest r if largest i o then exchange A[i] A[largest] Heapify (A, largest)
Building a Heap We can use the procedure 'Heapify' in a bottom-up fashion to convert an array A[1 . . n] into a heap. Since the elements in the subarray A[ |n/2| +1 . . n] are all leaves, the procedure BUILD_HEAP goes through the remaining nodes of the tree and runs 'Heapify' on each one. The bottom-up order of processing node guarantees that the subtree rooted at children are heap before 'Heapify' is run at their parent. BUILD_HEAP (A) heap-size (A) length [A] For i floor(length[A]/2) down to 1 do o Heapify (A, i)
11
DS
Module 5
We can build a heap from an unordered array in linear time. Heap Sort Algorithm The heap sort combines the best of both merge sort and insertion sort. Like merge sort, the worst case time of heap sort is O(n log n) and like insertion sort, heap sort sorts in-place. The heap sort algorithm starts by using procedure BUILD-HEAP to build a heap on the input array A[1 . . n]. Since the maximum element of the array stored at the root A[1], it can be put into its correct final position by exchanging it with A[n] (the last element in A). If we now discard node n from the heap than the remaining elements can be made into heap. Note that the new element at the root may violate the heap property. All that is needed to restore the heap property. HEAPSORT (A) BUILD_HEAP (A) for i length (A) down to 2 do exchange A[1] A[i] heap-size [A] heap-size [A] - 1 Heapify (A, 1)
The HEAPSORT procedure takes time O(n lg n), since the call to BUILD_HEAP takes time O(n) and each of the n -1 calls to Heapify takes time O(lg n). Implementation void heapSort(int numbers[], int array_size) { int i, temp; for (i = (array_size / 2)-1; i >= 0; i--) siftDown(numbers, i, array_size); for (i = array_size-1; i >= 1; i--) { temp = numbers[0]; numbers[0] = numbers[i]; numbers[i] = temp; siftDown(numbers, 0, i-1); } } void siftDown(int numbers[], int root, int bottom) { int done, maxChild, temp; done = 0; while ((root*2 <= bottom) && (!done)) {
12
DS
Module 5
if (root*2 == bottom) maxChild = root * 2; else if (numbers[root * 2] > numbers[root * 2 + 1]) maxChild = root * 2; else maxChild = root * 2 + 1; if (numbers[root] < numbers[maxChild]) { temp = numbers[root]; numbers[root] = numbers[maxChild]; numbers[maxChild] = temp; root = maxChild; } else done = 1; } } Time Complexity At each step of the "adjust" algorithm, a node is compared to its children and one of them is chosen as the next root. We drop one level of the tree at each step of this process. Since this is a complete binary tree, there are at most log (n) levels. Thus, the worst case time complexity of "adjust" is O (log(n)). In the Heap sort algorithm, we simply use "adjust" (n/2 times in Phase 1 and (n-1) times in Phase 2. Thus, Heap Sort has a worst-case time complexity of O (n*log (n)).
if child+1 end and a[swap] < a[child+1] swap := child + 1 (check if we need to swap at all) if swap != root swap(a[root], a[swap]) root := swap (repeat to continue sifting down the child now) else return
13