Summaries of popular sorting algorithms
Summaries of popular sorting algorithms
Insertion sort - is a simple sorting algorithm that is relatively efficient for small
lists and mostly-sorted lists, and often is used as part of more sophisticated
algorithms. It works by taking elements from the list one by one and inserting
them in their correct position into a new sorted list. In arrays, the new list and
the remaining elements can share the array's space, but insertion is expensive,
requiring shifting all following elements over by one. Shell sort (see below) is a
variant of insertion sort that is more efficient for larger lists.
Shell sort - was invented by Donald Shell in 1959. It improves upon bubble sort
and insertion sort by moving out of order elements more than one position at a
time. One implementation can be described as arranging the data sequence in a
two-dimensional array and then sorting the columns of the array using insertion
sort.
Merge sort - takes advantage of the ease of merging already sorted lists into a
new sorted list. It starts by comparing every two elements (i.e., 1 with 2, then 3
with 4...) and swapping them if the first should come after the second. It then
merges each of the resulting lists of two into lists of four, then merges those lists
of four, and so on; until at last two lists are merged into the final sorted list. Of
the algorithms described here, this is the first that scales well to very large lists,
because its worst-case running time is O(n log n). Merge sort has seen a
relatively recent surge in popularity for practical implementations, being used for
the standard sort routine in the programming languages.
Due to the fact that bucket sort must use a limited number of buckets it is best
suited to be used on data sets of a limited scope. Bucket sort would be
unsuitable for data such as social security numbers - which have a lot of
variation.
Timsort - finds runs in the data, creates runs with insertion sort if necessary, and
then uses merge sort to create the final sorted list. It has the same complexity
(O(nlogn)) in the average and worst cases, but with pre-sorted data it goes down
to O(n).
Sorting an array of a list of item is one of the base and most usefull data process.
Here, we assume we have to sort an array (meaning we can access in constant
time any item of the collection). Some of the algorithms we present can be used
with other data-structures like linked lists, but we don't focus on this.
The most classical approach is called in-place, comparison based sorting. The
sort visualizer presents only examples in this category. In place sorting means
that, to save memory, one does not allow use of additional storage besides the
items being compared or moved. Comparison-based sort means that we have a
function (compare(a,b)) that can only tell if a<b, a>b or a=b. No other
information on the keys is available, such as the ordinality of the items, nor is it
possible to logically group them by any mean prior to sorting. Under such
restrictions, any algorithm will require at least log n! (or O(n log n)) comparisons
of one item again another to sort some array in the worst case. Any type of sort
requires at least n moves in the worst case.
Merge sort
Merge sort
Class
Sorting algorithm
Data structure
Array
Θ(nlogn)
Θ(nlogn)
Θ(n)
Optimal
Sometimes
Algorithm
Divide the unsorted list into two sublists of about half the size.
A small list will take fewer steps to sort than a large list.
Fewer steps are required to construct a sorted list from two sorted lists than two
unsorted lists. For example, you only have to traverse each list once if they're
already sorted (see the merge function below for an example implementation).
function merge_sort(m)
if length(m) ≤ 1
return m
add x to left
add x to right
left = merge_sort(left)
right = merge_sort(right)
else
return result
There are several variants for the merge() function, the simplest variant could
look like this:
function merge(left,right)
if first(left) ≤ first(right)
left = rest(left)
else
right = rest(right)
end while
if length(left) > 0
else
return result
Analysis
A recursive merge sort algorithm used to sort an array of 7 integer values. These
are the steps a human would take to emulate merge sort (top-down).
In the worst case, merge sort does an amount of comparisons equal to or slightly
smaller than (n ⌈lg n⌉ - 2⌈lg n⌉ + 1), which is between (n lg n - n + 1) and (n lg n +
n + O(lg n)). [1]
For large n and a randomly ordered input list, merge sort's expected (average)
number of comparisons approaches α·n fewer than the worst case where
In the worst case, merge sort does about 39% fewer comparisons than quicksort
does in the average case; merge sort always makes fewer comparisons than
quicksort, except in extremely rare cases, when they tie, where merge sort's
worst case is found simultaneously with quicksort's best case. In terms of moves,
merge sort's worst case complexity is O(n log n)—the same complexity as
quicksort's best case, and merge sort's best case takes about half as many
iterations as the worst case.[citation needed]
Merge sort as described here also has an often overlooked, but practically
important, best-case property. If the input is already sorted, its complexity falls
to O(n). Specifically, n-1 comparisons and zero moves are performed, which is
the same as for simply running through the input, checking if it is pre-sorted.
Sorting in-place is possible (e.g., using lists rather than arrays) but is very
complicated, and will offer little performance gains in practice, even if the
algorithm runs in O(n log n) time.[2] In these cases, algorithms like heapsort
usually offer comparable speed, and are far less complex. Additionally, unlike the
standard merge sort, in-place merge sort is not a stable sort. Some would argue
that sorting a linked list is not in place because even though you are sorting in
the given data structure, the data structure inherently has O(n) extra data you
are manipulating (e.g., the links in the list).
Merge sort is more efficient than quick sort for some types of lists if the data to
be sorted can only be efficiently accessed sequentially, and is thus popular in
languages such as Lisp, where sequentially accessed data structures are very
common. Unlike some (efficient) implementations of quicksort, merge sort is a
stable sort as long as the merge operation is implemented properly.
As can be seen from the procedure merge sort, there are some demerits. One
complaint we might raise is its use of 2n locations; the additional n locations
were needed because one couldn't reasonably merge two sorted sets in place.
But despite the use of this space the algorithm must still work hard: The contents
of m is first copied into left and right and later into the list result on each
invocation of merge_sort (variable names according to the pseudocode above).
An alternative to this copying is to associate a new field of information with each
key (the elements in m are called keys). This field will be used to link the keys
and any associated information together in a sorted list (a key and its related
information is called a record). Then the merging of the sorted lists proceeds by
changing the link values; no records need to be moved at all. A field which
contains only a link will generally be smaller than an entire record so less space
will also be used.
Another alternative for reducing the space overhead to n/2 is to maintain left and
right as a combined structure, copy only the left part of m into temporary space,
and to direct the merge routine to place the merged output into m. With this
version it is better to allocate the temporary space outside the merge routine, so
that only one allocation is needed. The excessive copying mentioned in the
previous paragraph is also mitigated, since the last pair of lines before the return
result statement (function merge in the pseudo code above) become
superfluous.
Merge sort is so inherently sequential that it is practical to run it using slow tape
drives as input and output devices. It requires very little memory, and the
memory required does not change with the number of data elements.
For the same reason it is also useful for sorting data on disk that is too large to fit
entirely into primary memory. On tape drives that can run both backwards and
forwards, merge passes can be run in both directions, avoiding rewind time.
Divide the data to be sorted in half and put half on each of two tapes
Merge individual pairs of records from the two tapes; write two-record chunks
alternately to each of the two output tapes
Merge the two-record chunks from the two output tapes into four-record chunks;
write these alternately to the original two input tapes
Merge the four-record chunks into eight-record chunks; write these alternately to
the original two output tapes
Repeat until you have one chunk containing all the data, sorted --- that is, for log
n passes, where n is the number of records.
For almost-sorted data on tape, a bottom-up "natural merge sort" variant of this
algorithm is popular.
In a simple pseudocode form, the "natural merge sort" algorithm could look
something like this:
# Original data is on the input tape; the other tapes are blank
# take the next sorted chunk from the input tapes, and merge into the single
given output_tape.
# tape[next] gives the record currently under the read head of that tape.
# tape[current] gives the record previously under the read head of that tape.
do
if left[current] ≤ right[current]
else
return
Although heapsort has the same time bounds as merge sort, it requires only Θ(1)
auxiliary space instead of merge sort's Θ(n), and is often faster in practical
implementations. Quicksort, however, is considered by many to be the fastest
general-purpose sort algorithm. On the plus side, merge sort is a stable sort,
parallelizes better, and is more efficient at handling slow-to-access sequential
media. Merge sort is often the best choice for sorting a linked list: in this
situation it is relatively easy to implement a merge sort in such a way that it
requires only Θ(1) extra space, and the slow random-access performance of a
linked list makes some other algorithms (such as quicksort) perform poorly, and
others (such as heapsort) completely impossible.
As of Perl 5.8, merge sort is its default sorting algorithm (it was quicksort in
previous versions of Perl). In Java, the Arrays.sort() methods use merge sort or a
tuned quicksort depending on the datatypes and for implementation efficiency
switch to insertion sort when fewer than seven array elements are being sorted.
[6]
Python uses timsort, another tuned hybrid of merge sort and insertion sort.
Merge sort's merge operation is useful in online sorting, where the list to be
sorted is received a piece at a time, instead of all at the beginning. In this
application, we sort each new piece that is received using any sorting algorithm,
and then merge it into our sorted list so far using the merge operation. However,
this approach can be expensive in time and space if the received pieces are
small compared to the sorted list — a better approach in this case is to store the
list in a self-balancing binary search tree and add elements to it as they are
received.
References
The terms are used in other contexts; for example the worst- and best-case
outcome of a planned-for epidemic, worst-case temperature to which an
electronic circuit element is exposed, etc. Where components of specified
tolerance are used, devices must be designed to work properly with the worst-
case combination of tolerances and external conditions.
Best-case performance
The term best-case performance is used in computer science to describe the way
an algorithm behaves under optimal conditions. For example, the best case for a
simple linear search on a list occurs when the desired element is the first
element of the list.
Determining what average input means is difficult, and often that average input
has properties which make it difficult to characterise mathematically (consider,
for instance, algorithms that are designed to operate on strings of text).
Similarly, even when a sensible description of a particular "average case" (which
will probably only be applicable for some uses of the algorithm) is possible, they
tend to result in more difficult to analyse equations.
When analyzing algorithms which often take a small time to complete, but
periodically require a much larger time, amortized analysis can be used to
determine the worst-case running time over a (possibly infinite) series of
operations. This amortized worst-case cost can be much closer to the average
case cost, while still providing a guaranteed upper limit on the running time.
Insertion sort
simple implementation
adaptive, i.e. efficient for data sets that are already substantially sorted: the time
complexity is O(n + d), where d is the number of inversions
more efficient in practice than most other simple quadratic (i.e. O(n2)) algorithms
such as selection sort or bubble sort: the average running time is n2/4, and the
running time is linear in the best case
stable, i.e. does not change the relative order of elements with equal keys
in-place, i.e. only requires a constant amount O(1) of additional memory space
Algorithm
Every iteration of insertion sort removes an element from the input data,
inserting it into the correct position in the already-sorted list, until no input
elements remain. The choice of which element to remove from the input is
arbitrary, and can be made using almost any choice algorithm.
Sorting is typically done in-place. The resulting array after k iterations has the
property where the first k + 1 entries are sorted. In each iteration the first
remaining entry of the input is removed, inserted into the result at the correct
position, thus extending the result:
becomes
with each element greater than x copied to the right as it is compared against x.
The most common variant of insertion sort, which operates on arrays, can be
described as follows:
Suppose there exists a function called Insert designed to insert a value into a
sorted sequence at the beginning of an array. It operates by beginning at the
end of the sequence and shifting each element one place to the right until a
suitable position is found for the new element. The function has the side effect of
overwriting the value stored immediately after the sorted sequence in the array.
To perform an insertion sort, begin at the left-most element of the array and
invoke Insert to insert each element encountered into its correct position. The
ordered sequence into which the element is inserted is stored at the beginning of
the array in the set of indices already examined. Each insertion overwrites a
single value: the value being inserted.
Pseudocode of the complete algorithm follows, where the arrays are zero-based
and the for-loop includes both the top and bottom limits (as in Pascal):
InsertionSort (array A)
begin
for i := 1 to length[A] - 1 do
begin
value := A[i];
j := i - 1;
begin
A[j + 1] := A[j];
j := j - 1;
end;
A[j + 1] := value;
end;
end;
Best, worst, and average cases
The best case input is an array that is already sorted. In this case insertion sort
has a linear running time (i.e., O(n)). During each iteration, the first remaining
element of the input is only compared with the right-most element of the sorted
subsection of the array.
The worst case input is an array sorted in reverse order. In this case every
iteration of the inner loop will scan and shift the entire sorted subsection of the
array before inserting the next element. For this case insertion sort has a
quadratic running time (i.e., O(n2)).
The average case is also quadratic, which makes insertion sort impractical for
sorting large arrays. However, insertion sort is one of the fastest algorithms for
sorting arrays containing fewer than ten elements.
Insertion sort is very similar to selection sort. As in selection sort, after k passes
through the array, the first k elements are in sorted order. For selection sort
these are the k smallest elements, while in insertion sort they are whatever the
first k elements were in the unsorted array. Insertion sort's advantage is that it
only scans as many elements as needed to determine the correct location of the
k+1th element, while selection sort must scan all remaining elements to find the
absolute smallest element.
Calculations show that insertion sort will usually perform about half as many
comparisons as selection sort. Assuming the k+1th element's rank is random,
insertion sort will on average require shifting half of the previous k elements,
while selection sort always requires scanning all unplaced elements. If the input
array is reverse-sorted, insertion sort performs as many comparisons as
selection sort. If the input array is already sorted, insertion sort performs as few
as n-1 comparisons, thus making insertion sort more efficient when given sorted
or "nearly-sorted" arrays.
While insertion sort typically makes fewer comparisons than selection sort, it
requires more writes because the inner loop can require shifting large sections of
the sorted portion of the array. In general, insertion sort will write to the array
O(n2) times, whereas selection sort will write only O(n) times. For this reason
selection sort may be preferable in cases where writing to memory is
significantly more expensive than reading, such as with EEPROM or flash
memory.
D.L. Shell made substantial improvements to the algorithm; the modified version
is called Shell sort. The sorting algorithm compares elements separated by a
distance that decreases on each pass. Shell sort has distinctly improved running
times in practical work, with two simple variants requiring O(n3/2) and O(n4/3)
running time.
If the cost of comparisons exceeds the cost of swaps, as is the case for example
with string keys stored by reference or with human interaction (such as choosing
one of a pair displayed side-by-side), then using binary insertion sort may yield
better performance. Binary insertion sort employs a binary search to determine
the correct location to insert new elements, and therefore performs
comparisons in the worst case, which is Θ(n log n). The algorithm as a whole still
has a running time of Θ(n2) on average because of the series of swaps required
for each insertion. The best-case running time is no longer Ω(n), but Ω(n log n).
To avoid having to make a series of swaps for each insertion, the input could be
stored in a linked list, which allows elements to be inserted and deleted in
constant-time. However, performing a binary search on a linked list is impossible
because a linked list does not support random access to its elements; therefore,
the running time required for searching is O(n2). If a more sophisticated data
structure (e.g., heap or binary tree) is used, the time required for searching and
insertion can be reduced significantly; this is the essence of heap sort and binary
tree sort.
Optimal Sometimes
History
Algorithm
Quicksort sorts by employing a divide and conquer strategy to divide a list into
two sub-lists.
Full example of quicksort on a random set of numbers. The boxed element is the
pivot. It is always chosen as the last element of the partition.
Reorder the list so that all elements which are less than the pivot come before
the pivot and so that all elements greater than the pivot come after it (equal
values can go either way). After this partitioning, the pivot is in its final position.
This is called the partition operation.
Recursively sort the sub-list of lesser elements and the sub-list of greater
elements.
The base case of the recursion are lists of size zero or one, which are always
sorted.
function quicksort(array)
if length(array) ≤ 1
return array
At each iteration, all the elements processed so far are in the desired position:
before the pivot if less than or equal to the pivot's value, after the pivot
otherwise (loop invariant).
The correctness of the overall algorithm follows from inductive reasoning: for
zero or one element, the algorithm leaves the data unchanged; for a larger data
set it produces the concatenation of two parts, elements less than or equal to the
pivot and elements greater than it, themselves sorted by the recursive
hypothesis.
The disadvantage of the simple version above is that it requires Ω(n) extra
storage space, which is as bad as merge sort. The additional memory allocations
required can also drastically impact speed and cache performance in practical
implementations. There is a more complex version which uses an in-place
partition algorithm and can achieve the complete sort using O(logn) space
(not counting the input) use on average (for the call stack):
pivotValue := array[pivotIndex]
storeIndex := left
if array[i] ≤ pivotValue
storeIndex := storeIndex + 1
return storeIndex
In-place partition in action on a small list. The boxed element is the pivot
element, blue elements are less or equal, and red elements are larger.
This is the in-place partition algorithm. It partitions the portion of the array
between indexes left and right, inclusively, by moving all elements less than or
equal to a[pivotIndex] to the beginning of the subarray, leaving all the greater
elements following them. In the process it also finds the final position for the
pivot element, which it returns. It temporarily moves the pivot element to the
end of the subarray, so that it doesn't get in the way. Because it only uses
exchanges, the final list has the same elements as the original list. Notice that an
element may be exchanged multiple times before reaching its final place.
This form of the partition algorithm is not the original form; multiple variations
can be found in various textbooks, such as versions not having the storeIndex.
However, this form is probably the easiest to understand.
Parallelizations
Like merge sort, quicksort can also be easily parallelized due to its divide-and-
conquer nature. Individual in-place partition operations are difficult to parallelize,
but once divided, different sections of the list can be sorted in parallel. If we have
p processors, we can divide a list of n elements into p sublists in Θ (n)
average time, then sort each of these in average time. Ignoring the
Θ (n) preprocessing, this is linear speedup. Given n processors, only Θ(n)
time is required overall.
One advantage of parallel quicksort over other parallel sort algorithms is that no
synchronization is required. A new thread is started as soon as a sublist is
available for it to work on and it does not communicate with other threads. When
all threads complete, the sort is done.
Other more sophisticated parallel sorting algorithms can achieve even better
time bounds.[2] For example, in 1991 David Powers described a parallelized
quicksort that can operate in O(logn) time given enough processors by
performing partitioning implicitly.[3]
Formal analysis
From the initial description it's not obvious that quicksort takes Θ(nlogn) time
on average. It's not hard to see that the partition operation, which simply loops
over the elements of the array once, uses Θ(n) time. In versions that perform
concatenation, this operation is also Θ(n).
In the best case, each time we perform a partition we divide the list into two
nearly equal pieces. This means each recursive call processes a list of half the
size. Consequently, we can make only log n nested calls before we reach a list
of size 1. This means that the depth of the call tree is Θ(logn). But no two calls
at the same level of the call tree process the same part of the original list; thus,
Θ(n) time all together (each call has some
each level of calls needs only
constant overhead, but since there are only Θ(n) calls at each level, this is
subsumed in the Θ(n) factor). The result is that the algorithm uses only Θ
(nlogn) time.
An alternate approach is to set up a recurrence relation for the T (n) factor, the
time needed to sort a list of size n. Because a single quicksort call involves Θ
(n) factor work plus two recursive calls on lists of size n / 2 in the best case,
the relation would be:
In the worst case, however, the two sublists have size 1 and n − 1 (for
example, if the array consists of the same element by value), and the call tree
becomes a linear chain of n nested calls. The ith call does Θ(n − i) work, and
Randomized quicksort has the desirable property that, for any input, it requires
only Θ(nlogn) expected time (averaged over all choices of pivots). But what
makes random pivots a good choice?
Suppose we sort the list and then divide it into four parts. The two parts in the
middle will contain the best pivots; each of them is larger than at least 25% of
the elements and smaller than at least 25% of the elements. If we could
consistently choose an element from these two middle parts, we would only have
to split the list at most 2log2n times before reaching lists of size 1, yielding an
Θ(nlogn) algorithm.
A random choice will only choose from these middle parts half the time.
However, this is good enough. Imagine that you are flipping a coin over and over
until you get k heads. Although this could take a long time, on average only 2k
flips are required, and the chance that you won't get k heads after 100k flips is
highly improbable. By the same argument, quicksort's recursion will terminate on
2(2log2n). But if its average call depth is
average at a call depth of only
Θ(logn), and each level of the call tree processes at most n elements, the
total amount of work done on average is the product, Θ(nlogn). Note that the
algorithm does not have to verify that the pivot is in the middle half - if we hit it
any constant fraction of the times, that is enough for the desired complexity.
The outline of a formal proof of the O(nlogn) expected time complexity
follows. Assume that there are no duplicates as duplicates could be handled with
linear time pre- and post-processing, or considered cases easier than the
analyzed. Choosing a pivot, uniformly at random from 0 to n − 1, is then
equivalent to choosing the size of one particular partition, uniformly at random
from 0 to n − 1. With this observation, the continuation of the proof is
analogous to the one given in the average complexity section.
Average complexity
Even if pivots aren't chosen randomly, quicksort still requires only Θ(nlogn)
time over all possible permutations of its input. Because this average is simply
the sum of the times over all permutations of the input divided by n factorial, it's
equivalent to choosing a random permutation of the input. When we do this, the
pivot choices are essentially random, leading to an algorithm with the same
running time as randomized quicksort.
More precisely, the average number of comparisons over all permutations of the
input sequence can be estimated accurately by solving the recurrence relation:
Here, n − 1 is the number of comparisons the partition uses. Since the pivot is
equally likely to fall anywhere in the sorted list order, the sum is averaging over
all possible splits.
This means that, on average, quicksort performs only about 39% worse than the
ideal number of comparisons, which is its best case. In this sense it is closer to
the best case than the worst case. This fast average runtime is another reason
for quicksort's practical dominance over other sorting algorithms.
Algorithm
Repeat the steps above for the remainder of the list (starting at the second
position and advancing each time)
Effectively, we divide the list into two parts: the sublist of items already sorted,
which we build up from left to right and is found at the beginning, and the sublist
of items remaining to be sorted, occupying the remainder of the array.
11 25 12 22 64
11 12 25 22 64
11 12 22 25 64
11 12 22 25 64
(nothing appears changed on this last line because the last 2 numbers were
already in order)
Selection sort can also be used on list structures that make add and remove
efficient, such as a linked list. In this case it's more common to remove the
minimum element from the remainder of the list, and then insert it at the end of
the values sorted so far. For example:
64 25 12 22 11
11 64 25 12 22
11 12 64 25 22
11 12 22 64 25
11 12 22 25 64
void selectionSort(int[] a) {
int min = i;
if (i != min) {
a[i] = a[min];
a[min] = swap;
Analysis
these scans requires one swap for n − 1 elements (the final element is already in
place).
Simple calculation shows that insertion sort will therefore usually perform about
half as many comparisons as selection sort, although it can perform just as many
or far fewer depending on the order the array was in prior to sorting. It can be
seen as an advantage for some real-time applications that selection sort will
perform identically regardless of the order of the array, while insertion sort's
running time can vary considerably. However, this is more often an advantage
for insertion sort in that it runs much more efficiently if the array is already
sorted or "close to sorted."
Another key difference is that selection sort always performs Θ(n) swaps, while
insertion sort performs Θ(n2) swaps in the average and worst cases. Because
swaps require writing to the array, selection sort is preferable if writing to
memory is significantly more expensive than reading. This is generally the case
if the items are huge but the keys are small. Another example where writing
times are crucial is an array stored in EEPROM or Flash. There is no other
algorithm with less data movement.
Variants
Heapsort greatly improves the basic algorithm by using an implicit heap data
structure to speed up finding and removing the lowest datum. If implemented
correctly, the heap will allow finding the next lowest element in Θ(log n) time
instead of Θ(n) for the inner loop in normal selection sort, reducing the total
running time to Θ(n log n).
Selection sort can be implemented as a stable sort. If, rather than swapping in
step 2, the minimum value is inserted into the first position (that is, all
intervening items moved down), the algorithm is stable. However, this
modification either requires a data structure that supports efficient insertions or
deletions, such as a linked list, or it leads to performing Θ(n2) writes. Either way,
it eliminates the main advantage of insertion sort (which is always stable) over
selection sort.
In the bingo sort variant, items are ordered by repeatedly looking through the
remaining items to find the greatest value and moving all items with that value
to their final location. Like counting sort, this is an efficient variant if there are
many duplicate values. Indeed, selection sort does one pass through the
remaining items for each item moved. Bingo sort does two passes for each value
(not item): one pass to find the next biggest value, and one pass to move every
item with that value to its final location. Thus if on average there are more than
two items with each value, bingo sort may be faster. [1]
Overview
The heap sort works as its name suggests. It begins by building a heap out of the
data set, and then removing the largest item and placing it at the end of the
sorted array. After removing the largest item, it reconstructs the heap and
removes the largest remaining item and places it in the next open position from
the end of the sorted array. This is repeated until there are no items left in the
heap and the sorted array is full. Elementary implementations require two arrays
- one to hold the heap and the other to hold the sorted elements. [2]
Heapsort inserts the input list elements into a heap data structure. The largest
value (in a max-heap) or the smallest value (in a min-heap) are extracted until
none remain, the values having been extracted in sorted order. The heap's
invariant is preserved after each extraction, so the only cost is that of extraction.
During extraction, the only space required is that needed to store the heap. In
order to achieve constant space overhead, the heap is stored in the part of the
input array that has not yet been sorted. (The structure of this heap is described
at Binary heap: Heap implementation.)
Heapsort uses two heap operations: insertion and root deletion. Each extraction
places an element in the last empty location of the array. The remaining prefix of
the array stores the unsorted elements.
[edit] Variations
Ternary heapsort [3] uses a ternary heap instead of a binary heap; that is, each
element in the heap has three children. It is more complicated to program, but
does a constant number of times fewer swap and comparison operations. This is
because each step in the shift operation of a ternary heap requires three
comparisons and one swap, whereas in a binary heap two comparisons and one
swap are required. The ternary heap does two steps in less time than the binary
heap requires for three steps, which multiplies the index by a factor of 9 instead
of the factor 8 of three binary steps. Ternary heapsort is about 12% faster than
the simple variant of binary heapsort.[citation needed]
Quicksort is typically somewhat faster, due to better cache behavior and other
factors, but the worst-case running time for quicksort is O(n2), which is
unacceptable for large data sets and can be deliberately triggered given enough
knowledge of the implementation, creating a security risk. See quicksort for a
detailed discussion of this problem, and possible solutions.
Thus, because of the O(n log n) upper bound on heapsort's running time and
constant upper bound on its auxiliary storage, embedded systems with real-time
constraints or systems concerned with security often use heapsort.
Heapsort also competes with merge sort, which has the same time bounds, but
requires Ω(n) auxiliary space, whereas heapsort requires only a constant
amount. Heapsort also typically runs more quickly in practice on machines with
small or slow data caches. On the other hand, merge sort has several
advantages over heapsort:
Like quicksort, merge sort on arrays has considerably better data cache
performance, often outperforming heapsort on a modern desktop PC, because it
accesses the elements in order.
Merge sort parallelizes better; the most trivial way of parallelizing merge sort
achieves close to linear speedup, while there is no obvious way to parallelize
heapsort at all.
Merge sort can be easily adapted to operate on linked lists and very large lists
stored on slow-to-access media such as disk storage or network attached
storage. Heapsort relies strongly on random access, and its poor locality of
reference makes it very slow on media with long access times.
Pseudocode
heapify(a, count)
end := count - 1
(swap the root(maximum value) of the heap with the last element of the
heap)
swap(a[end], a[0])
siftDown(a, 0, end-1)
(decrease the size of the heap by one so that the previous max value will
end := end - 1
function heapify(a,count) is
start := (count - 2) / 2
while start ≥ 0 do
(sift down the node at index start to the proper place such that all nodes
below
start := start - 1
(after sifting down the root all nodes/elements are in heap order)
function siftDown(a, start, end) is
input: end represents the limit of how far down the heap
to sift.
root := start
while root * 2 + 1 ≤ end do (While the root has at least one child)
(If the child has a sibling and the child's value is less than its sibling's...)
swap(a[root], a[child])
else
return
The heapify function can be thought of as building a heap from the bottom up,
successively shifting downward to establish the heap property. An alternative
version (shown below) that builds the heap top-down and shifts upward is
conceptually simpler to grasp. This "siftUp" version can be visualized as starting
with an empty heap and successively inserting elements. However, it is
asymptotically slower: the "siftDown" version is O(n), and the "siftUp" version is
O(n log n) in the worst case. The heapsort algorithm is O(n log n) overall using
either version of heapify.
function heapify(a,count) is
(end is assigned the index of the first (left) child of the root)
end := 1
(sift up the node at index end to the proper place such that all nodes above
siftUp(a, 0, end)
end := end + 1
(after sifting up the last node all nodes are in heap order)
input: start represents the limit of how far up the heap to sift.
child := end
parent := floor((child - 1) ÷ 2)
swap(a[parent], a[child])
else
return
Viewing the comparison as a subtraction of the sought element from the middle
element,[citation needed] only the sign of the difference is inspected: there is no
attempt at an interpolation search based on the size of the difference.
Overview
Finding the index of a specific value in a sorted list is useful because, given the
index, other data structures will contain associated information. Suppose a data
structure containing the classic collection of name, address, telephone number
and so forth has been accumulated, and an array is prepared containing the
names, numbered from one to N. A query might be: what is the telephone
number for a given name X. To answer this the array would be searched and the
index (if any) corresponding to that name determined, whereupon the associated
telephone number array would have X's telephone number at that index, and
likewise the address array and so forth. Appropriate provision must be made for
the name not being in the list (typically by returning an index value of zero),
indeed the question of interest might be only whether X is in the list or not.
If the list of names is in sorted order, a binary search will find a given name with
far fewer probes than the simple procedure of probing each name in the list, one
after the other in a linear search, and the procedure is much simpler than
organizing a hash table. However, once created, searching with a hash table may
well be faster, typically averaging just over one probe per lookup. With a non-
uniform distribution of values, if it is known that some few items are much more
likely to be sought for than the majority, then a linear search with the list
ordered so that the most popular items are first may do better than binary
search. The choice of the best method may not be immediately obvious. If,
between searches, items in the list are modified or items are added or removed,
maintaining the required organisation may consume more time than the
searches.
[edit] Examples
This rather simple game begins something like "I'm thinking of an integer
between forty and sixty inclusive, and to your guesses I'll respond 'High', 'Low',
or 'Yes!' as might be the case." Supposing that N is the number of possible
values (here, twenty-one as "inclusive" was stated), then at most
questions are required to determine the number, since each question halves the
search space. Note that one less question (iteration) is required than for the
general algorithm, since the number is already constrained to be within a
particular range.
Even if the number we're guessing can be arbitrarily large, in which case there is
no upper bound N, we can still find the number in at most steps (where
k is the (unknown) selected number) by first finding an upper bound by repeated
doubling.[citation needed] For example, if the number were 11, we could use the
following sequence of guesses to find it: 1, 2, 4, 8, 16, 12, 10, 11
One could also extend the technique to include negative numbers; for example
the following guesses could be used to find −13: 0, −1, −2, −4, −8, −16, −12,
−14, −13
People typically use a mixture of the binary search and interpolative search
algorithms when searching a telephone book, after the initial guess we exploit
the fact that the entries are sorted and can rapidly find the required entry. For
example when searching for Smith, if Rogers and Thomas have been found, one
can flip to a page about halfway between the previous guesses. If this shows
Samson, we know that Smith is somewhere between the Samson and Thomas
pages so we can bisect these.
Even if we do not know a fixed range the number k falls in, we can still determine
its value by asking simple yes/no questions of the form "Is k greater
than x?" for some number x. As a simple consequence of this, if you can answer
the question "Is this integer property k greater than a given value?" in some
amount of time then you can find the value of that property in the same amount
of time with an added factor of log2k. This is called a reduction, and it is
because of this kind of reduction that most complexity theorists concentrate on
decision problems, algorithms that produce a simple yes/no answer.
For example, suppose we could answer "Does this n x n matrix have determinant
larger than k?" in O(n2) time. Then, by using binary search, we could find the
(ceiling of the) determinant itself in O(n2log d) time, where d is the determinant;
notice that d is not the size of the input, but the size of the output.
The method
To begin with, the span to be searched is the full supplied list of elements, as
marked by variables L and R, and their values are changed with each iteration of
the search process, as depicted by the flowchart. Note that the division by two is
integer division, with any remainder lost, so that 3/2 comes out as 1, not 1½. The
search finishes either because the value has been found, or else, the specified
value is not in the list.
Should there have been just one value remaining in the search span (so that L +
1 = p = R − 1), and x did not match, then depending on the sign of the
comparison either L or R will receive the value of p and at the start of the next
iteration the span will be found to be empty.
Accordingly, with each iteration, if the search span is empty the result is "Not
found", otherwise either x is found at the probe point p or the search span is
reduced for the next iteration. Thus the method works, and so can be called an
Algorithm.
With each iteration that fails to find a match at the probed position, the search is
continued with one or other of the two sub-intervals, each at most half the size.
More precisely, if the number of items, N, is odd then both sub-intervals will
contain (N - 1)/2 elements. If N is even then the two sub-intervals contain N/2 - 1
and N/2 elements.
If the original number of items is N then after the first iteration there will be at
most N/2 items remaining, then at most N/4 items, at most N/8 items, and so on.
In the worst case, when the value is not in the list, the algorithm must continue
iterating until the span has been made empty; this will have taken at most
⌊log2(N) + 1⌋ iterations, where the ⌊ ⌋ notation denotes the floor function that
rounds its argument down to an integer. This worst case analysis is tight: for any
N there exists a query that takes exactly ⌊log2(N) + 1⌋ iterations. When compared
to linear search, whose worst-case behaviour is N iterations, we see that binary
search is substantially faster as N grows large. For example, to search a list of 1
million items takes as much as 1 million iterations with linear search, but never
more than 20 iterations with binary search. However, binary search is only valid
if the list is in sorted order.
There are two cases: for searches that will fail because the value is not in the list,
the search span must be successively halved until no more elements remain and
this process will require at most the p probes just defined, or one less. This latter
occurs because the search span is not in fact exactly halved, and depending on
the value of N and which elements of the list the absent value x is between, the
span may be closed early.
For searches that will succeed because the value is in the list, the search may
finish early because a probed value happens to match. Loosely speaking, half the
time the search will finish one iteration short of the maximum and a quarter of
the time, two early. Consider then a test in which a list of N elements is searched
once for each of the N values in the list, and determine the number of probes n
for all N searches.
N= 1 2 3 4 5 6 7 8 9 10 11 12 13
n/N = 1 3/2 5/3 8/4 11/5 14/6 17/7 21/8 25/9 29/10 33/11 37/12 41/13
1 1.5 1.66 2 2.2 2.33 2.43 2.63 2.78 2.9 3 3.08 3.15
Suppose the list to be searched contains N even numbers (say, 2,4,6,8 for N = 4)
and a search is done for values 1, 2, 3, 4, 5, 6, 7, 8, and 9. The even numbers will
be found, and the average number of iterations can be calculated as described.
In the case of the odd numbers, they will not be found, and the collection of test
values probes every possible position (with regard to the numbers that are in the
list) that they might be not found in, and an average is calculated. The maximum
value is for each N the greatest number of iterations that were required amongst
the various trail searches of those N elements. The first plot shows the iteration
counts for N = 1 to 63 (with N = 1, all results are 1), and the second plot is for N
= 1 to 32767.