0% found this document useful (0 votes)
10 views

Radix Sort

The document discusses counting sort and radix sort, highlighting their efficiency and stability in sorting algorithms. Counting sort is stable and operates in linear time under certain conditions, while radix sort utilizes counting sort as a subroutine to sort multi-digit numbers efficiently. The text also includes exercises to illustrate and prove the concepts presented in the chapter.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Radix Sort

The document discusses counting sort and radix sort, highlighting their efficiency and stability in sorting algorithms. Counting sort is stable and operates in linear time under certain conditions, while radix sort utilizes counting sort as a subroutine to sort multi-digit numbers efficiently. The text also includes exercises to illustrate and prove the concepts presented in the chapter.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

170 Chapter 8 Sorting in Linear Time

anywhere in the code. Instead, counting sort uses the actual values of the elements
to index into an array. The (n lg n) lower bound for sorting does not apply when
we depart from the comparison-sort model.
An important property of counting sort is that it is stable: numbers with the same
value appear in the output array in the same order as they do in the input array. That
is, ties between two numbers are broken by the rule that whichever number appears
first in the input array appears first in the output array. Normally, the property of
stability is important only when satellite data are carried around with the element
being sorted. Counting sort’s stability is important for another reason: counting
sort is often used as a subroutine in radix sort. As we shall see in the next section,
counting sort’s stability is crucial to radix sort’s correctness.

Exercises

8.2-1
Using Figure 8.2 as a model, illustrate the operation of C OUNTING -S ORT on the
array A = 6, 0, 2, 0, 1, 3, 4, 6, 1, 3, 2.

8.2-2
Prove that C OUNTING -S ORT is stable.

8.2-3
Suppose that the for loop header in line 9 of the C OUNTING -S ORT procedure is
rewritten as

9 for j ← 1 to length[A]

Show that the algorithm still works properly. Is the modified algorithm stable?

8.2-4
Describe an algorithm that, given n integers in the range 0 to k, preprocesses its
input and then answers any query about how many of the n integers fall into a
range [a . . b] in O(1) time. Your algorithm should use (n + k) preprocessing
time.

8.3 Radix sort

Radix sort is the algorithm used by the card-sorting machines you now find only
in computer museums. The cards are organized into 80 columns, and in each col-
umn a hole can be punched in one of 12 places. The sorter can be mechanically
“programmed” to examine a given column of each card in a deck and distribute the
8.3 Radix sort 171

329 720 720 329


457 355 329 355
657 436 436 436
839 457 839 457
436 657 355 657
720 329 457 720
355 839 657 839

Figure 8.3 The operation of radix sort on a list of seven 3-digit numbers. The leftmost column is
the input. The remaining columns show the list after successive sorts on increasingly significant digit
positions. Shading indicates the digit position sorted on to produce each list from the previous one.

card into one of 12 bins depending on which place has been punched. An operator
can then gather the cards bin by bin, so that cards with the first place punched are
on top of cards with the second place punched, and so on.
For decimal digits, only 10 places are used in each column. (The other two
places are used for encoding nonnumeric characters.) A d-digit number would then
occupy a field of d columns. Since the card sorter can look at only one column
at a time, the problem of sorting n cards on a d-digit number requires a sorting
algorithm.
Intuitively, one might want to sort numbers on their most significant digit, sort
each of the resulting bins recursively, and then combine the decks in order. Unfor-
tunately, since the cards in 9 of the 10 bins must be put aside to sort each of the
bins, this procedure generates many intermediate piles of cards that must be kept
track of. (See Exercise 8.3-5.)
Radix sort solves the problem of card sorting counterintuitively by sorting on the
least significant digit first. The cards are then combined into a single deck, with
the cards in the 0 bin preceding the cards in the 1 bin preceding the cards in the 2
bin, and so on. Then the entire deck is sorted again on the second-least significant
digit and recombined in a like manner. The process continues until the cards have
been sorted on all d digits. Remarkably, at that point the cards are fully sorted
on the d-digit number. Thus, only d passes through the deck are required to sort.
Figure 8.3 shows how radix sort operates on a “deck” of seven 3-digit numbers.
It is essential that the digit sorts in this algorithm be stable. The sort performed
by a card sorter is stable, but the operator has to be wary about not changing the
order of the cards as they come out of a bin, even though all the cards in a bin have
the same digit in the chosen column.
In a typical computer, which is a sequential random-access machine, radix sort
is sometimes used to sort records of information that are keyed by multiple fields.
For example, we might wish to sort dates by three keys: year, month, and day. We
could run a sorting algorithm with a comparison function that, given two dates,
compares years, and if there is a tie, compares months, and if another tie occurs,
172 Chapter 8 Sorting in Linear Time

compares days. Alternatively, we could sort the information three times with a
stable sort: first on day, next on month, and finally on year.
The code for radix sort is straightforward. The following procedure assumes that
each element in the n-element array A has d digits, where digit 1 is the lowest-order
digit and digit d is the highest-order digit.

R ADIX -S ORT ( A, d)
1 for i ← 1 to d
2 do use a stable sort to sort array A on digit i

Lemma 8.3
Given n d-digit numbers in which each digit can take on up to k possible values,
R ADIX -S ORT correctly sorts these numbers in (d(n + k)) time.

Proof The correctness of radix sort follows by induction on the column being
sorted (see Exercise 8.3-3). The analysis of the running time depends on the stable
sort used as the intermediate sorting algorithm. When each digit is in the range 0
to k−1 (so that it can take on k possible values), and k is not too large, counting sort
is the obvious choice. Each pass over n d-digit numbers then takes time (n + k).
There are d passes, so the total time for radix sort is (d(n + k)).

When d is constant and k = O(n), radix sort runs in linear time. More generally,
we have some flexibility in how to break each key into digits.

Lemma 8.4
Given n b-bit numbers and any positive integer r ≤ b, R ADIX -S ORT correctly
sorts these numbers in ((b/r)(n + 2r )) time.

Proof For a value r ≤ b, we view each key as having d = b/r digits of r bits
each. Each digit is an integer in the range 0 to 2r − 1, so that we can use counting
sort with k = 2r − 1. (For example, we can view a 32-bit word as having 4 8-bit
digits, so that b = 32, r = 8, k = 2r − 1 = 255, and d = b/r = 4.) Each pass of
counting sort takes time (n + k) = (n + 2r ) and there are d passes, for a total
running time of (d(n + 2r )) = ((b/r)(n + 2r )).

For given values of n and b, we wish to choose the value of r, with r ≤ b,


that minimizes the expression (b/r)(n + 2r ). If b < lg n, then for any value
of r ≤ b, we have that (n + 2r ) = (n). Thus, choosing r = b yields a running
time of (b/b)(n + 2b ) = (n), which is asymptotically optimal. If b ≥ lg n,
then choosing r = lg n gives the best time to within a constant factor, which
we can see as follows. Choosing r = lg n yields a running time of (bn/ lg n).
As we increase r above lg n, the 2r term in the numerator increases faster than
8.3 Radix sort 173

the r term in the denominator, and so increasing r above lg n yields a running
time of (bn/ lg n). If instead we were to decrease r below lg n, then the b/r
term increases and the n + 2r term remains at (n).
Is radix sort preferable to a comparison-based sorting algorithm, such as quick-
sort? If b = O(lg n), as is often the case, and we choose r ≈ lg n, then radix sort’s
running time is (n), which appears to be better than quicksort’s average-case time
of (n lg n). The constant factors hidden in the -notation differ, however. Al-
though radix sort may make fewer passes than quicksort over the n keys, each pass
of radix sort may take significantly longer. Which sorting algorithm is preferable
depends on the characteristics of the implementations, of the underlying machine
(e.g., quicksort often uses hardware caches more effectively than radix sort), and
of the input data. Moreover, the version of radix sort that uses counting sort as the
intermediate stable sort does not sort in place, which many of the (n lg n)-time
comparison sorts do. Thus, when primary memory storage is at a premium, an
in-place algorithm such as quicksort may be preferable.

Exercises

8.3-1
Using Figure 8.3 as a model, illustrate the operation of R ADIX -S ORT on the fol-
lowing list of English words: COW, DOG, SEA, RUG, ROW, MOB, BOX, TAB,
BAR, EAR, TAR, DIG, BIG, TEA, NOW, FOX.

8.3-2
Which of the following sorting algorithms are stable: insertion sort, merge sort,
heapsort, and quicksort? Give a simple scheme that makes any sorting algorithm
stable. How much additional time and space does your scheme entail?

8.3-3
Use induction to prove that radix sort works. Where does your proof need the
assumption that the intermediate sort is stable?

8.3-4
Show how to sort n integers in the range 0 to n 2 − 1 in O(n) time.

8.3-5 
In the first card-sorting algorithm in this section, exactly how many sorting passes
are needed to sort d-digit decimal numbers in the worst case? How many piles of
cards would an operator need to keep track of in the worst case?

You might also like