0% found this document useful (0 votes)
14 views17 pages

1993.Optimal Nonuniform Signaling for Gaussian Channels

Uploaded by

Đặng Hiếu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views17 pages

1993.Optimal Nonuniform Signaling for Gaussian Channels

Uploaded by

Đặng Hiếu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

IEEE TRANSACfIONS ON INFORMATION THEORY, VOL. 39, NO.

3, MAY 1993 913

Optimal Nonuniform Signaling


for Gaussian Channels
Frank R. Kschischang, Member, IEEE, and Subbarayan Pasupathy, Fellow, IEEE

Abstract- Variable-rate data transmission schemes in which Maxwell-Boltzmann distributions [3], [4] or as Gibbs ensem­
constellation points are selected according to a nonuniform prob­ bles [5], is the focus of this paper.
ability distribution are studied. When the criterion is one of
Nonuniform signaling is closely related to the notion of
minimizing the average transmitted energy for a given aver­
age bit rate, the best possible distribution with which to select
constellation shaping in coded modulation (as described in,
constellation points is a Maxwell-Boltzmann distribution. In e.g., [2], [6]-[9]). Constellation shaping can provide an energy
principle, when constellation points are selected according to savings called shaping gain in addition to the usual coding
a Maxwell-Boltzmann distribution, the ultimate shaping gain gain provided by lattice- or trellis-coding. Indeed, the gain G
(1fe/6 or 1.53 dB) can be achieved in any dimension_ Nonuni­ provided by a coded modulation system (relative to a simple
pulse amplitude modulation baseline) operating at a bit rate f3
form signaling schemes can be designed by mapping simple
variable-length prefix codes onto the constellation. Using the
HutTman procedure, prefix codes can be designed that approach (bits per two-dimensional channel-use) is well-approximated
the optimal performance. These schemes provide a fixed-rate by writing
primary channel and a variable-rate secondary channel, and are
easily incorporated into standard lattice-type coded modulation
schemes.

Index Terms-Signal constellations, maximum entropy princi­ where "Ie and "Is denote, respectively, the coding gain [10]
ple, shaping gain, coded modulation, Huffman codes. and shaping gain [2] of the scheme in question. As will be
shown, the discretization factor 1 .:.c 2-/3 properly adjusts the
gain for finite bit rates. The connection between shaping and
1. INTRODUCTION nonuniform signaling arises from the fact that, in schemes
that employ shaping, a nonuniform distribution is induced on
I
N THE CONVENTIONAL apprQach to data transmission,
the points of the low-dimensional constituent constellation. By
each point in a given constellation is equally likely to be
applying nonuniform signaling directly (rather than indirectly
transmitted. While tllis approach yields the maximum bit rate
via constellation shaping), nonuniform signaling can, in any
for a given constell�tion size, it does not take into account the
dimension, achieve the ultimate shaping gain-7re/6 or 1.53
energy cost of the various constellation points. In this paper,
dB-attainable with uniform signaling only in the limit of
the idea of choosing constellation points with a nonuniform
infinite dimension [2]. Indeed, one of the principal results
probability distribution is explored. Such nonuniform signaling
of this paper is a method of designing simple nonuniform
will reduce the entropy of the transmitter output, and hence
signaling schemes that approach this ultimate level.
the average bit rate. However, if points with small energy
It is important to note at the outset, however, that prac­
are chosen more often than points with large energy, energy
tical implementation of direct nonuniform signaling will be
savings may (more than) compensate for this loss in bit rate.
hampered by the variable transmission rate of such schemes.
It follows immediately from the maximum entropy prin­
Transmitting data obtained from a fixed-rate source requires
ciple (see, e .g., [1, ch. 11D, or by variational calculus as
data buffering at the transmitter and the receiver, which leads
in [2, Section IV-B], that the probability distribution that
to the problem of coping with buffer over- or underflow.
maximizes entropy for a fixed average energy is one in which
Furthermore, since the transmitted signals represent variable
a constellation point r, with energy IIr112, is chosen with
probability p(r) <X exp (->-llrI12), where the nonnegative
numbers of bits, channel errors may cause the insertion and
deletion of bits in the decoded data, causing potential losses
parameter >- governs the trade-off between bit rate and average
of synchronization. Although these system problems will tend
energy. Nonuniform signaling with this family of distributions,
to limit the broad applicability of nonuniform signaling, we
well known in statistical mechanics and thermodynamics as
do not attempt to provide solutions to these problems in this
Manuscript received December 28, 1990; revised August 20, 1992. This paper.
work was supported in part by the Natural Sciences and Engineering Research Instead, our aim in this paper is 1) to provide insight
Council of Canada, and by the Government of Ontario through an Ontario into nonuniform signaling schemes and how they relate to
Graduate SchOlarship. This paper was presented in part at the 1991 Tirrenia
International Workshop on Digital Communications, Tirrenia, Italy, September conventional signaling schemes, 2) to assess the potential gains
8-;12, 1 991, and at the 16th Biennial Symposium on Communications, that nonuniform signaling may provide, and '3) to provide
Kingston, ON, Canada, May 27-29, 1992. a method (via the Huffman algorithm) by which simple,
The authors are with the Department of Electrical Engineering, University
of Toronto, Toronto, ON, Canada M5S J A4.
near-optimal, nonuniform signaling schemes may be designed.
IEEE Log Number 9205742. Such nonuniform signaling schemes provide a standard, fixed-
001 8-9448/93$03.00 © 1993 IEEE
914 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 39, NO.3, MAY 1993

rate, primary channel unaffected by the system problems �s;equence


C
of cosets
of A' in A
mentioned in the previous paragraph, together with a variable­ input data
rate secondary channel, in which these system problems may s - ----;> to channel
be acceptable. At various places in this paper, the close
analogy between these nonuniform signaling schemes and Fig. 1. Enc uder structure fur lattice-type coded modulation schemes.

mathematically equivalent statistical mechanical systems will


be pointed out. a sequence of sets of signals, drawn from the cosets of a
To motivate our interpretation of nonuniform signaling as sublattice A' in a lattice A. In general, each coset contains
a shaping method, we summarize the complementary notions an infinite number of points. The role of the signal point
of coding and shaping in Section II. In Section III, we briefly selector 5 is to choose an actual constellation point to be
define the coded modulation parameters that will be needed transmitted from each infinite coset. The coset code C is
throughout the paper. In Section IV, we discuss various proper­ usually chosen to maximize the minimum distance between
ties of the Maxwell-Boltzmann distribution that are important different possible signal sequences. The signal point selector
in this setting. The results of applying this distribution to 5 usually attempts to minimize the average transmitted energy
signal point selection from spherical constellations based on while supporting the desired bit rate. In light of our previous
various dense lattices are given in Section V. In Section VI, discussion, it should be clear that C performs the coding (or
we define and apply "continuous approximations" to show that distance-maximizing) operation, while 5 performs the shaping
the ultimate shaping gain of 7rej6, or 1.53 dB, is attainable (or energy-minimizing) operation. Both the coset code C and
in any dimension when constellation points are selected with the signal point selector 5 contribute to the transmission of
a Maxwell-Boltzmann distribution. In Section VII, we show data in these coded modulation schemes. In this paper, the
how the Huffman procedure may be used to obtain dyadic signal point selection or shaping component of these coded
approximations to the Maxwell-Boltzmann distribution that modulation schemes is studied.
provide near-ultimate shaping gains. In Section VIII, we dis­ Shaping schemes (or signal point selectors) may be clas­
cuss the integration of nonuniform signaling with coset coding. sified as being either fixed-rate or variable-rate. Fixed-rate
Finally, we make some general comments and concluding schemes achieve the transmission of a fixed number of bits
remarks in Section IX. over some well-defined signaling interval. Generalized cross
constellations [2], Voronoi constellations [6], block shaping
codes [7], trellis shaping codes [8], and truncated polydisc con­
II. CODING AND SHAPING stellations [9] are all examples of fixed-rate shaping schemes.
Coding and shaping are two separate and complementary From the point of view of maximizing shaping gain, the best
operations that contribute to the gain of lattice-type coded possible N-dimensional constellation shape is the N-sphere,
modulation schemes such as lattice codes and lattice-type since it achieves a specified volume with least average energy.
trellis codes. In implementation as well as in analysis, the two With variable-rate schemes, the number of bits transmitted
operations arc dual and separable and provide two additive during a signaling interval is a random variable. A simple
gain components: coding gain and shaping gain. We say that example of such a scheme is given in [13j, where a binary
coding gain is a distance property of the coded modulation data stream is parsed into the codewords of a variable-length
scheme because it depends, in general, on the set of distances prefix code, which are then mapped onto the points of a
between the various transmitted signal sequences. Coding is constellation. (A more detailed account of this type of scheme
generally performed to achieve a large minimum distance is given in [14].) Other examples of variable-rate schemes
between signal-sequences, i.e., coding attempts to be distance­ that may be interpreted as mapping the words of a prefix code
maximizing. Shaping gain, on the other hand, is an energy onto a constellation include the shaping schemes described
property of the coded modulation scheme because it depends, by Livingston [15], the block-encoded modulation schemes
in general, on the energy of the various transmitted signal of Chouly and Sari [161, and the signaling schemes with
sequences. Shaping is generally performed to achieve a small "opportunistic secondary channels" described by Forney and
average transmitted energy while maintaining the desired bit Wei [2] (see also [17]-[21]). It should be noted, however, that
rate, i.e., shaping attempts to be energy-minimizing. these latter schemes were not constructed with shaping gain
In more general terms, coding and shaping attempt to solve in mind.
two related, but different, problems. The coding problem is to
find a large set of symbol sequences that can be distinguished
with high reliability in the presence of noise. The shaping III. DEFINITIONS
problem is to use these symbol sequences to deliver maximum In this section, definitions for various parameters used
information to the receiver at minimum cost where, in this throughout this paper are provided. Most of these parameters
paper, cost is measured by the average energy per transmitted are carefully defined in [2] for the case of uniform signaling;
symbol. here these definitions are extended to the case of nonuniform
Most lattice-type coded modulation schemes based on a signaling.
lattice partition Aj A' have the encoder structure shown in Throughout this paper, we deal with constellations n em­
Fig. 1 (see [10]-[121). The encoder consists of a coset code bedded in an N-dimensional (ND) vector space with a well­
C and a signal point selector 5. The coset code C produces defined (Euclidean) distance and norm. The size of n, Inl,
KSCHISCHANG AND PASUPATHY: OPTIMAL NONUNIFORM SIGNALING FOR GAUSSIAN CHANNELS 915

is usually finite, but not necessarily so (as in the case of an where, as in [2), we have normalized to 2-D. For the special
infinite lattice). Often, 0 will be obtained as the intersection case of a uniform probability distribution over a finite N­
of a lattice A (or a translate a + A) with a finite region IR, D constellation 0, the bit rate is (2/N) IOg 2 101 bits/2-D
in which case we denote the constellation by O(A, IR). The channel-use, and this is the maximum bit rate for the given
energy (squared norm) of a point rEO is denoted by Ilr112. We constellation. A signaling scheme with a normalized bit rate
shall usually assume that the transmitter produces a sequence of f3 can send roughly f3 bits/s/Hz when implemented with a
of symbols drawn independently from the constellation and QAM (quadrature amplitude modulation) modem (which sends
that the symbols are selected at some regular symbol rate. sequences of 2-D signalsV
The probability with which the transmitter selects a point Transmitter power is proportional to the average energy
r E O is denoted by per). As usual in the study of data per transmitted symbol. The normalized average energy per
transmission schemes, we are concerned with trade-offs among symbol per two dimensions is given by
three parameters: reliability, bit rate, and transmitter power. 2 2
E �
A. Reliability, Bit Rate, Transmitter Power
N LP(r)l!rI1
rEO
. (4)

Although the most natural reliability measure for symbol


transmission in a noisy channel is, perhaps, Pe (the average B. Constellation Figure of Merit and Gain
symbol error rate), this measure is often difficult to compute In any data transmission scheme, we would like to transmit
and to work with, especially in the case of complicated mul­ at a large bit rate, with as high a reliability and as Iow a trans­
tidimensional constellations. A simple (and well-established) mitter power as possible. A commonly used figure of merit for
reliability measure for a signaling scheme on the Gaussian a signaling scheme, sometimes termed the "constellation figure
channel is the parameter d�in' the minimum squared Euclidean of merit" or CFM [2], is the dimensionless, scale-invariant
distance between different constellation points. Formally, ratio CFM � d�in/E.
d�in � min {d2(r, r/):r, r' EO,r fc r/} To compare the relative energy efficiency of two schemes,
we use the estimate (2).
where d2(r r/) denotes the squared Euclidean distance be­
, Let
tween constellation points rand r'. Schemes with greater d�in
will tend to have smaller symbol error rate, at least for large
SNR (signal-to-noise ratio), and hence greater reliability; thus, and
we will take �in as the principal reliability measure in this
paper.
In fact, d�in can be used to estimate the symbol error rate at
moderate to high SNR's. Let Nmin (r) denote the number of denote symbol error rate estimates for two schemes operating
constellation points at distance d�in from the point r. The error at the same bit rate and having, respectively, constellation
coefficient N is the average of N min over the constellation; figures of merit CFM1 and CFM2 and error coefficients Nl
that is, and N2. Then, Pe-�(P) and Pe-�(P ) denote, respectively, the
approximate E/N� value need�d by each of the two schemes
N � LP(r)Nmin (r). (1) to achieve the symbol error rate p. At a fixed symbol error rate
rEO p, the gain, G(p), of the first scheme relative to the second is
Assuming a white Gaussian noise channel with a one-sided given by the ratio of E/No values, i.e.,
noise power spectral density of No W/Hz, the dominant term G(p) Pe��(p)/Pe��(P)
in a simple union bound on Pe, assuming maximum-likelihood
=

decoding, gives us the estimate = (CFMl/CFM2) x 'YN(p, Nll N2) (5)

(2) where

where Q(x) � (l/yI21r) jx"" exp ( u2/ 2) duo It is important


-

to note that N, unlike d�in' is affected by the probability with (This latter factor is easily evaluated via a convenient approxi­
which constellation points are selected. mation for Q-l(x) due to Hastings [22) given in Abramowitz
In information-theoretic terms, the transmitter is a discrete and Stegun [23, section 26.2.23).) For small values of x,
memoryless source whose output alphabet is the set of points
(6)
in the constellation. The (average) bit rate is equal to the
entropy of the transmitter in bits per transmitted symbol. In and hence the asymptotic gain
formal terms, a scheme with a constellation n, in which the
symbol r is selected independently with probability p(r), has G � lim G(p) = CFMl/CFM2
p--->o
bit rate

f3 � -! LP(r) log2 [per)]


rEO
bits/2-D channel-use (3) 1 Actually, a spectral throughput of {3 bits per second per Hertz is achieved
with QAM only by using ideal Nyquist pulse shaping. In practice, some excess
bandwidth is needed, so {3 represents an upper bound on spectral throughput.
916 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 39, NO.3, MAY 1993

is the ratio of constellation figures of merit, and is not degradation is even less. As mentioned, we prefer to focus on
affected by error coefficient values. At nonzero values of p, the asymptotic gain, which is unaffected by variations in the
the asymptotic gain will be affected by the IN factor in (5). error coefficient, but we caution the reader to note that the
In particular, when N 1 > N2, the asymptotic gain will be error coefficient must be accounted for in estimating the gain
reduced. When p is small, we can again use (6) and write at nonzero error probabilities.
Suppose now that the constellation n is obtained from an
infinite ND lattice A, and that a point r E n is selected with
to estimate this effect. probability p( r). If n I- A, it is convenient to extend the dis­
For bit rates (3 � 2 bits/2-D channel-use, we take as a tribution p(r) to all the points of A, simply by assigning zero
baseline for comparison the CFM obtainable with a simple 1- probability to any points not in n. Let VeAl be the volume of
D PAM (pulse amplitude modulation) constellation. Assuming a fundamental region [24] of A, i.e., the volume of N-space
that constellation points are selected with equal probability, associated with each lattice point, and let 2N(3(p)/2V (A) be
this baseline has figure of merit given by CFMEll 6/(2(3 -1). the "entropy volume" of A with distribution per). We can
then write G(d�in' (3, E) (8) as
=

The (asymptotic) gain of a scheme with a given CFM, relative


to the baseline, 'is then given by G(d�in' (3, E) = ,cCAhs(p)(1 - 2-(3(p)), (9)
G(d�in' (3, E) � CFM/CFMEll where
1 ) d�in
- (2(3
_ -
,8 � 2, (8) ,e(A) = d�in (A)jV(A)2/N (1 0)
6E
is the coding gain [10] of the lattice A, and
This gain measure effectively combines our reliability, bit
rate, and transmitter power measures into a single figure, thus 18(P) 2(3(p)V(A)2/N /[6E(p)]
=

allowing for direct comparisons of disparate schemes. Note


also that the gain formula (8) is valid only for (3 � 2, i.e., for is the shaping gain of A with distribution per). For large bit
bit rates of at least one bit per symbol per dimension. Since rates (3, when the discretization factor Id(P) 1 - 2-jJ(P)
=

we wish to study bandwidth-efficient schemes, i.e., those with is small, the total gain is approximately separable into the
large (3, this restriction will pose no problem, and we assume product of a coding gain and a shaping gain. The coding gain
it to hold throughout this paper. Ic (A), a geometric property of the lattice A studied in the
While we will be interested primarily in asymptotic gain in coded modulation literature and elsewhere [10], [11], [24], is
this paper; we can use (5) to estimate the effect of the error independent of the probability distribution per) used to select
coefficient at nonzero error probability values. In Appendix constellation points and therefore not of central interest in this
A, we show that a cubic N-D constellation based on 7LN paper. The shaping gain IS(P) is largely independent of the
under uniform signaling and supporting a bit rate (3 (Le., underlying lattice A, except insofar as the lattice restricts the
an N-D baseline constellation) has error coefficient NEll =
distribution p(r); this is why we have suppressed an explicit
2N(1- 2-f3/2). Thus, in two dimensions and for large values .dependence on A in our notation. In general [2], 'Ys(p) ;:;:: 7re/6.
of (1, NA ,:::; 4. Using (7), we estimate that every factor of As we shall see, by choosing the distribution p(r) to be
two increase in the error coefficient (over the baseline of 4) the Maxwell-Boltzmann distribution, 18(P) can be made to
reduces the asymptotic gain by about 0.2 dB for p on the order approach the ultimate shaping gain of 7re/6 in any dimension.
of 10-6, in agreement with the rule of thumb given by Forney
[10, p. 1142]. (A plot of IN in dB versus log2 N suggests C. Other Constellation Parameters
that a loss of 0.22 dB per error coefficient doubling is a fairly Often, higher-dimensional constellations n are obtained as
accurate rule, up to about N = 64, at p =10-6. Similarly, subsets drawn from Cartesian products of lower-dimensional
at p = 10-8, the loss is about 0.17 dB per doubling in error "constituent" constellations. When the dimension N of n is
coefficient.) even, we may define the constituent 2-D constellation of n
It is important to note that when the inner points of a as the smallest 2-D constellation n2 such that n c n�/2,
constellation are selected morc' often than the outcr points, where n �/2 is the Nl2-fold Cartesian product of n
the error coefficient will increase, because the inner points The constituent 2-D constellation O plays an important 2 with itself.
2 role
tend to have greater Nmin than the outer points. For example, when the signaling scheme is to be implemented with a QAM
consider a constellation drawn from the 7L2 lattice, for which modem, since all transmitted signals are obtained as sequences
N :S 4. In light of the previous paragraph, we can estimate of QAM signals drawn from n .
the maximum loss in gain by 2
If n is odd-dimensional, then n2 is even-dimensional and
can be implemented with a QAM modem. This suggests
that we may define the constituent 2-D constellation of n
for error rates on the order of 10-6. For /3 � 2, when the as the constituent 2-D constellation of n2. In particular, if
baseline has N = 2, this maximum possible degradation is N 1, this implies n2
= n2. For later use, we note that the
=

quite large (0.22 dB) relative to the maximum achievable constituent 2-D constellation of IBN(R), an N-ball of radius
shaping gain (1.53 dB); however, for (3 ,:::; 6, the maximum R centered at the origin, is a 2-D disk 1B2(R) when N is even,
degradation is only about 0.04 dB. At smaller error rates, the and is a square lBi(R) of side 2R when N is odd.
KSCHISCHANG AND PASUPATHY: OPTIMAL NONUNIFORM SIGNALING FOR GAUSSIAN CHANNELS 917

As discussed in [2] and [25], an important parameter in where the partition function Z(A) is chosen to normalize the
the design of a signaling scheme is the 2-D "constellation distribution, i.e.,
expansion ratio"
Z(,X) g L exp( -AllrW), ,X ::::: O. (12)
rEn

(For infinite constellations, ,X must be strictly positive.) The


where In21 is the number of points in the constituent 2-D
Maxwell-Boltzmann distribution arises in many contexts; e.g.,
constellation of n, and 2il represents the number of points in
in the optimization of permutation modulation for quantization
a comparable baseline constellation supporting the same bit
[26] and for transmission [27], in neural networks [28], and in
rate with uniform signaling. Since Inz l � InI2/N � 2/'3, we
simulated annealing [29], among others.
have CER2(n) � 1. Large 2-D constellations are sensitive to
For finite constellations, setting A = 0 yields a signaling
nonlinearities and other signal-dependent perturbations, so it
scheme in which constellation points are selected with
is desirable that CER2(n) be as close to its lower bound of
uniform probability; thus, "classical" fixed-rate signaling
unity as possible.
schemes appear here as a special case. Note too that, with
Another important parameter discussed in [2] is the 2-D
a Maxwell-Boltzmann distribution, outer points (points with
"peak-to-average energy ratio"
large energy) are never selected more often than inner points
(points with small energy). An equivalence class of the points
(11)
of n all having the same energy is called a shell of the
where r;,ax is the energy maximum of the points in n2, the constellation. With a Maxwell-Boltzmann distribution, the
constituent 2-D constellation of n, and E is the normalized points of a shell are selected equally often. Indeed, if all
average energy. PAR2 is a measure of the dynamic range of constellation points lie in the same shell, "classical" uniform
the signals transmitted by a QAM modem. To minimize the signaling is obtained for all values of A.
effects of signal-dependent distortion, it is desirable that PAR2, In statistical mechanics, much attention is paid to the
like CER2, be as small as possible. (Note that the "peak" computation of the partition function Z(A) (12) in various
energy in this definition is found by averaging a signal over physical systems. This is due to the fact that the average energy
a 2-D interval, thus making PAR2 independent of the pulse and entropy are easily obtained in terms of Z('x). Indeed, the
shape used in implementation. The actual "instantaneous" peak normalized average energy (4) is obtained as
energy depends on the actual pulses used, and on how these
pulses superpose in time when transmitted in sequence.) (13)
Since constellations are often obtained by taking the in­
tersection of an N-D lattice A with an N-D region IR, in and the normalized bit rate (3) is obtained as
analogy with constituent 2-D constellations, it is useful to
�(>') �[_A2 d� Og2 Z(A» )]
define constituent 2-D lattices and regions. We define the
constituent 2-D lattice 11.2 of A as the smallest 2-D set of
=

C A
2 AE('x)
11.2 such that 11.2 C AIJ . Similarly the constituent 2-D region = log2 Z(>.) +�. (14)
of IR is the smallest 2-D region 1R2 such that R2 C IRIJ. If N
N 1, then 11.2
= 11.2 and 1R2 1R2. It is clear that n2 is
= =

The partition function is easily obtained in terms of the


obtained as the intersection of 11.2 with 1R2•
theta series [30] or Euclidean weight distribution [10] of a
constellation. The theta series for a constellation n is simply
IV. THE MAxWELL-BOLTZMANN DISTRIBUTION a generating function for the set of energy values (squared
As pointed out in the Introduction, it follows im­ norms) taken on by the points of n, and is defined as
mediately from the maximum entropy principle that the Sex) g LrHI xllrll', where we interpret Sex) as a real
Maxwell-Boltzmann distribution maximizes bit rate for a function of x, and note that Z(A) = S[exp (->.)].
fixed average energy. (For a good introduction to the The Maxwell-Boltzmann parameter ,X governs the trade-off
maximum entropy principle, see [1, ch. 11].) Equivalently, the between bit rate and average energy. In analogy with statistical
Maxwell-Boltzmann distribution minimizes average energy mechanics, we might call ,X the "inverse temperature" of the
for a fixed bit rate. 2 Signal point selection with a Maxwell-Boltzmann distribution, Le., >. = l/(kT), where,
Maxwell-Boltzmann distribution causes a constellation point in statistical mechanics, k is the Boltzmann constant and T
r, with energy IIr112, to be selected with probability per) ex is the temperature. When ,X 0 (infinite "temperature"), the
exp(-'xllrIl2), where the parameter'x � 0 governs the trade­
=

uniform distribution is obtained, corresponding to the maxi­


off between bit rate and average energy. More precisely, the mum possible entropy for the given constellation. (In statistical
optimal distribution is one in which mechanics, all states of a system are equally occupied at
infinite temperature.) As A -> 00 (or the "temperature" cools
per) g exp(-,XllrI12)/Z(A), toward absolute zero), the bit rate as well as the average energy
are reduced as the points with large energy are selected less
2 In
both optimization problems. the given constellation must be able to
.•upportthe given bit rate or the given average energy, so these values are frequently. The "limiting constellation" (obtained at absolute
themselves constrained; t his is pointed out at the end of Section IV. zero "temperature") consists of only the innermost points
918 IEEE TRANSACfIONS ON INFORM.<\TION THEORY, VOL. 39, NO.3, MAY 1993

of the original constellation (the ground states in statistical


1.5
mechanics), and these points are selected equally often.
An important property of the Maxwell-Boltzmann distri­ 1.2
bution is its "separability" property. Suppose the ND con­
stellation 0 is the Cartesian product of two or more "factor
Gh,0.9
constellations," i.e., 0 01 X O2 X
= x OJ, J ::;> 2, where
. • •
(dB)
0i is Ni-dimensional and 2.::=1 Ni N. Thcn it follows that
= 0.6

J 0.3
Z(A) = II Z;(A), (15)
i=l
4 6
where Zi(A) is the partition function over the ith factor con­ 6 (bits/2D
2.:r,Ef!, exp (-AllrdI2). When (15) .
channel-use)
stellation, i.e., Zi(A) =

holds, the Maxwell-Boltzmann distribution with parameter A Fig. 2. Normalized gain of spherical constellations drawn from various
dense lattices with signal point selection performed accor d i ng to the
is separable into the product of Maxwell-Boltzmann distri­ Maxwell-Boltzmann distribution.
butions over the factor constellations, each with parameter A.
In practice, this means that optimal nonuniform signaling can
be implemented on separable constellations by indepcndently the separability property of the Maxwell-Boltzmann distribu­
implementing nonuniform signaling on each of the factor tion, the results we obtain for these spherical constellations are
constellations. From (13) and (14) it follows that E and (3 also applicable to those nonspherical constellations that can be
can be obtained by a weighted average of the corresponding expressed as Cartesian products of spherical constellations. For
factor constellation quantities, i.e., E 2.::=1 RiNi/N and
=
example, N -cube shaped constellations based on the integer
lattice lLN are Cartesian products of simple 1-D "spherical"
f3 2.:{=1 (3iN;jN.
constellations based on lL.
=

If the constellation 0 has 101 points in total, and


Nin "innermost" points of minimum energy, then as A Plotted in Fig. 2 is the "normalized" gain that selected
ranges from 0 to +00, every value of (J in the range constellations provide for various bit rates when constella­
(2 10g2 (Nin ) IN , 2 10g2 (lOUIN] is obtained, and we say tion points are selected with a Maxwell-Boltzmann distri­
that the constellation supports every bit rate in this range bution. Thc normalized gain is obtained from the gain G,
with a Maxwell-Boltzmann distribution. Similarly, over the defined in (8), by dividing by the coding gain Ac(A) (10),
same range of A values, the constellation supports average of the lattice from which the constellation is drawn. Each
energy values in the range (Emin, Ettniform], where Emin curve in Fig. 2 is obtained by varying the parameter A from
is the normalized energy of a minimum-energy point in 0, zero--corresponding to the maximum bit rate (rightmost) point
and Euniform is the normalized average energy under uniform in each curve-through positive values. (Recall that A 0 =

signaling. An infinite lattice can support any positive bit rate corresponds to thc "classical" case of a uniform distribution.)
and any positive average energy with a Maxwell-Boltzmann Curves corresponding to constellations drawn from the same
distribution. latticc but extending further to the right, i.e., to larger bit rates,
correspond to larger constellations. Also plotted in Fig. 2 is
the function
V. SPHERICAL CONSTELLATIONS

In this section, we apply the Maxwell-Boltzmann dis­ U((3) � (1fe/6)(1- T(3). (16)
tribution to spherical constellations based on the densest
known lattices in various numbers N of dimensions. In one As we shall explain in Section VI, for bit rates /3 greater than
dimension, the constellations are based on the integer lattice lL, about 2.5, U(fJ) forms the "upper envelope" of the gain curves,
while in two dimensions, the constellations are based on A2, to good approximation.
thc hexagonal lattice. In higher dimensions, the constellations Fig. 2 has several noteworthy features. First, notice that the
are based on the lattices denoted D4, E6, Ell, and K12; the gain obtained by nonuniform signaling with a constellation can
subscript in this notation displays the number of dimensions significantly exceed that provided under uniform signaling, at
N. Basis vectors and extensive theta series tables for these the expense of a reduction in bit rate. The additional gain
lattices are given in [24, ch. 4J. Each constellation is obtained provided under nonuniform signaling is called the "biasing
by taking some number M of the points of smallest energy in gain" [7].
the lattice, where M is chosen so as to include some integral The curves corresponding to large constellations tend to
number of lattice shells. Equivalently, we may think of the merge with curves corresponding to smaller constellations as
constellations as being obtained by forming the intersection A is increased. This happens because the smaller constellations
of the infinite lattice with an N -sphere centered at the origin; are subconstellations of the large constellations. As A is
hence the term spherical constellations? We note that, due to increased, the outer points of the large constellations are
selected very infrequently, so that, in effect, these outer points
3 Caution: some authors reserve this term La refer to constellations in which
all points are on the surface of a sphcre; in this paper, spherical constellations can be neglected and the large constellation "shrinks" into a
do, in general, include interior points as well. smaller constellation.
KSCHISCHANG AND PASUPATHY: OPTIMAL NONUNIFORM SIGNALING FOR GAUSSIAN CHANNELS 919

Note also that each curve tends to merge with the U((3) small a, we expect the summation on the right-hand side of
curve. Comparing (16) to (9), we see that this merging implies (17) to be a good approximation to the integral on the left­
that the shaping gain under nonuniform signaling approaches hand side of(17). Setting A aA (a small), we obtain from
= *

the ultimate limit of 7r e /6 as A becomes large, and that this (17) the general continuous approximation
limit is obtained independently of dimension.
Fig. 2 also illustrates the "law of diminishing returns" L
rEO(A, R)
fer) � V(A)-l �fCr)dYer). (18)
governing the biasing gain. Recall that the rightmost point
of each curve (A = 0) in the graph corresponds to uniform For example, setting fer) = 1, yields the approximation
signaling. We see th�t, for constellations having dimension
greater than unity, some "initial" gain is available under In(A, R)I =
L 1 � V(II\l)/V(A), (19)
uniform signaling. Furthermore, the initial gain increases with rEO(A, R)
increasing dimension. This initial gain is, of course, the
where VCR) denotes the volume of the region R. This is
shaping gain of the N-sphere under uniform signaling, which
"Proposition I" of Forney and Wei [2].
increases with N and ultimately approaches the value 7re/6. For nonuniform signaling with a Maxwell-Boltzmann distri­
This forces the ultimate biasing gain (the difference between
bution, the average energy and bit rate are determined by the
the ultimate shaping gain and the shaping gain of the N­
partition function Z(>.) (12). The continuous approximation
sphere) to decrease with dimension.
(18) yields


Qualitatively, this law of diminishing returns arises due to
a phenomenon known as the "sphere hardening effect" (see, Z(A) � V(A)-l exp (->'llrI12)dYer). (20)
e.g., [31]). In a many-dimensional sphere, almost all of the
volume is located near the surface of the sphere; consequently, Combining approximation (20) with (13) we approximate E,
almost all constellation points lie near or on the surface as well. the normalized average energy (4), by
Since these points all have the same energy, i.e., the same
cost, signaling with a Maxwell-Boltzmann distribution will E (R, A) � � �IIrl12 fer, >.)dYer), (21)
cause, these points to be selected equally often. Thus, uniform
signaling with spherical constellations becomes increasingly where
effective as the dimension increases, ultimately approaching exp (-AllrI12)
fCr, A) = (22)
IR exp (->'llrIl2)dyer)'
the performance of nonuniform signaling.

VI. CONTINUOUS APPROXIMATIONS


Note that f(r, >.) represents a continuous G,lUssian probability
density function, truncated to the region R. The continuous
Let n(A, R) be a constellation obtained from the inter­ approximation (21) estimates the normalized average con­
section of an N-D lattice A (or a translate a + A of A) stellation energy by the normalized average energy of this
with a finite N-D region R. In [2], Forney and Wei were continuous random variable.
able to obtain much insight into the performance of such In the same way, combining approximation (20) with (14),
constellations via the so-called "continuous approximation," we approximate (3, the normalized bit rate (3), by
obtained by replacing discrete sums over the points of n with
ll. 2
properly normalized integrals over the region R. In this section (3(R, >.) [H(R, >.) -10g2 V(A)], (23)
N
=

we use the same approach to obtain similar insight in the


case of nonuniform signaling, essentially by replacing discrete where

Maxwell-Boltzmann distributions with continuous Gaussian


distributions, truncated to the region R. The continuous ap­
H(R, >.) = k
- fer, A) log 2 f(r, A)dyer)
proximation allows us to obtain a continuous approximation is the differential entropy of a continuous Gaussian random
for the partition function Z(A), from which estimates of all variable, truncated to the region R. The continuous approxi­
other relevant system parameters are obtained. The result of mations (21) and (23) were also used by Forney and Wei [2,
Forney and Wei for uniform signaling [2] appear as a special section IV-B], and by Forney in [6, section V].
case, obtained when the Maxwell-Boltzmann parameter A is
set to zero. B. Shaping Gain Approximation

We know use (21) and (23) to estimate the gain provided


A. Energy and Entropy Approximations
by nonuniform signaling. Writing (3(R, A) for
(3, and E(R, A)
Let f: II\lN -+ 1I\l, a function of N variables, be Riemann­ for E, in the gain expression (8) yields the approximation
integrable over an ND region R. Given an ND lattice A', G � G(R, A) where
we have
2�(R,'\)d2min R, ,\)
G(R , >.) � T�(
f f(r)dV(r) (1 _

lim �
L.,;
f(r)V(aA'), (17) 6E( IR, A)
JR
R
=

0:-40
rEO(<>A',R)
[ d;;'in ]. [2(2/N)H(R,,\)]

where nCaA', Ifil) =


ll.
aA' n R, and V(aA') denotes the V(A)2/N 6E (1I\l , A)
volume of a fundamental region of the scaled lattice aA'. For X (1 - T13(R, '\l). (24)
920 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 39, NO. 3. MAY 1993

As in (9), we have grouped the gain G(IT\l, A) (24) into three IT\l, and V(A2) is the fundamental volume (actually area) of
factors-the first being the coding gain 'Yc(A)�d�in/V(A )2/N A2, the constituent 2-D lattice of A. Combining this with our
of the lattice A [10], the second being a continuous approxi­ approximation (23) for the bit rate, we obtain
mation for the shaping gain, namely, 2 2
20 V(A ) /N/V(A2) · [V(IT\l2)/2( /N)H(R, A) .
CER ( ) ;::; [ l l
Ll. 2(2/N)H(R,A)
'Ys(IT\l, A) =
fiE(IR,A) ,
Here, as in [2], we identify two independent factors: the coding
(25)
constellation expansion ratio of A, CER2c(A) � V(A)2/NjV(A2)'
and the third being the discretization factor and the shaping constellation expansion ratio, CER2s(R A) �
V(IT\l2)/2(2/N)H(R,A). Again, as in the case of gain, one
component, the coding constellation expansion factor, is a
The discretization factor was omitted in the analysis of Forney geometric property of the lattice A and is unaffected by the
and Wei [2], who were interested in asymptotic (/3 ---> 00) prohahility distribution with which the constellation points are
limits for the shaping gain. However, for most practical values selected. The other component, the shaping constellation ex­
of /3, this factor is not insignificant and should not be omitted. pansion ratio, depends both on the region IT\l and the parameter
Indeed, as will become evident, including this factor provides A.
accurate estimates for gain, even for relatively small values of When ), 0, we obtain the special case of uniform sig­
=

/3 (see Fig. 3 and Section VI-D). naling, where


We now estimate the biasing gain [7], i.e., the additional

(
gain that nonuniform signaling can provide over uniform
signaling with the same constellation. Setting A 0 reduces We may write CER2. IT\l , A) in terms of CER2s(R, 0) as
=

The shaping gain 'Ys (IT\l, A) can be written in terms of


as
'Y,,(IR,
(25) to the special case of uniform signaling discussed in [2].
0) CER2.(IR, ),) CER2s(lIi!, 0) . 2P(R, A)
=
,

[
2(2/N)HCR,A) ] [ ] E(IR,O)
where p(R,),) (27) is the normalized redundancy under
'Ys(lR, A) 'Ys(IT\l, 0)·
2(2/N)H(R,0)
.
E(IR, nonuniform signaling. We see' that in addition to the

] [E(IR,O) ]
=

[
A) constellation expansion due to uniform signaling, we have
2.!:ICR,A) incurred an additional constellation expansion factor due to
,
= 'Ys(lR 0)· 2Mf/.,0) . E(IR, A) the loss in rate caused by nonuniform signaling. Thus, for
a fixed constellation, CER2s(lIi!, 0), the shaping constellation
= 'Ys(lR, 0) . TP(Il, A) . gE(IR, A), (26) expansion ratio induced by uniform signaling, is a lower bound

where to CER2s(lIi!, A), the shaping constellation expansion ratio


under nonuniform signaling.
p( IR,),) � /3(IR, 0) - /3(IR, ), ) (27) To estimate PAR2(O) (11), we note that the peak energy
of the constituent 2-D constellation is a geometric property
is the normalized redundancy (or loss in bit rate) caused by unaffected by the probability with which constellation points
selecting constellation points with a nonuniform distrihution. are selected. The average energy, on the other hand, is reduced
Clearly, 'Yb(lR, A), the total biasing gain, is given by the from its value under uniform signaling by the energy savings
product of the second and third factors in (26), i.e., factor gE(IIi!, ),) ; thus PAR2 is increased by the same factor,
(28) i.e.,

In (28), we have identified two separate factors that charac­ PAR2(1Ii!, A) PAR2(IIi!, O)gE(IIi!, A),
=

terize the biasing gain. The "energy savings factor" where PAR2(1Ii!, 0) denotes the PAR2 under uniform signaling.
gE(IR,),) � E(iR, O)/E(IT\l, ),) 2: 1, (29)
D. Applying the Continuous Approximations
accounts for the energy savings that result when constellation In applying these continuous approximations to actual con­
points of low energy are selected more often that points of stellations, one is confronted with a certain flexibility in
large energy. Of course, selecting points with a nonuniform the choice of approximating region 1Ii!. For example, to ap­
distribution results in a loss of entropy and hence a drop in proximate the behavior of an !vI point, symmetric PAM
the baseline average energy. The "energy loss factor" 2-p(R, >.) constellation based on lL, one would choose iii! [-R, Rj,
=

accounts for this drop. a ID "sphere" of radius R; however, it is not clear which
choice for radius R is best. Indeed, the "best" choice for R
C. CER2 and PAR2 Approximations depends both on the constellation parameter-be it average
We now provide continuous approximations for CER2 and energy, entropy, or whatever-that one is trying to estimate,
PAR2, two constellation parameters defined in Section III. and on the value of the Maxwell-Boltzmann parameter ),. The
From (19), we estimate 1021 ;::; V(IT\l2)/V(A2 ) for the size same flexibility in choice of sphere radius R occurs when one
of the constituent 2-D constellation, where V(IT\l2) is the attempts to approximate the behavior of an N-D spherical
volume (actually area) of IT\l2, the constituent 2-D region of constellation with an N -sphere 8N(R).
KSCHISCHANG AND PASUPATHY: OPTIMAL NONUNIFORM SIGNALING FOR GAUSSIAN CHANNELS 921

Since the size, 101, of the constellation is assumed known,


1.5
one approach is to choose the radius R so as to match the
bit rate estimate (23) at A = 0 with the actual bit rate 1.2
( 2/ N) log2 101 at A = O. In effect, this forces the volume
of the region � to satisfy V(�) = 101V(A), so that (19) is 0.9

satisfied with equality. From extensive numerical calculations, Gh,


(dB)
0.6
we have found that this approach gives very satisfactory
estimates for bit rate and average energy, over a fairly wide
0.3
range for A.
E8 1 D,
Applying this approach to the M-PAM example, we find o
that R M/2 and E :::::: M2/6. However, the actual energy
=
z \ z

E (M2 - 1) /6; thus, using the continuous approximation, 4 6 8


(3 (bits/2D channel-use)
=

we have incurred an error that is a factor of (1 - M-2) =

1 - 2- (3 . The discretization factor, 'Yd «(3) 1 2-(3 , can thus


= - Fig. 3. Comparison of the approximate and actual normalized gain of
be interpreted as a correction factor used to adjust the average spherical constellations based on various N-dimensional lattices. The dotted
curves are obtained by applying continuous approximations for gain.
energy when applying the continuous approximation to the
baseline constellations under uniform signaling.
basic regions achieve the same performance as the basic
E. Spherical and Cubic Constellati ons region itself. Thus, the best region � for use with nonuniform
Continuous approximations for the various constellation signaling in 2n dimensions is the n-fold Cartesian product of a
parameters are derived in Appendix B for the important case in 2-D disc-a so-called polydisc [32]-because this region will
which the region IR BN(R), an N-ball of radius R centered
=
achieve a given value of shaping gain with least CER2. and
at the origin. These include cubic constellations, which are PAR2. Thus, while nonuniform signaling will always cause
Cartesian products of I-D "spheres," as a special case. a constellation expansion relative to uniform signaling with
In order to compare the shaping gain predicted by these the same constellation, this constellation expansion is never
continuous approximations to actual shaping gain, we have greater and usually less than would be required under uniform
plotted in Fig. 3 normalized gain curves for spherical con­ signaling to achieve the Same shaping gain.
stellations 0 in various dimensions. As in Fig. 2, the gain
values are normalized by dividing the total gain by the coding VII. SHAPING WITH BINARY PREFIX CODES
gain of the lattice from which the constellation is drawn. Solid In this section, we study methods of achieving nonuni­
curves represent the actual normalized gain G hc, computed form signaling schemes for the transmission of binary data.
from (8). The dotted curves give (1012/N2-p(BN (R), A) - Assuming, as usual, that we wish to transmit the output of
1)/[6E(Bn(R), ..\)], as defined in Appendix B. The radius R a memoryless binary equiprobable source, the most obvious
in each case is chosen so that V (IEBN(R))/V(A) = 101. Also means of generating events with nonuniform probabilities is
plotted in Fig. 3 is the function U«(3) (16) which represents to parse the output of the source into codewords of variable
the "upper envelope" shown in the figure. length. Since the probability of occurrence of a codeword of
We see that the curves corresponding to the approximate length Ii is 2-1" shorter (more frequently occurring) code­
normalized gain closely match the actual normalized gain words may be mapped to constellation points with low energy
curves for all values of (3 2:: 2, although some difference is seen and longer (less frequently occurring) codewords may be
for small (3. For large (3, however, the curves corresponding to mapped to points with high energy and, in this way, shaping
the approximation correspond with the actual normalized gain gain may be achieved. This approach was suggested by a brief
curves, confirming the asymptotic accuracy of the continuous example in [13] and is discussed in greater detail in [14].
approximation. To ensure unique and complete parsing, it can be shown
As pointed out in Appendix B, for large bit rates, the shaping (e.g., [33, p. 297]) that the variable length binary codewords
gain under nonuniform signaling approaches 7re/6, indepen­ must form a complete binary prefix code, in which the M
dently of dimension. The ultimate biasing gain approaches codeword lengths li, i 1, · · · , M satisfy
=

n:/ [6'Y® (N)], where 'Ye;(N) (39) denotes the shaping gain M
of the N -sphere under uniform signaling. Since 'Y0 (1'1) --+ I)-Ii 1. = (30)
7re/6 monotonically from below as N --+ 00, 'Yb approaches i= l
unity as the dimension increases, thus confirming the "law of
diminishing" returns discussed at the end of Section V. Although we will nOt always explicitly refer to them as such,
Curves showing the trade-offs between CER2s and shaping all prefix codes considered in this paper are complete.
gain or between PAR2 and shaping gain for ND spherical
constellations are easily obtained from the continuous approx­ A. Matched Codes
imations derived in Appendix B. However, as asserted by The simplest example of the idea of mapping a prefix code
Forney and Wei [2], the best possible trade-offs are achieved to a constellation is probably the following.> The output of
>
by 2-D spherical constellations, i.e., by regions shaped as a binary equiprobable memoryless source (the data source)
discs in two dimensions. Recall that Cartesian products of is parsed into a sequence of blocks drawn from the set
922 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 39, NO. 3, MAY 1993

10 o 11 1000 1 100
• •
• • . ,.

o
1001 000 010 1101
Fig. 4. A simple nonuniform PAM scheme. • • • •

1010 010 1 1 10
• • • 1011 001 011 1111
• • • •

100 00 1 10 1010 1110


• • • • •

Fig. 6. A nonunifonn QAM scheme. f3 = 3.5.

101 1 011 1111 100 011


• • • • •

Fig. 5. A nonuniform QAM scheme. 8 = 3.


101 00 010
• • •
{O, 10, 1 1 } and these output blocks are mapped onto a PAM
constellation as shown in Fig. 4. Since prO] � and P[lO]
111
= =

P[ll] = �, this scheme has an (average) bit rate of 1 � 110


• •
bits/To This three-level scheme is quite similar to a partial­
response scheme, but since any level is available for use during Fig. 7. A nonunifonn QAM scheme. f3 = 2.75.
any signaling interval (i.e., different transmitted symbols are
independent), the data rate is greater. If we place two such I-D /3B < /3p. Note also that since the primary channel operates at
schemes in quadrature, we obtain the 2-D scheme shown in a fixed rate, it can operate as a "standard" channel, and is not
Fig. 5, which achieves an (average) bit rate of 3 bits/To This affected by the system problems associated with variable rate
2-D scheme provides a shaping gain of 4/3 1. 25 dB, with
=
transmission. The "opportunistic secondary channels" of [2]
a CER2 of 9/8 = 1.125 and a PAR2 of 2. The overall gain and the "in-band coding method" of [21] (see also [17]-[20))
(including the discretization factor id) is 7/6 0.67 dB. =
are examples of nonuniform signaling schemes that use prefix
The use of a binary prefix code will not, in general, produce codes to separate data into primary and secondary channels,
an optimal nonuniform scheme unless the constellation is although these schemes are not described in terms of prefix
"matched" to the code. A constellation n is said to be codes.
matched to some binary prefix code if, for some A :::: 0, a Other examples of 2-D constellations matched to prefix
Maxwell-Boltzmann distribution with parameter A, induces codes are shown in Figs. 6 and 7. The scheme of Fig. 6 was
probabilities on the constellation points that are all integral used by Forney and Wei [2, Fig. 7(b)] to illustrate the notion
powers of two. This means that for some A :::: 0, of an opportunistic secondary channel, while the scheme of
Fig. 7 is based on 7 points of lowest energy in the 2-D
(3 1)
hexagonal lattice A2. A limited search turned up several
for all r E n, where l(r) is a positive integer. The matching additional examples in higher dimensions. For example, the
condition (31) is trivially satisfied when A = 0 by any constellation containing the origin and the first shell of the
constellation of size 21. However, for positive A, the matching lattice D4 in four dimensions has theta series 1 + 24:c and is
condition is strong, and we expect relatively few constellations matched to a binary prefix code having one codeword with
to satisfy it. two bits and 24 codewords with five bits.
Note that, in Fig. 5, each constellation point conveys either
B. Huffman Codes
two, three or four bits (with the outer points conveying more
bits than the inner points). In particular, each point conveys Although, in general, a complex binary prefix code will not
at least two bits. This implies that we may consider this match a constellation in the sense of (31), we nevertheless
scheme to consist of a fixed-rate "primary" channel, conveying expect prefix codes to provide shaping gain. Given an ND
two bits per symbol, and a variable-rate "secondary" channel, constellation n of size Inl, in which the ith point Ti has
conveying an average of one bit per symbol. In general, a energy IITi 112, 1 ::; i ::; Inl, the optimal (gain-maximizing)
binary prefix code with codeword lengths {h ::; 12 ::; . . . ::; complete binary prefix code with codeword lengths Ii, 1 ::;
1M} assigned to an ND constellation of size M will produce i ::; In l , would maximize the quantity f (26 - 1)/ E, where
=

2
a signaling scheme with an overall normalized average bit /3 = (2/N) Li li2-1i and E = (2/N) Li IITi I1 2-1i , subject
rate /3 (2/N) L;� 1 li2-1i . The fixed primary channel rate
= to the constraint (30), Unfortunately, short of searching all
is /3p = 2h /N, while the variable secondary channel rate is complete binary prefix codes with Inl codewords, we know of
/3
.• = /3 - /3p . It is quite possible to have /3 > /3r, so the names
.. no general method for finding the optimal prefix code,
primary and secondary do not necessarily refer to relative Rather than attempting to find the optimal code, we have
bit rates. In most practical circumstances, we will select taken the approach of finding approximations to the optimal
KSCHISCHANG AND PASUPATHY: OPTIMAL NONUNIFORM SIGNALING FOR GAUSSIAN CHANNELS

Maxwell-Boltzmann distribution with distributions in which


1.5
all probabilities are positive integer powers of 1/2. (Stubley
and Blake consider a more general matching problem in
1.2
[34).) We refer to such approximations as "dyadic approxi­
mations" to the Maxwell-Boltzmann distribution. To find the
C O.9
"best" dyadic approximation, we need a measure of distance (dB)
between the Maxwell-Boltzmann distribution and its dyadic 0.6
approximation. A commonly used measure of distance be­
tween two probability distributions P (with probability masses 0.3

{PI , . . . , PM }), and Q (with probability masses {ql , . . . , qM })


is the relative entropy of P with respect to Q :
2 4
M /3 (bitsj2D channel-use)
D(P, Q) 4> "" Pi
=
L]i 10g2 -.
q, Fig. 8. Illustrating the performance of dyadic approximations to the
i =1
Maxwell-Boltzmann distribution obtained from the Huffman procedure.
When Q is dyadic, so that qi = 2-1"
2) Using Aopt , we generate a list of Maxwell-Boltzmann
M
probabilities
D(P, Q) =
L1iPi - H(P) (32)
i=l Pi = p(T.; ) = exp ( - A opt IITi I 1 2 )/Z (Aopt)
4> for all Ti E n.
where H(P) - Li Pi log2 Pi is the entropy of P.
From the point of view of source coding, D(P, Q) repre­ 3) We apply the Huffman procedure to the list Pi to obtain
sents the redundancy of a source code used to represent the a complete binary prefix code. The performance of this
output of a discrete memoryless source with alphabet of size code is then evaluated.
M and distribution P. As is well known, the redundancy (32) Note that, in general, the Huffman procedure does not
is minimized by the Huffman procedure [35]. Furthermore, result in a unique code. We have chosen the version of the
the Huffman procedure always results in a complete prefix Huffman procedure, described in [36, p. 68J, that results in
code. The existence of an algorithm for minimizing DCP, Q) least variation among the codeword lengths. Note also that we
is our primary motivation for choosing this particular measure; have chosen Aopt to maximize total gain. Since the coding gain
indeed, other measures may be more naturally suited to the is fixed for a given lattice, this is equivalent to maximizing the
problem. Nevertheless, as will become evident, this approach product "I8"1d. It is important to note that this is not equivalent
of minimizing D(P, Q) leads to excellent gain values that can to maximizing the shaping gain "Is since this, in principle, can
be made to approach the ultimate shaping gain. be accomplished by making A arbitrarily large.
To illustrate their performance, we have computed dyadic We have applied this procedure to spherical constellations
approximations to the Maxwell-Boltzmann distribution for based on the 2-D 712 lattice and its translate (71 + 1/2)2,
two constellations based on 712 . The two constellations were as well as the 2-D hexagonal lattice A 2 . The results are
chosen quite arbitrarily: one consists of the 21 points of given in Tables I-III. Each constellation n consists of the Inl
least energy in 712; the other consists of the 121 points of points of least energy drawn from the corresponding infinite
least energy. The results are illustrated in Fig. 8, and were lattice (or translate). Shown in each table are the primary and
obtained by varying the Maxwell-Boltzmann parameter A secondary bit rates (fJp and fJB) obtained from the Huffman
from zero through positive values. As expected, the dyadic code. Each table lists the parameter "Is"ld, obtained by dividing
approximations (marked with a triangle for the 121-point the total gain of the signaling scheme by the coding gain of
constellation and a square for the 21-point constellation) the lattice upon which it is based. The 2-D peak-to-average
have lower gain values than those obtained from the optimal energy ratio PAR2 and the 2-D constellation expansion ratio
Maxwell-Boltzmann distribution (the solid curves); however, CER2 are listed. In addition, the parameter Neff, the "effective
the gain values do follow the general trends obtained for dimension," is listed. We define the effective dimension of a
the Maxwell-Boltzmann distribution. Due to the flexibility shaping scheme to be the smallest dimension N for which the
afforded by having a larger number points, the larger constel­ shaping gain of an N-sphere "I0 (N) (39) (properly multiplied
lation has a greater number of different dyadic approximations by "Id) meets or exeeeds the shaping gain provided by the
to the Maxwell-Boltzmann distribution. scheme in question, i.e.,
The fact that these dyadic approximations follow the same
general trends obtained for the Maxwell-Boltzmann distri­
bution suggests the following algorithm for designing prefix As can be seen from the tables, very satisfactory shaping gain
codes to achieve shaping gain. Given an N-D constellation, values, with effective dimensions numbering in the hundreds
we proceed as follows. of dimensions, are obtained from these Huffman prefix codes.
1) From the constellation theta series, we numerically de­ The shaping gains obtained from these Huffman codes seem
termine the value of the Maxwell-Boltzmann parameter to be the highest ever reported, exceeding those reported in
A that maximizes gain G (8). Call this value Anpt. [8, Table IVJ. As previously noted, however, maximization
924 IEEE TRANSACTIONS ON 1NFORMATlON TIIEORY, VOL. 39, NO. 3, MAY 1993

TABLE I TABLE II
PERFORMANCE OF HUFFMAN-CODED SIGNAL CONSTELLATIONS PERFORMANCE OF HUFFMAN-CODED SIGNAL CONSTELLATIONS
BASED ON 7J2 BASED 0>1 (1 + 1 / 2 )2

(3s 'Ys'Yd (dB) Neff PAR2 CER2 Inl {ip S, "fs"fd (dB) Neff PAR2 CER2

9 2 1.000 0.669 47 2.000 1.125 12 2 1.000 0.670 47 2.500 1500


13 2 1.125 0.834 95 3.765 1 .490 16 2 1.375 0.757 37 3.429 1 .542
21 3 0.688 1 .030 111 3.200 1 .630 24 3 0.844 1 .045 93 3.714 1 . 672
25 3 0.906 1.039 80 4.357 1 .667 32 3 1.094 1 . 136 130 4. 121 1.874
29 3 1.078 1 .069 78 4.347 1.717 44 4 0.734 1.123 58 3.791 1.653
37 3 1.359 1.172 120 4.025 1 .803 52 4 0.820 1.190 85 4.199 1.841
45 3 1.641 1.39 159 4.333 1.804 60 4 0.969 1.232 104 4.862 1.916
49 3 1 .li56 1.249 173 5.285 1 .943 li8 4 1 .086 1 .253 1 13 4.979 2.002
57 3 1.719 1.263 189 5.386 2.165 76 4 1 .266 1 .289 138 4.848 1.976
61 4 0.922 1.259 139 4.923 2.012 80 4 1.395 1 .320 172 4.851 1 .902
69 4 1 .020 1 .275 149 5. 120 2. 127 88 4 1.412 1.334 202 5.198 2.067
81 4 1.281 1 .289 135 5.327 2.083 96 4 1.432 1.341 219 5.911 2.224
89 4 1.291 1.301 152 5.517 2.273 112 4 1.691 1.340 175 5.239 2.167
97 4 1.389 1.288 124 5.724 2.315 120 4 1 .828 1 .353 1 85 5.358 2. 1 1 2
101 4 1 .667 1 .285 103 5.182 1.988 124 5 0.945 1 .376 232 5.502 2.012
109 4 1.708 1.301 115 5.368 2.085 140 5 0.955 1.385 268 5.747 2.257
113 4 1.710 1 .303 1 17 5.679 2.159 148 5 1.083 1 .397 293 5.514 2.183
121 4 1.780 1.318 129 5.573 2.202 156 5 1 . 157 1 .406 328 5.716 2.186
129 4 1 .784 1 .323 136 6.015 2.341 164 5 1.1 Iili 1.410 351 5.921 2.284
137 5 1 .007 1 .340 145 5.291 2.131 172 5 1.223 1.414 364 6.146 2.303
145 5 1.077 1.359 171 5.550 2.148 180 5 1.224 1.417 384 6.369 2.408
149 5 1 .078 1 .360 174 6.042 2.205 188 5 1 .232 1.418 390 6.558 2.500
161 5 1.119 1.367 184 5.998 2.316 192 1 .245 1.416 373 6.719 2.531
169 1.168 1.369 1 84 6.031 2.350 208 1.378 1398 247 6.297 2.501
1 77 1 .256 1.373 186 5.784 2.316 216 5 1 .449 1 .389 209 6.554 2.473
185 5 1.304 1.377 190 6.124 2.341 232 5 1.511 1 .385 193 6.633 2.543
193 5 1.319 1 .379 194 6.376 2.417 240 5 1 .654 1.389 192 Ii. 172 2.383
197 5 1.336 1.379 192 6.611 2.438 248 5 1.726 1.398 209 6.041 2.343
256 5 1.733 1 .400 214 6.171 2.407

of shaping gain 'Ys itself is not our aim; rather, we have


attempted to maximize the combination 'Ys'Yd. Further numeri­ for a discrete memoryless source can be made to approach zero
cal calculations obtained from dyadic approximations to the by considering Cartesian products of the source. This implies
Maxwell-Boltzmann distribution with parameter A > Aopt that the relative entropy between the Maxwell-Boltzmann
show that, in some cases, the effective dimension Neff can distribution and its dyadic approximation can be made to
be made to increase significantly above the values shown in approach zero by considering Cartesian products of the basic
Tables I-III. Note also that PAR2 and CER2 values shown constellation. Convergence in relative entropy implies Ll
in Tables I-III are all quite reasonable, especially when com­ convergence of the probabilities [1, Section 12.6]; hence, the
pared to the PAR2 and CER2 of large-dimensional Voronoi performance obtained from our dyadic approximations can be
constellations [6]. The PAR2 and CER2 can, in principle, made to approach arbitrarily closely lo the performance ob­
be improved by sacrificing some shaping gain. Indeed, nu­ tained by using the optimal Maxwell-Boltzmann distribution.
merical calculations show that dyadic approximations to the Numerical calculations confirm the performance improvement
Maxwell-Boltzmann distribution with parameter A < Aopt obtained by working with Cartesian products of the basic
will, in general, result in improved PAR2 and CER2, with constellations.
some corresponding sacrifice in overall gain. We have also
applied this procedure to multidimens ional constellations, with VIII, CODED NONUNIFORM SIGNALING
similar results. However, since many multidimensional con­ In this section we study how optimal nonuniform signaling
stellations are best implemented as coset codes (see [10] and fits into the general framework of coset codes, first introduced
[1 1]), it may be preferable to use a nonuniform signal point by Calderbank and Sloane [12] and extensively studied by
selection scheme that is suited for a coset code (as described Forney [10] , [11].
in the next section) rather than a direct mapping of the words
of a prefix eode onto the constellation points. A . Memoryless Signal Point Selectors

It follows from standard arguments in information theory A signaling scheme based on a coset code has two com­
(e.g., [1, Section 5.4]) that the redundancy of the optimal code ponents as shown in Fig. 1. A coset code C, based on the
KSCHISCHANG AND PASUPATHY: OPTIMAL NONUNIFORM SIGNALING FOR GAUSSIAN CHANNELS 925

TABLE III on 2-D lattices, and assuming uniform signaling, time-varying


PERFORMANCE OF HUFFMAN-CODED SIGNAL CONSTELLATIONS
signal point selectors are necessary to achieve shaping gains
BASED ON A2
that exceed the shaping gain of a 2-D disc. Indeed, the best
1111 j3p f1s '1s"1d (dB) Neff PAR2 CER2 possible shaping gain in N-space is achieved by an N-sphere,
7 2 0.750 0.423 27 1 .333 1.041 a region not decomposable as a Cartesian product of lower­
13 2 1.031 0.732 62 3.429 1.590 dimensional regions.
19 2 1.453 0.884 62 3.413 1 .735 As discussed in Section VI, under nonuniform signaling, the
31 3 1.187 1 .047 59 3.584 1.701 best regions with which to shape a constellation are polydiscs,
37 3 1.477 1.142 81 3.815 1.662 as these achieve a given shaping gain with least shaping
43 3 1 .668 1 .242 157 4.5 3 1 1.691 constellation expansion ratio CER2s and least peak-to-average
55 3 1 .754 1 .292 268 4.668 2.038 energy ratio PAR2. Since polydiscs are by definition a pr9duct
61 4 1 .006 1 . 304 2 14 4.808 1 .899 of 2-D discs, polydisc-shaped constellations can be achieved
73 4 1.049 1.324 273 5.562 2.205 by a memoryless signal point selector combined with a coset
85 4 1.449 1.286 1 17 4.583 1 .946 code based on a 2-D lattice. We focus our attention, therefore,
91 4 1.473 1.296 126 5.376 2.049 on memoryless nonuniform signal point selectors.
97 4 1 .557 1.314 143 5.496 2.061
109 4 1.695 1.330 155 5.186 2.104 B. Coset Codes
121 4 1.766 1.337 161 5,472 2.224
The coset codes considered in this paper are based on any L­
127 5 0.851 1.344 163 5.992 2.201
way partition of Aj A' of an NO lattice A into the L cosets of
139 5 0.912 1.351 171 5.907 2.308
a sublattice A' . We focus our attention on binary coset codes,
151 5 1 .078 1.368 19 1 5.564 2.236
where L = 2k+r, although generalization to nonbinary coset
163 5 U59 1 .379 211 5.809 1.282
codes is straightforward.
169 U67 1.380 214 6 .448 2.352
Let C be a binary rate-kj(k + r) encoder that takes in k
5
187 5 1.280 1.379 199 6.082 2.407
bits per ND and puts out k + l' coded bits. These coded bits
199 5 1.441 1.386 200 5.772 2.291
can be used to select one of the 2k+r cosets of A' in A.
The resulting coset code is denoted C (AjA'; C). When the
211 5 1.487 1.393 215 6.134 2.352

binary encoder C is a block code, the coset code C(Aj A'; C)


223 5 1.526 1.396 222 6.395 2.420

defines a finite-dimensional sphere packing; often this sphere


packing is actually a lattice. When C is a convolutional code,
partitioning of a lattice A (possibly translated by some constant
the resulting coset code is a trellis code.
I Aj A' l coset leaders
vector) into the cosets of sublattice A', produces a sequence 2k+ r
Let us denote a set of L =

of the cosets' of A' in A by {Cl ' C2 , ' . . , cd. Each time the
=
of sets of channel symbols, drawn from the alphabet of the
cosets of A' in A_ The actual transmitted constellation point
memoryless signal point selector § is presented with ith coset
is determined by the signal point selector §. As discussed in
C; + A', it selects some point for transmission. The set of
Section II, both the coset code C and the signal point selector
all possible points drawn from. the ith coset forms the ith
S contribute to the transmission of data. It is important to note
constellation !1i.
that, as nonuniform signaling is a shaping technique, for the
Given that S is presented with the ith coset, if a point
schemes we propose only the signal point selector § is affected_
ri E Ci + A' is selected with probability p eri ), then
The coset code C is unchanged relative to well-known schemes
we can determine a normalized average bit rate f3i =

such as those of Ungerboeck [37].


- 2 LTi E !!i peri) 10g2 [p(ri)] jN and a normalized average
The simplest type of signal point selector is memoryless,
energy Ei 2 Lri E fli p(ri) llri I 1 2jN for the ith constellation.
or time invariant. When 5 is memoryless, each time a coset
=

If the coset code C selects the ith coset with probability


of A' is made available to §, the subset from which the
P[i] (and usually P[i] Ij L), then the normalized average
=

constellation point is selected is the same, and the choice is


number of bits taken in by the signal point selector § is
made independently. For example, the signal point selector
could always choose from the K points of least norm in each
K(S) = L f=l P[i]f3i, and the normalized average energy is
coset. For a block coset code C based on a 2-D lattice A, E = Lf=l P[i]E,. Since the code C takes in k bits per N
dimensions, or K(C) = 2kjN bits per two dimensions, the
this would result in a polydisc-shaped constellation. Cubic
overall normalized bit rate f3 K(C) + 11:(5) . If the coset
constellations are achieved if § always selects from a square­
code C has minimum squared Euclidean distance d�in' then,
=

shaped region in each coset.


from (S) the gain G of the coded modulation scheme may be
More complicated signal point selectors have memory, i.e.,
written as
they are time-varying. To achieve generalized cross constella­
tions [2], Voronoi constellations [6], or indeed constellations
based on any region that is not a Cartesian product of lower­
dimensional regions, the signal point selector must be time­
varying. The block shaping codes of Calderbank and Ozarow
[7] and the trellis shaping codes of Forney [8] arc examples Here, as in (9), we have separated the total gain into the
of time-varying signal point selectors. For coset codes based product of the coding gain /'c(C) of the coset code C [10],
926 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 39, NO. 3, MAY 1993

the overall shaping gain "Is(S) provided by the signal point C D


• •
selector S, and a discretization factor "rd((3). 11 11

B A B A
C. Continuous Approximations • • • •
11 0 0 11
Suppose now that each constellation l1i is obtained from
the intersection of the ith coset of A' with the same finite D C D C
region �. Further, suppose that the memoryless signal point • • • •
10 0 0 10
selector § sclects each point in 11; with a Maxwell-Boltzmann
distribution, Le., given that § is presented with the ith constel­ A B
lation, a point ri E l1i is selected with probability p(ri ) • •
10 10
=

exp (-..\ llri I12)/Zi(..\)' where ..\ is fixed for all constellations.
Using the same continuous approximation principles as in Fig. 9. The sequence of subconstellations {A, B, c, D} is determined by
Section VI, the average energy Ei and bit rate (3i for each an Ungerbocck code. Signal point selection within each subconstellation is
subconstellation can be estimated via (21) and (23) from a performed using a simplex prefix code.
continuous Gaussian distribution, truncated to the region R
I t follows that each subconstellation supports approximately D. Memoryless Signal Point Selection with Huffman Codes
the same bit rate, at the same cost in average energy. Thus, As in the case of uncodcd transmission, probably the
independently of the coset code C and the probability with simplest method of selecting points from the coset sequences
which each coset is selected, K(S ) � ,Bi ( JR , ..\) and E � generated by a coset code is through a complete binary
Ei ( � , ..\) . prefix code. Because we would like our constellations to be
Using these estimates for bit rate and average energy, the polydiscs, and we have restricted our attention to binary coset
shaping gain "Is(S) is estimated by codes, we focus on coset codes based on the 2-D lattice 7/.. 2
22 H(R, >.)/N
A simple example of combining a nonuniform memoryless
"Is (JR, ..\)
signal point selector with a trellis code is shown in Fig. 9.
6E ( �, ..\) ,
=

The trellis code is a simple four-state Ungerboeck code [37]


based on a translate of the four-way partition 7/..2 /27/.. 2 ; this
2
which is the same expression as (25). Similarly, we code provides a coding gain "Ie
3.01 dB. Each
find that CER2, the 2-D constellation expansion ratio, is subconstellation (labeled A, B, C, or D) consists of a single
= =

approximately the produce of a coding constellation expansion inner point at squared Euclidean distance 1/2 from the origin
ratio CER2e( C) and a shaping constellation expansion and two outer points at squared Euclidean distance 5/2 from
ratio CER2s (S). In terms of the normalized redundancy the origin. By using the prefix code {O, 10, ll}, the signal
p(C ) 2r/N of the binary encoder C, we have
ll.
point selector chooses the inner point with probability 1/2 and
each outer point with prohability 1/4. The transmitted rate (3
A 2/
CER2 .(C) = 2P (C) V ( ) N for this scheme is 2.5 bits with a primary channel rate of 2
V (A2 ) ,
c

bits and a secondary channel rate of 0.5 bits. The shaping gain
"Is 0,994 dB. The overall gain .G "Ic"ls"ld 3.16 dB.
The four-way partition 7/.. 2/27/.. 2 , translated as in Fig. 9,
= = =

while
( JR 2 )
.:; - 2p(R, >.) VV( )2/N
yields four cosets with identical weight distributions. The
CER2., (<0)
_

' weight enumerator for these cosets begins



Go (x) = Xl/2 + 2X5/2 + x9/ 2 + 2:]; 13/2 + 2x17/ 2 + " ',
exactly as in Section VI. Approximations for the 2-D peak-to­
average energy ratio PAR2 also lead to expressions identical As in Section VII, we have computed dyadic approximations
to those given in Section V I . to the optimal Maxwell-Boltzmann distribution by applying
In general, to the accuracy of the continuous approxima­ the Huffman procedure, The results are given in Table [V. We
tion, the shaping gain "Is ( � , ..\) achieved by our memoryless have assumed a coset code C ( 7/..2 /27/..2 ; C) with a normalized
nonuniform signal point selector S is completely independent bit rate K(C) =1. This bit rate is included in the primary
of the choice of coset code C. Trade-offs involving shaping rate (3p of Table IV. The overall constellation, of size 1111, is
constellation expansion ratio CER2s(S) and PAR2 are also the union of four subconstellations, each of size 1111/4. As in
completely independent of the choice of coset code C. As Tables I-III, the gain "Is"ld and the effective dimension Neff are
noted, it is desirable to have constellations shaped as polydiscs, listed, along with PAR2 and CER2s. Note that CER2s (C) 2. =

since these achieve a given shaping gain with minimum As before, very satisfactory shaping gains are achievable via
CER2s and minimum PAR2. Therefore, a nonuniform signal­ these prefix codes, with Neff numbering in the hundreds of
ing scheme based on a multidimensional lattice A is perhaps dimensions for the larger constellations.
best implemented as a coset code involving the constituent Although the prefix codes for the smaller constellations
2-D lattice A 2 , in which the ith constellation l1i is circular have smaller Neff, as discussed at the end of Section VII,
and the signal point selector § chooses from 11; according to applying the Huffman procedure to Cartesian products of these
a Maxwell-Boltzmann distribution. subconstellations can improve the gain. Note, however, that the
KSCHISCHANG AND PASUPATHY: OPTIMAL NONUNIFORM SIGNALING FOR GAUSSIAN CHANNELS 'Y27

TABLE IV found good dyadic approximations to the Maxwell-Boltzmann


1/2 COSET CODES
Z2 /2Z2
PERFORMANCE OF HUFFMAN-CODED RATE
distribution for these partitions, with results similar to those
BASED ON
given in Table IV. Many different schemes may be designed
1111 ,Bp (3. ISld (dB) Neff PAR 2 CER2.
by combining the various signal point selectors obtained. For
12 2 0,500 0.149 17 1.667 1.061 example, it may be advantageous for different cosets to convey
16 2 0.750 0.378 23 2.571 1. 189 different numbers of secondary channel bits.
24 3 0.375 0.555 18 2.364 1.157 In general, by applying the Huffman algorithm to obtain
32 3 0.562 0.798 34 2.R33 1 .354 dyadic approximations to Maxwell-Boltzmann distribution,
44 3 0.906 0.948 46 3.333 1.467 we have been able to obtain a variety of different schemes.
52 3 1.281 1.064 60 3.013 1.337 By varying A, schemes that trade off gain for improved PAR2
60 3 1.328 1.128 87 3.771 1.494 and CER2s arc easily obtained.
68 3 1 .375 I. i 68 112 4.075 1. 63 9
76 3 1.562 1.155 82 3.892 1 .608
IX. DISCUSSION AND CONCLUSION
80 3 1.570 1.174 94 4.232 1.684
88 4 0.883 1.221 102 3.695 1.491 From the point of view of coded modulation, we have seen
96 4 0.906 1.247 1 26 4,207 1 .601 that nonuniform signal point selection is an energy-minimizing
112 4 0.938 1.289 192 4,426 1.827 or shaping operation. When constellation points are selected
120 4 1.180 1.295 158 4.186 1.655 with Maxwell-Boltzmann probabilities, the ultimate in shaping
124 4 1.184 1.303 173 4,641 1.706 gain performance can be achieved in any dimension. Dyadic
140 4 1.207 1.329 239 4,818 1. 895
approximations to the optimal Maxwell-Boltzmann distribu­
148 4 1.213 1.340 280 5.036 1.995 tion are easily obtained by applying the Huffman procedure.
156 4 1.432 1.292 125 4,647 1.807
The performance of the resulting shaping schemes is often
164 4 1.455 1.303 137 4.770 1.869
close to optimum, with effective dimensions numbering in
1 72 4 1.494 1.306 137 5.011 1.908
the hundreds, and can be made to approach the optimum
180 4 1.656 1.308 127 4,634 1.785
by considering Cartesian products of the basic constellations.
188 4 1.705 1.321 140 4,649 1.802
By varying the Maxwell-Boltzmann parameter, trade-offs be­
192 4 1.707 1.326 147 4,807 1.838
tween shaping gain, 2-D constellation expansion ratio or 2-D
208 4 1.717 1.342 174 4,950 1.977
peak-to-average energy ratio are easily accomplished. In a
216 4 1.728 1 .345 179 5.387 2.038
sense, the implementation complexity for these schemes is
232 5 0.924 1.342 153 4.961 1.911
trivial, since data to constellation point mappings (and vice
240 5 0.941 1.348 162 5.042 1.953
versa) are easily performed by table lookup. Furthermore,
248 5 0.986 1.353 169 5.023 1.956
these schemes are easily incorporated into well-known lattice­
256 5 1.020 1.357 174 5,040 1.973
type coded modulation schemes. All of these properties make
nonuniform signaling very attractive.
The principal drawback, as pointed out at the outset, is
signal point selector § is no longer memoryless in this case. the variable bit rate. While only the secondary channel data
Other partitions of 7L 2 , e.g., the 8-way partition 7L2/2R7L2 or are subject to the problems associated with buffer under­
the 16-way partition 7L2 /47L 2, allow the use of more powerful and overflow and the insertion and deletion of bits in the
coset codes, many of which are listed in [10, Tables IV, V, decoded bit stream, these problems may be acceptable only
IX, X, XI]. Unlike the 4-way partition 7L2 /27L2, the cosets in certain applications, e.g., for the transmission of internal
in these partitions do not all have the same theta series. For control signals. Left unsolved, these problems will tend to
example, the 8-way partition of 7L2 + (1/2, 1/2) , has two limit the broad applicability of nonuniform signaling. Solving
classes of cosets [typified by A = 2R7L2 + (1/2, 1/2) and the system problems associated with nonuniform signaling
B = 2R7L2 + (3/2, 1/2)] with weight enumerators will certainly increase the complexity of implementation.
8A (''/: ) = xl/2 + x9/ 2 + 2x17/2 + 3x2 5/2 + ' . . Yet, in order to achieve the large shaping gains achieved
with nonuniform signaling, uniform signaling schemes will
8B(x ) = 2X5/2 + 2X13/2 + 2X29/2 + 2x37/2 + . . . , themselves tend to become quite complex (see [8, Table IV]).
respectively. Each class consists of four different cosets. It remains an open problem to evaluate and compare these
Similarly, the 16-way partition of 7L2 + (1/2, 1 /2) has three complexities.
classes of cosets [typified by A 47L2 + (1/2, 1/2), B =

47L 2 + (3/2, 1/2), and C 47L 2 + (3/2, 3/2)] with weight


=

= APPENDIX A
enumerators ERROR COEFFICIENTS FOR BASELINE CONSTELLATIONS
8A(x) = x1 /2 + 2x25/2 + 2X41/2 + x 49/ 2 + . . . In this appendix, we compute the average nearest neighbor
8B(x ) = X5/2 + X13/2 + X29/2 + x 37/ 2 + . . . multiplicity (error coefficient) for a cubic constellation of side
8c(x) = x9/ 2 + 2x 17/2 + x25/2 + 2X 65/ 2 + . . . , M drawn from the lattice 7L N. This problem is easily solved
using the notion of a nearest neighbor enumerator.
respectively. Class A consists of four cosets, class B consists Given a finite constellation n, recall that Nmin (r) denotes
of eight cosets, and class C consists of four cosets. We have the number of points of n at distance dmin from the point r.
928 IEEE TRANSACTIONS O N INFORMATION THEORY, VOL. 39, NO. 3, MAY 1993

Define A (x) � L rH! XN min (r); then A(x) is a polynomial Setting .\ = 0 yields
with integer coefficients that we call the nearest neighbor enu­
merator for O. For example, a simple 1-D PAM constellation H(IBN (R), 0) = log2 [V(IBN (R))]
with M points has A (x) 2:c + (M - 2)X2, indicating that
= = log2 [(11'R2)N/ 2/r(N/2 + 1 )] .
two points of the constellation have a single nearest neighbor,
while M - 2 points have two nearest neighbors. Combining these expressions gives an estimate for the nor­
Assuming uniform signal point selection, it is easy to sec malized redundancy (27), namely,
that average nearest neighbor multiplicity N is given by
'\R2
A' ( I ) /A ( I ) where A'(x) dA(x)/dx. Thus for the M-PAM
=

example of the previous paragraph, N 2(1 - l /M).


=
p(IBN (R), .\) =
log2
( [r (N/2 + I ) P(N/2, '\R2)j 2/N
P(N/2 + 1 , '\R2)
)
It is easily seen that the nearest neighbor enumerator for on,
the n-fold Cartesian product of 0 with itself, is A( x)n. The -
P(N/2 , '\ R2 )
log z e. (35)
average nearest neighbor multiplicity of the Cartesian product
is then nA(I)n-1A'(1 )/A(I)n =nA'(I)/A ( l ) ; in other The bit rate can be estimated via (23) or via (27). For a
words, taking the n-fold Cartesian product of a constellation spherical constellation of size n, we use the approximation
with itself multiplies the average nearest neighbor multiplicity ;3(IBN(R) , .\) = (2/N) log2 I n l - P(IBN(R) , .\) .
by n. Since an N-D cubic constellation of side M is the N­ Shaping and Biasing Gains: Substituting (34) and (35) into

fold Cartesian product of simple I-D PAM constellations, we (28) gives an cstimate for the biasing gain. Multiplying the
have N 2N(1 - 1M) for such constellations. Furthermore,
= biaSIng gain by "i0 (N) � 'if(N/2 + 1 )/ [6r(N/2 + l ) z/N] ,
since M 2(3 / 2 , we obtain Nffi 2N(1 - 2-,3/ 2 ) for our
= = the shaping gain of an N -sphere under uniform signaling [2],
N -D baseline constellations. yields an estimate for the shaping gain under nonuniform
signaling. Explicitly, the shaping gain "is ( IBN (R), .\) is
ApPENDIX B
'if P(N/2 + 1, '\R2)
CONTINUOUS APPROXIMATIONS FOR
SPHERlCAL CONSTELLATIONS
"is( IBN (R), .\) = "6 exp
( P(N/2, '\R2) )
P(N/2, .\R2)( 2 /N)+1
In this appendix, we specialize the continuous approx­ (36)
imations derived in Section VI to the case where IIil =
P(N/2 + 1, '\R2 ) .
IBN (R), an N-baIl of radius R centered at the origin. Cubic
constellations can be considered to be Cartesian products of PAR2: The constituent 2-D constellation of IBN (R) (as
1 D "spherical constellations" and so are (by the separability defined in Section III) is a 2-D disc B 2 (R) (with peak energy
of Gaussian densities) a special case. By letting R -> 00, we R2) when N is even, and a square lBi(R) of side 2R (with
obtain continuous approximations for the case of an infinite peak energy 2R2) when N is odd; this may be expressed
constellation. compactly by writing r;,ax (IBN (R)) [3 - (-1 ) N ] R2 /2.
=

Many of the expressions derived in this Appendix may Since the normalized average energy under uniform signaling
be written in terms of the (normalized) incomplete Gamma is 2R2 / (N + 2), we have
function pen, :e ) , defined in [23] as
(3 - ( - l ) N )(N + 2)/4 .
rta)l"t'L-le-tdt, PAR2 ( BN ( R) , 0) =

P(a, x) �
Under nonuniform signaling this PAR2 is increased by a factor
where we note that limx->oo P( a, x) 1. =
of gE so that
For a finite N-D spherical constellation 0, in our estimates
we choose the spherical radius R so that V(BN (R)) = (37)
InlV(A), Le., so that approximation (1 9) holds with equality.
Energy: From (21) we obtain

� . P(N/2 + 1 , '\R2) CER2s: Under uniform signaling, a large even-dimensional


E(IBN (R) , .\) spherical constellation induces a shaping constellation expan­
(33)
P(N/2, '\R 2 )
=

.\
sion ratio of CER2s r ( N/2 + 1)2 /N (see [2]), while a
=

Since E ( IB N ( R), 0) 2 R z/(N + 2), we find that the energy large odd-dimensional spherical constellation has CE R2. =

(4/11')f(N/2+ 1)2 /N. Under nonuniform signaling, this CER2s


=

savings factor (29) is


is increased by a factor of 2p(BN (RJ Al , so iliat
IB ( R) .\) 2'\R2 P(N/2, '\R2)
gE (
_

N , - (N + 2)P (N/2 + 1 , '\R Z ) ' (34)


CER2s(IBN (R), .\ ) 'if + 4 + (�I )N('if - 4)
11'
=

Bit Rate: The entropy of a continuous Gaussian random '\R2


variable (with parameter .\) truncated to IBN (R) is given by P(N/2, '\R2 ) 2/N
H(IBN (R) , .\) log2 [('if /.\)N/2 P(N/2 , '\R2)] [- P(N/2 + 1, '\R2 )
] 8
=

+ N .\ R( IB N ( R ) , .\ )/(2 1n 2). . exp P(N/2, '\R2 ) . (3 )


KSCHISCHANG AND PASUPATHY: OPTIMAL NONUNIFOR M SIGNALING FOR GAUSSIAN CHANNELS 929

Limiting Case: When R -> 00, the truncated Gaussian [7] A. R. Calderbank and L. H. Ozarow, "Non-equiprobable signaling on the
random variable approaches a standard (untruncated) Gaussian Gaussian channel," IEEE Trans. lIiform Theory, vol. 36, pp. 726-740,
July 1990.
random variable. For large R, the normalized average energy [8] G. D. Forney, Jr., "Trellis shaping," IEEE Trans. Inform. Theory, vol.
E (IBN(R), A) -> l/A, independently of N. Siniilarly, the en­ 38, pp. 281-300, Mar. 1992.
[9] F. R. Kschischang and S. Pasupathy, "Optimal shaping properties of the
ergy savings factor gE(IBN(R), A) -> 2AR2 /(N + 2), the nor­
truncated polydisc," IEEE Trans. Inform. Theory, Mar. 1992, submitted
malized redundancy p(IBN (R), A) -> log ( A R2/ef(N/ 2 + for publication.
N 2 [10] G. D. Forney, Jr., "Coset codes I: Introduction and geumetrical classi-
1 )2/ ). Thus the biasing gain
fication," IEEE Trans. Inform. Theory, vol. 34, pp. 1 123-1151, Sept.
N 1988.
i'b (BN(R), A) -> 2er(N/2 + 1 ) 2/ / (N + 2) [11] __ , "Coset codes II: Binary lattices and related codes," IEEE Trans.
1l'e/[6'Y'0 (N)], Inform. Theory, vol. 34, pp. 1152-1 187, Sept. 1988.
[12] A. R. Calderbank and N. J. A. Shine, "New trellis codes based on lat­
=

tices and eosets," IEEE Trans. Inform. Theory, vol. IT-33, pp. 177-195,
where Mar. 1987.
[13] G. D. Forney, Jr., R. O. Gallager, G. R. Lang, F. M. Longstaff, and
N
A 0 (N) � 1l'(N/2 + 1)/[6 r(N/2 + 1 )2/ ] , (39) S. U. Qureshi, "Efficient modulation for band-limited channels," IEEE
J.
Select. Areas Commun., vol. SAc-i, pp. 632-647, Sept. 1984.
[14] R. G. Gallager, "Source coded modulation system," U.S. Patent 4 586
is the shaping gain of the N -sphere under uniform signaling. 182, Apr. 29, 1986.
The shaping gain i's (IBN(R), A) -> 7re/6, independently of [15] J. N. Livingston, "Shaping using variable-size regions," IEEE Trans.
Inform Theory, pp. 1347-1353, July 1992.
the dimension N. Of course, the biasing gain depends on [16] A. Chouly and H. Sari, "Block-coded modulation: Novel design tech­
N, and approaches unity as N -> 00. For large values of niques and rotational invariance," Philips J. Res., vol. 45, no. 2, pp.
R, CER2(IBN (R) , A) -> AR2/e for even-dimensional spher­ 127-155, 1990.
[17] G. R. Lang, G. D. Forney, S. Qureshi, F. M. Longstaff, and C. H. Lee,
ical constellations, and CER2(IBN (R) , A) -> 4AR2/(1l'e) for "Signal structures with data encoding/decoding for QCM modulations,"
odd-dimensional spherical constellations. Similarly, for even­ U.S. Patent 4 538 284, Aug. 27, 1985.
dimensional spherical constellations, PAR2(BN(R) , A) -> [18] T. Armstrong, "Secondary channel signaling in a QAM data point
constellation," U.S. Patent 4 630 287, Dec. 16, 1 986.
AR2 independently of N. Of course, this is to be expected
,
[19] R. D. Gitlin and J.-J. Werner, "Inbaod coding of secondary data," U.S.
because the peak energy value is approximately R2, and Patent 4 644 537, Feb. 17, 1987.
the average energy is approximately 1/ A. Similarly, for odd­ [20] H. K. Thapar, "Inband coding of secondary data," U.S. Patent 4 651
320, Mar. 17, 1987.
dimensional spherical constellations, PAR2(IEE N (R) , A) -> [21] R. D. Gitlin, H. K. Thapar, and J. J. Werner, "Au inband coding method
2 AR2 because the peak energy value (in two dimenSIons) is for the transmission of secondary data," in Conf. Rec. IEEE InL Conf.
Commun., June 1988, pp. 3.t.i-3.1.5. '
approximately 2R2 . [22] C. Hastings, Jr., Approximations for Digital Computers. Princeton, NJ:
Princeton University Press, 1955.
[23] M. Abramowitz and I. A. Stegun, Handbook ofMathematical Functions.
ACKNOWLEDGMENT
New York: Dover, 1965.-
[24] J. H. Conway and N. 1. A. Sloane, Sphere Packings, Lattices and Groups.
We are grateful to G. D. Forney, Jr., for his detailed
New York: Springer-Verlag, 1988.
comments on earlier versions of this paper, for his insightful [25] L.-F. Wei, "Trellis-coded moduiation with multidimensional constel­
suggestions, and for providing us with some related reference lations," IEEE Trans. Inform. Theory, vol. IT-33, pp. 41b-501, July
1987.
material. Comments made by the anoriymous reviewers proved [26] T. Berger, "Minimum entropy quantizers and permutation codes," IEEE
most helpful as well. We would also like to thank P. R. Trans. Inform. Theory, vol. iT-28, pp. 149-157, Mar. 1982.
Stubley for pointing out that the Huffinan algorithm minimizes [27J I. Ingemarsson, "Optimized permutation modulation," IEEE Trans.
Inform. Theory, pp. 1098-1 100, Sept. 1990.
the relative entropy between a distribution and its dyadic [28] J. A. Hertz, R. G. Palmer, and A. S. Krogh, Introduction to the Theory
approximation. of Neural Computation. Reading, MA: Addison-Wesley, 1991.
[29] S. Kirkpatrick, C. D. Gelatt, Jr., and M. P. Vecehi, "Optimization by
simulated anncaling," Science, vol. 220, pp. 671-680, May 13, 1983.
REFERENCES [30] N. J. A. Sloane, "Tables of sphere packings and spherical codes," IEEE
Trans. Inform. Theory, vol. IT-27, pp. 327-338, May 1981. .
(1] T. M. Cover and J. A. Thomas, Elements of Information Theory. New [31] 1. M. Wozencraft and I. M. Jacobs, Principles of Communication Engi­
York, Wiley, 1991. neering. New York� Wiley, 1965.
[2] G. D. Forney, Jr., and L.-F. Wei, "Multidimensional constellations--Part [32] W'. Rudin, Function Theory in Polydiscs. New York: W. A. Benjamin,
I: Introduction, figures of merit, and generalized cross constellations, 1969.
IEEE J. Select. Areas Commun., vol. 7, pp. 877-892, Aug. 1989. [33] R. E. Blahut, Digital Transmission of Information. Reading, MA:
[3] E. SchrMinger, Statistical Thermodynamics. Cambridge; Cambridge Addison-Wesley, 1 990.
University Press, 1962. [34] P. R. Stubley and I. F. Blake, "On a discrete probability distribution
[4] R. K. Pathria, Statistical Mechanics. Elmsford, NY: Pergamon, 1972. matching problem," J. Algorithms, June 1991, submitted' for publication.
[5] D. Ruelle, "Thermodynamic formalism," in Encyclopedia of Mathe­ [35] D. A. Huffman, "A method for the construction of minimum redundancy
matics and Its Applications, vol. 5. Reading, MA: Addison-Wesley, codes," Proc. IRE, vol. 40, pp. 1098-1 101, 1952.
1978. [361 R. W. Hamming, Coding and Information Theory. Englewood Cliffs,
[6] G. D. Forney, Jr., "Multidimensional constellations--Part II: Voronoi NJ: Prentice-Hall, 1986.
constellations," IEEE J. Select. Areas Commun., vol. 7, pp. 941-958, [37] G. Ungerboeck, "Channel coding with multilevel/phase signals," IEEE
Aug. 1989. Trans. Inform. Theory, vol. IT-28, pp. 55-67, Jan. 1982.

You might also like