0% found this document useful (0 votes)
2 views

Ch08

The document discusses the design and implementation of entropy coding in video codecs, focusing on the functions of source modeling and entropy encoding for efficient data compression. It outlines key constraints such as compression efficiency, computational efficiency, and error robustness, and describes techniques like run-level coding and Huffman coding for encoding various data symbols including transform coefficients and motion vectors. Additionally, it highlights the importance of data organization and the handling of special cases in the encoding process to optimize performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Ch08

The document discusses the design and implementation of entropy coding in video codecs, focusing on the functions of source modeling and entropy encoding for efficient data compression. It outlines key constraints such as compression efficiency, computational efficiency, and error robustness, and describes techniques like run-level coding and Huffman coding for encoding various data symbols including transform coefficients and motion vectors. Additionally, it highlights the importance of data organization and the handling of special cases in the encoding process to optimize performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Video Codec Design

Iain E. G. Richardson
Copyright q 2002 John Wiley & Sons, Ltd
ISBNs: 0-471-48553-5 (Hardback); 0-470-84783-2 (Electronic)

Entropy Coding

8.1 INTRODUCTION
A video encoder contains two main functions: a source model that attempts to represent a
video scene in a compact form that is easy to compress (usually an approximation of the
original video information) and anentropy encoder that compresses the output of the model
prior to storage and transmission. The source model is matched to the characteristics of the
input data (images or video frames), whereas the entropy coder may use ‘general-purpose’
statistical compressiontechniques that arenotnecessarilyuniqueintheirapplication to
image and video coding.
As with the functions described earlier (motion estimation and compensation, transform
coding,quantisation),thedesign ofan entropyCODECisaffectedbyanumber of
constraints including:

1. Compression eficiency: the aim is to represent the source model output using as few bits
as possible.
2. Computational eficiency: thedesignshouldbesuitableforimplementationonthe
chosen hardware or software platform.
3. Error robustness: if transmission errors are likely, the entropy CODEC should support
recovery from errors and should (if possible) limit error propagation at decoder (this
constraint may conflict with (1) above).

In a typical transform-based video CODEC, the data to be encoded by the entropy CODEC
falls into three main categories: transform coefficients (e.g. quantisedDCT coefficients),
motion vectors and ‘side’ information (headers, synchronisation markers, etc.). The method
of coding side information depends on the standard. Motion vectors can often be represented
compactly in a differential form due to the high correlation between vectors for neighbouring
blocks or macroblocks.Transformcoefficientscanberepresented efficiently with‘run-
level’ coding, exploiting the sparse nature of the DCT coefficient array.
An entropy encoder maps input symbols (for example, run-level coded coefficients) to a
compressed data stream. It achieves compression by exploiting redundancy in the set of
input symbols, representing frequently occurring symbols with a small number of bits and
infrequently occumng symbols with a larger number of bits. The two most popular entropy
encodingmethodsusedinvideocodingstandardsareHuffmancodingandarithmetic
coding.Huffmancoding (or ‘modified’Huffmancoding)representseachinputsymbol
by avariable-lengthcodewordcontaining an integralnumber of bits. It is relatively
164 ENTROPY CODING

straightforwardtoimplement,butcannotachieveoptimalcompressionbecause of the
restriction that each codeword must contain an integral number of bits. Arithmetic coding
maps aninputsymbolintoafractionalnumber of bits,enablinggreatercompression
efficiency at the expense of higher complexity (depending on the implementation).

8.2 DATA SYMBOLS

8.2.1 Run-LevelCoding

The output of the quantiser stage in a DCT-based video encoder is a block of quantised
transform coefficients. The arrayof coefficients is likely to be sparse:if the image block has
been efficiently decorrelated by the DCT, most of the quantised coefficients in a typical
block are zero. Figure 8.1 shows a typical block of quantised coefficients from an MPEG-4
‘intra’block.Thestructure of thequantisedblock is fairlytypical. A few non-zero
coefficients remain after quantisation, mainly clustered around DCT coefficient (0,O): this
is the ‘DC’ coefficient and is usually the most important coefficient to the appearance of the
reconstructed image block.
The block of coefficients shown in Figure 8.1 may be efficiently compressed as follows:

1. Reordering. The non-zero values are clustered around the top left of the 2-D array and
this stage groups these non-zero values together.
2. Run-level coding. This stage attempts to find a more efficient representation for the large
number of zeros (48 in this case).
3. Entropy coding. The entropy encoder attempts to reduce the redundancy
of the data symbols.

Reordering

The optimum method of reordering the quantised data depends on the distribution of the
non-zero coefficients. If the original image (or motion-compensated residual) data is evenly

DC

Figure 8.1 Block of quantised


coefficients
(intra-coding)
DATA SYMBOLS 165

distributedin the horizontal and vertical directions (i.e. thereis not a predominance of
‘strong’ image features in either direction), then the significant coefficients
will also tend to
be evenly distributed about the top left of the array (Figure 8.2(a)). In this case, a zigzag
reordering pattern such as Figure 8.2 (c) should group together the non-zero coefficients

Typical coefficientmap: frame coding

8000 -
6000.
4000.
2000 -
O
0 L
i 2P o

Typical coefficientmap: field coding

2000 j
O
0 b
2 2

8 8
(b)
Figure 8.2 Typicaldatadistributionsandreorderingpatterns:(a)evendistribution;(b)field
distribution;(c) zigzag; (d) modified zigzag
166 ENTROPY CODING

efficiently. However, in some cases an alternative pattern performs better. For example, a
field of interlaced video tends to vary more rapidly in the vertical than in the horizontal
direction (because it has been vertically subsampled). In this case the non-zero coefficients
are likely to be ‘skewed’ as shown in Figure 8.2(b): they are clustered more to the left of the
array (corresponding to basis functions with a strong vertical variation, see for example
Figure 7.4). A modified reordering pattern such as Figure 8.2(d) should perform better at
grouping the coefficients together.

Run-level coding

The output of the reordering process is a linear array of quantised coefficients. Non-zero
coefficients are mainly grouped together near the start of the array and the remaining values
inthe array arezero.Longsequences of identicalvalues(zerosinthiscase)can be
represented as a (run, level) code, where (run) indicates the number of zeros preceding a
non-zero value and (level) indicates the sign and magnitude of the non-zero coefficient.
The following example illustrates the reordering and run-level coding process.

Example

Theblock of coefficientsinFigure 8.1 isreorderedwiththezigzag scan shown in


Figure 8.2 and the reordered array is run-level coded.

Reordered array:
[102, -33, 21, -3, -2, -3, -4, - 3 , 0 , 2 , 1,0, 1,0, -2, - 1, -1,0, 0,0, -2, 0,0, 0,
0,0,0,0,0,0,0,0,1,0 ...l
Run-level coded:
(0, 102) (0, -33) (0, 21) (0, -3) (0, -2) (0, -3) (0, -4) (0, -3) (1, 2) (0, 1 ) (1, 1 )
(1, -2) (0, - 1) (0, -1) (4, -2) (11, 1)
DATA SYMBOLS 167

Two special cases need to be considered. Coefficient (0, 0) (the ‘DC’ coefficient) is impor-
tant to the appearance of the reconstructed image block and has no preceding zeros. In an
intra-coded block (i.e. coded without motion compensation), the DC coefficient is rarely
zero and so is treated differently from other coefficients. In an H.263 CODEC, intra-DC
coefficients are encoded with a fixed, relatively low quantiser setting (to preserve image
quality) and without (run, level) coding. Baseline JPEG takes advantage of the property that
neighbouringimageblockstendtohavesimilarmeanvalues(andhencesimilar DC
coefficientvalues)andeachDCcoefficientisencodeddifferentiallyfromtheprevious
DC coefficient.
The second special case is the final run of zeros in a block. Coefficient (7, 7) is usually
zero and so we need a special case to deal with the final run of zeros that has no terminating
non-zero value. In H.261 and baseline JPEG, a special code symbol, ‘end of block’ or EOB,
is inserted after the last (run, level) pair. This approach is known as ‘two-dimensional’ run-
level coding since each code represents just two values (run and level).The method doesnot
perform well underhighcompression:inthiscase,manyblockscontainonlya DC
coefficient and so the EOB codes make up a significant proportion of the coded bit stream.
H.263 and MPEG-4 avoid this problemby encoding a flag along with each (run, level)pair.
This ‘last’ flag signifies the final (run, level) pair in the block and indicates to the decoder
thattherest of theblockshouldbe filled withzeros.Eachcode now representsthree
values (run, level, last) and so this method is known as ‘three-dimensional’ run-level-last
coding.

8.2.2 Other Symbols

In addition to run-level coded coefficient data, a number of other values need to be coded
and transmitted by the video encoder. These include the following.

Motion vectors

The vectordisplacementbetweenthecurrentandreferenceareas (e.g. macroblocks)is


encoded along with each dataunit. Motion vectors for neighbouring data units are oftenvery
similar, and this property may be used to reduce the amount of information required to be
encoded.In anH.261 CODEC,forexample,themotionvectorforeachmacroblock is
predicted from the preceding macroblock. The difference between the current and previous
vectorisencodedandtransmitted(instead of transmittingthevectoritself). A more
sophisticatedpredictionisformedduringMPEG-4/H.263coding:thevectorforeach
macroblock (or block if the optional advanced prediction mode is enabled) is predicted
from up to three previously transmitted motion vectors. This helps to further reduce the
transmittedinformation. These two methods of predictingthecurrentmotionvector are
shown in Figure 8.3.

Example

Motion vector of current macroblock: x = +3.5, y = +2.0


Predicted motion vector from previous macroblocks: x = f3.0, y = 0.0
Differential motion vector: dx = +0.5, dy = -2.0
168 ENTROPY CODING

Current Current
macroblock macroblock

H.261: predict MV from previous H.263/MPEG4: predict MV from three previous


macroblock vector MV1 macroblock vectors MV1, MV2 and MV3

Figure 8.3 Motion vector prediction (H.261, H.263)

Quantisation parameter

In order to maintain a target bit rate, it is common for a video encoder to modify the
quantisation parameter (scale factor or step size) during encoding. The change must be
signalled to the decoder. It isnotusuallydesirabletosuddenlychangethequantisation
parameter by a large amount during encodingof a video frame andso the parameter may be
encoded differentially from the previous quantisation parameter.

Flags to indicate presence of coded units

It is common for certain components of amacroblocknottobepresent.Forexample,


efficient motion compensation and/or high compression leads to many blocks containing
only zero coefficients after quantisation. Similarly, macroblocks in an area that contains no
motion or homogeneous motion will tend to have zero motion vectors (after differential
prediction as described above). In some cases, a macroblockmay contain no coefficient data
and a zero motion vector, i.e. nothing needs to be transmitted. Rather than encoding and
sending zero values, it can be more efficient to encode flag(s) that indicate the presence or
absence of these data units.

Example

Coded block pattern (CBP) indicates the blocks containing non-zero coefficients in an
inter-coded macroblock.

INumber of coefficients
I
non-zero block
in each I
I
I
I
YO Yl Y2 Y3 Cr Cb
CBP I
2 1 0 7 llO100 0 0
I I I I I I

0 6 9 1 1 3 011111
HUFFMAN CODING 169

Synchronisation markers

A video decoder may require to resynchronise in the eventof an error or interruption to the
stream of coded data. Synchronisation markers in the bit stream provide a means of doing
this. Typically, the differential predictions mentioned above (DC coefficient, motion vectors
and quantisation parameter) are reset after a synchronisation marker, so that the data after the
marker may be decoded independentlyof previous (perhaps errored) data. Synchronisation is
supported by restart markers in JPEG, group of block (GOB) headers in baseline H.263 and
MPEG-4 (at fixed intervals within the coded picture) and slice start codes in the MPEG-1,
MPEG-2 and annexes to H.263 and MPEG-4 (at user definable intervals).

Higher-level headers

Information that applies to a complete frame or picture is encoded in a header (picture


header). Higher-level information about a sequence of frames may also be encoded (for
example, sequence and group of pictures headers in MPEG-1 and MPEG-2).

8.3 HUFFMAN CODING


A Huffman entropy encoder maps each input symbol into a variable length codeword and
thistype of coderwas first proposedin1952.’ Theconstraintsonthe variablelength
codewordarethat it must (a)containanintegralnumber of bitsand(b) be uniquely
decodeable (i.e. the decoder must be able to identify each codeword without ambiguity).

8.3.1 ‘True’HuffmanCoding

In order to achieve the maximum compression of a set of data symbols using Huffman
encoding, it is necessary to calculate the probabilityof occurrence of each symbol. A set of
variablelengthcodewordsisthenconstructed for thisdataset. This processwillbe
illustrated by the following example.

Example: H u f i a n coding, ‘Carphone’ motion vectors

A video sequence, ‘Carphone’, was encoded with MPEG-4 (short header mode). Table 8.1
liststheprobabilities of themostcommonlyoccurringmotionvectorsintheencoded

Table 8.1 Probability of occurrence of motion vectors


in ‘Carphone’ sequence
Probability
Vector P log2(l/P)
- 1.5 0.014 6.16
-1 0.024 5.38
- 0.5 0.117 3.10
0 0.646 0.63
0.5 0.101 3.31
1 0.027 5.21
1.S 0.0 16 5.97
170 ENTROPY CODING

Probability distribution of motion vectors


1

0.9

0.8

0.7

0.6
L.
-
._
.-
a
2 0.5
2
a
0.4

0.3

0.2

0.1

O L
-3 -2 -1 0 1 2 3
MVX or MVY

Figure 8.4 Distribution of motionvectorvalues

sequence and their information content, 10g2(1/P). To achieve optimum compression, each
value should be represented with exactly 10g2(llP) bits.
The vector probabilities are shown graphicallyin Figure 8.4 (the solid line).‘0’ is the most
common value and the probability drops sharply for larger motion vectors. (Note that there
are a small numberof vectors larger than+/ - 1.5 and so the probabilities in the table do not
sum to l.)

1. Generating the HufSman code tree

To generate a Huffman code table for this set of data, the following iterative procedure is
carried out (we will ignore any vector values that do not appear in Table 8.1):

1. Order the list of data in increasing order of probability.


2. Combinethetwolowest-probabilitydataitemsintoa‘node’ and assignthejoint
probability of the data items to this node.
3. Reordertheremainingdataitemsandnode(s) in increasingorder of probabilityand
repeat step 2.
HLTFFMAN CODING 171

Figure 8.5 Generating the Huffman code tree: ‘Carphone’ motion vectors

The procedure isrepeated until there is a single ‘root’ node that contains all other nodes and
data items listed ‘beneath’ it. This procedure is illustrated in Figure 8.5.

0 Original list: The data items are shown as square boxes. Vectors ( - 1S ) and (1S ) have
the lowest probability and these are the first candidates for merging to form node ‘A’.
0 Stage 1: The newly created node ‘A’ (shown as a circle) has a probability of 0.03 (from
the combined probabilities of ( - 1.5) and (1.5)) and the two lowest-probability items are
vectors ( - l ) and (1). These will be merged to form node ‘B’.
0 Stage 2: A and B are the next candidates for merging (to form ‘C’).
0 Stage 3: Node C and vector (0.5) are merged to form ‘D’.
0 Stage 4: (-0.5) and D are merged to form ‘E’.
0 Stage 5: There are two ‘top-level’ items remaining: node E and the highest-probability
vector (0). These are merged to form ‘F’.
0 Final tree: The data itemshave all been incorporated into a binary ‘tree’ containing seven
data values and six nodes. Each data item is a ‘leaf’ of the tree.

2. Encoding

Each ‘leaf’ of the binary tree is mapped to a VLC. To find this code, the tree is ‘traversed’
from the root node (F in this case) to the leaf (data item). For every branch, a 0 or 1 is
appended to the code:0 for an upper branch,1 for a lower branch (shown in thefinal tree of
Figure 8.5). This gives the following set of codes(Table 8.2). Encoding is achieved by
transmittingtheappropriate codeforeachdataitem.Note thatoncethetree has been
generated, the codes may be stored in a look-up table.
172 ENTROPY CODING

Table 8.2 Huffman codes: ‘Carphone’motionvectors


Bits CodeVector (actual) Bits (ideal)
0 1 1 0.63
- 0.5 00 2 3.1
0.5 01 1 3 3.3 1
- 1.5 5 0 1 000 6.16
1.S 01001 5 5.97
-1 01010 5 5.38
1 0101 l 5 5.21

Note the following points:

1. High probability data items are assigned short codes (e.g. 1 bit for the most common
vector ‘0’). However,thevectors ( - 1.5,1.5, - 1 , 1)areeach assigned5-bitcodes
(despite the fact that - 1 and - 1 have higher probabilities than 1.5 and 1.5). The lengths
of the Huffman codes (each an integral number of bits) do not match the ‘ideal’ lengths
given by log,( l/P).
2. No code contains any other code as a prefix,i.e. reading from the left-hand bit, each code
is uniquely decodable.

For example, the series of vectors (1, 0, 0.5) would be transmitted as follows:

3. Decoding

In order to decode the data, the decoder must have a local copy of the Huffman code tree (or
look-up table). This may be achieved by transmitting the look-up table itself,or sending the
list of data and probabilities, prior to sendingthe coded data. Each uniquely decodable code
may then be read and converted back to the original data. Following the example above:

01011 is decoded as (1)


1 is decoded as (0)
01 1 is decoded as (0.5)

Example: Hu@nan coding, ‘Claire’ motion vectors

Repeatingtheprocessdescribedaboveforthevideosequence‘Claire’givesadifferent
result. Thissequencecontains less motionthan‘Carphone’and so thevectorshavea
different distribution (shown in Figure8.4, dotted line). A much higher proportionof vectors
are zero (Table 8.3).
The corresponding Huffman tree is given in Figure 8.6. Note that the ‘shape’ of the tree
has changed (because of the distribution of probabilities) and this gives a different set of
HUFFMAN CODING 173

Table 8.3 Probability of occurrence of motion vectors


in ‘Claire’ sequence

9.66 0.001 - 1.5


8.38 -1 0.003
- 0.5 5.80 0.0 18
0.07 0 0.953
0.021 0.5
0.003 1
1.5 0.001 9.66

Figure8.6 Huffmantreefor‘Claire’motionvectors

Huffman codes (shown in Table 8.4). There are still six nodes in the tree, one less than the
number of data items (seven): this is always the case with Huffman coding.
If the probability distributions are accurate, Huffman coding provides a relatively compact
representation of the original data. In these examples, the frequently occurring (0) vector is
represented very efficiently as a single bit. However, to achieve optimum compression, a

Table 8.4 Huffmancodes:‘Claire’motionvectors


eal) Bits (actual) Bits CodeVector
0 1 1 0.07
0.5 00 2 5.57
- 0.55.8 01 1 3
0100 1 4 8.38
-1 8.38 0101 1 5
- 1.59.66 010100 6
010101 1.S9.66 6
174 ENTROPY CODING

separate code table is required for each of the two sequences ‘Carphone’ and ‘Claire’. The
loss of potential compression efficiency due to the requirement for integral length codes is
very obvious for vector ‘0’ in the ‘Claire’ sequence: the optimum number of bits (information
content) is 0.07 but the best that can be achieved with Huffman coding is 1 bit.

8.3.2 Modified Huffman Coding

The Huffman coding process described above has two disadvantages for a practical video
CODEC. First, the decoder must use the same codeword set as the encoder. This means that
the encoder needs to transmit the information contained in the probability table before the
decoder can decode the bit stream, an extra overhead that reduces compression efficiency.
Second, calculating the probability table for a large video sequence (prior to generating the
Huffman tree) is a significant computational overhead and cannot be done until after the
video data is encoded. For these reasons, the image and video coding standards define sets of
codewords based on the probability distributionsof a large range of video material. Because
the tables are ‘generic’, compression efficiency is lower than that obtained by pre-analysing
the data to be encoded, especially if the sequence statistics differ significantly from the
‘generic’ probability distributions. The advantage of not requiring to calculate and transmit
individual probability tables usually outweighs this disadvantage. (Note: Annex C of the
original JPEG standard supports individually calculated Huffman tables, but most practical
implementations use the ‘typical’ Huffman tables provided in Annex K of the standard.)

8.3.3Table Design

The following two examples of VLC table design are taken from the H.263 and MPEG-4
standards. These tables are required for H.263 ‘baseline’ coding and MPEG-4 ‘short video
header’coding.

H.263/MPEG-4 transform coeficients (TCOEF)

H.263andMPEG-4use‘3-dimensional’coding of quantisedcoefficients,whereeach
codeword represents a combination of (run,level,last) as described in Section 8.2.1. A
total of 102 specific combinations of (run, level, last) haveVLCs assigned to them. Table 8.5
shows 26 of these codes.
A further 76 VLCs are defined, each up to 13 bits long. Note that the last bit of each
codeword is the sign bit ‘S’, indicatingthesign of the decoded coefficient (O=positive,
1 = negative). Any (run, level, last) combination that is not listed in the table is codedusing
an escape sequence, a special ESCAPE code (000001 1 ) followed by a 13-bit fixed length
code describing the values of run, level and last.
The codes shown in Table 8.5 are represented in ‘tree’ form in Figure 8.7. A codeword
containingarun of morethaneightzerosisnotvalid, so anycodewordstartingwith
000000000. . . indicates an error in the bit stream (or possibly a start code, which begins
with a long sequence of zeros, occurring at an unexpected position in the sequence). All
other sequences of bits can be decoded as valid codes. Note that the smallest codes are
HUFFMAN CODING 175

Table 8.5 H.263MPEG4transformcoefficient(TCOEF)


VLCs (partial, all codes 9 bits)
Last Code
Run Level
0 0 1 10s
0 1 1 110s
0 2 1 1110s
0 0 2 1111s
1 0 1 0111s
0 3 1 01 101s
0 4 l 01 loos
0 5 1 0101 1s
0 0 3 010101s
0 l 2 010100s
0 6 1 01001 1s
0 7 1 0l0010s
0 8 1 010001s
0 9 1 0 10000s
1 1 1 001111s
1 2 1 001110s
1 3 1 001101s
1 4 1 001 loos
0 0 4 00101 11s
0 10 1 0010110s
0 11 1 0010101s
0 12 1 00lOlO0s
1 5 1 0010011s
1 6 1 0010010s
1 7 1 0010001s
1 8 1 00 10000s
ESCAPE 000001 1s
... ...

allocated to short runs and small levels (e.g. code ‘10’ represents a run of 0 and a level of
+/- l), since these occur most frequently.
H.263/MPEG-4 motion vector difference (MVD)

The H.263MPEG-4 differentially coded motion vectors (MVD) described in Section 8.2.2
are each encoded as a pairof VLCs, one for the x-component and one for the y-component.
Part of the table of VLCs is shown in Table 8.6 and in ‘tree’ form in Figure 8.8. A further
49 codes (8-13 bits long) are not shown here. Note that the shortest codes represent small
motion vector differences (e.g. MVD = 0 is represented by a single bit code ‘l’).

H.26L universal VLC (UVLC)

The emerging H.26L standard takes a step away from individually calculated Huffman tables
by using a ‘universal’ set of VLCs for any coded element. Each codeword is generated from
176 ENTROPY CODING

000000000X (error)

000001 1 (escape)
0 1D
B
1 ...19 codes
0010000 (1,8,1)
0010001 (1,7,1)
0010010 (1,6,1)
D T 0010011 (1,5,1)
0 0010100 (O,lZ,l)
Start ~ 0010101 ( 0 , l l . l )
1 m 0010110(0,10,1)
00101 11 (0,0.4)
001100 (1,4,1)
001101 (1,3,1)
1 001110(1.2,1)

001111 ( l , l , l )

010000 (0,9,1)

ti-
if-
010001 (0,8,1)
010010 (0,7,1)
010011 (0,6,1)
010100 (0,1.2)
010101 (0,0,3)
0101 1 (0,5,1)

01 100 (0,4,1)
01 101 (0,3.1)
0111 (l,O,l)
10 (O,O,1)

‘. 1 :::9
110 (O,l,l)

Figure 8.7 H.263/MPEG-4TCOEFVLCs


HUFFMAN CODING 177

Table 8.6 H.263/MPEG-4motionvector


difference (MVD) VLCs
Code MVD
0 1
f0.5 010
- 0.5 01 1
+l 0010
-1 001 1
+ 1.5 00010
- 1.5 00011
+2 0000110
-2 000011 1
+ 2.5 00001010
- 2.5 00001011
+3 0000 1000
-3 0000 1001
+ 3.5 00000110
- 3.5 000001 11
... ...

the following systematic list:

...

where xk is a single bit. Hence there is one l-bit codeword; two 3-bit codewords; four 5-bit
codewords; eight 7-bit codewords; andso on. Table 8.7 shows the first 12 codes and these are
represented in tree form in Figure 8.9. The highly regular structure of the set of codewords
can be seen in this figure.
Any data element to be coded (transform coefficients, motion vectors, block patterns, etc.)
is assigned a code from the list of UVLCs. The codes are not optimised for a specific data
element (since the same set of codes is used for all elements): however, the uniform, regular
structure considerably simplifies encoder and decoder design since the same methods can be
used to encode or decode any data element.

8.3.4 EntropyCodingExample

This examplefollows the process of encoding and decoding a block of quantised coefficients
in an MPEG-4 inter-coded picture. Only six non-zero coefficients remain in the block: this
178 ENTROPY CODING

Table 8.7 H.26L universal VLCs


Index x2 X1 X0 Codeword
0 NIA 1
1 0 00 1
2 1 01 1
3 0 0000 1
4 1 00011
5 0 01001
6 1 01011
7 0 OOoooO1
8 1 0000011
9 0 0001001
10 1 0001011
11 0 0 100001
... ... ... ... ...

...39 codes

...10 codes

A
ooooo11o (3.5)
T om001 1 1 (-3.5)

aoooO1ooO (3)
oooo1001 (-3)
- b
1
oooO1010 (2.5)
T ooO01011 (-2.5)

A
ooo0110 (+2)
m 1 1 1 (-2)

i ooO10 (+l .5)


o o O 1 1 (-1.5)

Start ~
0

1
i 010 (+0.5)
011 (-0.5)

Figure 8.8 H.263iMPEG-4 MVDVLCs


HUFFMAN CODING 179

...etc

0000001 (7)

...etc

0000011 (8)

...etc

0001001 (9)

...etc

-
ooo1011 (10)
0001 1 (4)

c
001 (1)
...etc

01oooo1 (11)

...etc

1 01001 (5)
O l o o o l l (12)

...etc

0101001 (13)

...etc

0101011 (14)

- 1 (0)
- 011 (2)
0101 1 (6)

Figure 8.9 H.26L universal VLCs


180 ENTROPY CODING

wouldbecharacteristic of eitherahighlycompressedblockorablockthathasbeen
efficiently predicted by motion estimation.

Quantised DCT coefficients (empty cells are ‘0’):

Zigzag reordered coefficients:

4 - 1 0 2 - 3 0 0 0 0 0 - 1 0 0 0 1 0 o...

TCOEF variable length codes: (from Table 8.5: note that the last bit is the sign)
00101110; 101; 0101000; 0101011; 010111; 0011010

Transmitted bit sequence:


001011101010101000010l01l01011I00l10l0

Decoding of this sequence proceeds as follows. The decoder ‘steps’ through the TCOEF tree
(shown in Figure 8.7) until it reaches the ‘leaf’ 00101 11. The next bit (0) is decoded as the
sign and the (last, run, level) group (0, 0, 4) is obtained. The steps taken by the decoder for
this first coefficient are highlightedin Figure 8.10. The process is repeated with the ‘leaf’ 10
followed by sign (1) and so on until a ‘last’ coefficient is decoded. The decoder can now fill
the coefficient array and reverse the zigzag scan to restore the array of 8 x 8 quantised
coefficients.

8.3.5 VariableLengthEncoderDesign

Sofiware design

A general approach to variable-length encoding in software is as follows:


HUFFMAN CODING 181

001OOOO (1,8,1)
0010001 (1,7,1)
0010010(1,6,1)
0010011 (1,5,1)
0 0010100 (0,12,1)
Start 0010101 (O,ll,l)
0010110 (O,lO,l)
0010111 (0,0,4)

Figure8.10 Decoding of codeword OOlOllls

f o r eachdatasymbol
findthecorrespondingVLCvalueandlength( i n b i t s )
packthisVLCintoanoutputregisterR
if the contents of RexceedLbytes
writeL (least significant) bytestotheoutputstream
s h i f t R by L b y t e s

Example

Using the entropy encoding example above, L = 1 byte, R is empty at start of encoding:
182 ENTROPY CODING

Thefollowingpackedbytesarewritten totheoutputstream:00101110,01000101,
10101 101,0010111 1. At the endof the above sequence, the output register R still contains
6 bits (001101). If encoding stops here, it willbe necessary to ‘flush’ the contents of R to
the output stream.
The MVD codes listed inTable 8.6 can be stored in a simple look-up table. Only64 valid
MVD values exist and the contents of the look-up table are as follows:

[ index 1 [vlc] [ length ]


where [index] is a number in the range 0 . . .63 that is derived directly from MVD, [vlc] is the
variable length code ‘padded’with zeros and representedwith a fixed number of bits (e.g. 16
or 32 bits) and [length] indicates the number of bits present in the variable length code.
Converting (last, run, level) into the TCOEF VLCs listed in Table 8.5 is slightly more
problematic. The 102 predetermined combinations of (last, run, level) have individualVLCs
assignedtothem(thesearethemostcommonlyoccurringcombinations) and anyother
combination must be converted to an Escape sequence. The problem is that there are many
more possible combinations of (last, run, level) than there are individual VLCs. ‘Run’ may
take any value between 0 and 62; ‘Level’ any value between 1 and 128; and ‘Last’ is 0 or 1.
This gives 16 002 possible combinations of (last, run, level). Three possible approaches to
finding the VLC are as follows:

1. Large look-up table indexed by (last, run, level). The size of this table may be reduced
somewhatbecauseonlylevelsin therange 1-12 and runs in therange 0-40 have
individual VLCs. The look-up procedure is as follows:

i f ( \ l e v e l ] < 1 3 a n d r u n3<9 )
lookuptablebasedon (last, run, level)
returnindividualVLCor calculateEscape sequence
else
calculate Escape sequence

The look-up table has ( 2 x 4 0 12)


~ = 960 entries; 102 of these contain individual VLCs
and the remaining 858 contain a flag indicating that an Escape sequence is required.
2. Partitioned look-up tables indexedby (last, run, level). Basedon the valuesof last, runand
level, choose a smaller look-up table (e.g. a table that only applies when last =O). This
requires one or more comparisons before choosingtable the but allows the large table to
be
split intoa number of smaller tables with fewer entries overall. The procedure as follows:
is

i f ( l a s t , r u n , l e v e l ) E { s e t A}
l o o k up t a b l e A
returnVLCor calculateEscape sequence
e l s e i f ( l a s t , r u n , l e v e l ) E { s e t B}
l o o k up t a b l e B
returnVLCor calculateEscapesequence
....
else
calculate Escape sequence
CODING HUFFMAN 183

For example, earlier versions of the H.263 ‘test model’ software used this approach to
reduce the number of entries in the partitioned look-up tables to 200 (i.e. 102 valid VLCs
and 98 ‘empty’ entries).
3. Conditional expression for every valid combination of (last, run, level). For example:

switch ( l a s t , run, level)


c a s e {A} : v l c = v A , l e n g t h = 1 A
c a s e { B } : v l c = v B , l e n g t h= lB
. . . ( 100 mor e c a s e s ) . . .
default : calculateEscapesequence

Comparing the three methods, method 1 lends itself to compact code, is easy to modify (by
changing the look-up table contents) and is likely beto computationally efficient; however, it
requires a large look-up table, most of which is redundant. Method 3, at the other extreme,
requires the most code and is the most difficult to change (since each valid combination is
‘hand-coded’) but requires the least data storage. On some platforms it may be the slowest
method. Method 2 offers a compromise between the other two methods.

Hardware design

Ahardwarearchitecture for variablelengthencodingperformssimilartasks to those


described above and an example is shown in Figure 8.1 1 (based on a design proposed by
Lei and Sun’). A ‘look-up’ unit finds the length and value of the appropriate VLC and passes
these to a ‘pack’ unit. The pack unit collects together a fixed number of bits (e.g. 8, 16 or
32 bits) and shifts these out to a stream buffer. Within the ‘pack’ unit, a counter records the
number of bits in the output register. When this counter overflows, a data word is output (as
in the example above) and the remaining upper bits in the output register are shifted down.
The design of the look-up unit is critical to the size, efficiency and adaptability of the
design. Options range from a ROM or RAM-based look-up table containing all valid codes
plus‘dummy’entriesindicatingthatan Escape sequenceisrequired,toa ‘Hard-wired’
approach (similar to the ‘switch’ statement described above) in which each valid combina-
tion ismapped to theappropriateVLC and length fields. This approachissometimes
described as a‘programmablelogic array’ (PLA)look-up table. Another example of a
hardware VLE is presented e l ~ e w h e r e . ~

byte or
word stream
calculate VLC
h table VLC select

Figure 8.11 Hardware VLE


184 ENTROPY CODING

8.3.6 VariableLengthDecoderDesign

Software design

The operation of a decoder for VLCs can be summarised as follows:

scanthroughbits inaninput buffer


i f a v a l i d V L C ( l e n g t h L )i s d e t e c t e d
removeLbitsfrombuffer
returncorrespondingdataunit
ifinvalidVLCisdetected
returnanerror flag

Perhaps the most straightforward way of finding a valid VLC is to step through the relevant
Huffman code tree. For example, a H.263 / MPEG-4 TCOEF code may be decoded by
stepping through the tree shown in Figure 8.7, starting from the left:

i f ( f i r s t b i t= 1)
i f ( s e c o n d b i t = 1)
i f ( t h i r d b i t = 1)
if (fourthbit=l)
r e t u r n (0,0,2)
else
r e t u r n (0,2,1)
else
r e t u r n (O,l,l)
else
r e t u r n (0,0,1)
else
_ _ _ decode a l l VLCs s t a r t i n g w i t h 0

This approach requires a large nested if. . . else statement (or equivalent) that can deal with
104cases(102uniqueTCOEFVLCs,oneescapecode,plusanerrorcondition).This
method leads to a large code size, may be slow to execute and is difficult to modify (because
the Huffman tree is ‘hand-coded’ into the software); however, no extra look-up tables are
required.
An alternative is touse one or more look-up tables. The maximum lengthof TCOEF VLC
(excluding the sign bit and escape sequences) is 13 bits. We can construct a look-up table
whose index is a 13-bit number (the 13 Isbs of the input stream). Each entry of the table
contains either a (last, run, level) triplet or a flag indicating Escape or Error; 213= 8192
entries are required, most of which will be duplicates of other entries. For example, every
code beginning with ‘10. . .’ (starting with the Isb) decodes to the triplet (0, 0, 1).
An initial test of the range of the 13-bit number maybe used to select one of a number of
smaller look-up tables. For example, the H.263 reference model decoder described earlier
breaks the table into three smaller tables containing around 300 entries (about 200 of which
are duplicate entries).
HUFFMAN CODING 185

1 4
inputd
input
register
bitstream
Shift
one or
Ezrrbits
::1
Find VL
code
data unit

Figure 8.12 Hardware VLD

The choice of algorithm may depend on the capabilities of thesoftwareplatform. If


memory is plentiful and array access relatively fast, a large look-up table may be the best
approach for speed and flexibility. If memory is limited and/or array access is slow, better
performance may be achieved with an ‘if. . . else’ approach or a partitioned look-up table.
Whichever approach is chosen, VL decoding requires a significant amount of bit-level
processing and for many processors this makes it a computationally expensive function. An
interestingdevelopment in recentyearshasbeen the emergence of dedicatedhardware
assistance for softwareVLdecoding. The PhilipsTriMedia and EquatodHitachiMAP
platforms, for example, contain dedicated variable length decoder (VLD) co-processors that
automatically decode VL data in an input buffer, relieving the main processor of the burden
of variable length decoding.

Hardware design

Hardware designs for variable length decoding fall into two categories: (a) those thatdecode
n bits from the input stream every m cycles (e.g. decoding 1 or 2bits per cycle) and (b) those
that decode n complete VL codewords every m cycles (e.g. decoding 1 codeword in one or
two cycles). The basic architecture of a decoder is shown in Figure 8.12 (the dotted line
‘code length L is only required for category (b) decoders).

Category(a), n bits per m cycles Thistype of decoderfollowsthrough the Huffman


decoding tree. The simplestdesignprocesses one level of the tree every cycle: this is
analogous to the large ‘if. . .else’ statement described above. The shift register shown in
Figure 8.12 shifts 1 bit per cycle to the ‘Find VL code’ unit. This unit steps through the tree
(basedon the value of eachinput bit) until a valid code (a‘leaf’)isfoundand can be
implemented with a finite state machine (FSM) architecture.For example, Table 8.8 lists part
of the FSM for theTCOEF tree shown in Figure 8.7. Each state corresponds to a node of the
Huffmantreeand the nodes in thetablearelabelled(withcircles) in Figure 8.13 for
convenience.
There are 102 nodes (and hence 102 states in the FSM) and 103 output values. To decode
l 1 10, for example, the decoder traces the following sequence:

State 0 + State 2 + State 5 + State 6 + output (0, 2, 1)

Hence the decoder processes 1 bit per cycle (assuming that a state transition occursper clock
cycle).
186 ENTROPY CODING

Table 8.8 Part of state table for TCOEF decoding


stateNextInputState or output
0 0
1
1 0
1
2 0
1
3 0
1
4 0
1
5 0
1
6 0
1

Thistype of decoderhasthedisadvantagethattheprocessingratedepends on the


(variable) rate of the coded stream. It is often more useful to be capableof processing one or
morecompleteVLCsperclockcycle(forexample,toguaranteeacertaincodeword
throughput), and this leads to the second category of decoder design.

Category (b), n codewords per m cycles This is analogous to the ‘large look-up table’
approach in a software decoder. K bits (stored in the input shift register) are examined per
cycle, where K is the largest possible VLC size (13, excluding the sign bit, in the example of
H.263MPEG-4 TCOEF). The‘Find VL code’ unitin Figure 8.12 checks all combinationsof
K bits and finds a matching valid code, Escape code or flags an error. The length of the
matching code (L bits) is fed back and the shift register shifts the input data by L bits (i.e.
L bitsareremovedfromtheinputbuffer).HenceacompleteL-bitcodewordcan be
processed in one cycle.
The shift register can be implemented using a barrel shifter (a shift-register circuit that
shifts its contents by L places in one cycle). The ‘Find VL code’ unit may be implemented
using logic (a PLA). The logic array should minimise effectively sincemost of the possible
inputcombinationsare‘don’tcares’. In theTCOEFexample, all 13-bitinputwords
‘IOXXXXXXXXXXX’ map to the output (0, 0, l). It is also possible to implement this
unit as a ROM or RAM look-up table with 213 entries.
A decoder that decodes one codeword per cycle is described by Lei and Sun2 and Chang
and Me~serschmitt~ examine the principles of concurrent VLC decoding. Further examples
of VL decoders can be found elsewhere.526

8.3.7 Dealing with Errors

An errorduringtransmission may causethedecodertolosesynchronisationwiththe


of subsequent VLCs. These
sequence of VLCs and this in turn can cause incorrect decoding
HUFFMAN CODING 187

0
Start
1
c9

11 11 (0,0,2)
I

Figure 8.13 Part of TCOEF tree showing state labels

decoding errors may continue to occur (propagate) until a resynchronisation point occursin
the bit stream. The synchronisation markers describedin Section 8.2.2 limit the propagation
of errors at the decoder. Increasing the frequency of synchronisation markers in the bit
streamcanreducetheeffect of anerroronthedecodedimage:however,markersare
‘redundant’ overhead and so this also reduces compression efficiency. Transmission errors
and their effect on coded video are discussed further in Chapter 11.
Error-resilient alternatives to modified Huffman codes have been proposed. For example,
MPEG-4 (video) includes an option touse reversible variable length codes(RVLCs), a class
of codewords that may be successfully decoded in either a forward or backward direction
from a resynchronisation point. Whenan error occurs, it is usually detectable by the decoder
(since a serious decoder error is likely to violate the encoding syntax). The decoder can
decode the current section of data in both directions, forward from the previous synchro-
nisation point and backward from the next synchronisation point. Figure 8.14 shows an
example. Region (a) is decoded and then an error is identified. The decoder ‘skips’ to the
188 ENTROPY CODING

I Header 1 Decoded (a)


mbn Decoded (b)
Synchronisation
marker

Figure 8.14 DecodingwithRVLCswhen an error is detected

nextresynchronisation point anddecodesbackwards from theretorecoverregion(b).


Without RVLCs, all of region (b) would be lost.
An interesting recent development is the useof 'soft-decision' decoding of VLCs, utilising
information available from the communications receiver about the probability of error in
each codeword to improve decoding performance in the presence of channel n ~ i s e . ~ - ~

8.4 ARITHMETIC CODING


Entropy coding schemes based on codewords that are an integral number of bits long (such
as Huffman coding or UVLCs) cannot achieve optimal compression of every set of data.
This is because the theoretical optimum number of bits to represent a data symbol isusually
a fraction (rather than an integer). This optimum number of bits is the 'information content'
logz( U P ) , where P is the probability of occurrence of each data symbol. In Table 8.1, for
example,the motionvector '0.5' should be representedwith 3.31 bits formaximum
compression.Huffman coding producesa5-bitcodewordfor this motionvector and so
the compressed bit stream is likely to be larger than the theoretical maximally compressed
bit stream.
Arithmetic codingprovidesapracticalalternativetoHuffman coding and can more
closely approach the theoretical maximum compression.'' An arithmetic encoder converts a
sequence of data symbolsinto a singlefractionalnumber. Thelonger thesequence of
symbols, the greater the precision required to represent the fractional number.

Example

Table 8.9 lists five motion vector values ( - 2 , - 1, 0, 1, 2 ) . The probability of occurrence of
each vector is listed in the second column. Each vector is assigned a subrange within the

Table 8.9 Subranges


Probability
Vector Iogd 1/ P ) Subrange
-2 0.1 3.32 0-0.1
-1 0.2 2.32 0.1-0.3
0 0.3-0.7 0.4 1.32
1 0.7-0.9 0.2 2.32
2 0.1 0.9-1 3.32 .O
ARITHMETIC CODING 189

range 0.0-1.0, depending on itsprobability of occurrence. In this example, ( - 2) hasa


probability of 0.1 and is given the subrange 0-0.1 (i.e. the first 10% of the total range 0-1 .O).
( - 1) has a probability of 0.2 and is given the next 20% of the total range, i.e. the subrange
0.14.3. After assigning a subrange to each vector, the total range 0-1.0 has been ‘divided’
amongst the data symbols (the vectors) according to their probabilities. The subranges are
illustrated in Figure 8.15.
The encoding procedure is presented below, alongside a worked example for the sequence
of vectors: (0, - l , 0, 2).

Encoding procedure

Subrange Range
Encoding
procedure (L + H) Symbol (L + H) Notes
1. Set the initial range 0 + 1.0
2. For the first data (0) 0.3 + 0.7
symbol, find the
corresponding subrange
(low to high).
3. Set the new range (1) 0.3 + 0.7
to this subrange
4. For the next data symbol, (- 1)0.1 + 0.3 This is the subrange
find the subrange L to H within the interval 0-1
5. Set the new range (2) to 0.34 + 0.42 0.34 is 10% of the range;
this subrange within the 0.42 is 30% of the range
previous range
6. Find the next subrange (0) 0.3 + 0.7
7. Set the new range (3) 0.364 + 0.396 0.364 is 30% of the range;
within the previous range 0.396 is 70% of the range
8. Find the next subrange (2)
0.9 + 1.0
9. Set the new range (4) 0.3928 + 0.396 0.3928 is 90%of the range;
within the previous range 0.396 is 100% of the range

Each time a symbol is encoded, the range (L to H) becomes progressively smaller.At the end
of the encoding process (four steps in this example), we are left with a final range (L to H).
The entire sequence of data symbols can be fully represented by transmitting a fractional
number that lies within this final range. In the example above, we could send any number in

Total range
t
0.7 0.3 0 0.1 1
190 CODING ENTROPY

the range 0.3928-0.396: for example, 0.394. Figure 8.16 shows how the initial range (0-1) is
progressivelypartitionedintosmallerranges as eachdatasymbolisprocessed.After
encoding the first symbol (vector 0), the new range is (0.3, 0.7). The next symbol (vector -1)
selects the subrange (0.34, 0.42) which becomes the new range, andso on. The final symbol
(vector +2) selects the subrange (0.3928, 0.396) and the number 0.394 (falling within this
range) is transmitted. 0.394 can be represented as a fixed-point fractional number using 9
bits, i.e. our data sequence (0, - l , 0, 2) is compressed to a 9-bit quantity.

Decoding procedure

The sequence of subranges (and hence the sequence of data symbols) can be decoded from
this number as follows.

Decoding procedure symbol


Decoded
Subrange
Range
1 . Set the initial range 0+ 1
2. Find the subrange in which the 0.3 ---t 0.7 (0)
received number falls. This indicates
the first data symbol
3.Setthenewrange (1) to thissubrange0.3 4 0.7
4. Find the subrange of the new 0.34 4 0.42
range in which the received
number falls. This indicates
the second data symbol
5. Set the new range ( 2 ) to this 0.34 -+ 0.42
subrange within the previous range
6. Find the subrange in which the 0.364 4 0.396
received number falls and decode
the third data symbol
7. Set thenewrange(3) to thissubrange0.364 ---f 0.396
within the previous range
3928
the
which
subrange
in the
8. Find 4 0.396 (2)
received number falls and decode
the fourth data symbol

The principal advantage of arithmetic coding is that the transmitted number (0.394 in this
case, which can be represented as a fixed-point number with sufficient accuracy using 9 bits)
is not constrained to an integral number of bits for each transmitted data symbol.To achieve
optimal compression, the sequence of data symbols should be represented with:

In this example, arithmetic coding achieves 9 bits which is close to optimum. A scheme
using an integral number of bits for each data symbol (such as Huffman coding) would not
come so close to theoptimumnumber of bits and in general,arithmeticcodingcan
outperform Huffman coding.
ARITHMETIC CODING 191

0 0.1 0.9 1

4,
0.3 0.34 0.42
1

(2)

Figure8.16 Arithmeticcodingexample

8.4.1 Implementation Issues

A number of practical issues need to be taken into account when implementing arithmetic
coding in software or hardware.

Probability distributions

As with Huffman coding, it is not always practical to calculate symbol probabilities prior to
coding. In several video coding standards(e.g. H.263, MPEG-4, H.26L), arithmetic coding is
provided as anoptionalalternativetoHuffmancodingandpre-calculatedsubranges
are defined by thestandard(basedon‘typical’probabilitydistributions).This has the
advantage of avoiding the need to calculate and transmit probability distributions, but the
disadvantage that compression will be suboptimal for a video sequence that doesnot exactly
follow the standard probability distributions.

Termination

In our example, we stopped decoding after four steps. However, there is nothing containedin
the transmitted number (0.394) to indicate the number of symbols that must be decoded: it
could equally be decoded as three symbolsor five. The decoder must determine when to stop
decoding by some other means. In thearithmeticcodingoption specified in H.263, for
example, the decoder can determinethenumber of symbolstodecodeaccordingtothe
syntax of the coded data. Decodingof transform coefficients in a block halts when an end-of-
block code is detected. Fixed-length codes (such as picture start code) are included in the bit
stream and these will ‘force’ the decoder to stop decoding (for example, if a transmission
error has occurred).
192 ENTROPY CODING

Fixed-point arithmetic

Floating-point binary arithmetic is generally less efficient than fixed-point arithmetic and
some processors do not support floating-point arithmetic at all. An efficient implementation
with fixed-point arithmetic can be achieved by specifying the subranges as fixed-precision
binary numbers. For example, in H.263, each subrange is specified as an unsigned 14-bit
integer(i.e.atotalrange of 0-16383). Thesubrangesforthe differentialquantisation
parameter DQUANT are listed as an example:

H.263 DQUANT value Subrange


2 0-4094
1 19 4095-8 1
-1 8192-12286
-2 12287-16383

Incremental encoding

Asmoredatasymbolsareencoded, theprecision of thefractionalnumberrequiredto


represent the sequence increases. It is possible for the number to exceed the precision of the
processor after a relatively small numberof data symbols and a practical arithmetic encoder
must take steps to ensure that this does not occur. This can be achieved by incrementally
encoding bits of the fractional number as they are identified by the encoder.In our example
above, after step 3, the range is 0.364-0.396. We know that the final fractional number will
begin with ‘0.3.. . ’ and so we can send the most significant part (e.g. 0.3, or its binary
equivalent) without prejudicing the remaining calculations. At the same time, the limits of
the range are left-shifted to extend the range. In this way, the encoder incrementally sends
themostsignificantbits of thefractionalnumberwhilstcontinuallyreadjustingthe
boundaries of the range to avoid arithmetic overflow.

Patent issues

A number of patents have been filed that cover aspects of arithmetic encoding (such as
IBM’s‘Q-coder’arithmeticcoding algorithm”). It is not entirelyclearwhetherthe
arithmetic coding algorithms specified in the image and video coding standards are covered
by patents. Some developers of commercial video coding systems have avoided the use of
arithmeticcodingbecause of concernsaboutpotentialpatentinfringements,despiteits
potential compression advantages.

8.5 SUMMARY
An entropy coder maps a sequence of data elements to a compressed bit stream, removing
statistical redundancy in the process. In a block transform-based video CODEC, the main
REFERENCES 193

data elements are transform coefficients (run-level coded to efficiently represent sequences
of zero coefficients), motion vectors (which may be differentiallycoded)andheader
information. Optimum compression requires the probability distributions of the data to be
analysed prior to coding; for practical reasons, video CODECs use standard pre-calculated
look-up tables for entropy coding.
The two most popularentropycodingmethodsfor video CODECsare ‘modified’
Huffman coding (in whicheachelementismappedtoaseparate VLC) and arithmetic
coding(in which a series of elements arecoded to formafractionalnumber). Huffman
encoding may be carried using a series of tablelook-upoperations; a Huffmandecoder
identifies each VLC and this is possible because the codes are designed such that no code
forms the prefix of any other. Arithmetic coding is carried out by generating and encoding a
fractional number to represent a series of data elements.
This concludes the discussion of the main internal functions of a video CODEC (motion
estimation and compensation, transform coding and entropy coding). The performance of a
CODEC in a practical video communication system can often be dramatically improved by
filtering the source video (‘pre-filtering’) and/or the decoded video frames (‘post-filtering’).

REFERENCES
1. D. A. Huffman, ‘A method for the construction of minimum-redundancy codes’,Proceedings ofthe
Institute of Electrical and Radio Engineers, 40(9), September 1952.
2. S. M. Lei and M-T. Sun, ‘An entropy coding system for digital HDTV applications‘, IEEE Trans.
CSW, 1(1), March 1991.
3. Hao-ChiehChang,Liang-GeeChen,Yung-ChiChangandSheng-ChiehHuang, ‘A VLSI archi-
tecture design of VLC encoder for high data rate videohmage coding’, 1999 IEEE International
Symposium on Circuits and Systems (ISCAS ’99).
4. S. F. Chang and D. Messerschmitt, ‘Designing high-throughput VLC decoder, Part I-concurrent
VLSI architectures’, IEEE Trans. CSVT, 2(2),June 1992.
5 . J. Jeon, S. Park and H. Park, ‘A fast variable-length decoder using plane separation’, IEEE Trans.
CSVT, 10(5),August 2000.
6. B-J. Shieh, Y-S. Lee and C-Y. Lee,‘A high throughput memory-based VLC decoder with codeword
boundary prediction’, IEEE Trans. CSW, lo@),December 2000.
7. A. KopanskyandM.Bystrom,‘SequentialdecodingofMPEG-4codedbitstreamsforerror
resilience’, Proc. Con$ on Information Sciences and Systems, Baltimore, 1999.
8. J. Wen and J. Villasensor, ‘Utilizing soft information in decoding of variable length codes’, Proc.
IEEE Data Compression Conference, Utah, 1999.
9. S. KaiserandM.Bystrom,‘Softdecodingofvariable-lengthcodes’, Proc. IEEE International
Communications Conference, New Orleans, 2000.
IO. I. Witten, R. Neal and J. Cleary, ‘Arithmetic coding for data compression’, Communications ofthe
ACM, 30(6), June 1987.
1 1. J. Mitchell and W. Pennebaker, ‘Optimal hardware and software arithmetic coding procedures for
the Q-coder’, IBM Journal of Research und Development, 32(6), November 1988.

You might also like