2 Transform Coding - KLT - Discriet
2 Transform Coding - KLT - Discriet
Transform coding uses transforms to obtain representations of the signal that aid
compression.
In particular, we look at transforms that represent signals in the frequency domain.
Humans perceive signals in only a narrow band of frequencies—spatial frequencies, in the
case of images, and temporal frequencies, in the case of audio.
We use these limitations to discard or deemphasize components of the signal that may
not be perceived, or not be as important for perception.
Transforms allow us to find representations of signals with correlation structures which
compact most of the signal energy into a few components.
This then allows us to not expend resources on the components with little signal energy -
thus obtaining compression.
look at the height and weight as the coordinates of a point in two-dimensional space.
Notice that the output values tend to cluster around the line y = 2.5x.
We can rotate this set of values by the transformation
θ = Ax
φ is the angle between the x-axis and the y = 2.5x line, and
θ
= 0
θ1
The transforms that we will deal will be linear transforms. that is, we can get the
sequence {θn } from the sequence {xn } as
N−1
X
θn = xi an,i .
i=0
The original sequence {xn } can be recovered from the transformed sequence {θn } via the
inverse transform :
N−1
X
xn = θi bn,i
i=0
θ = Ax
x = Bθ
The discrete cosine transform (DCT) gets its name from the fact that the rows of the
NxN transform matrix C are obtained as a function of cosines.
q
1
cos (2j+1)iπ i = 0, j = 0, 1, 2, ....N − 1
[ci,j ] = qN 2N
2
cos (2j+1)iπ i = 0, 1, 2, ....N − 1, j = 0, 1, 2, ....N − 1
N 2N
The DCT is closely related to the discrete Fourier transform (DFT) and, in fact, can be
obtained from the DFT. However, in terms of compression, the DCT performs better than
the DFT.
The DWHT transform matrix H can be obtained from the Hadamard matrix by
multiplying it by a normalizing factor so that HH T = I instead of NI and by reordering
the rows in increasing order of sequence.
The sequence of a row is half the number of sign q changes in that row.
Normalization involves multiplying the matrix by N1 .
Because the matrix without the scaling factor consists of 1, the transform operation
consists simply of addition and subtraction.
So this transformation is useful when computation power is less.
However, the amount of energy compaction obtained is substantially less than that of the
DCT. .
Dr. Waquar Ahmad (NITC) Lecture-6 November 15, 2021 12 / 17
QUANTIZATION AND CODING OF TRANSFORM
COEFFICIENTS
If the amount of information conveyed by each coefficient is different, Then assign
differing numbers of bits to the different coefficients.
There are two approaches to assigning bits. , First approach relies on the average
properties of the transform coefficients, while the second approach assigns bits as needed
by individual transform coefficients.
In the first approach, we first obtain an estimate of the variances of the transform
coefficients.
These estimates can be used by one of two algorithms to assign the number of bits used
to quantize each of the coefficients.
We assume that the relative variance of the coefficients corresponds to the amount of
information contained in each coefficient.
Thus, coefficients with higher variance are assigned more bits than coefficients with
smaller variance.
Dr. Waquar Ahmad (NITC) Lecture-6 November 15, 2021 13 / 17
QUANTIZATION AND CODING OF TRANSFORM
COEFFICIENTS
Let us find an expression for the distortion, then find the bit allocation that minimizes the
distortion.
If the average number of bits per sample to be used by the transform coding system is R
and the average number of bits per sample used by the kth coefficient is Rk , then
M
1 X
R= Rk (1)
M
k=1
where αk is a factor that depends on the input distribution and the quantizer.
Dr. Waquar Ahmad (NITC) Lecture-6 November 15, 2021 14 / 17
QUANTIZATION AND CODING OF TRANSFORM
COEFFICIENTS
The total reconstruction error is given by
M
X
σr2 = αk 2−2Rk σθ2k (2)
k=1
The objective of the bit allocation procedure is to find Rk to minimize Equation 2 subject
to the constraint of Equation 1.
If we assume that αk is a constant α for all k, then the minimization problem in terms of
Lagrange multipliers is
M M
!
X 1 X
J=α 2−2Rk σθ2k − λ R − Rk (3)
M
k=1 k=1