0% found this document useful (0 votes)
19 views

Lec8 - Transform Coding (JPG)

The document discusses block transform coding and the discrete cosine transform (DCT). It explains that DCT converts image blocks from the spatial domain to the frequency domain. DCT has good energy compaction properties and can be implemented using fast Fourier transforms. Images are transformed using a two-dimensional DCT by first applying a one-dimensional DCT across rows, then across columns, to transform 8x8 pixel blocks into the frequency domain. The DC coefficient is the top-left value, with other coefficients representing higher frequencies.

Uploaded by

Ali Ahmed
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Lec8 - Transform Coding (JPG)

The document discusses block transform coding and the discrete cosine transform (DCT). It explains that DCT converts image blocks from the spatial domain to the frequency domain. DCT has good energy compaction properties and can be implemented using fast Fourier transforms. Images are transformed using a two-dimensional DCT by first applying a one-dimensional DCT across rows, then across columns, to transform 8x8 pixel blocks into the frequency domain. The DC coefficient is the top-left value, with other coefficients representing higher frequencies.

Uploaded by

Ali Ahmed
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

CS 411 : Data Compression

Lecture 8
Block Transform Coding
Block Transform
Image transform
 Two main types:
-Orthogonal transform:
e.g. Walsh-Hadamard transform, DCT

-Subband transform:
e.g. Wavelet transform

11/28/2022 3
Transformation Example
2x2 array of pixels

A B

C D
Transform Inverse
Transform
X0 = A
An = X0
X1 = B – A
Bn = X1 + X0
X2 = C – A
Cn = X2 + X0
X3 = D – A
Dn = X3 + X0
Orthogonal transform
 Orthogonal matrix W
 c1   w11 w12 w13 w14   d1 
  w w22 w23
 
w24   d 2 
 c2    21
 c3   w31 w32 w33 w34   d3   C=W•D
  w  
w44   d 4 
 c4   41 w42 w43

 Reducing redundancy
 Isolating frequencies

11/28/2022 5
Block Transform Coding
Walsh-Hadamard transform (WHT)

11/28/2022 6
Block Transform Coding

11/28/2022 7
Block Transform Coding

Consider a subimage of size n  n whose forward, discrete


transform T (u, v) can be expressed in terms of the relation

n 1 n 1
T (u, v)   g ( x, y )r ( x, y, u, v)
x 0 y 0

for u, v  0,1, 2,..., n -1.

11/28/2022 8
Block Transform Coding

Given T (u, v), g ( x, y ) similarly can be obtained using the


generalized inverse discrete transform

n 1 n 1
g ( x, y )   T (u, v) s( x, y, u, v)
u 0 v 0

for x, y  0,1, 2,..., n -1.

11/28/2022 9
JPEG

 It is the most common format for storing and


transmitting photographic images on the World Wide Web. These
format variations are often not distinguished, and are simply
called JPEG.
 JPEG typically achieves 10:1 compression with little perceptible
loss in image quality.
 However, JPEG is not well suited for line drawings and other
textual or iconic graphics, where the sharp contrasts between
adjacent pixels can cause noticeable artifacts. Such images are
better saved in a lossless graphics format such as TIFF, GIF, PNG,
or a raw image format.
 As the typical use of JPEG is a lossy compression method, which
reduces the image fidelity, it is inappropriate for exact
reproduction of imaging data (such as some scientific and medical
imaging applications and certain technical image
processing work).
Why JPEG

 The compression ratio of lossless methods (e.g.,


Huffman, Arithmetic, LZW) is not high enough for image
and video compression.
 JPEG uses transform coding, it is largely based on the
following observations:
 Observation 1: A large majority of useful image contents
change relatively slowly across images, i.e., it is unusual
for intensity values to alter up and down several times in a
small area, for example, within an 8 x 8 image block.
A translation of this fact into the spatial frequency domain,
implies, generally, lower spatial frequency components
contain more information than the high frequency
components which often correspond to less useful details
and noises.
 Observation 2: Experiments suggest that humans are
more immune to loss of higher spatial frequency
components than loss of lower frequency components.
11
JPEG

Joint Photographic Experts Group

Goals: Subjects:
 Support still images
 Quality
 True color/grayscale
 Steps
 Usually lossy
 Modes
JPEG Coding
Cr
Cb f(i, j) F(u, v) Fq(u, v) Steps Involved:
Y DCT Quantization 1. Discrete Cosine
Transform of each 8x8
8x8 8x8
pixel array
f(x,y) T F(u,v)
Quant… 2. Quantization using a
Tables table or using a constant
Coding 3. Zig-Zag scan to exploit
Tables Zig Zag redundancy
Scan
4. Differential Pulse Code
Header Modulation(DPCM) on
Tables the DC component and
DPCM Run length Coding of the
Data Entropy AC components
Coding 5. Entropy coding
RLC (Huffman) of the final
output

13
Overview
 Based on Discrete Cosine Transform
(DCT):
0) Image is divided into block N×N
1) The blocks are transformed with 2-D DCT
2) DCT coefficients are quantized
3) The quantized coefficients are encoded
DCT : Discrete Cosine Transform
DCT converts the information contained in a block(8x8) of
pixels from spatial domain to the frequency domain.
 A simple analogy: Consider a unsorted list of 12 numbers
between 0 and 3 -> (2, 3, 1, 2, 2, 0, 1, 1, 0, 1, 0, 0).
Consider a transformation of the list involving two steps
(1.) sort the list (2.) Count the frequency of occurrence of
each of the numbers ->(4,4,3,1 ). : Through this
transformation we lost the spatial information but captured
the frequency information.
 There are other transformations which retain the spatial
information. E.g., Fourier transform, DCT etc. Therefore
allowing us to move back and forth between spatial and
frequency domains.
1-D DCT: 1-D Inverese DCT:

N 1 N 1
(2n1) (2n1)
F()  a(u)  f(n)cos f'(n)  a(u)  F()cos
2
n0
16 a(0)  1 2 16
2 0
a(p) 1 p0 
 15

Discrete Cosine Transform
Block Transform Coding
 2D Discrete Cosine Transform (DCT)
M 1 N 1
 (2m  1)k   (2n  1)l 
F [k .l ]    f [m, n] (k ) (l ) cos   cos  
m0 n 0  2M   2 N 

 1
 for k  0
 N
where k , l  0,1,..., N 1 and  (k )  
 2 for k  1, 2,..., N  1
 N

 Inverse DCT
M 1 N 1
 (2m  1)k   (2n  1)l 
f [m, n]    F [k , l ] (k ) (l ) cos   cos  
k 0 l 0  2M   2N 
16
Discrete Cosine Transform
 The basis functions of DCT are real. (DFT has
complex basis functions.)
 DCT has very good energy compaction
properties.
 DCT can be expressed in terms of DFT,
therefore, Fast Fourier Transform
implementation can be used.
 In the case of block-based image compression,
(e.g., JPEG), DCT produces less artifacts along
the boundaries than DFT does.
17
2-D DCT
• Images are two-dimensional; How do you perform 2-D DCT?
– Two series of 1-D transforms result in a 2-D transform as demonstrated in
the figure below

f(i, j)

1-D 1-D
Row- Column-
wise wise

8x8 8x8 8x8

F(u, v)

r F(0,0) is called the DC component and the rest of F(i,j) are called
AC components

18
2-D Transform Example
• The following example will demonstrate the idea behind a 2-D
transform by using our own cooked up transform: The transform
computes a running cumulative sum.
f(i, j)

Fmy ( )  n  f (n)
1 1 1 1 1 1 1 1 8
1 1 1 1 1 1 1 1 My Transform:
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
8 7 6 5 4 3 2 1
1 1 1 1 1 1 1 1
8 7 6 5 4 3 2 1
1 1 1 1 1 1 1 1
1-D 8 7 6 5 4 3 2 1
8x8
Row- 8 7 6 5 4 3 2 1
64 56 48 40 32 24 16 8
8 7 6 5 4 3 2 1
wise 8 7 6 5 4 3 2 1 56 49 42 35 28 21 14 7
1-D 48 42 36 30 24 18 12 6
8 7 6 5 4 3 2 1
Column-40 35 30 25 20 15 10 5 Fmy (u, v)
8 7 6 5 4 3 2 1
8x8 wise 32 28 24 20 16 12 8 4
24 21 18 15 12 9 6 3
16 14 12 10 8 6 4 2
Note that this is only a hypothetical 8 7 6 5 4 3 2 1
transform. Do not confuse this with DCT 8x8
19
Quantization
• Why? -- To reduce number of bits per sample
F’(u,v) = round(F(u,v)/q(u,v))

• Example: 101101 = 45 (6 bits).


Truncate to 4 bits: 1011 = 11. (Compare 11 x 4 =44 against 45)
Truncate to 3 bits: 101 = 5. (Compare 8 x 5 =40 against 45)
Note, that the more bits we truncate the more precision we lose

• Quantization error is the main source of the Lossy Compression.

• Uniform Quantization:
– q(u,v) is a constant.

• Non-uniform Quantization -- Quantization Tables


– Eye is most sensitive to low frequencies (upper left corner in frequency
matrix), less sensitive to high frequencies (lower right corner)
– Custom quantization tables can be put in image/scan header.
– JPEG Standard defines two default quantization tables, one each for
luminance and chrominance. 20
Zig-Zag Scan
• Why? -- to group low frequency coefficients in top of vector and high
frequency coefficients at the bottom
• Maps 8 x 8 matrix to a 1 x 64 vector

8x8

...
1x64

21
JPEG Encoder/Decoder Block diagram
JPEG - Steps
1. Divide image into 8x8 subimages.
JPEG - Steps
For each subimage do:
2. Shift the gray-levels in the range [-128, 127]
(i.e., reduces the dynamic range requirements of DCT) Shift values
[0, 2^P-1] to [-2^(P-1), 2^(P-1)-1]
-e.g. if (P=8), shift [0, 255] to [-127, 127]
DCT requires range be centered around 0
Values in 8x8 pixel blocks are spatial
values and there are 64 samples values
in each block
3. Apply DCT; yields 64 coefficients
1 DC coefficient: F(0,0)
63 AC coefficients: F(u,v)
Example

[-128, 127] (i.e., DCT spectrum)


The low frequency
components are around the
upper-left corner of the
spectrum (not centered!).
JPEG Steps

4. Quantize coefficients (i.e., reduce the


amplitude of coefficients that do not
contribute a lot).

Q(u,v): quantization array


Computing Q[i][j] - Example
• Quantization Array Q[i][j]
array
Example (cont’d)
Cq(u,v)

C(u,v)

Quantization
array

Q(u,v) Small magnitude coefficients


have been truncated to zero!

“quality” controls how many of


them will be truncated!
JPEG Steps (cont’d)

5. Order the coefficients using zig-zag


ordering
- Creates long runs of zeros (i.e., ideal for RLC)
Example of DCT of image block
Default quantization matrix
Example of DCT of image block
Zig Zag ordering of DCT coefficients
Encoding of quantized DCT coefficients
Encoding of quantized DCT coefficients
• DC coefficient for the current block is predicted of
that of the previous block, and error is coded using
Huffman coding
• AC coefficients:
(a) Huffman code, arithmetic code for non-zeroes
(b) run-length encoding: (number of ’0’s, non-’0’-
symbol)
Dequantization
Inverse DCT
Preparation of AC Coefficients

DC AC01 AC07

Zig-Zag
Sequence

AC70 AC77
Computation of the quantized DCT coefficients

You might also like