cp467_12_lecture14_compression1
cp467_12_lecture14_compression1
Still Image
g
• One page of A4 format at 600 dpi is > 100 MB.
• One color image in digital camera generates 10-30 MB.
• Scanned 3”×7” photograph at 300 dpi is 30 MB.
Digital Cinema
•4K×2K×3 ×12 bits/pel = 48 MB/frame or 1 GB/sec
or 70 GB/min.
Why do we need Image Compression?
1) Storage
2)) Transmission
3) Data access
1990-2000
Disc capacities : 100MB -> 20 GB (200 times!)
but seek time : 15 milliseconds Æ 10 milliseconds
and transfer rate : 1MB/sec ->2 MB/sec.
•Image scanner
•Digital camera
•Video camera,
•Ultra-sound (US), Computer Tomography (CT),
Magnetic resonance image (MRI), digital X-ray (XR),
Infrared.
•etc.
Image types
IMAGE UNIVERSAL
COMPRESSION COMPRESSION
Binary
Gray-scale
images
images
Textual
Video data
images Colour
True colour palette
images images
Why we can compress image?
• Statistical redundancy:
1) Spatial correlation
a) Local - Pixels at neighboring locations have
similar intensities.
b) Global - Reoccurring patterns.
2) Spectral correlation – between color planes.
3) Temporal correlation – between consecutive frames.
• Tolerance to fidelity:
1) PPerceptual
t l redundancy.
d d
2) Limitation of rendering hardware.
Lossy vs. Lossless compression
Near-lossless
Near lossless compression:
medical imaging, remote sensing.
Rate measures
1 N
Mean square error (MSE): MSE = (
∑ i i
y − x )2
N i =1
Signal-to-noise
Signal to noise ratio (SNR): SNR = 10 ⋅ logg10 σ 2 MSE[ ]
(decibels)
N
H = −∑ pi ⋅ log
g 2 ( pi )
i =1
Entropy
H ( si ) = − log 2 ( pi )
The average number of bits for the source S:
N
H = −∑ pi ⋅ log 2 ( pi )
i =1
Entropy for binary source: N=2
1-p
S {0,1}
S={0 1}
p0=p
p
p1=1-p
1p
0 1
H = −( p ⋅ log
g 2 p + (1 − p ) ⋅ log
g 2 (1 − p ))
H=1
H 1 bit for p0=p
p1=0.5
0.5
Entropy for uniform distribution: pi=1/N
Pi=1/N
s1 s2 sN
Examples:
N= 2: pi=0.5;; H=log
g2((2)) = 1 bit
N=256: pi=1/256; H=log2(256)= 8 bits
How to get the probability distribution?
1) Static modeling:
a) The same code table is applied to all input data.
b) One-pass method (encoding)
c) No side information
2) Semi
Semi-adaptive
adaptive modeling:
a) Two-pass method: (1) analysis and (2) encoding.
b) Side information needed (model, code table)
3) Adaptive (dynamic) modeling:
a) One-pass method: analysis and encoding
b) Updating the model during encoding/decoding
c) No side information
Static vs. Dynamic: Example
(1/10)(1.58 1.0 2.32 1.0 0.81 3.0 0.85 0.74 2.46 0.78)
H=(1/10)(1.58+1.0+2.32+1.0+0.81+3.0+0.85+0.74+2.46+0.78
H
=1.45 bits/char
1.16 < 1.45 < 1.58
S.-Ad. Ad. Static
Coding methods
• Shannon-Fano Coding
• Huffman Coding
• Predictive coding
• Block coding
• Arithmetic code
• Golomb-Rice codes
Shannon-Fano Code: A top-down approach
2) Recursively
R i l didivide
id iinto
t parts,
t each
h with
ith approx. th
the same
number of counts (probability)
Shannon-Fano Code: Example (1 step)
si pi
A,B,
A B C C,D,E
DE
A- 15/39
15,7, 6,6,5
B- 7/39
C- 6/39
D- 6/39 0 1
E- 5/39
A,B C,D,E
15+7 =22 6+6+5=17
Shannon-Fano Code: Example (2 step)
si pi
A,B,
A B C C,D,E
DE
A- 15/39
15,7, 6,6,5
B- 7/39
C- 6/39
D- 6/39 0 1
E- 5/39
A,B C,D,E
15+7 =22 6+6+5=17
0 1 0 1
A B C D,E
15 7 6 6+5=11
Shannon-Fano Code: Example (3 step)
si pi
A,B,
A B C C,D,E
DE
A- 15/39
15,7, 6,6,5
B- 7/39
C- 6/39
D- 6/39 0 1
E- 5/39
A,B C,D,E
15+7 =22 6+6+5=17
0 1 0 1
A B C D,E
15 7 6 6+5=11
0 1
D E
6 5
Shannon-Fano Code: Example (Result)
Symbol
y pi -log
g2(pi) Code Subtotal
A 15/39 1.38 00 2*15
B 7/39 2.48 01 2*7
C 6/39 2 70
2.70 10 2*6
D 6/39 2.70 110 3*6
E 5/39 2.96 111 3*5
T t l
Total: 89 bit
bits
0 1
0 1 0 1
Binary tree
A B C 0 1
D E
H=89/39=2.28 bits
Shannon-Fano Code: Encoding
A - 00 Message: B A B A C A C A D E
B - 01 Codes: 01 00 01 00 10 00 10 00 110 111
C - 10
D - 110 Bitstream: 0100010010001000110111
E - 111
0 1
0 1 0 1
Binary tree
A B C 0 1
D E
Shannon-Fano Code: Decoding
0 1
0 1 0 1
Binary tree
A B C 0 1
D E
Huffman Code: A bottom-up approach
INIT:
Put all nodes in an OPEN list
list, keep it sorted all times
according their probabilities;.
REPEAT
a) From OPEN pick two nodes having the lowest
probabilities, create a parent node of them.
b) Assign the sum of the children’s probabilities
to the parent node and inset it into OPEN
c) Assign code 0 and 1 to the two branches of the
tree, and delete the children from OPEN.
Huffman Code: Example
H=87/39=2.23 bits
Huffman Code: Decoding
0 1
A 0 1
Binary tree
0 1 1
B C D E
Properties of Huffman code
• Optimum
p code for a g
given data set requires
q two p
passes.
• Code construction complexity O(N logN).
• Fast lookup table based implementation.
implementation
• Requires at least one bit per symbol.
• Average
A codeword
d d llength
h iis within
i hi one bi
bit off zero-order
d
entropy (Tighter bounds are known): H ≤ R < H+1 bit
• Susceptible to bit errors
errors.
Unique prefix property
A B
y=xi-xi-1
E t
Entropy: Ho= 7.8
7 8 bit
bits/pel
/ l (?) Hr=5.1
5 1 bit
bits/pel
/ l (?)
Coding without prediction
f0=8;
8 p0=p=8/64
8/64 =0.125;
0 125
f1=56; p1 =(1-p)=56/64=0.875
Entropy:
H =-((8/64)*log2(8/64)+(56/64)*log2(56/64))=0.544 bits/pel
Prediction for binary images by pixel above
f p
16 16/64
48 48/64
Entropy:
H =-((16/64)*log2(16/64)+(48/64)*log2(48/64))=0.811 bits/pel
Wrong predictor!
Prediction for binary images pixel to the left
f p
1 16/64
63 63/64
Entropy:
• Prediction byy p
pixel above: H = 0.811 bits /pel
p ((bad!))
1)) Entropy:
py H2=-(0.9801*log
( g2((0.9801)) + 0.0099*log
g2((0.0099)) +
+ 0.0099*log2(0.0099) + 0.0001*log2(0.0001))=
=((0.0284+0.0659
0.0659+0.0659
0.0659+0.0013)/2= 0.081 bits/pel
Why H2=H1?
2) H
Huffman
ff code:
d cA=’0’,
’0’ cB=’10’,
’10’ cC=’110’,
’110’ cD=’111’
’111’
LA=1, LB=2, LC=2, LD=3
Bit t R2 = (1*p
Bitrate (1* A+2*p
2* B+3*p
3* C+3*p
3* D)/2=0.515
)/2 0 515 bits/pel
bit / l
Block coding: n=3
pa=p=0.99, pb=q=0.01
Entropy Hn=0.081 bits/pel
Bitrate for Hufman coder:
n= 1:
1 R1 = 1.0
1 0 bit 2 symbols
b l in
i alphabet
l h b t
n= 2: R2 = 0.515 bits 4 symbols in alphabet
n= 3: R3 = 0.353 bits 8 symbols in alphabet
If block size n → ∞? Hn ≤ Rn < Hn+1/n
1 N 1 N 1
∑ n 2 n n n∑
n i =1
p ( B ) log p ( B ) ≤ R *
≤
i =1
p ( Bn ) log 2 p ( Bn ) +
n
Problem - alphabet size and Huffman table size grows
exponentially with number of symbols n blocked.
Block coding: Example 2, n=1
pa=56/64
pb=8/64
1) Entropy
H=-((56/64)*log
H=-((56/64) log2(56/64)+(8/64)
(56/64)+(8/64)*log
log2(8/64))=0.544
(8/64))=0 544 bits/pel
2) Huffman code: a= ’0’; b=’1’
Bit t R=
Bitrate: R 1 bit/pel
bit/ l
Block coding: Example 2, n=4
pA=12/16
pB=4/16
1) Entropy
H=-((12/16)*log
H ((12/16) log2(12/16) (4/16) log2(4/16))/4=0.203
(12/16)+(4/16)*log 0.203 bits/pel
2) Huffman code: A=’0’, B=’1’
Bitrate R = (1
(1*pA+1
1*pB)/4
)/4=0.250
0.250 bits/pel
Binary image compression
• Run-length coding
• Predictive coding
• READ code
• Block coding
• G3 and G4
• JBIG: Prepared by Joint Bi-Level Image Expert Group in
1992
Compressed file size
Model size
n=1: Model size: pa, pb → 21*8 bits
n=2: Model size: pA, pB , pC, pD → 22*8 bits
n=k: Model size: {pA, pB , …, pD} → 2k*8 bits
Compressed data size for S symbols in input file:
R*S bits, where R is bitrate (bits/pel)
Total size: Model size + R*S bits
• Pre-processing
p g method,, g
good when one symbol
y
occurs with high probability or when symbols are
dependent
• Count
C t how
h many repeated
t d symbol
b l occur
• Source ’symbol’ = length of run
R
Resolution:
l ti
Image: 1728*1,188
or 2 Mbytes
Transmission time: T=7 min
Run-length
Run length encoding: Example
RL Code
4b ’011’
9w ’10100’
2b ’11
2w ’0111’
6b ’0010’
0010
6w ’1110’
2b ’11’
length Huffman encoding: 0 ≤ n ≤ 63
Run-length
Run
...
Run-length
Run length Huffman encoding: n > 63
Examples:
n=30w: code=’00000011’
n=94w=64w+30w: code=’11011 00000011’
n=64w=64w+ 0w: code=’11011 00110101’
Predictive coding: Idea
99.76 % 96.64 %
62.99 % 77.14 %
83.97 % 94.99 %
87 98 %
87.98 61.41 %
71.05 % 61.41 %
86.59 % 78.74 %
70.10 % 78.60 %
95.19 % 91.82 %
READ Code (1)
• Vertical mode:
The position of each color change is coded with respect
to a nearby change position of the same color on the
reference line
line, if one exists
exists. "Nearby"
Nearby is taken to mean
within three pixels.
• Horizontal mode:
There is no nearby change position on the reference line,
one-dimensional run-length coding - called
• Pass code:
The reference line contains a run that has no counterpart
in the current line; next complete run of the opposite
color
l iin th
the reference
f liline should
h ld bbe skipped.
ki d
READ: Codes fo modes
current line
vertical mode horizontal mode pass vertical mode horizontal mode
-1 0 3 white 4 black code +2 -2 4 white 7 black
010 1 001 1000 011 0001 000011 000010 001 1011 00011
code generated
Block Coding: Idea
Code: ’1’
1 L=1
L 1
L=2
Code: ’0111’
0111
Hierarchical block encoding ()
L=3
Codes: 0011 0111 1000
L=4
Codes: 0111 1111
1111 1100
0101 1010
x x x x x x x x x x
x x x x x x x x x x
x x x x
x x x x
x x x x
x x x x
1 x x x x
x x x x
0 1 1 1
1+4+12+24=41 0 0 1 1 0 1 1 1 1 0 0 0
RLE
Bits
Run length
Pixel
Buffer 0101101100...
Boundary Bits
points READ
CCITT Group 3 (G3)
• Every kk-th
th line is coded by RLE RLE-method
method and
the READ-code is applied for the rest of the lines.
• The first ((virtual)) p
pixel is white
• EOL code after every line to synchronize code
• Six EOL codes after every page
• Binary documents only
CCITT Group 4 (G4)
7 min → 1 min
Comparison of algorithms
25.0
Compression ratio
20.0
15 0
15.0
23.3
10.0 18.0 18.9 17.9
0.0
COMPRESS GZIP PKZIP BLOCK RLE 2D-RLE ORLE G3 G4 JBIG
I
Input
t signal
i lx
Quantization error:
e(x) = x−q(x)
Distortion measure
μ = E[ x − q( x)] = ∑ ∫ ( x − yi ) p ( x)dx
j =1 a j −1
σ 2 = E[( x − q( x)) 2 ] = ∑ ∫ ( x − y j ) 2
p( x)dx
j =1 a j −1 j
Optimal quantization problem
Gi
Given a signal
i l x, with
ith probability
b bilit ddensity
it function
f ti
(or histogram) p(x), find a quantizer q(x) of x, which
minimizes the quantization error variance σ2:
M aj
σ 2
opt = min ∑ ∫ (x − y )
{a j },{ y j } j =1 a
j
2
p ( x)dx
j −1
Lossy image compression
Model
Part 1: DPCM
y=xi-xi-1
E t
Entropy: Ho= 7.8
7 8 bit
bits/pel
/ l (?) Hr=5.1
5 1 bit
bits/pel
/ l (?)
Prediction error quantization with open loop
ei=xi−xi-1 → q(ei)
yi=yi-1+q(ei)
Variance: σ = σ + (n − 1)σ
2 2 2
y x q
Closed loop: Encoding
ei=xi−xi-1 → q(ei)
zi=zi-1+q(ei)
Error accumulation?
acc m lation? No!
W/o quantization With quantization
en= xn− zn-1
n 1 or xn=z
zn-1
n 1+e
en ⇒ zn=zzn-1 q(en)
n 1+q(e
xn− zn-1=(zn-1+en)−(zn-1+q(en))=en−q(en);
Example
• Open
p loop:
p q
quantization step
p is 8
xj: 81 109 129 165 209 221
ej: 28 20 36 44 12
[ej/8] 4 3 5 6 2
q(ej) 32 24 40 48 16
yj: 81 113 137 177 225 241
ΔH=H0−H1=log2(σ0/σ1)
σ21=2σ20(1 − ρ(Δ)),
Δ Δ Δ
where σ20 is variance of the data x,
σ21 is variance of the predection error e,
ρ(Δ) is correlation coefficient of the pixels xi and xi-1
or ΔH= 0.5log2[2(1 − ρ(Δ))].
Example: If ρ(Δ) =0.8 → −log2[2*0.2]=1.3 bits
If ρ(Δ) =0.9 → −log2[2*0.1]=2.3 bits
Optimum linear prediction
m
• 1-D Linear predictor: xˆi = ∑ a j xi − j
j =1
Usually m=3
2 9 12 15 0 1 1 1 2 12 12 12
2 11 11 9 0 1 1 1 2 12 12 12
2 3 12 15 0 0 1 1 2 2 12 12
3 3 4 14 0 0 0 1 2 2 2 12
x = 7.94 q = 9 a a[2.3]
a=[2.3]
= = 2.3
2
σ = 4.91 b=[12.3]=12
b = 12.3
1 How to construct quantizer?
1.
∑x
m
< x >= ∑ i
1
m i < x 2 >= 1
m x 2
i =1 i =1
σ =< x > − < x >
2 2 2
nb na
a =< x > −σ ⋅ b = x +σ ⋅
na nb
2 Optimal scalar quantizer (”AMBTC”)
2. ( AMBTC )
⎧⎪ 2⎫
⎪
( )
D = min ⎨ ∑ xi − a + ∑ xi − b ⎬
a ,b ,T ⎪
2
( )
⎩ xi <T xi ≥T ⎪⎭
• Max-Lloyd solution:
a+b
1 T= 1
a = ⋅ ∑ xi 2 b = ⋅ ∑ xi
na xi <T nb xi ≥T
2 9 12 15 0 1 1 1 2 12 12 12
2 11 11 9 0 1 1 1 2 12 12 12
2 3 12 15 0 0 1 1 2 2 12 12
3 3 4 14 0 0 0 1 2 2 2 12
x = 7.94 q =9 9
T=
T a = 2.3
a=[2.3] = 2
σ = 4.91 na=7
nb=9
b = 12.3
b=[12.3]=12
D = σ a2 + σ b2 = 7 + 43 = 50
2 3 4 5 6 7 8 9 10 11 12 13 14 15
a T b
Example of optimal quantizer (”AMBTC”)
( AMBTC )
Original Bit-plane Reconstructed
2 9 12 15 0 1 1 1 2
3 12 12 12
2 11 11 9 0 1 1 1 3 12 12 12
2
2 3 12 15 0 0 1 1 3
2 3 12 12
2
3 3 4 14 0 0 0 1 3
2 23 3
2 12
x = 7.94 q =9 9
T=
T a = 2.3
a=[2.7] = 3
σ = 4.91 na=7
nb=9
b=[12.0]=12
b = 12.3
D = σ a2 + σ b2 = 4 + 43 = 47
2 3 4 5 6 7 8 9 10 11 12 13 14 15
a T b
Representative levels compression
• We
W can treat
t t sett off a’s
’ and
d b’s
b’ as an image:
i
1. Predictive encoding of a and b
2 Lossless image compression algorithm
2.
(FELICS, JPEG-LS, CALIC).
3 Lossy compression: DCT (JPEG)
3.
Significance bits compression
Binary image:
• Lossless binary image compression
methods (JBIG, context modeling with
arithemtic coding)
• Lossy image compression (vector
quantization,
ti ti with
ith sub-sampling
b li and
d
interpolating missing pixels, filtering)
Bitrate and Block size
8x8 blocks
Table Table
Specifications Specifications
Compressed Entropy
Dequantizer IDCT Reconstructed
Image Data Decoder Image Data
Table Table
S ifi ti
Specifications S ifi ti
Specifications
Divide image into N×N blocks
Low
o High
Low Low
8x8 block
High High
Low High
2-D
2 D Transform Coding
y00 + y23
y01 y10 y12
...
1-D
1 D DCT basis functions: N=8
N 8
u=0 u=1 u=2 u=3
1.0 1.0 1.0 1.0
0 0 0 0
05
0.5 05
0.5 05
0.5 05
0.5
0 0 0 0
⎧ 1
N −1
⎡ (2 j + 1)kπ ⎤ ⎪ N
for k = 0
x j = ∑ α (k )C (k ) ⋅ cos ⎢ ⎥
α (k ) = ⎨
k =0 ⎣ 2 N ⎦ ⎪ 2
⎩ N
ffor k = 1,2,..., N − 1
Zig-zag
Zig zag ordering of DCT coefficients
Converting
g a 2-D matrix into a 1-D array,
y, so that the
frequency (horizontal and vertical) increases in this order
and the coefficents variance are decreasing in this order.
Example of DCT for image block
Matlab: y=dct(x)
Distribution of DCT coefficients
DC coefficient AC coefficient
hi =
1
12
{∫ [ p ( x)] dx
i d }
13 3
Optimal bit allocation for DCT coefficients
Distortion: D = NHθ 2 2 − B N
1N
⎛ N −1
2⎞
where
h θ = ⎜⎜ ∏ σ k ⎟⎟
2
⎝ k =0 ⎠
yq(k,l)=round[y(k,l)/Q(k,l)]
z (k,l)=yq(k,l)·Q(k,l)
Examples: 236/16 → 15
-22/11 → -2
See: x=idct(y)
Original block
Encoding of quantized DCT coefficients
♦ AC: ?
Encoding of quantized DCT coefficients
For illuminance
Visual
Weighting
Masking
δ 2δ
⎢ s[m , n ] ⎥ ⎧0 + s
ν [m , n ] = ⎢ ⎥, χ [[m,, n]] = ⎨
⎣ δ ⎦ ⎩1 − s
0 1 1 0 1 1 0 1 0 1...
Coded Bitstream
Quantized Coeff.(Q=64)
64, 0, -1, 0, …
EBCOT
δ 2δ
⎢ s[m , n ] ⎥ ⎧0 + s
ν [m , n ] = ⎢ ⎥, χ [[m,, n]] = ⎨
⎣ δ ⎦ ⎩1 − s
• Sequence
q based code
– ROI coefficients are coded as independent sequences
– Allows random access to ROI without fully decoding
– Can specify exact quality/bitrate for ROI and the BG
• Scaling based mode:
– Scale
S l ROI maskk coefficients
ffi i t up (decoder
(d d scales
l d down))
– During encoding the ROI mask coefficients are found
significant at early stages of the coding
– ROI always coded with better quality than BG
– Can't specify rate for BG and ROI
Tiling
DCT
WT
JPEG 2000 vs JPEG: Quantization
JPEG
JPEG 2000
JPEG 2000 vs JPEG: 0.3
0 3 bpp
JPEG
JPEG 2000
JPEG 2000 vs JPEG: Bitrate
Bitrate=0
0.3
3 bpp
MSE=150 MSE=73
PSNR=26.2 db PSNR=29.5 db
JPEG 2000 vs JPEG: Bitrate
Bitrate=0
0.2
2 bpp
MSE=320 MSE=113
PSNR=23.1 db PSNR=27.6 db