0% found this document useful (0 votes)
74 views29 pages

Iamge Video Compression V2

Image and video compression aims to remove statistical and perceptual redundancy. Lossy compression is used for grayscale, color, and video and is irreversible. Fidelity is measured using PSSNR and SSIM metrics, which fail to capture human perception. Encoding involves predicting pixel values using neighboring pixels and prior frames through techniques like spatial linear estimation, predictive encoding, and motion estimation. Standards like JPEG, GIF, PNG, and MPEG-1/2 use techniques like DCT, LZW, filtering, and bi-directional prediction of macroblocks to achieve compression.

Uploaded by

MOH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views29 pages

Iamge Video Compression V2

Image and video compression aims to remove statistical and perceptual redundancy. Lossy compression is used for grayscale, color, and video and is irreversible. Fidelity is measured using PSSNR and SSIM metrics, which fail to capture human perception. Encoding involves predicting pixel values using neighboring pixels and prior frames through techniques like spatial linear estimation, predictive encoding, and motion estimation. Standards like JPEG, GIF, PNG, and MPEG-1/2 use techniques like DCT, LZW, filtering, and bi-directional prediction of macroblocks to achieve compression.

Uploaded by

MOH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Image and video

compression
CIE 425 Information theory
Compression Motivation\ aim
Goals of compression is to remove redundancy and reduce irrelevance (perceptual
redundancy).
Statistical redundancy
Spatial correlation
Local - Pixels at neighboring locations have similar intensities.
Global - Recurring patterns.
Spectral correlation
Between color planes.
Temporal correlation
Between consecutive frames.

Perceptual redundancy, not all visual information is perceived by eye/brain


Limitation of rendering hardware.
Lossless vs Lossy Compression
● Lossless compression: reversible, information preserving text compression
algorithms, binary images, palette images
● Lossy compression: irreversible grayscale, color, video
● Near -lossless compression: lossless compression: medical imaging, remote
sensing.
Fidelity measures
● The peak-signal-to-noise ratio (PSNR) and the structural similarity index
measure (SSIM) are two commonly used pixel-level image quality metrics.
● PSNR and SSIM fail to capture differences at the feature level and correlate
poorly with human perception of image quality.
● Several researchers have therefore attempted to define alternative
measures of compression quality based on the similarity of the features
extracted from the reconstructed and original images and have shown that
these alternative measures correlate better with human subjective
perception.
● SSIM: Structural Similarity Index
Image encoding
Image encoding
Some gray level value
are more probable
than others.
Pixel values are not i.i.d. (independent and identically distributed)
LZW

Widely used: GIF, TIFF, PDF …

Its royalty-free variant (DEFLATE) used in PNG, ZIP, …

LZW patent expired


GIF

Graphics Interchange Format


● One of the earliest developed image compression algorithms
(1987)
● Limited to 8-bit color space--each GIF image can contain only up
to 256 different colors selected from a 24-bit RGB color space
● Uniquely supports animations
● Based on LZW compression scheme
LZW
GIF- LZW
The color table came from the global color table block. The colors are listed in the
order which they appear in the file. The first color is given an index of zero. When
we send the codes, we always start at the top left of the image and work our way
right. When we get to the end of the line, the very next code is the one that starts
the next line. (The decoder will "wrap" the image based on the image dimensions.)
We could encode our sample image in the following way:

1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2,
1, 1, 1, 0, 0, 0, 0, 2, 2, 2, 1, 1, 1, 0, 0, 0, 0, 2, 2, 2, ...
Spatial linear estimator

Coefficients are optimized to minimize the MSE


Predictive encoder
PNG

● Developed in 1996 to be lossless and patent-free Adopted as


international standard ISO/IEC 15948 in 2003 Supports both
8-bit and up to 48-bit color
● Uniquely supports full transparency
● Based on DEFLATE (LZ77 + Huffman coding)
PNG: pixel mapping
As in GIF, pixels are first mapped to an index stream Each row of pixels is called a
“scanline”

containing up to 4 channels per pixel


Filtering encodes each pixel x with the difference between the filtered (predicted)
value and actual byte value at x. Rows of filtered pixels are concatenated for
compression.
GIF vs PNG
JPEC
Fidelity measures
Motion estimation

● Help understanding the content of image sequence .


● Help reduce temporal redundancy of video for
compression
● Stabilizing video by detecting and removing small, noisy
global motions for building stabilizer in camcorder.
● A hard problem in general!
Frame prediction

I frame: intra frame (also called key frame) is intra coded


•P frame: forward predicted frame from one previously decoded frame
•B frame: bi-directional predicted frame from one or two previously coded frames
•Group of Pictures (GoP): I BB P BB P BB P BB … P BB I
ISO/IEC MPEG-1

● First video compression standard developed by the ISO


● Targeted for storage and retrieval of moving pictures and audio on digital media such
as video CDs with target rates around 1.5 Mbps (quality same or better than VHS).
● Similar to H.261 with enhancements.
● B frames in addition to I and P frames.
● Adaptive perceptual quantization: separate quantization scale factor applied to each
AC coefficient to optimize the human visual perception
● Only progressive (non-interlaced) video is supported.
● Intraframecoding: I frames in addition to D frames (DC frames in which only DC
components from blocks, useful for fast search applications).
● Interframecoding: P and B frames.
● Wide range of input resolutions.
ISO/IEC MPEG-1

Bi-directional Prediction Advantages


○ Higher coding efficiency.
○ No uncovered background problems.
○ Effect of noise can be decreased by averaging between past
and future reference frames.
○ No prediction error propagation.
○ Increased frame rate with few extra bits.
ISO/IEC MPEG-1

Bi-directional Disadvantages
○ Increasing the number of B frames between references results
in a decreased correlation between reference frames and
between B frames and reference frames.
○ For a large class of scenes, references places at 1/10th second
interval resulting in 2 B frames between the reference I and P
frames; example: IBBPBBPBB..PBBI.
○ IAt least two frames need to be stored at decoder
○ Increased delay.
Frame prediction

● Work on each macroblock (MB) (16x16 pixels) independently


for reduced complexity Motion compensation done at the
MB level.

● DCT coding at the block level (8x8 pixels).


ISO/IEC MPEG-1 Encoder
ISO/IEC MPEG-2 Encoder
Thanks

You might also like