0% found this document useful (0 votes)
95 views

Chapter 6 Lossy Compression Algorithms

Lossy compression algorithms selectively discard less important information to reduce file sizes. This type of compression is commonly used for multimedia like audio and video. For images, lossy compression creates approximations of originals that are close perceptually if not identical. Distortion measures quantify information lost during compression. Rate distortion theory establishes the minimum rate needed for a given distortion level, representing the tradeoff between rate and distortion. Popular lossy techniques include quantization, predictive coding, and transform coding like DCT. Video compression exploits temporal redundancy between frames using motion compensation and spatial redundancy reduction.

Uploaded by

Desu Wajana
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
95 views

Chapter 6 Lossy Compression Algorithms

Lossy compression algorithms selectively discard less important information to reduce file sizes. This type of compression is commonly used for multimedia like audio and video. For images, lossy compression creates approximations of originals that are close perceptually if not identical. Distortion measures quantify information lost during compression. Rate distortion theory establishes the minimum rate needed for a given distortion level, representing the tradeoff between rate and distortion. Popular lossy techniques include quantization, predictive coding, and transform coding like DCT. Video compression exploits temporal redundancy between frames using motion compensation and spatial redundancy reduction.

Uploaded by

Desu Wajana
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 46

Lossy Compression Algorithms

Image, Video and Audio Compression Techniques


Chapter SIX
Introduction
• Lossy compression algorithms are data compression methods that
selectively discard certain information in order to reduce the size of
the data.
• The discarded information is typically chosen to be perceptually less
important to the viewer or listener.
• This type of compression is commonly used in multimedia
applications, where large files such as audio and video need to be
compressed to be easily transmitted or stored.
Cont’d…
• For image compression in multimedia applications, where a higher
compression ratio is required, lossy methods are usually adopted.
• In lossy compression, the compressed image is usually not the same
as the original image but is meant to form a close approximation to
the original image perceptually.
• To quantitatively describe how close the approximation is to the
original data, some form of distortion measure is required.
Distortion Measures
• A distortion measure is a mathematical quantity that specifies how
close an approximation is to its original, using some distortion criteria.
• When looking at compressed data, it is natural to think of the
distortion in terms of the numerical difference between the original
data and the reconstructed data.
• However, when the data to be compressed is an image, such a
measure may not yield the intended result.
Cont’d…
• In lossy compression, distortion measures refer to the quantification
of the difference between the original uncompressed data and the
compressed version that has been reconstructed from the
compressed data.
• When data is compressed using lossy compression techniques, some
of the information is lost, and the reconstructed version of the data is
an approximation of the original.
• The distortion measures quantify how much of the original
information has been lost in the compression process.
The rate distortion theory
• The rate-distortion function is a mathematical expression that defines
the minimum possible rate of compression for a given level of
distortion.
• The rate-distortion function can be used to determine the optimal
compression rate for a given level of distortion, or the optimal level of
distortion for a given compression rate.
The rate distortion theory
• Lossy compression always involves a tradeoff between rate and distortion.
• Rate is the average number of bits required to represent each source symbol.
• Within this framework, the tradeoff between rate and distortion is represented
in the form of a rate-distortion function R(D).
• Intuitively, for a given source and a given distortion measure, if D is a tolerable
amount of distortion, R(D) specifies the lowest rate at which the source data can
be encoded while keeping the distortion bounded above by D.
• It is easy to see that when D = 0, we have a lossless compression of the source.
Cont’d…
Cont’d…
• The rate-distortion function is meant to describe a fundamental limit
for the performance of a coding algorithm and so can be used to
evaluate the performance of different algorithms.
Quantization
• In practice, lossy compression algorithms use a variety of techniques
to achieve the desired trade-off between compression rate and
distortion.
• These techniques include quantization, predictive coding, and
transform coding.
• Quantization involves mapping the continuous values in the input
data to a finite set of discrete values, which reduces the amount of
data that needs to be stored or transmitted.
Cont’d…
• Predictive coding involves using a model to predict the values of the
input data based on past values, which can further reduce the
amount of data that needs to be stored or transmitted.
• Transform coding involves transforming the input data into a new
representation that is more compressible, such as through a Fourier
or wavelet transform.
Uniform quantization:
• This is the simplest form of quantization in which the input range is
divided into equal intervals, and each interval is assigned a
representative value.
• The representative values are usually chosen to be the midpoints of
the intervals.
• For example, if we have a range of input values from 0 to 255, we
might choose to divide this range into 256 intervals, with each interval
having a width of 1.
• We would then assign a representative value to each interval, such as
0, 1, 2, ..., 255.
Non-uniform quantization:
• This type of quantization is used when the input range is not evenly
distributed.
• In non-uniform quantization, the input range is divided into intervals
of varying widths, and a representative value is assigned to each
interval.
• For example, in audio compression, the human ear is more sensitive
to changes in low-frequency sounds than high-frequency sounds.
• Non-uniform quantization can be used to allocate more bits to the
low-frequency sounds to improve the overall quality of the
compressed audio.
Vector quantization:
• This type of quantization is used when the input data is in the form of
a vector, such as an image or audio signal.
• Vector quantization involves dividing the input vector space into
smaller subspaces, and assigning a representative vector to each
subspace.
• This can be useful in situations where there are correlations between
the input data, as it allows for more efficient compression of the data.
• For example, in image compression, vector quantization can be used
to compress blocks of pixels that have similar color values.
Transform coding
• Transform coding is a popular technique used in lossy compression
algorithms to reduce the size of digital data while minimizing the loss
of quality.
• It involves transforming the original data from the spatial domain to
the frequency domain using a mathematical function known as a
transform.
• The transformed data is then quantized, which means that the data
values are reduced to a smaller range of values, resulting in some loss
of information.
Cont’d…
• The most commonly used transform in lossy compression is the
Discrete Cosine Transform (DCT), which is widely used in image and
video compression algorithms such as JPEG and MPEG.
• The DCT is a mathematical function that transforms an image from
the spatial domain (i.e., the pixel values of the image) to the
frequency domain (i.e., the frequency components of the image).
Cont’d…
• It is the discrete analog of the formula for the coefficients of a Fourier
series.
Cont’d…
How JPEG compression works?
• This technique of image compression developed by the Joint
Photography Experts Group so that its name is JPEG.
• This compression uses a lossy compression algorithm so that some
information is removed from the image when compressing.
• The JPEG standard works by averaging color variation and discard the
information that the human eye cannot see.
Cont’d…
• JPEG is compressed into either full-color or grayscale images.
• In the case of color images, RGB is transformed into a luminance or
chrominance color space.
• JPEG compression mainly works by identifying similar areas of color
inside the image and converting them to actually the same color
code.
• JPEG uses the DCT (Discrete Cosine Transform) method to compress
for coding transformation. 
Steps of Compression:
1. The raw image is first converted to a different color model, which
separates the color of a pixel from its brightness.
2. image is divided into a small block which is having 8×8 block, each
block is called pixel.
3. Then RGB is converted into Y-Cb-Cr, JPEG uses a Y-Cb-Cr model
instead of RGB.
4. After that, DCT is applied to each block of pixels and converts the
image from the spatial domain to the frequency domain. The
formula followed by the DCT method :
Cont’d…
5.Then make the resulting image quantized, because human eyes can
not see high frequency so to the make the is low quantization is
applied.
6.After quantization, zigzag scan is performed on these quantized 8×8
blocks to group the low-frequency coefficients.
7.The coefficients is then encoded by Run Length and Huffman coding
algorithm to get the final image.
JPEG Standard
Introduction to Video Compression
• Video is a collection of images taken closely together in time.
• Therefore, in most cases, the difference between adjacent images is
not large.
• Video compression techniques take advantage of the repetition of
portions of the picture from one image to another by concentrating
on the changes between neighboring images.
• In other words, there is a lot of redundancy in video frames. There are
two types of redundancy: Spatial and Temporal Redundancy
Cont’d…
• Spatial redundancy: pixel-to-pixel or spectral correlation within the
same frame
• Temporal redundancy: similarity between two or more different frames
Video compression based on motion
compensation
• The MPEG video compression algorithm relies on two basic techniques: •
• motion compensation for the reduction of the temporal redundancy and
transform domain-(DCT)based compression for the reduction of spatial
redundancy.
• Motion-compensated techniques the techniques that exploit the temporal
redundancy of video signals.
• The concept of motion compensation is based on the estimation of motion
between video frames, i.e. if all elements in a video scene are approximately
spatially displaced, the motion between frames can be described by a
limited number of motion parameters (by motion vectors for translatory
motion of pixels).
Cont’d…
• The remaining signal (prediction error) is further compressed with
spatial redundancy reduction (DCT).
• The information relative to motion is based on 16 X 16 blocks and is
transmitted together with the spatial information.
• The motion information is compressed using variable-length codes to
achieve maximum efficiency.
Types of Frames
Cont’d…
• Because of the importance of random access for stored video and the
significant bit-rate reduction afforded by motion-compensated
interpolation, four types of frames are defined in MPEG:
• Intraframes(I-frames),
• Predicted frames(P-frames),
• Interpolated frames (B-frmes) and
• DC-Frames(D-frames)
I-Frames
• I-frames (Intra-coded frames) are coded independently with no reference to
other frames.
• I-frames provide random access points in the compressed video data, since
the I-frames can be decoded independently without referencing to other
frames.
• With I-frames, an MPEG bit-stream is more editable.
• Also, error propagation due to transmission errors in previous frames will be
terminated by an I-frame since the I-frame does not have a reference to the
previous frames.
• Since I-frames use only transform coding without motion compensated
predictive coding, it provides only moderate compression
P-Frames
• P-frames (Predictive-coded frames) are coded using the forward
motion-compensated prediction from the preceding I- or P-frame.
• P-frames provide more compression than the I-frames by virtue of
motion-compensated prediction.
• They also serve as references for B frames and future P-frames
• Transmission errors in the I-frames and P-frames can propagate to the
succeeding frames since the I-frames and P-frames are used to predict
the succeeding frame
B-Frame
• B-frames (Bi-directional-coded frames) allow macroblocks to be
coded using bidirectional motion-compensated prediction from both
the past and future reference Iframes or P-frames.
• In the B-frames, each bi-directional motion-compensated macroblock
can have two motion vectors: a forward motion vector which
references to a best matching block in the previous I-frames or P-
frames, and a backward motion vector which references to a best
matching block in the next I-frames or P-frames.
Cont’d…
• The motion compensated prediction can be formed by the average of
the two referenced motion compensated blocks.
• By averaging between the past and the future reference blocks, the
effect of noise can be decreased. B-frames provide the best
compression compared to I- and P-frames.
• I- and P-frames are used as reference frames for predicting B-frames.
• To keep the structure simple and since there is no apparent advantage
to use Bframes for predicting other B-frames, the B-frames are not
used as reference frames Hence, B-frames do not propagate
Cont’d…
D-frames
• D-frames (DC-frames) are low-resolution frames obtained by decoding
only the DC coefficient of the Discrete Cosine Transform coefficients
of each macroblock.
• They are not used in combination with I-, P-, or B-frames.
• D-frames are rarely used, but are defined to allow fast searches on
sequential digital storage media
Zig-Zag Scan for Entropy encoding
Apply Huffman Encoding
MPEG Audio Compression
• Psychoacoustics
• The range of human hearing is about 20 Hz to about 20 kHz.
• The frequency range of the voice is typically only from about 500 Hz
to 4 kHz.
• The dynamic range, the ratio of the maximum sound amplitude to the
quietest sound that humans can hear, is on the order of about 120 dB.
Fletcher-Munson Curves
• Equal loudness curves that display the relationship between
perceived loudness (“Phons”, in dB) for a given stimulus sound
volume (“Sound Pressure Level”, also in dB), as a function of frequenc
Frequency Masking
• Lossy audio data compression methods, such as MPEG/Audio encoding,
remove some sounds which are masked anyway.
• The general situation in regard to masking is as follows:
• 1. A lower tone can effectively mask (make us unable to hear) a higher tone
• 2. The reverse is not true – a higher tone does not mask a lower tone well
• 3. The greater the power in the masking tone, the wider is its influence –
the broader the range of frequencies it can mask.
• 4. As a consequence, if two tones are widely separated in frequency then
little masking occur
Bark Unit
• Bark unit is defined as the width of one critical band, for any masking
frequency.
MPEG audio compression
• It takes advantage of psychoacoustic models, constructing a large
multi-dimensional lookup table to transmit masked frequency
components using fewer bits
• MPEG Audio Overview
• 1. Applies a filter bank to the input to break it into its frequency
components
• 2. In parallel, a psychoacoustic model is applied to the data for bit
allocation block
• 3. The number of bits allocated are used to quantize the info from the
filter bank – providing the compression
MPEG Audio Layers
• Layer 1 quality can be quite good provided a comparatively high bit-
rate is available – Digital Audio Tape typically uses Layer 1 at around
192 kbps
• Layer 2 has more complexity; was proposed for use in Digital Audio
Broadcasting
• Layer 3 (MP3) is most complex, and was originally aimed at audio
transmission over ISDN lines
• Most of the complexity increase is at the encoder, not the decoder –
accounting for the popularity of MP3 player
MPEG Audio Strategy
• MPEG approach to compression relies on:
• – Quantization
• Human auditory system is not accurate within the width of a critical band
(perceived loudness and audibility of a frequency)
• -bank of filters
• Analyze the frequency (“spectral”) components of the audio signal by
calculating a frequency transform of a window of signal values
• Decompose the signal into subbands by using a bank of filters (Layer 1 &
2: “quadrature-mirror”; Layer 3: adds a DCT; psychoacoustic model:
Fourier transform)
Cont’d…
• Frequency masking: by using a psychoacoustic model to estimate the
just noticeable noise level:
• Encoder balances the masking behavior and the available number of
bits by discarding inaudible frequencies
• Scaling quantization according to the sound level that is left over,
above masking levels
The End!!!

You might also like