Img Compression
Img Compression
Image compression
- to reduce the amount of data required to represent an image
for a standard definition (SD) tv movie using 720 x 480 x 24 bit pixel arrays, with 30 fps ( frames per
second) SD digital video data must be accessed at
and a two - hour movie consist of 31,104,000 bytes / sec x (602) sec /hr x 2 hrs
Fundamentals.
Data computer refers to the process of reducing the amount of data required to represent a given
quantity of information. If we have b & b' denote the no. bits is two representation of the same
1
information, the relative data redundancy R of representation with b bits is R = 1 -
𝐶
𝑏
where C is the compression representation, & C =
𝑏′
If c=10 (written as 10 : 1), the larger representation has 10 bits of data for every 1 bit of data in the
smaller representation, i.e, for the larger representation, R=0.9 or the 90% of the data is redundant.
1. Coding redundancy :
A code is a system of symbols (letters, numbers, etc.) used to represent a body of
information. Each piece of information is assigned a sequence of code symbols, called a code
word. The number of symbols in each code word is its length. [eg 8 bit codes].
2. Spatial & temporal redundancy:
Because of the pixels of most, 2-D intensity arrays are correlated spatially (i.e., each pixel is
similar or dependent on neighbouring pixels), information is unnecessarily replicated in the
correlated pixels. In a video sequence, temporally correlated pixels (nearly frames) also
duplicate information.
3. Irrelevant info:
Most, 2-D intensity arrays contain information that is ignored by the human visual system. It
is redundant in the sense that it is not used.
Assuming that rk in the interval [0, L-1] is used to represent the intensities of an M x N
image and that each rk occurs with probability Pr (rk)
𝑛𝑘
Pr (rk) = k=0,1,2,……,L-1
𝑀𝑁
Where L is the number of intensity value and nk is the no of times that the k15 intensity
appears in the image. If the no. of bits used to represt each value of rk is l(rk ) then the
average no. of bits required represent each pixel is
If the intensities are represented using a m-bit fixed- length code, then l(rk ) = m
rk for k ≠ 0 - 8 - 0
87,128,186,255
If code 2 is used
Lavg = 0.25 (2) + 0.47 (1) + 0.25(3) +0.03(3) = 1.81 bits
The total no. of bits required to represent the entire image is, MNL = 256 x 256 x 1.81
= 118,621 bits
256 K 256 K 8 8
Then C = ≈ 4.42
118,621 1.81
=
1
And R = 1 - = 0.774
4.42
Thus 77.4% of the data is redundant. So, in the variable length code, fewer bits are assigned
to move probable intensity values.
Information that is ignored by the human visual system or which is extraneous can be
omitted. When an image is a homogeneous field of grey it can be represented by its average
intensity alone a single 8-bit value. The original 256 x 256 x 8 bit array reduced to a single
byte and c = (256 x 256 x 8) / 8 or 65,536 : 1
Even if several intensity values are present, the human visual system averages these
intensities, perceives only the average value, & ignores the small changes is intensity. A
histogram equalized version of the image will make the intensity changes visible. So
essential information like in medical application should not be omitted.
Input image f(x,y) is fed into the encoder, which creates a compressed representation of the
input. The decoder generates a reconstructed output image f^(x, y) when f^(x, y) is an exact
replica of f(x,y), the compression is error free, lossless, or information preserving.
Otherwise, the reconstructed output image is distorted and the compression system is
lossy.
The encoding or Compession Process
In the first stage of the encoding process, a mapper transforms f(x, y) into a format to
reduce spatial and temporal redundancy. This is a reversible operation. Example Run-length
coding
The quantizer reduces the accuracy of the mappers output is accordance with a pre-
established fidelity criterion, to keep irrelevant information out of the compressed
representation. This is omitted when error-free compression is desired.
The symbol coder generates a fixed or variable length code to represent. The quantizer
output & maps the output to a code. The shortest code words are assigned to the most
frequently occurring quantizer output values thus minimizing coding redundancy. This
operation is reversible
1. CCITT ITU-T a facsimile (fax) method for transmitting binary document over
Group 3 telephone lines
2. CCITT ITU -T a streamlined version of CCITT Group 3 and Supports 2D run
Group 4 length coding only
3. JBIG or ISO / IEC / A Joint Bi-level image experts Group for lossless compression of
JBIG1 ITU-T bi-level image.
4. JBIG2 ISO / IEC / similar to JBIG1 for bi-level image in desktop, internet & FAX
ITU-T application.
Continuous tone still images
5. JPEG ISO / IEC / A Joint Photographic Experts Group standard for images of
ITU-T photo quality - lossy baseline coding system…use DCT (Discrete
Cosine Transform)
6. JPEG – LS ISO / IEC /
ITU-T
7. JPEG - 2000 ISO / IEC /
ITU-T
VIDEO
8. DV IEC Digital Videos
9. H.261 ITU – T A two way video conferencing standard for ISDN lines. A DCT
based compression is used. Frame to frame prediction
differencing
10. H.262 ITU – T Similar to MPEG – 2
11. H.263 ITU - T Enhanced version of H.261 designed for telephone modems
(i.e. 28.8 kb/s).
12. H.264 ITU – T Extension of H.262 and H.263 for video conferencing. Internet
streaming and TV broadcasting.
13. MPEG1 ISO / TEC A Motion Picture Expert Group for CD-ROM applications with
non-interlaced video at 1.5 Mbps . Similar to H.26 1, but frame
predictions are based on previous format, next format, and an
interpolation of both
14. MPEG2 ISO / TEC Extension of MPEG1 designed for DVDs with transfer rates 15
Mb/s supports interlaced video & HDTV.
15. MPEG4 ISO / TEC Extension of MPEG2 that supports variable block size &
prediction differencing within frames
16. MPEG4 AVC ISO / TEC Advanced video coding similar to H.264
25 Quick-Time Apple a media container supporting DV, H.261, 262, 264, MPEG-1,
computer MPEG-2,
26 VC-1 SMPTE Most used on the Internet , Adopted for HD & Blu-ray high-defn
WMV9 Microsoft DVDS.
Some Basic Compression Methods
Some basic compression methods are
⚫ Huffman Coding
⚫ Golomb Coding
⚫ Arithmetic Coding
⚫ LZW Coding
⚫ Run-Length Coding
⚫ One-dimensional CCITT compression
⚫ two-dimensional CCITT compression
⚫ Symbol-Based Coding
Huffman coding
This is One of the most popular techniques for removing
coding redundancy . When coding the symbols of an
information source individually, Huffman code yields the
smallest possible number of code symbols per source
symbol. In terms of Shannon's theorem , the resulting code is optimal for
a fixed value of n, subject to the constraint that the spouce symbols be be coded
one at a time. . (The source symbols may be either the intensities of an image or the output of an
intensity mapping operation like pixel differences,run length etc )
⚫ The second step in Huffman’s procedure is to code each reduced source, starting with the
smallest source and working back to the original source. The code for a two symbol source
are 0 and 1. These are assigned to two symbols on the right. Since the prob. 0.6 was
generated by combining two symbols, the 0 used to code it is now assigned to both of these
symbos, & a 0 & are 1 are appended to each. This is repeated to each reduced source
The symbols are coded one at a time and decoding is done using a simple lookup table. The code
used here is a block code, since each source symbol is mapped into a fixed sequence of code
symbols. It is uniquely decodable.
Arithmetic coding
This generates nonblock codes. A one-to-one correspondence between Source symbols &
code words does not exist . But, an entire sequence of source symbol ( or message), is assigned a
single arithmetic code word. The code word defines an internal of real numbers between 0 & 1. As
the no. of symbols in the message increases, the interval used to represent it becomes smaller &
the no. of information units (bits) required to represent the interval, become larger.
for ex a five symbol sequence or message, a1 a2a3a3 a4, from a four symbol source is coded. The
internal [ 0, 1 ]is sub divided into four regions based on the probabilities of each source symbol.
Symbol a1, is associated with subinterval [ 0, 0.2 ] and the message interval is initially narrowed to
[ 0, 0.2 ]. This is then subdivided according to original source symbol probabilities and the proces
continues with the next message symbol .
Symbol a2 narrows the subinterval to [ 0.04, 0.08 ] , a3 to [ 0.056, 0.072 ] & so on. The
final message symbol, reserved for end-of -message indicator, narrows the range to [ 0.06752,
0.0688 ] . Any number within this subinterval like 0.068 can be used to represent the message.
Here 3 decimal digits are used to represent the five symbol message i.e., 0.6 decimal digits
are used per source symbol.
Arithmetic coding procedure.
LZW coding
This addresses spatial redundancies in an image. The technique called Lempel –Ziv -Welch
(LZW) coding, assigns fixed-length code words to variable length sequences of source symbols.
During the coding process, a codebook dictionary containing the source symbols to be coded
is constructed . For 8-bit monochrome images, the first 256 words of the dictionary are assigned to
intensities 0,1,2, … 255. The encoder sequentially examines image pixels and intensity sequences
are put in successive locations. Fox ex, the sequence "255-255” might be assigned to location 256.
Whenever two consecutive white pixels are encountered, code word 256 is used to represent them
LZW compression has been integrated into a variety of mainstream imaging file formats,
including GIF, TIFF, and PDF. The PNG format was created to get around LZW licensing
requirements
In LZW coding, the coding dictionary or code book is created while the data are being encoded
In absolute mode, the first byte is 0, and the second byte signals one of four possible
conditions, as shown below.
0 End of line
1 End of image
2 Move to a new position
3-255 Specify pixels individually
if the second byte is between 3 – 255 , it specifies the no. of uncompressed pixels that
follow
RLE is effective when compressing binary images, because there are only two possible
intensies (black & white).
This is based on the concept of decomposing a multilevel ( Monochxome or Color) image into a
series of binary images and compressing each binary image via one of several binary compression
methods.
The lowest order bit plane contains the a0 pixels of each pixel, and the highest order bit plane
contains the am-1 bits. The disadvantage of this method is that small changes in intensity can have a
significant impact on the complexity of the bit planes.
An alternative method ( which reduces the effect of small intensity variations) is to first
represent image by an m-bit Gray code . The m-bit Gray code gm-1........g2g1g0 that corresponds
to the polynomial can be computed from
A line is an edge segment in which the intensely of the background on either side of the
line is either much higher or much lower than the intensity of the line pixels. An isolated point is a
line whose length & width are equal to one pixel.
Background
Derivatives of a digital function are defined in terms of differences. The first derivative :
A second derivative
Segmentation algorithms generally are based on one of two basis properties of intensity
values
• Similarity: to partition an image into regions that are similar according to a set of
predefined criteria.
Direction dependent filters localize 1 pixel thick lines at other orientations (0, 45, 90).
Edge models
4. Use double thresholding (and subsequent connectivity analysis) to detect and link edges