0% found this document useful (0 votes)
2 views

Unit-4_DIP

The document covers various aspects of color image processing, including color fundamentals, models, and transformations. It explains how colors are perceived, the RGB, CMY, and HSI color models, and their applications in image processing. Additionally, it discusses the importance of device-independent color models for maintaining color consistency across different devices.

Uploaded by

kannanram623
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Unit-4_DIP

The document covers various aspects of color image processing, including color fundamentals, models, and transformations. It explains how colors are perceived, the RGB, CMY, and HSI color models, and their applications in image processing. Additionally, it discusses the importance of device-independent color models for maintaining color consistency across different devices.

Uploaded by

kannanram623
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

UNIT - IV: Color Image Processing

Introduction, Color fundamentals, color models, pseudo color image processing, basics of full
color image processing, color transformations, color image smoothing and sharpening, color
segmentation.
Introduction:
 Color of an object is determined by the nature of the light reflected from it.
 When a beam of sunlight passes through a glass prism, the emerging beam of light is not
white but consists instead of a continuous spectrum of colors ranging from violet at one end
to red at the other.
 As Fig.1 shows, the color spectrum may be divided into six broad regions: violet, blue, green,
yellow, orange, and red.
 When viewed in full color (Fig.2), no color in the spectrum ends abruptly, but rather each
color blends smoothly into the next.

Fig. 1 Color spectrum seen by passing white light through a prism

Fig. 2 Wavelengths comprising the visible range of the electromagnetic spectrum.


 As shown in Fig.2, visible light is composed narrow band of frequencies in the
electromagnetic spectrum. A body that reflects light that is balanced in all visible wavelengths
appears white to the observer.
 However, a body that favors reflectance in a limited range of the visible spectrum exhibits
some shades of color. For example, green objects reflect light with wavelengths primarily in
the 500 to 570 nm range while absorbing most of the energy at other wavelengths.
 Characterization of light is central to the science of color. If the light is achromatic (void of
color), its only attribute is its intensity, or amount. Achromatic light is what viewerssee on
a black and white television set.
 Three basic quantities are used to describe the quality of a chromatic light source: radiance,
luminance, and brightness.
 Radiance: Radiance is the total amount of energy that flows from the light source, and itis
usually measured in watts (W).
 Luminance: Luminance, measured in lumens (lm), gives a measure of the amount of energy
an observer perceives from a light source. For example, light emitted from a source
operating in the far infrared region of the spectrum could have significant energy (radiance),
but an observer would hardly perceive it; its luminance would be almost zero.
 Brightness: Brightness is a subjective descriptor that is practically impossible to measure.
It embodies the achromatic notion of intensity and is one of the key factors in describing color
sensation.

Fig.3 Absorption of light by the red, green, and blue cones in the human eye as a function of
wavelength.
 Cones are the sensors in the eye responsible for color vision. Detailed experimental evidence
has established that the 6 to 7 million cones in the human eye can be dividedinto three
principal sensing categories, corresponding roughly to red, green, and blue. Approximately
65% of all cones are sensitive to red light, 33% are sensitive to green light, and only about
2% are sensitive to blue (but the blue cones are the most sensitive). Figure3 shows average
experimental curves detailing the absorption of light by the red, green, and blue cones in the
eye. Due to these absorption characteristics of the human eye, colors arc seen as variable
combinations of the so- called primary colors red (R), green (G), and blue (B).
 The primary colors can be added to produce the secondary colors of light --magenta (red plus
blue), cyan (green plus blue), and yellow (red plus green). Mixing the three primaries, or a
secondary with its opposite primary color, in the right intensities produces white light.
 The characteristics generally used to distinguish one color from another are brightness, hue,
and saturation. Brightness embodies the chromatic notion of intensity. Hue is an attribute
associated with the dominant wavelength in a mixture of light waves. Hue represents
dominant color as perceived by an observer. Saturation refers to the relative purity or the
amount of white light mixed with a hue. The pure spectrum colors are fully saturated. Colors
such as pink (red and white) and lavender (violet and white) are less saturated, with the
degree of saturation being inversely proportional to the amount of white light-added.
 Hue and saturation taken together are called chromaticity, and. therefore, a color may be
characterized by its brightness and chromaticity.
 The amount of red, green or blue in a color is known as Tristimulus Values and they are
denoted as X, Y and Z respectively. A color is then specified by its Trichromatic coefficients
denoted as:






 It is noted from above equations that x+y+z=1


 For any value of x(red) and y(green) the corresponding value of z(blue) is obtained by
above equation as z=1-(x+y)

RGB color model:

 The purpose of a color model (also called color space or color system) is to facilitate the
specification of colors in some standard.
 In essence, a color model is a specification of a coordinate system and a subspace within
that system where each color is represented by a single point.
 RGB Model for color monitors and color video cameras.
 CMY and CMYK models for color printing.
 HIS model used by humans to describe and interpret colors of image and also used to
decouple the color and grayscale information in an image.

The RGB Color Model:

 In the RGB model, each color appears in its primary spectral components of red, green, and
blue. This model is based on a Cartesian coordinate system. The color subspace of interest is
the cube shown in Fig.4, in which RGB values are at three corners; cyan, magenta, and yellow
are at three other corners; black is at the origin; and white is at the corner farthest from the
origin. In this model, the gray scale (points of equal RGB values) extends from black to white
along the line joining these two points. The different colors in this model arc points on or
inside the cube, and are defined by vectors extending from the origin. For convenience, the
assumption is that all color values have been normalizedso that the cube shown in Fig.4 is
the unit cube. That is, all values of R, G. and B are assumed to be in the range [0, 1].






Fig.4 Schematic of the RGB color cube

 Images represented in the RGB color model consist of three component images, one for each
primary color. When fed into an RGB monitor, these three images combine on the phosphor
screen to produce a composite color image. The number of bits used to represent each
pixel in RGB space is called the pixel depth.
 Consider an RGB image in which each of the red, green, and blue images is an 8-bit image.
Under these conditions each RGB color pixel [that is, a triplet of values (R, G, B)]is said to
have a depth of 24 bits C image planes times the number of bits per plane). The term full-
color image is used often to denote a 24-bit RGB color image. The total number of colors in
a 24-bit RGB image is (28)3 = 16,777,216.
 RGB is ideal for image color generation (as in image capture by a color camera or image
display in a monitor screen), but its use for color description is much more limited.
 Even though each color is represented with 256 different intensities depending on viewer
hardware capabilities there are subset (216) of intensities which are viewed. These subsets
of intensities are called safe RGB colors or all-system-safe colors.
 Now the RGB triplets of these safe RGB colors is given as (26)3

CMY color model:


 Cyan, magenta, and yellow are the secondary colors of light or, alternatively, the primary
colors of pigments. For example, when a surface coated with cyan pigment is illuminated
with white light, no red light is reflected from the surface. That is, cyan subtracts red light
from reflected white light, which itself is composed of equal amounts of red, green, and blue
light
 Most devices that deposit colored pigments on paper, such as color printers and copiers,
require CMY data input or perform an RGB to CMY conversion internally. This conversion
is performed using the simple operation (1) where, again, the assumption is that all color
values have been normalized to the range [0, 1]. Equation (1) demonstrates that light
reflected from a surface coated with pure cyan does not contain red (that is, C = 1 — R in
the equation).

1
 Similarly, pure magenta does not reflect green, and pure yellow does not reflect blue.
Equation (1) also reveals that RGB values can be obtained easily from a set of CMY values
by subtracting the individual CMY values from 1. As indicated earlier, in image processing
this color model is used in connection with generating hardcopy output, so the inverse
operation from CMY to RGB generally is of little practical interest.
 Equal amounts of the pigment primaries, cyan, magenta, and yellow should produce black.
In practice, combining these colors for printing produces a muddy-looking black.
 To produce true black (predominant colour in printing) a 4th colour, black, is added =>
CMYK model (CMY + Black)
HSI color model:
 When humans view a color object, we describe it by its hue, saturation, and brightness. Hue
is a color attribute that describes a pure color (pure yellow, orange, or red), whereas saturation
gives a measure of the degree to which a pure color is diluted by white light. Brightness is a
subjective descriptor that is practically impossible to measure. It embodies the achromatic
notion of intensity and is one of the key factors in describing color sensation.
 Intensity (gray level) is a most useful descriptor of monochromatic images. This quantity
definitely is measurable and easily interpretable. The HSI (hue, saturation, intensity) color
model, decouples the intensity component from the color-carrying information (hue and
saturation) in a color image. As a result, the HSI model is an ideal tool for developingimage
processing algorithms based on color descriptions that are natural and intuitive to humans.

 In Fig 5 the primary colors are separated by 120°. The secondary colors are 60° from the
primaries, which means that the angle between secondaries is also 120°. Figure 5(b) shows
the same hexagonal shape and an arbitrary color point (shown as a dot).The hue of the point
is determined by an angle from some reference point. Usually (but not always)an angle of
0° from the red axis designates 0 hue, and the hue increases counterclockwise from there. The
saturation (distance from the vertical axis) is the length of the vector from the origin to the
point. Note that the origin is defined by the intersection of the color plane with the vertical
intensity axis. The important components of the HSI color space are the vertical intensity axis,
the length of the vector to a color point, and the angle this vector makes with the red axis.
Fig 5 Hue and saturation in the HSI color model.
Conversion from RGB color model to HSI color model:
 Given an image in RGB color format, the H component of each RGB pixel is obtained using
the equation

1
 With

 The saturation component is given by

3
 Finally, the intensity component is given by

4
 It is assumed that the RGB values have been normalized to the range [0, 1] and that angle
θ is measured with respect to the red axis of the HST space. Hue can be normalized to the
range [0, 1] by dividing by 360° all values resulting from Eq. (1). The other two HSI
components already are in this range if the given RGB values are in the interval [0, 1].
Conversion from HSI color model to RGB color model:
 Given values of HSI in the interval [0, 1], one can find the corresponding RGB values in the
same range. The applicable equations depend on the values of H. There are three sectors of
interest, corresponding to the 120° intervals in the separation of primaries.
RG sector (0o ≤ H <120°):
 When H is in this sector, the RGB components are given by the equations
B = I (1 – S)
G = 3 I – (R + B)
R = I [1 + (S * cos H/ cos(60o – H)]
GB sector (120o ≤ H < 240o):
If the given value of H is in this sector, first subtract 120° from it.
H = H - 1200
Then the RGB components are
R = I (1 – S)
B = 3 I – (R + G)
G = I [1 + (S * cos H/ cos(60o – H)]
BR sector (240o ≤ H ≤ 360o):
If H is in this range, subtract 240o from it
H = H - 2400
Then the RGB components are
G = I (1 – S)

R = 3 I – (B + G)
B = I [1 + (S * cos H/ cos(60o – H)]

Manipulating HSI Component Images


 the gray-level values in the above image(b) correspond to angles.
 for example, because red corresponds to 0°, the red region in image (c) is mapped to a black region in
the hue image.
 To change the individual color of any region in the RGB image, we change the values of the
corresponding region in the hue image .
 Then we convert the new H image, along with the unchanged S and I images, back to RGB.
 To change the saturation (purity) of the color in any region, we follow the same procedure, except
that we make the changes in the saturation image in HSI space.

 In this figure that the outer portions of all circles are now red; the purity of the cyan region was
diminished, and the central region became gray rather than white. Although these
results are simple, they clearly illustrate the power of the HSI color model in allowing
independent control over hue, saturation, and intensity.

A DEVICE-INDEPENDENT COLOR MODEL

 Color transformations can be performed on most desktop computers.

 The effectiveness of the transformations is judged ultimately in print.

 it is necessary to maintain a high degree of color consistency between the monitors used and the
eventual output devices.

 This is best accomplished with a device-independent color model that relates the color gamuts of the
monitors and output devices.

 The model of choice for many color management systems (CMS) is the CIE L* a * b * model.
 The tonal range of an image, also called its key type, refers to its general distribution of color
intensities.

 Most of the information in high-key images is concentrated at high (or light) intensities; the colors
of low-key images are located predominantly at low intensities; middle-key images lie in between

Basics of full color image processing:


 Full-color image processing approaches fall into two major categories. In the first category,
each component image is processed individually and then form a composite processed color
image from the individually processed components.

 In the second category, one works with color pixels directly. Because full-color images have
at least three components, color pixels really are vectors. For example, in the RGB system,
each color point can be interpreted as a vector extending from the origin to that point in the
RGB coordinate system.

 Let c represent an arbitrary vector in RGB color space:

……..01
 This equation indicates that the components of c are simply the RGB components of color
image at a point. If the color components are a function of coordinates (x, y) by using the
notation

……02

 For an image of size M X N, there are MN such vectors, c(x, y), for x = 0,1, 2,...,M- l; y =
0,1,2,...,N- 1.

 It is important to keep clearly in mind that Eq. (2) depicts a vector whose components are
spatial variables in x and y.

 In order for per-color-component and vector-based processing to be equivalent, two


conditions have to be satisfied: First, the process has to be applicable to both vectors and
scalars. Second, the operation on each component of a vector must be independent of the
other components.

Fig 9 Spatial masks for gray-scale and RGB color images.

 Fig 9 shows neighborhood spatial processing of gray-scale and full-color images. Suppose
that the process is neighborhood averaging. In Fig. 9(a), averaging would be
accomplished by summing the gray levels of all the pixels in the neighborhood and dividing
by the total number of pixels in the neighborhood. In Fig. 9(b), averaging would be done by
summing all the vectors in the neighborhood and dividing each component by the total
number of vectors in the neighborhood. But each component of the average vector is the sum
of the pixels in the image corresponding to that component, which is the same as the result
that would be obtained if the averaging were done on a per-color- component basis and then
the vector was formed.
Color transformations:
Formulation:
• As with the gray-level transformation techniques we model color transformations using
the expression g(x,y)=T[f(x,y)].
• Where f(x,y) is a color input image. G(x,y) is transformed image. T is an operation on f
over a spatial neighborhood of (x,y).
• The principle difference between color image trans formation and gray-level image trans
formation is that in color image transformation the pixel value is triplets or quartets
• {T1,T2,T3,…,Tn} is a set of transformation or color mapping functions that operate on ri to
produce Si .
• If the RGB color space is selected, for example, n=3 and r1,r2, and r3 denote the red,
green, and blue components of the input image.
• Suppose, for example, that we wish to modify the intensity of the image using g(x, y) = kf
(x, y)
• In the HIS color space, this can be done with the simple transformation
s3 = kr3 where s1 = r1 and s2 = r2. Only HSI intensity component r3 is modified
• In the RGB color space, three components must be transformed:
si = kri for i =1,2,3
• The CMY space requires a similar set of linear transformations:
si = kri + (1− k) for i =1,2,3
Color complements
 The Hues directly opposite one another on the color circle are called the complements as
shown in the following figure

Our interest in complements stems from the fact that they are Analogous to gray scale
negatives.
As the gray scale case color complements are used to enhance details buried in dark regions
of a color image
 The RGB compliment transformation function used in this may not have a straight
forward HIS space equivalent.
 Saturation component of the compliment cannot be computed from the saturation
component of the input image alone.
 In the above Figures C and D the saturation component is unaltered
Color slicing
 Color slicing is used to highlight a specific range of colors in an image to separate objects
from surroundings.
 The basic idea to this is:
1. Display the colors of interest so that they stand out from the background. or
2. use the regions defined by specified colors as a mask for further processing
 – since a color pixel is n-dimensional quantity color transformation functions are more
complex than gray-level slicing.
 One simple way to slice the color image is to map the color outside some range of interest
to a non prominent natural color.
 Using a cube of width W to enclose the reference color with components
(a1,a2,……..,an) the transformation is given by

 These transformations highlight the colors around the prototype by forcing all other
colors to the midpoint of the reference color space.
 For RGB color space, for example, a suitable neutral point is middle gray or color (0.5,
0.5, 0.5).
 If the color of interest is specified by a sphere of radius R0, the ransformation is
 Here R0 is the radious of the enclosing sphere and (a1,a2,a3,…..an) are components of its
center.
Tone and color corrections
 This is used for photo enhancement and color reproduction
 This is Device independent color model from CIE relating the color gamuts
 It use a color profile to map each device to color model
 CIE L*a*b* system
o CIE systm is most common model for color management systems
o For CIE system Components given by the following equations

Where

 XW, YW, and ZW are values for refence white, called D65 which is defined by x =
0:3127 and y = 0:3290 in the CIE chromaticity diagram
 X, Y, Z are computed from rgb values as
 Rec. 709 RGB corresponds to D65 white point
 L*a*b* is calorimetric (colors perceived as matching are encoded identically), perceptually
uniform (color differences among various hues are perceived uniformly), and device
independent
 Not directly displayable on any device but its gamut covers the entire visible spectrum
 L*a*b* decouples intensity from color, making it useful for image manipulation (hue and
contrast editing) and image compression applications
o L* represents lightness or intensity
o a* gives red minus green
o b* gives green minus blue
 Allows tonal and color imbalances to be corrected interactively and independently
 Tonal range refers to general distribution of key intensities in an image
o Adjust image brightness and contrast to provide maximum detail over a range
of intensities
o The colors themselves are not changed
Histogram processing
 This provides an automated way to perform enhancement
 This is simmilar to the grayscale image Histogram equalization
 Adapt the grayscale technique to multiple components
 Applying grayscale techniques to different colors independently yields erroneous
colors
 Spread the intensities uniformly leaving the hues unchanged.


Smoothing and sharpening:
Color image smoothing
 Extend spatial filtering mask to color smoothing, dealing with component vectors
 Let S xy be the neighborhood centered at (x, y)
 Average of RGB components in the neighborhood is given by

Which is the same as








Color image sharpening:


 Image sharpening is done using the Laplacian. For vector analysis we know that the
Laplacian of a vector is defined as a vector whose components are equal to the laplacian
of the individual scalar components of the input vector.
 In RGB color system the Laplation of vector e is as follows

 It tells us that we can compute the Laplation of full color image by computing the
Laplacian ofeach component image separately.


Image compression
 The term data compression refers to the process of reducing the amount of data required to
represent a given quantity of information.

 various amounts of data can be used to represent the same amount of information, representations
that contain irrelevant or repeated information are said to
contain redundant data.

 let b and b′ denote the number of bits (or information-carrying units) in two representations of the
same information, the relative data redundancy, R, of the representation with b bits, is

 Two-dimensional intensity arrays suffer from three principal types of data redundancies that can
be identified and exploited:

CODING REDUNDANCY

 A code is a system of symbols (letters, numbers, bits, and the like) used to represent a body of
information or set of events.
 Each piece of information or event is assigned a sequence of code symbols, called a code word.
 The number of symbols in each code word is its length.
 a discrete random variable rk in the interval [0,L - 1] is used to represent the intensities of an M *
N image, and that each rk occurs with probability Pr(rk)

 where L is the number of intensity values, and nk is the number of times that the kth intensity
appears in the image.
 the average length of the code words assigned to the various intensity values is found by
summing the products of the number of bits used to represent each
intensity and the probability that the intensity occurs.

 The total number of bits required to represent an M * N image is MNL avg.


 If the intensities are represented using a natural m-bit fixed-length code then,

Lavg = m

Computer-generated 256 * 256 * 8-bit image with coding redundancy

SPATIAL AND TEMPORAL REDUNDANCY


 A computer-generated collection of constant-intensity lines is shown in the diagram.
 In the corresponding 2-D intensity array:
1. All 256 intensities are equally probable. The histogram of the image is uniform.
2. Because the intensity of each line was selected randomly, its pixels are independent of
one another in the vertical direction.
3. Because the pixels along each line are identical, they are maximally correlated in the
horizontal direction.

Computer generated 256 * 256 * 8-bit image with spatial redundancy

The intensity histogram of the image


 A significant spatial redundancy can be eliminated by representing the image as a sequence of
run-length pairs, where each run-length pair specifies the start of a new intensity and the number
of consecutive pixels that have that intensity.

 A run-length based representation compresses the original 2-D, 8-bit intensity array by (256 *
256 * 8)/ [(256 + 256) * 8] or 128:1.
 Each 256-pixel line of the original representation is replaced by a single 8-bit intensity value
and length of 256 in the run-length representation.

 To reduce the redundancy associated with spatially and temporally correlated pixels, a 2-D
intensity array must be transformed into a more efficient but usually “non-visual”
representation.

 For example, run-lengths or the differences between adjacent pixels can be used.
Transformations of this type
are called mappings.

 A mapping is said to be reversible if the pixels of the original 2-D intensity array can be
reconstructed without error from the transformed data set; otherwise, the mapping is said to be
irreversible.

IRRELEVANT INFORMATION
 One of the simplest ways to compress a set of data is to remove superfluous data from the set.

 In the context of digital image compression, information that is ignored by the human visual
system, or is extraneous to the intended use of an image, are
obvious candidates for omission.

Computer-generated 256 * 256 * 8-bit images with irrelevant information


Histogram of the image in and a histogram equalized version of the image.

 The human visual system averages these intensities, perceives only the average value, and then
ignores the small changes in intensity that are present in this case.

 Histogram-equalized version of the image makes the intensity changes visible and reveals two
previously undetected regions of constant intensity—one oriented vertically, and the other
horizontally.

 If the image is represented by its average value alone, this “invisible” structure is lost.

 The redundancy examined here is fundamentally different from the redundancies discussed in the
previous two sections.

 Its elimination is possible because the information itself is not essential for normal visual
processing and/or the intended use of the image.

 Because its omission results in a loss of quantitative information, its removal is commonly
referred to as quantization.

MEASURING IMAGE INFORMATION

 Information theory provides the mathematical framework.

 Its fundamental premise is that the generation of information can be modeled as a probabilistic
process that can be measured in a manner that
agrees with intuition.

 In accordance with this supposition, a random event E with probability P(E) is said to contain

units of information. If P(E) = 1 (that is, the event always occurs), I(E) = 0 and no information is
attributed to it.
[it always occurs if P(E) = 1].
 Given a source of statistically independent random events from a discrete set of possible
events{a1,a1….aj} with associated probabilities { P(a1),P(a1)….P(aj) the average information
per source output, called the entropy of the source, is

 The aj in this equation are called source symbols. Because they are statistically independent, the
source itself is called a zero-memory source.
Shannon’s First Theorem

 Shannon's First Theorem, also known as the Noiseless Coding Theorem, guarantees that it is
possible to represent the data from a source with a compression rate approaching the entropy of
that source, which is the theoretical limit for data compression.

 an image can be represented with fewer bits per pixel than originally required by standard coding.
The text refers to the ability to use a variable-length code that compresses the data to about 1.81
bits per pixel, even though the entropy of the image (minimum average number of bits needed) is
1.664 bits per pixel.
 Shannon's theorem proves that for any given source, the average number of symbols used to
encode data (denoted as Lavg) can approach the entropy H of the source as the block size n
increases.

Where:
 Lavg is the average number of code symbols used for representing n-symbol groups.
 H is the entropy of the source, which represents the average amount of information per symbol in
the source.

 Shannon's theorem applies to both single-symbol sources and extended sources. By grouping
consecutive symbols together (extending to n-symbol blocks), the theorem ensures that
encoding efficiency improves as larger blocks are considered, approaching the theoretical
entropy limit.
FIDELITY CRITERIA
When performing image compression, some information loss is inevitable, and this can result in a
degradation of image quality. The purpose of fidelity criteria is to evaluate how closely the compressed
image resembles the original image, i.e., how much quality has been retained. Two types of fidelity
criteria are generally used:

Objective Fidelity Criteria

Objective fidelity measures provide a mathematical and quantifiable way to compare the original image
with the compressed image. A common method used for this purpose is the Root-Mean-Squared (RMS)
Error.

Root-Mean-Squared (RMS) Error:

 The RMS error is a way of measuring the difference between two images: the original image
f(x,y) and the compressed (or reconstructed) image f^(x,y)
 The error at each pixel is given by:

 This equation represents the error at each pixel by subtracting the original pixel value from
the corresponding pixel in the compressed image.

 The total RMS error over the entire image (size M×NM \times NM×N) is calculated by
summing up the squared errors at each pixel and then taking the square root of the mean of
these squared errors. This is given by:

Signal-to-Noise Ratio (SNR):


 The signal-to-noise ratio (SNR) quantifies how much the original signal (image) dominates over
the noise (error introduced by compression).

 The mean squared signal-to-noise ratio (SNRrms) is computed as

Subjective Fidelity Criteria

While objective measures such as RMS error and SNR are useful for assessing image quality
quantitatively, they may not always align with human perception. Therefore, subjective fidelity
criteria are also used to assess how humans perceive the quality of the compressed image.

Subjective Evaluations:

 In subjective fidelity evaluations, human observers are asked to rate the quality of the
compressed image. This can be done through:
 Absolute Rating Scales: Observers are asked to assign numerical scores based on
how similar the compressed image looks compared to the original. Ratings might
be assigned on scales such as:
 111 (poor quality) to 555 (excellent quality), or
 111 (worst) to 121212 (best), depending on the evaluation method.

 Side-by-Side Comparisons: Observers view the original and compressed images side by side
and rate the differences.

 Evaluation Scale: Subjective evaluations often use labels like "much worse," "slightly
worse," "the same," "slightly better," and "much better" to categorize the differences between
the images.

 Importance of Subjective Measures: These evaluations are crucial because human perception
of image quality can be influenced by factors that objective measures like RMS error may not
capture. This is particularly important when dealing with subtle visual differences, where
small RMS errors may still lead to significant perceptual changes.
IMAGE COMPRESSION MODELS

 An image compression system is composed of two distinct functional components: an encoder


and a decoder.

 The encoder performs compression, and the decoder performs the complementary operation of
decompression.

 A codec is a device or program that is capable of both encoding and decoding.

 Input image f (x, p) is fed into the encoder, which creates a compressed representation of the
input.

 When the compressed representation is presented


to its complementary decoder, a reconstructed output image fˆ(x, p) is generated.

 In still-image applications, the encoded input and decoder output are f (x, y) and fˆ(x, y),
respectively

 In general, fˆ(x, p) may or may not be an exact replica of f (x, p).

 If it is, the compression system is called error-free, lossless, or information-preserving.

 If not, the reconstructed output image is distorted, and


the compression system is referred to as lossy.

The Encoding or Compression Process

 encoder is designed to remove the redundancies described in the previous sections through a
series of three independent operations.

 In the first stage of the encoding process, a mapper transforms f (x, p) into a (usually nonvisual)
format designed to reduce spatial and temporal redundancy.

 This operation generally is reversible, and may or may not directly reduce the amount of data
required to represent the image.

 Run-length coding is an example of a mapping that normally yields compression in the first step
of the encoding process.

 The quantizer reduces the accuracy of the mapper’s output in accordance with a pre-established
fidelity criterion.

 The goal is to keep irrelevant information out of the compressed representation.

 In the third and final stage of the encoding process, the symbol coder generates a fixed-length or
variable-length code to represent the quantizer output and maps the output in accordance with the
code.

 The shortest code words are assigned to the most frequently occurring quantizer output values,
thus minimizing coding redundancy.

 This operation is reversible.


The Decoding or Decompression Process:

 The decoder contains only two components: a symbol decoder and an inverse mapper.

 They perform, in reverse order, the inverse operations of the encoder’s symbol encoder and
mapper.

 an inverse quantizer block is not included in the general decoder model.

 In video applications, decoded output frames are maintained in an internal frame store (not
shown) and used to reinsert the temporal redundancy that was removed at the encoder.

IMAGE FORMATS, CONTAINERS, AND COMPRESSION STANDARDS

 An image file format is a standard way to organize and store image data.

 It defines how the data is arranged and the type of compression (if any) that is used.

 An image container is similar to a file format but handles multiple types of image data.

 Image compression standards, on the other hand, define procedures for compressing and
decompressing images—that is, for reducing the amount of data needed to represent an image.

 International standards are sanctioned by ;the International Standards Organization


(ISO), the International Electrotechnical Commission (IEC), and/or the International
Telecommunications Union (ITU-T)—a United Nations (UN) organization that was once called
the Consultative Committee of the International Telephone and Telegraph
(CCITT).

 Two video compression standards, VC-1 by the Society of Motion Pictures and Television
Engineers (SMPTE) and AVS by the Chinese Ministry of Information
Industry (MII), are also included.

You might also like