Unit 1 CV Notes
Unit 1 CV Notes
Part 1: CAMERAS
Pinhole Cameras.
Part 3:
Sources, Shadows, And Shading
Qualitative Radiometry, Sources and Their Effects, Local Shading
Models, Application: Photometric Stereo, Interreflections: Global
Shading Models.
Part 4:
Color
The Physics of Color, Human Color Perception, Representing Color,
A Model for Image Color, Surface Color from Image Color.
Advantages of CV:
Disadvantages:
Limitations:
1. Agriculture:
2. Sports:
3. Healthcare:
4. Transportation:
6. Retail:
7. Constructions:
PART 1 CAMERAS
Pinhole Camera:
There are many types of imaging devices, from animal eyes to
video cameras and radio telescopes, and they may or may not be
equipped with lenses.
The visual arts and photography have undergone a major
transformation over the years.
With the advancement of technology, the designs and
functioning of cameras have also changed.
However, the Pinhole Camera, which is one of the fundamentals
of photography, is still an interesting technique today.
The first models of the “camera obscura”, which is a Latin word
which means dark chamber, invented in thesixteenth century did
not have lenses, but instead used a pinhole to focus light raysonto a
wall or translucent plate and demonstrate the laws of perspective
discovered, a century earlier by Brunelleschi.
The pinhole camera is the simplest kind of camera.
It does not have a lens.
It just makes use of a tiny opening/aperture (a pinhole-sized
opening) to focus all light rays within the smallest possible area to
obtain an image, as clearly as possible.
The core component of the Pinhole Camera, the hole is an
opening through which light passes.
Light rays reflected or radiated from an object pass through
the hole and reach the projection plane.
The hole allows the light to come from different points and
reflect to different points on the plane and pass through a single
point.
This prevents the light rays from scattering and allows a
clearer image to fall on the projection plane.
The small size of the hole helps to focus the light rays
accurately.
The simple image formed using a pinhole camera is always
inverted.
Advantages:
1. It is simple and has a few components.
2. Its design is quite simple.
3. It is low cost.
4. It is portable.
Disadvantages:
1. Due to absence of lens, it has a low light-gathering capacity.
2. Lenses receive more light and expose better, due to their ability to
collect light. In the Pinhole Camera, on the other hand, light is more
difficult to collect, which means longer exposure times may be needed
in low light conditions.
3. Due to absence of lens, there may be a loss of sharpness in
image quality.
4. There are also restrictions on depth of focus because focus
cannot be adjusted in the Pinhole Camera.
Light in Space:
The solid angle that the source subtends is Ωs. This will behave
approximately proportional to
Where N(x) is the unit normal to the surface and S(x) known as source
vector, a vector from x to the source.
Point source model has limitations: While the point source model
simplifies calculations, it can create unrealistic results for nearby
sources. For example, a point source in the center of a cube wouldn't
accurately predict the darkness of corners in a real room.
We also assume that the line source is infinitely long, with constant
exitance along its surface and diameter ε.
Figure 2.4
On the right, the view of the source from each patch are shown.
We can see that that the length of the source on this hemisphere does
not change, but the width does (as ε/r).
Hence, the radiosity due to the source decreases with the reciprocal
of distance.
For points not too far from the source, the radiosity due to an area
source does not change with distance to the source.
Shading models are algorithms that determine how light and color
interact with the surfaces of 3D objects in computer vision.
The models are local because they don't consider other objects at all.
The models are used because they are fast and simple to compute.
They do not require the knowledge of the entire scene only the current
piece of surface.
This takes into account radiance arriving from sources, and that
arriving from radiating surfaces.
For point sources that are not at infinity the model becomes:
Shadows cast with a single source can be very crisp and very black,
depending on the size of the source and the albedo of other nearby
surfaces.
Figure 2.6
If there are many sources, the shadows will be less dark, except at
points where no source is visible.
Regions where the source cannot be seen at all are known as the
umbra (Latin word meaning shadow).
Regions where some portion of the source is visible are known as the
penumbra (Latin word meaning almost shadow).
Umbra is the dark part of the shadow whereas the penumbra is the
less dark part of the shadow.
At point 1, one can see all of the source; at point 2, one can see some
of it; and at point 3 you can see none of it.
But in fact, the local shading models miss an important effect: light
bouncing off nearby surfaces can still illuminate shadows.
This effect is especially noticeable and which can be very significant
in rooms with bright walls and large light sources because shadows
are illuminated by light from other diffuse surfaces.
Figure 2.8
There are two ways to calculate this term as shown in figure 2.8
above:
In more complex worlds, some surface patches see much less of the
surrounding world than others.
For example, the patch at the base of the groove on the right in figure
2.8 sees relatively little of the outside world, which is modelled as an
infinite polygon of constant radiosity, where the view of this polygon
is occluded at some patches and its input hemisphere is shown
below.
The result is that the ambient term is smaller for patches that see
less of the world.
Monge Patch:
A Monge patch, named after the French mathematician Gaspard
Monge, is a type of parametric surface used in mathematics,
computer graphics and computer vision to represent a curved surface
in 3D space using a mathematical formula.
A Monge patch is defined by three functions: X(u, v), Y(u, v), and
Z(u, v), where 𝑢 and v are parameters that vary over a two-
dimensional domain, often a rectangle.
They are also used in computer graphics and CAD (computer aided
design)to model and render smooth, curved surfaces.
2.5 Application: Photometric Stereo
Photometric stereo is a computer vision technique that recovers the
3D shape of an object by analyzing how light reflects off the surface
from different lighting directions.
The method involves reasoning about the image intensity values for
several different images of a surface in a fixed view, illuminated by
different sources.
Assumptions:
4.Once the surface normal vectors are known for all pixels, we can
reconstruct the 3D shape of the object.
Mathematical Equations:
Disadvantages
Assumes a Lambertian surface
Sensitive to noise and shadows.
Limitations:
Applications:
4. Artistic control
Shading models offer artists and designers precise control over the
appearance of materials and surfaces, helping them achieve specific
aesthetic goals and convey desired moods or atmospheres.
5. Increased efficiency
Advanced shading techniques, such as physically-based rendering,
can help streamline the rendering process by accurately simulating
light behaviour while minimizing the need for manual adjustments,
resulting in more efficient workflows.
6. Versatility
3D shading models are versatile tools that can be applied to
animation, visual effects, video games, architectural visualization,
and product design, providing consistent and high-quality results
across various media and industries.
Thus, in the real world, each surface patch is illuminated not only by
sources, but also by other surface patches.
This leads to a variety of complex shading effects, which are still quite
poorly understood.
Unfortunately, these effects occur widely, and it is still not yet known
how to simplify interreflection models without losing essential
qualitative properties.
When one black room with black objects and a white room with white
objects are illuminated by a distant point source, the local shading
model predicts that these pictures would be indistinguishable.
This is because surfaces in the black room reflect less light onto other
surfaces (they are darker) whereas in the white room, other surfaces
like the walls and floor of the room are significant sources of
radiation, which tend to light up the corners, which would otherwise
be dark.
Where every other patch in the world that the patch under
consideration can see is an area source, with exitance B(v).
Subdivide the world into small, flat patches and approximate the
radiosity as being constant over each patch.
The model describes the world poorly, and very little is known about
how severely this affects the resulting shape information.
Extracting shape information from an interreflection model is
difficult, for two reasons.
Secondly, there are almost always surfaces that are not visible, but
radiate to the objects in view.
Imagine a stained glass window: the intricate details are lost when
looking at the colored blobs it projects on the floor.
Figure 2.18
This effect is further explained with a simplified model:
A small patch views a plane with sinusoidal radiosity of unit
amplitude as shown in figure 2.18 above.
The graph shows numerical estimates of the gain for patches at ten
equal steps in slant angle, from 0 to π/2, as a function of spatial
frequency on the plane.
The gain falls extremely fast, meaning that large terms at high
spatial frequencies must be regional effects, rather than the
result of distant radiators.
Light bounces between a slanted patch and a flat plane. The model
shows that high spatial frequencies (sharp details) have a hard time
traveling between the surfaces.
This is a common effect that people tend not to notice unless they
are consciously looking for it. It is quite often reproduced by
painters.
Some changes are clearly visible while others are not, depending on
the distances between the surveillance camera and subjects.
COLOUR
Color perception: When the reflected light reaches our eyes, the
cone cells in our retina react to the specific wavelengths.
These signals are then sent to the brain, which interprets them as
different colors.
Spectral radiance has units Watts per cubic meter per steradian
(Wm−3sr−1 — cubic meters because of the additional factor of the
wavelength).
spectral exitance has units Wm−3 and is the property where the
angular distribution of the source is unimportant.
The colour of the light returned to the eye is affected both by the
spectral radiance of the illuminant and by the spectral reflectance of
the surface.
λ is the wavelength.
This is the temperature of the black body that looks most similar.
The colour of sunlight varies with time of day as shown in figure 3.3
and time of year.
Figure 3.3
It's named after the British scientist Lord Rayleigh, who first
described it in the 19th century.
Where λ is wavelength,
N is no of scatterers
α is polarizability
R is distance from scatterer
θ is angle.
Mie Scattering:
The Mie scattering model is a mathematical description of how light
is scattered by particles that are comparable in size to the
wavelength of the light.
It's named after the German physicist Gustav Mie, who developed it
in the early 20th century.
where:
𝑄scat is the scattering efficiency, representing the fraction of incident
light scattered by the particle.
Artificial Illumination
Artificial illumination, as opposed to natural light from the sun, refers
to any lighting created by humans.
Most fluorescent bulbs generate light with a bluish tinge, but bulbs
that mimic natural daylight are increasingly available (figure 3.4 (b)).
• In some bulbs, an arc is struck in an atmosphere consisting of
gaseous metals and inert gases. Light is produced by electrons in
metal atoms dropping from an excited state, to a lower energy state.
The most common cases are sodium arc lamps, and mercury arc
lamps.
Mercury arc lamps produce a blue-white light, and are often used
for security lighting.
The graph in figure 3.4 (a) shows the relative spectral power
distribution of two standard CIE models, illuminant A — which
models the light from a 100W Tungsten filament light bulb, with
colour temperature 2800K
— and illuminant D-65 — which models daylight.
Figure 3.4 (b) show a sample of spectra from different light bulbs.
Warm vs. Cool Colours: The colour wheel is also divided into warm
colours (red, orange, yellow) associated with fire and energy, and
cool colours (blue, green, violet) associated with water and
calmness. Understanding this temperature distinction helps create
specific moods in a design.
Additive Matching:
Additive color matching occurs when different colored lights are
combined.
The primary colors in additive matching are red, green, and blue
(RGB).
When all three primary colors are mixed at full intensity, they
produce white light. This is known as Additive colour matching.
This is the principle behind electronic displays such as televisions
and computer monitors.
Figure 3.5 below shows the outline of such additive colour
matching.
Write T for the test light, an equals sign for a match, the weights wi
and the
primaries Pi.
Figure 3.5
The subtractive colors are cyan, yellow, magenta and black, also
known as CMYK.
Subtractive color begins with white (paper) and ends with black; as
color is added, the result is darker.
Black is referred to as "K," or the key color, and is also used to add
density.
Trichromacy:
Trichromacy, also known as trichromatic vision, is the ability to see
a wide range of colors.
Tri means three and chromacy means colour.
The rods are sensitive to light and help us to see in dim lighting,
whereas the cones allow us to detect color and detail in normal
lighting.
Of the three types of color receptors, one is most sensitive to:
Short-wavelength (S) which corresponds to blue colour,
medium-wavelength (M) which corresponds to green colour and
long-wavelength (L) which corresponds to red colour.
But the combinations of only two colors could not produce all of the
colors that we are capable of perceiving.
Grassman’s Laws
Grassmann's Law, also known as the Grassmann's Axiom, is a
principle in color perception and computer vision that describes how
colors mix and interact.
1. Firstly, if we mix two test lights, then mixing the matches will
match the result, that is, if
Ta = wa1P1 + wa2P2 + wa3P3
and
Tb = wb1P1 + wb2P2 + wb3P3
Then
Ta + Tb = (wa1 + wb1 )P1 + (wa2 + wb2 )P2
+ (wa3 + wb3 )P3
Where Ta and Tb are two test lights,
P1 , P2 and P3 are three different primaries,
wa1 and wb1 are weights or percentage of intensity of P1 which
contributes to the final colour.
wa2 and wb2 are weights or percentage of intensity of P2 which
contributes to the final colour.
wa3 and wb3 are weights or percentage of intensity of P3 which
contributes to the final colour.
2. Secondly, if two test lights can be matched with the same set of
weights, then they will match each other, that is, if
Ta = w1P1 + w2P2 + w3P3
and
Tb = w1P1 + w2P2 + w3P3
Then
Ta = Tb
Where w1, w2 and w3 are the weights or percentage of intensity of P1,
P2 and P3 respectively, which contributes to the final colour.
Exceptions
Given the same test light and the same set of primaries, most people
will use the same set of weights to match the test light.
This, trichromacy and Grassman’s laws are about as true as any law
covering biological systems can be.
• some elderly people (whose choice of weights will differ from the
norm, because of the development of macular pigment in the eye);
• very bright lights (whose hue and saturation look different from less
bright versions of the same light);
If two test lights that have different spectra, look the same, then they
must have the same effect on these receptors.
The Principle of Univariance
The principle of univariance is a concept that applies to biological
vision systems, but it's relevant to understanding color perception in
computer vision as well.
Overcoming Univariance:
Our brains have multiple cone types, each with a different peak
sensitivity to specific wavelength ranges (red, green, blue).
Cones are responsible for vision in bright light conditions, also known
as photopic vision.
Cones are somewhat less sensitive to light than rods are, meaning
that in low light, colour vision is poor and it is impossible to read.
The figure 3.6 above shows the log of the relative spectral sensitivities
of the three kinds of colour receptors in the human eye.
The three types of cone are properly called S cones, M cones and L
cones (for their peak sensitivity being to short, medium and long
wavelength light respectively).
They are occasionally called blue, green and red cones, which is a
bad practice.
The first two receptors —sometimes called the red and green cones
respectively, but more properly named the long and medium
wavelength receptors — have peak sensitivities at quite similar
wavelengths.
The third type of receptors sometimes called the blue cones, but more
properly named the short wavelength receptors - have a very different
peak sensitivity.
The response of a receptor to incoming light can be obtained by
summing the product of the sensitivity and the spectral radiance of
the light, over all wavelengths.
Rods Cones
4. There are about 120 million There are about 6-7 million cone
rod cells. cells.
Many products are closely associated with very specific colours and
manufacturers take a great deal of trouble to ensure that different
batches have the same colour.
Colour matching data yields simple and highly effective linear colour
spaces
(section 3.3.1).
The most common color spaces include RGB (Red, Green, Blue),
CMYK (Cyan, Magenta, Yellow, Black), HSV/HSL (Hue, Saturation,
Value/Lightness), and XYZ.
Linear color spaces are a subset of color spaces that have gained
importance due to their compatibility with various image processing
algorithms, especially those involving linear transformations like
convolution and matrix operations.
Linear colour spaces are obtained are easy to use to describe a colour
by setting up and performing matching experiments and transmitting
the match weights.
CIE XYZ: The CIE XYZ color space is linear and serves as a
foundational color space for various color models. It represents
colors based on their spectral characteristics and is widely used in
color science and standardization.
Dynamic Range: Linear color spaces may not fully utilize the
available dynamic range of digital sensors and displays, especially
in low-light or high-brightness conditions. Linear spaces can
struggle with the vast dynamic range of real-world scenes. Cameras
often capture data with a limited range, requiring non-linear spaces
for efficient storage and display. However, this limitation can be
addressed through techniques like gamma correction or tone
mapping.
Perceptual Non-uniformity: Linear color spaces may not provide
perceptual uniformity, meaning that equal differences in color
components do not always correspond to equal perceptual changes
in color. However, perceptual uniformity can be achieved through
transformations like CIE Lab.
They quantify the sensitivity of the three types of cone cells (red (L),
green (M) and blue (S)) in the human retina to different wavelengths
of light.
1. CIE 1931 2-degree CMFs: This is the most widely used set,
representing colour matching for a 2-degree field of view (the central
part of human vision).
2. CIE 1964 10-degree CMFs: These CMFs account for a wider
field of view (10 degrees) and are more relevant for peripheral vision.
3. Normalized Functions: The color matching functions are
often normalized so that the area under each curve is equal to one.
Normalization ensures that the total response of the visual system
to all wavelengths of light is consistent across different conditions
and observers.
The colour matching functions, which are written as f1(λ), f2(λ) and
f3(λ); can be obtained from a set of primaries P1, P2 and P3 by
experiment.
(a) (b)
The figures (a) and (b) above plot the wavelength on x-axis and
Trstimulus value, i.e., relative sensitivity on y-axis.
The figure (a) above shows colour matching functions for the
primaries for the RGB system. The negative values mean that
subtractive matching is required to match lights at that wavelength
with the RGB primaries.
The figure (b) above shows colour matching functions for the CIE X,
Y and Z primaries; the colour matching functions are everywhere
positive, but the primaries are not real.
Applications of Colour Matching Functions:
CMFs play a crucial role in various applications:
1. Standardizing Colour Measurement: They provide a reference
for instruments that measure colour, ensuring consistent results
across different devices. They serve as the basis for defining color
spaces, color models, and color reproduction systems used in various
industries, including printing, imaging, and display technology.
If the primaries are real lights, at least one color matching function
may be negative for some wavelengths.
This may seem problematic, but since color naming systems typically
rely on comparing weights rather than physically creating colors, it's
not a significant issue.
The XYZ color space should be considered the master color space as
it can encompass and describe all other RGB color spaces.
2. Color Space Primaries: The CIE XYZ color space has imaginary
primaries, meaning they do not correspond to any physically real
light sources.
These primaries are defined in such a way that any color can be
represented as a non-negative linear combination of X, Y, and Z
values.
However, given colour matching functions alone, one can specify the
XYZ coordinates of a colour and hence describe it.
Standardization:
The CIE has standardized various aspects of the CIE XYZ color space
to ensure consistency and interoperability across different
applications and industries. These standards include color matching
functions, illuminants, observer conditions, and conversion
algorithms.
The CIE xy color space is widely used in color science, lighting design,
and colorimetry.
x = X / (X + Y + Z)
y = Y / (X + Y + Z)
Colour Hue: Hue changes as one moves around the spectral locus.
Colours on the red side of the diagram have a reddish hue, those on
the green side have a greenish hue, and so on.
Device Independence: Like the CIE XYZ color space, the CIE xy
color space is device-independent, making it suitable for color
specification, analysis, and comparison across different devices and
environments.
RGB (Red, Green, Blue) color spaces are fundamental to the digital
world, used in digital imaging, video systems, display technologies,
computer graphics, photography, and digital media, etc.
RGB colour spaces form the foundation for how colors are displayed
on electronic devices like monitors, TVs, and smartphones.
The RGB colour space is a linear colour space that formally uses
single wavelength primaries (645.16 nm for R, 526.32nm for G and
444.44nm for B).
Informally, RGB uses whatever phosphors a monitor has as
primaries.
Core Principles:
Additive Color Mixing: RGB relies on the principle of additive color
mixing.
By combining varying intensities of red, green, and blue light, a vast
range of colors can be produced.
Color Gamut: The color gamut of an RGB color space refers to the
range of colors that can be accurately represented within that color
space.
Different RGB color spaces have different gamuts, with some capable
of representing a wider range of colors than others.
For example:
R=0, G=0, B=0 yields black
R=255, G=255, B=255 yields white
R=251, G=254, B=141 yields a pale yellow
R=210, G=154, B=241 yields a light purple.
Figure 3.12 (a) shows the RGB cube; which is the space of all colours
that can be obtained by combining three primaries (R, G, and B —
usually defined by the colour response of a monitor) with weights
between zero and one.
It is common to view this cube along its neutral axis — the axis from
the origin to the point (1, 1, 1) — to see a hexagon, shown in the
middle.
Figure 3.12 (b) shows a cone obtained from the RGB cube cross-
section, where the distance along a generator of the cone gives the
value (or brightness) of the colour, angle around the cone gives the
hue and distance out gives the saturation of the colour.
Despite its limitations, the sRGB color space is still very widely used
and often considered to be the default if the color space of an image
is not explicitly specified.
It has been endorsed by the W3C, Exif, Intel, Pantone, Corel, and
many other industry players.
Adobe RGB:
The Adobe RGB color space was developed by Adobe Systems in
1998.
The Adobe RGB color space encompasses roughly 50% of the visible
colors specified by the Lab color space, improving upon the gamut of
the sRGB color space primarily in cyan-greens.
It offers a wider gamut (range of colors) compared to sRGB, suitable
for professional image editing where preserving a larger color range
is crucial.
ProPhoto RGB:
ProPhoto RGB, also known as ROMM RGB, is an even wider-gamut
RGB color space designed for high-end professional imaging
applications.
1. Image and Video Display: RGB is the primary color space for
displaying images and videos on various electronic devices.
2. Digital Photography: Most digital cameras capture images in
an RGB color space, often using variations like sRGB for
compatibility.
3. Computer Graphics and Animation: RGB is the foundation for
creating and manipulating colors in computer graphics and
animation software.
In the CMY colour space, each layer of ink reduces the initial
brightness, by absorbing some of the light wavelengths and reflecting
others, depending on its characteristics.
It uses the subtractive colour mixing where colors are created by
subtracting varying amounts of cyan, magenta, and yellow inks from
white light, to create different hues.
Primary Colors:
1. Cyan: Cyan is a greenish-blue color.
It absorbs red light and reflects green and blue light.
In printing, cyan ink is used to subtract red light from white light,
resulting in cyan.
cyan = White − Red (C = W - R); -------- (1)
3. Yellow: Yellow absorbs blue light and reflects red and green
light.
In printing, yellow ink is used to subtract blue light from white light,
resulting in yellow.
yellow = White − Blue (Y = W-B) ---------- (3)
Color Mixing:
The appearance of mixtures may be evaluated by reference to the
RGB colour space.
For example,
W = R + G + B ------------------(4)
we assume that ink cannot cause paper to reflect more light than it
does when uninked.
Hence, we can write,
W + W = W --------------- (6)
C + M = W – R – G ---------(7)
Put (4) in (7), we have,
C+M=R+G+B–R–G
C + M = B ----------- (8)
For example:
3. Mixing cyan and yellow inks subtracts red and blue light,
resulting in green.
Adding (3) and (1), we have,
C + Y = W – R + W – B ------- (12)
W + W = W --------------- (6)
C + Y = W – R – B ---------(13)
Put (4) in (13), we have,
C+Y=R+G+B–R–B
C + Y = G ------------ (14)
Black ink provides richer, deeper blacks and sharper details in print
compared to mixing CMY inks.
Hence, the new colour model is known as CMYK colour model, which
is an extension of the CMY model, incorporating a fourth color, black
(K), to improve the quality of color reproduction in printing.
The CMYK color space has a narrower color gamut compared to RGB,
meaning it can represent fewer colors.
In HSV colour space, H stands for Hue, S stands for Saturation and
V stands for Value and B stands for brightness.
Hue (H):
the property of a colour that varies in passing from red to green is
known as Hue.
In each cylinder, the angle around the central vertical axis
corresponds to "hue".
In the HSL color model, hue represents the actual colour, which
ranges from 0 to 360 degrees, covering the entire spectrum of
colors.
Red is typically at 0 degrees, green at 120 degrees, and blue at 240
degrees, with intermediate hues in between.
2. Saturation (S):
The distance from the axis corresponds to "saturation".
It can also be defined as the property of a colour that varies in
passing from red to pink;
Saturation refers to the intensity or purity of the color.
A saturation value of 0 results in a grayscale color (no hue), while a
saturation value of 100% represents the fully saturated hue.
Saturation is typically expressed as a percentage, ranging from 0%
(gray) to 100% (fully saturated).
3. Value (V):
Value is also known as brightness.
So the HSV colour space is also known as HSB colour space.
Value is the property that varies in passing from black to white.
The distance along the axis corresponds to value or brightness.
Value/Brightness represents the brightness of the color.
The value or brightness dimension is represented on a scale from 0%
to 100%.
Value (sometimes called Brightness) also goes from 0% (black) to
100% (white).
The lower the percentage, the darker the color will be; the higher the
percentage, the purest the color will be.
The value is essentially the brightness of the color irrespective of its
hue.
Both HSL and HSV require conversion to/from linear spaces like
RGB for calculations and display on devices.
In HSV, the colors are most saturated at the top (full value), and
desaturation happens as we move down the cylinder, i.e., most
saturated at 100%.
Figure 3.12 (a) shows the RGB cube, which can be viewed with
weights between 0 and 1. This cube appears as a hexagon, when
viewed along its neutral axis — the axis from the origin to the point
(1, 1, 1).
Figure 3.12 (b) shows the cone obtained from the cross-section of
the hexagon.
The angle around the cone gives hue, and the distance away from
the axis gives the saturation, and the distance from origin along the
vertical axis gives value or brightness.
Applications:
1. Developed in the 1970s for computer graphics applications,
HSL and HSV are used today in color pickers, in image editing
software, and less commonly in image analysis and computer vision.
2. HSV model is used in histogram equalization.
3. Converting grayscale images to RGB color images.
4. Visualization of images is easy as by plotting the H and S
components we can vary the V component or vice-versa and see the
different visualizations.
Figure 2
When these differences are plotted on a colour space, they form the
boundary of a region of colours that are indistinguishable from the
original colours.
These boundaries form ellipses instead of circles because the human
eye has varying sensitivity to color changes across the spectrum.
Figure 3.13
At the center of the ellipse is the colour of a test light; the size of the
ellipse represents the scatter of lights that the human observers
tested would match to the test colour; the boundary shows where the
just noticeable difference is.
The ellipses at the top are larger than those at the bottom of the
figure, and that they rotate as they move up.
This means that the magnitude of the difference in x, y coordinates
is a poor poor indicator of the significance of a difference in colour.
The CIE u'v' color space:
The CIE u'v' color space, also known as the CIE 1960 UCS (Uniform
Colour Space), is an important step towards more perceptually
uniform color representation compared to its predecessor, CIE xy.
The CIE u'v' space was developed to address the issue of non-
uniform perception where in equal distances in the space don't
necessarily correspond to equal perceived color differences.
The CIE u'v' space was developed to address the issue of non-
uniform perception where in equal distances in the space don't
necessarily correspond to equal perceived color differences.
u' and v' Coordinates: Colors within the diagram are defined by
coordinates u' and v' that are derived from the original x and y values
of CIE xy.
Disadvantage:
The transformation from xy to u'v' involved complex mathematical
formulas, making it less convenient for some applications.
ΔE is given by:
It's one of the most widely used color spaces in various fields such
as color science, color management, and computer graphics.
The 1976 CIELAB coordinates (L*, a*, b*) in this color space can be
calculated from the tristimulus values XYZ with the following
formulas.
The subscript n denotes the values for the white point.
Disadvantages of CIELAB:
1. The precision provided by CIELAB is at a level where it requires
significantly more data per pixel, compared to RGB and CMYK
standards.
Since the gamut of the standard is higher than most computer
displays, occasionally there is some loss of precision; however,
advances in technology have made such issues negligible.
Spatial Effects: The spatial effects relate the objects being viewed
with space.
These focus on how objects are arranged and relate to each other
in 3D space.
Temporal Effects: The temporal effects relate the objects being
viewed with time.
These focus on how objects move and change over time.
Spatial Effects:
These include the following:
1. Spatial Resolution: Spatial resolution refers to the level of
detail in an image and is determined by the number of pixels in an
image.
Higher resolution images contain more detail and provide a clearer
representation of the scene.
Spatial resolution is crucial in tasks like image classification,
object detection, and image segmentation.
2. Spatial Filtering: Spatial filtering involves applying
mathematical operations to an image at each pixel to enhance or
extract certain features. Common spatial filters include blurring
(smoothing), sharpening, edge detection, and noise reduction
filters.
These filters manipulate the spatial arrangement of pixel values to
highlight or suppress specific features in an image.
Temporal Effects:
1. Temporal Resolution: Temporal resolution refers to the
ability to distinguish between changes in a scene over time.
In video processing, it is determined by the frame rate, which is the
number of frames captured or displayed per second.
Higher frame rates provide smoother motion and better temporal
resolution, which is essential for tasks like motion detection,
tracking, and video analysis.
Chromatic Adaptation:
Predicting the appearance of complex displays of colour is difficult.
In the top row, the graph on the left shows log ρ(x); that the center
shows log I(x) and that on the right their sum which is logC.
Algorithm:
1. The log albedo map whose gradient is most like the thresholded
gradient is chosen.
2. This is a relatively simple problem, because computing the
gradient of an image is a linear operation.
3. The x-component of the thresholded gradient is scanned into a
vector p and the y-component is scanned into a vector q.
4. The vector representing log-albedo is written as l.
5. Since the process of forming the x derivative is linear, there is
some matrix Mx such that Mxl is the x derivative.
6. For the y derivative, the corresponding matrix is written as My.
The main parameters which affect the colour of the pixel are:
1. The camera response to illumination (which might not be
linear);
2. The choice of camera receptors;
3. The amount of light that arrives at the surface;
4. The colour of light arriving at the surface;
5. The dependence of the diffuse albedo on wavelength; and
specular components.
The term i(x) can sometimes be quite small with respect to other
terms and usually changes quite slowly over space.
Thus, if a photoreceptor of the k’th type sees this surface patch, its
response will be:
where the φj(λ) are the basis functions for the model of reflectance,
and the rj vary from surface to surface.
where the ψi(λ) are the basis functions for the model of illumination.
When both models apply, the response of a receptor of the k’th type
is:
where we expect that the
where the φj(λ) are the basis functions for the model of reflectance,
and the rj vary from surface to surface.
The average of the response of the k’th receptor can then be written
as:
pk and ̅̅̅
We know ̅̅̅̅ rj , and so have a linear system in the unknown
light coefficients ei. We solve this, and then recover reflectance
coefficients at each pixel, as for the case of specularities.
The gamut is the set of different colours that appear in the image.
The gamut of an image is defined as the collection of all pixel
values which contains information about the light source.
If an image gamut contains two pixel values, say p1 and p2, then it
must be
possible to take an image under the same illuminant that contains
the value tp1 + (1 − t)p2 for 0 ≤ t ≤ 1
This means that the convex hull of the image gamut contains the
illuminant information.
Advantages:
Better color reproduction than other approaches.
Disadvantages:
1. Computationally expensive
2. Depends upon sensor sensitivity
3. Assumes uniform illumination distribution
4. Requires the knowledge of the range of illuminant.