COMP5111-W05-imageEnhancement
COMP5111-W05-imageEnhancement
Heavy rain from Storm Monica has caused severe #floods in southern #France
Sentinel 2, MultiSpectral Instrument (MSI),
12 March 2024
1
Image Enhancement Techniques
5.1 Introduction
5.2 Human Visual System
5.3 Contrast Enhancement
5.4 Pseudocolour Enhancement
2
Introduction
A digital image can be considered as a three-dimensional rectangular array or
matrix,
the x- and y-axes representing the two spatial dimensions, and
the z -axis the spectral bands.
Number
The elements of this
of
matrix are numbers in
the range 0–2n−1 where bands
n is the number of bits
used to represent the
radiance recorded for
any given pixel in the
image. X
Number
of
bands
C/C++ declaration:
Y 4
A computer can be used to manipulate the image data and to produce displays that
satisfy the particular needs of the interpreter.
Also, the characteristics of each image in terms of the distribution of pixel values
(PVs) over the 0–255 display range will change from one image to another;
thus, enhancement techniques suited
to one image (for example covering an area of forest)
will differ from the techniques applicable
to an image of another kind of area (for example the Antarctic ice-cap).
5
In this chapter we concentrate on ways of improving the visual interpretability of an
image by one of two methods:
The first group consists of those methods which can be used to compensate for
inadequacies of what, in photographic terminology, would be called ‘exposure’;
‘too dark’ ------ ‘over-bright’
In this context, contrast is simply the range and distribution of the PVs over the
0–255 scale.
The second category includes those methods that allow the information content of
greyscale image to be re-expressed in colour.
The chapter begins with a brief description of the human visual system.
6
5.2 Human Visual System
Light reaching the eye passes through the pupil and is focused onto the retina by
the lens (Figure 5.1a,b). The retina contains large numbers of light-sensitive
photoreceptors, termed rods and cones. These photoreceptors are connected via a
network of nerve fibres to the optic nerve, along which travel the signals that are
interpreted by the brain as images of our environment.
Figure 5.1 (a) Simplified diagram of the human eye. (b) A real retina. Arteries and veins are
clearly visible, and they converge on the optic nerve, which appears in a lighter colour. Rods
and cones on the surface of the retina are tiny, and are only visible at a much greater
magnification. 7
8
There are around
- 100 million rod-shaped cells on the retina, and
- 5 million cone-shaped cells.
Each of these cells is connected to a nerve, the junction being called a synapse.
The way in which these cells respond to light is through alteration of a molecule
known as a chromophore. Changes in the amount of light reaching a chromophore
produce signals that pass through the nerve fibre to the optic nerve.
Signals from the right eye are transmitted through the optic nerve to the left side
of the brain, and vice versa.
9
It is generally accepted that the photoreceptor cells, comprising the rods and cones,
differ in terms of their inherent characteristics.
10
The rod-shaped cells respond to light at low illumination levels, and provide a
means of seeing in such conditions.
It does not provide any colour information, though different levels of intensity can
be distinguished.
11
Cone or photopic vision allows the distinction of colours or hues and the
perception of the degree of saturation (purity) of each hue as well as the intensity
level.
However, photopic vision requires a higher illumination level than does scotopic
vision.
12
Colour is thought to be associated with cone vision because there are
three kinds of cones, each kind being responsive to one of the
three primary colours of light (red, green and blue (RGB)).
13
A model of ‘colour space’ can be derived from the idea that colours are formed by
adding together differing amounts of RGB.
Figure 5.3 shows a geometrical representation of the RGB colour cube.
14
The origin is at the vertex of the cube marked ‘black’ and the axes
are black–red, black–green and black–blue.
A specific colour can be specified by its coordinates along these three axes.
Black represents the absence of colour.
15
These coordinates are termed (R, G, B) triples. Notice that white
light is formed by the addition of maximum red, maximum green and
maximum blue light. The line joining the black and white vertices of the cube
represents colours formed by the addition of equal amounts of RGB light;
these are shades of grey.
16
Colour television makes use of the RGB model of colour vision.
A cathode ray television screen is composed of an array of dots,
each of which contains RGB-sensitive phosphors. Colours on the screen are formed
by exciting the RGB phosphors in differing proportions.
If the proportions of RGB were equal at each point a greyscale image would be seen.
A colour picture is obtained when the amounts of RGB at each point are unequal,
and so – in terms of the RGB cube – the colour at any pixel is represented by a point
that is located away from the black-white diagonal line.
17
Flat screen technology using a liquid crystal display (LCD) and
thin film transistor (TFT) allows light from a fluorescent source to
pass through RGB liquid crystals with intensity proportional to the voltage at that
point.
The number of points (representing image pixels) is determined by screen size.
Each pixel can be addressed separately and a picture produced in a way that is
conceptually similar to the image on a conventional cathode ray tube (CRT) display.
LCD screens can be much bigger than CRT; commercial TVs with a diagonal screen
size of 52 inches (132 cm) are readily available.
18
Other colour models are available which provide differing views of the nature of
our perception of colour. The hue–saturation–intensity (HSI) model uses the
concepts of hue, saturation and intensity to explain the idea of colour.
19
Hue is represented by the top edge of a six-sided cone (hexcone)
with red at 0◦, green at 120◦ and blue at 240◦, then back to red at 360◦.
Pure unsaturated and maximum intensity colours lie around the top edge of the
hexcone. Addition of white light produces less saturated, paler, colours and so
saturation can be represented by the distance from the vertical axis of the hexcone.
21
The RGB model of colour is that which is normally used in the study and
interpretation of remotely-sensed images, and in the rest of this chapter we will
deal exclusively with this model.
22
5.3 Contrast Enhancement
If the full dynamic range from 0 to (2n − 1) levels of the sensor is not used, the
corresponding image is dull and lacking in contrast or over-bright.
In such cases, the pixel values (PVs) are clustered around a narrow section of the
black–white axis. Not much detail can be seen on such images, which are either
underexposed or overexposed in photographic terms.
If the range of levels used by the display system could be altered so as to fit the
full range of the black-white axis, then the contrast between the dark and light
areas of the image would be improved while maintaining the relative distribution
of the grey levels.
23
5.3.1 Linear Contrast Stretch
The linear contrast-stretching technique involves the translation of the image PVs
from the observed range Vmin to Vmax to the full range of the display device
(generally 0–255, which assumes an 8-bit display memory).
The PVs are scaled so that Vmin maps to a value of 0 and Vmax maps to a value of
255.
Intermediate values retain their relative positions, so that the observed PV in the
middle of the range from Vmin to Vmax maps to 127.
24
To perform a contrast stretch, we first realize that the number of separate values
contained in the image is calculated as (Vmax − Vmin + 1), which
must be 256 or less for an 8-bit image.
All output values corresponding to input values of Vmin or less are set to 0, while
output values corresponding to input values of Vmax or more are set to 255.
The range ( Vmin − Vmax ) is then linearly mapped onto the range ( 0 – 255 ), as
shown in Figure 5.5.
25
Any pixel in the image having the value 16 (the minimum PV in the
image) is transformed to an output value of 0.
All input values of 191 and more are transformed to output values of 255.
26
Figure 5.6a shows a Landsat-7 ETM+ image of the south-east corner of The Wash
in eastern England.
The histograms are calculated from the image PVs, and are simply counts of the
number of PVs having the value 0, 1, . . . , 255.
27
(a)
Figure 5.6
(a) Raw Landsat-7 ETM+ false colour
composite image (using bands 4, 3 and
2 as the RGB inputs) of the south-east
corner of The Wash, an embayment in
eastern England.
(b)
Figure 5.7b is the histogram of each channel after the stretch has been applied.
The difference between the stretched and raw image is not great.
29
(a)
Figure 5.7
(a) Image shown in Figure 5.6a after a
linear contrast stretch in which the
minimum and maximum histogram values
in each channel are set to 0 and 255
respectively.
(b)
30
If, instead of Vmax and Vmin we use V95% and V5% then we can carry out the contrast
enhancement procedure so that all PVs equal to or less than V5% are output as 0
while all PVs greater than V95% are output as 255.
Those values lying between V5% and V95% are linearly mapped (interpolated), as
before, to the full brightness scale of 0 – 255.
31
Figure 5.8
(a) Linear contrast stretch applied to the
image shown in Figure 5.6a.
The 5th and 95th percentile values of the
cumulative image histograms for the RGB
channels are set to 0 and 255 respectively
and the range between the 5th and 95th
percentiles is linearly interpolated onto the
0–255 scale.
32
5.3.2 Histogram Equalization
Underlying principle of the histogram equalization is straightforward.
It is assumed that in a well-balanced image the histogram should be such that each
brightness level contains an approximately equal number of PVs, so that the
histogram of these displayed values is almost uniform.
Firstly, we calculate the target number of PVs in each class of the equalized
histogram:
nt = N / 256;
33
A hypothetical example,
which uses only 16 levels for ease of understanding:
nt = N / 16
nt = 262144 / 16
nt = 16384
34
Next, the histogram of the input image is converted to cumulative
form.
35
Then, target cumulative numbers are computed for each class.
16384 x 1 = 16384
16384 x 2 = 32768
16384 x 3 = 49152
…
16384 x 16 = 262144
36
Finally, New PVs are assigned for each Old PVs.
Table 5.2 Number of pixels allocated to each class after the application of the equalisation
procedure shown in Figure 5.1a.
Note that the smaller classes in the input have been amalgamated, reducing the contrast in
those areas, while larger classes are more widely spaced, giving greater contrast. 41
The Wash image shown in Figure 5.6a is displayed in Figure 5.9a after a histogram
equalization contrast stretch.
42
Figure 5.9
(a) Histogram equalization contrast
stretch applied to the image shown in
Figure 5.6a.
44
In terms of the HSI model, grey values are ranged along the vertical or intensity
axis (Figure 5.4).
No hue or saturation information is present, yet the human visual system is
particularly efficient in detecting variations in hue and saturation, but not so
efficient in detecting intensity variations.
45
The name given to a colour rendition of a single band of imagery is a pseudocolour
image.
46
5.4.1 Density Slicing
The range of contiguous grey levels (such as 0–10 inclusive) is called a ‘slice’.
The greyscale range 0–255 is normally converted to several colour slices.
The effect is
(i) to reduce the number of discrete levels in the image, for several grey levels are
usually mapped onto a single colour and
(ii) to improve the visual interpretability of the image if the slice boundaries and
the colours are carefully selected.
47
Consider, for example a thermal infrared image of the heat emitted by the Earth.
A colour scale ranging from light blue to dark blue, through the yellows and
oranges to red would be a suitable choice for most people have an intuitive feel for
the ‘meaning’ of colours in terms of temperature.
48
Figure 5.11a shows a greyscale image which is to be converted to pseudocolour
using the process of density slicing.
This image is band 4 of the Landsat ETM+ false colour image shown in Figure 5.6.
Figure 5.12(a–c) are the density sliced image, the colour wedge giving the
relationship between greyscale and colour, and the histogram of the density sliced
image with colours superimposed.
49
Figure 5.11
(a) Landsat ETM+ Band 4 (NIR) image
of the south-east corner of The Wash,
eastern England.
50
Figure 5.12
(a) Greyscale image of
Figure 5.11a converted
to colour by slicing the
greyscale range 0–255
and allocating RGB
values to each slice
51
5.4.2 Pseudocolour Transform
A greyscale image has equal RGB values at each pixel position.
A pseudocolour transform is carried out by chancing the RGB values so that they
become un-equal.
52
5.4.2 Pseudocolour Transform
A greyscale image has equal RGB values at each pixel position.
A pseudocolour transform is carried out by changing the colours in the RGB display
to the format shown in the lower half of Figure 5.13.
The settings shown in the lower part of Figure 5.13 send different colour (RGB)
information to the digital to analogue converter (and hence the screen) for the
same greyscale PV.
53
54
Like the density slicing method, the pseudocolour transform method associates
each of a set of grey levels to a discrete colour.
Usually, the pseudocolour transform uses a lot more colours than the density slice
method.
The greyscale image on the screen is directly equivalent to the greyscale values in
the image.
The lower part of Figure 5.13 illustrates the same input PVs (N1, N1, N1) but in this
case the RGB are set to transform this triple to the values (N2, N3, N4).
The values N2, n3 and N4 define a colour which is dominated by red and green.
55
Figure 5.14a is a pseudocolour image of the Wash area converted to pseudocolour
using the colour translation wedge shown in Figure 5.14b.
56
57
5.5 Summary
Image enhancement techniques include, but are not limited to, those of
- contrast improvement and
- greyscale to colour transformations.
Other image-processing methods can justifiably be called enhancements.
All these methods alter the visual appearance of the image in such a way as to
bring out or clarify some aspect or property of the image that is of interest to
a user.
58