21CS732 Module 4 Textbook (1)
21CS732 Module 4 Textbook (1)
ud
It is only after years of preparation that the young artist should touch
color—not color used descriptively, that is, but as a means of
personal expression. Henri Matisse
lo
For a long time I limited myself to one color—as a form of discipline.
Pablo Picasso
C
Preview
The use of color in image processing is motivated by two principal factors.
tu
are acquired with a full-color sensor, such as a color TV camera or color scan-
ner. In the second category, the problem is one of assigning a color to a partic-
ular monochrome intensity or range of intensities. Until relatively recently,
most digital color image processing was done at the pseudocolor level. How-
ever, in the past decade, color sensors and hardware for processing color im-
ages have become available at reasonable prices. The result is that full-color
image processing techniques are now used in a broad range of applications, in-
cluding publishing, visualization, and the Internet.
It will become evident in the discussions that follow that some of the gray-scale
methods covered in previous chapters are directly applicable to color images.
394
6.1 ■ Color Fundamentals 395
ud
through a glass prism, the emerging beam of light is not white but consists in-
stead of a continuous spectrum of colors ranging from violet at one end to red
at the other. As Fig. 6.1 shows, the color spectrum may be divided into six
broad regions: violet, blue, green, yellow, orange, and red. When viewed in full
color (Fig. 6.2), no color in the spectrum ends abruptly, but rather each color
blends smoothly into the next.
Basically, the colors that humans and some other animals perceive in an object
are determined by the nature of the light reflected from the object. As illustrated
lo
in Fig. 6.2, visible light is composed of a relatively narrow band of frequencies in
the electromagnetic spectrum.A body that reflects light that is balanced in all vis-
ible wavelengths appears white to the observer. However, a body that favors re-
flectance in a limited range of the visible spectrum exhibits some shades of color.
For example, green objects reflect light with wavelengths primarily in the 500 to
570 nm range while absorbing most of the energy at other wavelengths.
C
FIGURE 6.1 Color
spectrum seen by
passing white
tu
light through a
prism. (Courtesy
of the General
Electric Co.,
Lamp Business
Division.)
V
FIGURE 6.2 Wavelengths comprising the visible range of the electromagnetic spectrum.
(Courtesy of the General Electric Co., Lamp Business Division.)
396 Chapter 6 ■ Color Image Processing
ud
sured in watts (W). Luminance, measured in lumens (lm), gives a measure of
the amount of energy an observer perceives from a light source. For example,
light emitted from a source operating in the far infrared region of the spec-
trum could have significant energy (radiance), but an observer would hardly
perceive it; its luminance would be almost zero. Finally, brightness is a subjec-
tive descriptor that is practically impossible to measure. It embodies the
achromatic notion of intensity and is one of the key factors in describing
color sensation. lo
As noted in Section 2.1.1, cones are the sensors in the eye responsible for
color vision. Detailed experimental evidence has established that the 6 to 7 mil-
lion cones in the human eye can be divided into three principal sensing cate-
gories, corresponding roughly to red, green, and blue. Approximately 65% of all
cones are sensitive to red light, 33% are sensitive to green light, and only about
2% are sensitive to blue (but the blue cones are the most sensitive). Figure 6.3
C
shows average experimental curves detailing the absorption of light by the red,
green, and blue cones in the eye. Due to these absorption characteristics of the
tu
Purplish blue
Blue
Blue green
Green
Yellowish green
Yellow
Orange
Reddish orange
Red
6.1 ■ Color Fundamentals 397
human eye, colors are seen as variable combinations of the so-called primary
colors red (R), green (G), and blue (B). For the purpose of standardization, the
CIE (Commission Internationale de l’Eclairage—the International Commis-
sion on Illumination) designated in 1931 the following specific wavelength val-
ues to the three primary colors: blue = 435.8 nm, green = 546.1 nm, and
red = 700 nm. This standard was set before the detailed experimental curves
shown in Fig. 6.3 became available in 1965. Thus, the CIE standards correspond
only approximately with experimental data. We note from Figs. 6.2 and 6.3 that
no single color may be called red, green, or blue. Also, it is important to keep in
mind that having three specific primary color wavelengths for the purpose of
standardization does not mean that these three fixed RGB components acting
ud
alone can generate all spectrum colors. Use of the word primary has been widely
misinterpreted to mean that the three standard primaries, when mixed in vari-
ous intensity proportions, can produce all visible colors. As you will see shortly,
this interpretation is not correct unless the wavelength also is allowed to vary,
in which case we would no longer have three fixed, standard primary colors.
The primary colors can be added to produce the secondary colors of light—
magenta (red plus blue), cyan (green plus blue), and yellow (red plus green).
Mixing the three primaries, or a secondary with its opposite primary color, in
lo
the right intensities produces white light. This result is shown in Fig. 6.4(a),
which also illustrates the three primary colors and their combinations to pro-
duce the secondary colors.
a
C
MIXTURES OF LIGHT b
(Additive primaries)
GREEN FIGURE 6.4
Primary and
secondary colors
YELLOW CYAN of light and
tu
WHITE pigments.
(Courtesy of the
RED BLUE General Electric
MAGENTA
Co., Lamp
Business
Division.)
MIXTURES OF PIGMENTS
(Subtractive primaries)
V
YELLOW
RED GREEN
BLACK
MAGENTA CYAN
BLUE
Differentiating between the primary colors of light and the primary colors
of pigments or colorants is important. In the latter, a primary color is defined
as one that subtracts or absorbs a primary color of light and reflects or trans-
mits the other two. Therefore, the primary colors of pigments are magenta,
cyan, and yellow, and the secondary colors are red, green, and blue. These col-
ors are shown in Fig. 6.4(b). A proper combination of the three pigment pri-
maries, or a secondary with its opposite primary, produces black.
Color television reception is an example of the additive nature of light col-
ors. The interior of CRT (cathode ray tube) color TV screens is composed of a
large array of triangular dot patterns of electron-sensitive phosphor. When ex-
cited, each dot in a triad produces light in one of the primary colors. The inten-
ud
sity of the red-emitting phosphor dots is modulated by an electron gun inside
the tube, which generates pulses corresponding to the “red energy” seen by
the TV camera. The green and blue phosphor dots in each triad are modulated
in the same manner. The effect, viewed on the television receiver, is that the
three primary colors from each phosphor triad are “added” together and re-
ceived by the color-sensitive cones in the eye as a full-color image. Thirty suc-
cessive image changes per second in all three colors complete the illusion of a
continuous image display on the screen.
lo
CRT displays are being replaced by “flat panel” digital technologies, such as
liquid crystal displays (LCDs) and plasma devices. Although they are funda-
mentally different from CRTs, these and similar technologies use the same
principle in the sense that they all require three subpixels (red, green, and
blue) to generate a single color pixel. LCDs use properties of polarized light to
block or pass light through the LCD screen and, in the case of active matrix
C
display technology, thin film transistors (TFTs) are used to provide the proper
signals to address each pixel on the screen. Light filters are used to produce
the three primary colors of light at each pixel triad location. In plasma units,
pixels are tiny gas cells coated with phosphor to produce one of the three pri-
tu
ud
It is noted from these equations that†
x + y + z = 1 (6.1-4)
For any wavelength of light in the visible spectrum, the tristimulus values
needed to produce the color corresponding to that wavelength can be ob-
tained directly from curves or tables that have been compiled from extensive
experimental results (Poynton [1996]. See also the early references by Walsh
[1958] and by Kiver [1965]). lo
Another approach for specifying colors is to use the CIE chromaticity dia-
gram (Fig. 6.5), which shows color composition as a function of x (red) and y
(green). For any value of x and y, the corresponding value of z (blue) is ob-
tained from Eq. (6.1-4) by noting that z = 1 - (x + y). The point marked
green in Fig. 6.5, for example, has approximately 62% green and 25% red con-
tent. From Eq. (6.1-4), the composition of blue is approximately 13%.
C
The positions of the various spectrum colors—from violet at 380 nm to red
at 780 nm—are indicated around the boundary of the tongue-shaped chro-
maticity diagram. These are the pure colors shown in the spectrum of Fig. 6.2.
Any point not actually on the boundary but within the diagram represents
tu
some mixture of spectrum colors. The point of equal energy shown in Fig. 6.5
corresponds to equal fractions of the three primary colors; it represents the
CIE standard for white light. Any point located on the boundary of the chro-
maticity chart is fully saturated. As a point leaves the boundary and approach-
es the point of equal energy, more white light is added to the color and it
becomes less saturated. The saturation at the point of equal energy is zero.
The chromaticity diagram is useful for color mixing because a straight-line
V
segment joining any two points in the diagram defines all the different color
variations that can be obtained by combining these two colors additively. Con-
sider, for example, a straight line drawn from the red to the green points shown
in Fig. 6.5. If there is more red light than green light, the exact point represent-
ing the new color will be on the line segment, but it will be closer to the red
point than to the green point. Similarly, a line drawn from the point of equal
†
The use of x, y, z in this context follows notational convention. These should not be confused with the
use of (x, y) to denote spatial coordinates in other sections of the book.
400 Chapter 6 ■ Color Image Processing
FIGURE 6.5
Chromaticity
diagram.
(Courtesy of the
General Electric
Co., Lamp
Business
Division.)
ud
lo
C
tu
energy to any point on the boundary of the chart will define all the shades of
that particular spectrum color.
V
.9 FIGURE 6.6
Typical color
520 gamut of color
monitors
.8 530
(triangle) and
color printing
510 540
G
devices (irregular
.7
region).
550
560
.6
ud
570
500
.5
580
y-axis
590
.4
600
610
R 620
.3
.2
490 lo 640
780
480
C
.1
470
460
B
450 380
0
0 .1 .2 .3 .4 .5 .6 .7 .8
tu
x-axis
devices. The boundary of the color printing gamut is irregular because color
printing is a combination of additive and subtractive color mixing, a process
that is much more difficult to control than that of displaying colors on a
monitor, which is based on the addition of three highly controllable light
V
primaries.
ud
esting and informative. However, keeping to the task at hand, the models dis-
cussed in this chapter are leading models for image processing. Having
mastered the material in this chapter, you will have no difficulty in under-
standing additional color models in use today.
FIGURE 6.7 B
Schematic of the
RGB color cube.
Points along the
main diagonal (0, 0, 1)
Blue Cyan
V
(1, 0, 0)
Red Yellow
R
6.2 ■ Color Models 403
ud
Images represented in the RGB color model consist of three component
images, one for each primary color. When fed into an RGB monitor, these
three images combine on the screen to produce a composite color image, as
explained in Section 6.1. The number of bits used to represent each pixel in
RGB space is called the pixel depth. Consider an RGB image in which each of
the red, green, and blue images is an 8-bit image. Under these conditions each
RGB color pixel [that is, a triplet of values (R, G, B)] is said to have a depth of
24 bits (3 image planes times the number of bits per plane). The term full-color
lo
image is used often to denote a 24-bit RGB color image. The total number of
colors in a 24-bit RGB image is (28)3 = 16,777,216. Figure 6.8 shows the 24-bit
RGB color cube corresponding to the diagram in Fig. 6.7.
■ The cube shown in Fig. 6.8 is a solid, composed of the (28)3 = 16,777,216 EXAMPLE 6.1:
C
colors mentioned in the preceding paragraph. A convenient way to view these Generating the
hidden face
colors is to generate color planes (faces or cross sections of the cube). This is planes and a cross
accomplished simply by fixing one of the three colors and allowing the other section of the
two to vary. For instance, a cross-sectional plane through the center of the cube RGB color cube.
and parallel to the GB-plane in Fig. 6.8 is the plane (127, G, B) for
tu
G, B = 0, 1, 2, Á , 255. Here we used the actual pixel values rather than the
mathematically convenient normalized values in the range [0, 1] because the
former values are the ones actually used in a computer to generate colors.
Figure 6.9(a) shows that an image of the cross-sectional plane is viewed simply
by feeding the three individual component images into a color monitor. In the
component images, 0 represents black and 255 represents white (note that
V
these are gray-scale images). Finally, Fig. 6.9(b) shows the three hidden surface
planes of the cube in Fig. 6.8, generated in the same manner.
It is of interest to note that acquiring a color image is basically the process
shown in Fig. 6.9 in reverse. A color image can be acquired by using three fil-
ters, sensitive to red, green, and blue, respectively. When we view a color scene
with a monochrome camera equipped with one of these filters, the result is a
monochrome image whose intensity is proportional to the response of that fil-
ter. Repeating this process with each filter produces three monochrome im-
ages that are the RGB component images of the color scene. (In practice,
RGB color image sensors usually integrate this process into a single device.)
Clearly, displaying these three RGB component images in the form shown in
Fig. 6.9(a) would yield an RGB color rendition of the original color scene. ■
404 Chapter 6 ■ Color Image Processing
a
b
FIGURE 6.9 Red
(a) Generating
the RGB image of
the cross-sectional
color plane (127,
G, B). (b) The
three hidden
Green RGB
surface planes in Color
the color cube of monitor
Fig. 6.8.
ud
Blue
lo
C
(R 0) (G 0) (B 0)
256 colors. Also, there are numerous applications in which it simply makes no
sense to use more than a few hundred, and sometimes fewer, colors. A good
example of this is provided by the pseudocolor image processing techniques
discussed in Section 6.3. Given the variety of systems in current use, it is of
considerable interest to have a subset of colors that are likely to be repro-
duced faithfully, reasonably independently of viewer hardware capabilities.
V
This subset of colors is called the set of safe RGB colors, or the set of all-
systems-safe colors. In Internet applications, they are called safe Web colors or
safe browser colors.
On the assumption that 256 colors is the minimum number of colors that
can be reproduced faithfully by any system in which a desired result is likely to
be displayed, it is useful to have an accepted standard notation to refer to
these colors. Forty of these 256 colors are known to be processed differently by
various operating systems, leaving only 216 colors that are common to most
systems. These 216 colors have become the de facto standard for safe colors,
especially in Internet applications. They are used whenever it is desired that
the colors viewed by most people appear the same.
6.2 ■ Color Models 405
TABLE 6.1
Number System Color Equivalents
Valid values of
Hex 00 33 66 99 CC FF each RGB
Decimal 0 51 102 153 204 255 component in a
safe color.
Each of the 216 safe colors is formed from three RGB values as before, but
each value can only be 0, 51, 102, 153, 204, or 255. Thus, RGB triplets of these
values give us (6)3 = 216 possible values (note that all values are divisible by
3). It is customary to express these values in the hexagonal number system, as
shown in Table 6.1. Recall that hex numbers 0, 1, 2, Á , 9, A, B, C, D, E, F
ud
correspond to decimal numbers 0, 1, 2, Á , 9, 10, 11, 12, 13, 14, 15. Recall
also that (0)16 = (0000)2 and (F)16 = (1111)2. Thus, for example,
(FF)16 = (255)10 = (11111111)2 and we see that a grouping of two hex num-
bers forms an 8-bit byte.
Since it takes three numbers to form an RGB color, each safe color is
formed from three of the two digit hex numbers in Table 6.1. For example, the
purest red is FF0000. The values 000000 and FFFFFF represent black and
white, respectively. Keep in mind that the same result is obtained by using the
lo
more familiar decimal notation. For instance, the brightest red in decimal no-
tation has R = 255 (FF) and G = B = 0.
Figure 6.10(a) shows the 216 safe colors, organized in descending RGB val-
ues. The square in the top left array has value FFFFFF (white), the second
square to its right has value FFFFCC, the third square has value FFFF99, and
C
a
b
FIGURE 6.10
(a) The 216 safe
tu
RGB colors.
(b) All the grays
in the 256-color
RGB system
(grays that are
part of the safe
color group are
shown
V
underlined).
AAAAAA
DDDDDD
BBBBBB
CCCCCC
EEEEEE
FFFFFF
000000
111111
222222
333333
444444
555555
666666
777777
888888
999999
406 Chapter 6 ■ Color Image Processing
so on for the first row. The second row of that same array has values FFCCFF,
FFCCCC, FFCC99, and so on. The final square of that array has value FF0000
(the brightest possible red). The second array to the right of the one just ex-
amined starts with value CCFFFF and proceeds in the same manner, as do the
other remaining four arrays. The final (bottom right) square of the last array
has value 000000 (black). It is important to note that not all possible 8-bit gray
colors are included in the 216 safe colors. Figure 6.10(b) shows the hex codes
for all the possible gray colors in a 256-color RGB system. Some of these val-
ues are outside of the safe color set but are represented properly (in terms of
their relative intensities) by most display systems. The grays from the safe
color group, (KKKKKK)16, for K = 0, 3, 6, 9, C, F, are shown underlined in
ud
Fig. 6.10(b).
Figure 6.11 shows the RGB safe-color cube. Unlike the full-color cube in
Fig. 6.8, which is solid, the cube in Fig. 6.11 has valid colors only on the sur-
face planes. As shown in Fig. 6.10(a), each plane has a total of 36 colors, so
the entire surface of the safe-color cube is covered by 216 different colors, as
expected.
lo
6.2.2 The CMY and CMYK Color Models
As indicated in Section 6.1, cyan, magenta, and yellow are the secondary colors
of light or, alternatively, the primary colors of pigments. For example, when a
surface coated with cyan pigment is illuminated with white light, no red light is
reflected from the surface. That is, cyan subtracts red light from reflected white
light, which itself is composed of equal amounts of red, green, and blue light.
C
Most devices that deposit colored pigments on paper, such as color printers
and copiers, require CMY data input or perform an RGB to CMY conversion
internally. This conversion is performed using the simple operation
C 1 R
tu
FIGURE 6.11
The RGB safe-
color cube.
6.2 ■ Color Models 407
surface coated with pure cyan does not contain red (that is, C = 1 - R in the
equation). Similarly, pure magenta does not reflect green, and pure yellow
does not reflect blue. Equation (6.2-1) also reveals that RGB values can be
obtained easily from a set of CMY values by subtracting the individual CMY
values from 1. As indicated earlier, in image processing this color model is
used in connection with generating hardcopy output, so the inverse opera-
tion from CMY to RGB generally is of little practical interest.
According to Fig. 6.4, equal amounts of the pigment primaries, cyan, ma-
genta, and yellow should produce black. In practice, combining these colors
for printing produces a muddy-looking black. So, in order to produce true
black (which is the predominant color in printing), a fourth color, black, is
ud
added, giving rise to the CMYK color model. Thus, when publishers talk about
“four-color printing,” they are referring to the three colors of the CMY color
model plus black.
brightness. Recall from the discussion in Section 6.1 that hue is a color at-
tribute that describes a pure color (pure yellow, orange, or red), whereas satu-
ration gives a measure of the degree to which a pure color is diluted by white
light. Brightness is a subjective descriptor that is practically impossible to mea-
sure. It embodies the achromatic notion of intensity and is one of the key fac-
tors in describing color sensation. We do know that intensity (gray level) is a
most useful descriptor of monochromatic images. This quantity definitely is
V
measurable and easily interpretable. The model we are about to present, called
the HSI (hue, saturation, intensity) color model, decouples the intensity com-
ponent from the color-carrying information (hue and saturation) in a color
image. As a result, the HSI model is an ideal tool for developing image pro-
cessing algorithms based on color descriptions that are natural and intuitive to
humans, who, after all, are the developers and users of these algorithms. We
can summarize by saying that RGB is ideal for image color generation (as in
image capture by a color camera or image display in a monitor screen), but its
use for color description is much more limited. The material that follows pro-
vides an effective way to do this.
408 Chapter 6 ■ Color Image Processing
ud
would give us a point with intensity value in the range [0, 1]. We also note with a
little thought that the saturation (purity) of a color increases as a function of dis-
tance from the intensity axis. In fact, the saturation of points on the intensity axis
is zero, as evidenced by the fact that all points along this axis are gray.
In order to see how hue can be determined also from a given RGB point,
consider Fig. 6.12(b), which shows a plane defined by three points (black,
white, and cyan). The fact that the black and white points are contained in the
plane tells us that the intensity axis also is contained in the plane. Further-
lo
more, we see that all points contained in the plane segment defined by the in-
tensity axis and the boundaries of the cube have the same hue (cyan in this
case). We would arrive at the same conclusion by recalling from Section 6.1
that all colors generated by three colors lie in the triangle defined by those col-
ors. If two of those points are black and white and the third is a color point, all
points on the triangle would have the same hue because the black and white
C
components cannot change the hue (of course, the intensity and saturation of
points in this triangle would be different). By rotating the shaded plane about
the vertical intensity axis, we would obtain different hues. From these concepts
we arrive at the conclusion that the hue, saturation, and intensity values re-
tu
quired to form the HSI space can be obtained from the RGB color cube. That
is, we can convert any RGB point to a corresponding point in the HSI color
model by working out the geometrical formulas describing the reasoning out-
lined in the preceding discussion.
a b
V
Black Black
6.2 ■ Color Models 409
The key point to keep in mind regarding the cube arrangement in Fig. 6.12
and its corresponding HSI color space is that the HSI space is represented by a
vertical intensity axis and the locus of color points that lie on planes
perpendicular to this axis. As the planes move up and down the intensity axis,
the boundaries defined by the intersection of each plane with the faces of the
cube have either a triangular or hexagonal shape. This can be visualized much
more readily by looking at the cube down its gray-scale axis, as shown in
Fig. 6.13(a). In this plane we see that the primary colors are separated by 120°.
The secondary colors are 60° from the primaries, which means that the angle
between secondaries also is 120°. Figure 6.13(b) shows the same hexagonal
shape and an arbitrary color point (shown as a dot). The hue of the point is de-
ud
termined by an angle from some reference point. Usually (but not always) an
angle of 0° from the red axis designates 0 hue, and the hue increases counter-
clockwise from there. The saturation (distance from the vertical axis) is the
length of the vector from the origin to the point. Note that the origin is defined
by the intersection of the color plane with the vertical intensity axis. The impor-
tant components of the HSI color space are the vertical intensity axis, the
length of the vector to a color point, and the angle this vector makes with the
red axis. Therefore, it is not unusual to see the HSI planes defined is terms of
lo
the hexagon just discussed, a triangle, or even a circle, as Figs. 6.13(c) and (d)
show. The shape chosen does not matter because any one of these shapes can
be warped into one of the other two by a geometric transformation. Figure 6.14
shows the HSI model based on color triangles and also on circles.
C
Green Yellow
White
tu
Cyan Red
Blue Magenta
Green Yellow Green Yellow
Green
V
S S
H H
Cyan Red Cyan S Yellow Cyan Red
H
a
b c d
FIGURE 6.13 Hue and saturation in the HSI color model. The dot is an arbitrary color
point. The angle from the red axis gives the hue, and the length of the vector is the
saturation. The intensity of all colors in any of these planes is given by the position of
the plane on the vertical intensity axis.
410 Chapter 6 ■ Color Image Processing
a White
b
FIGURE 6.14 The I 0.75
HSI color model
based on
(a) triangular and I
(b) circular color
planes. The Green
triangles and Cyan Yellow
circles are H
perpendicular to S
I 0.5 Blue Red
the vertical Magenta
ud
intensity axis.
Black
lo I 0.75
White
I
C
Green Yellow
I 0.5 Cyan H
S Red
tu
Blue Magenta
V
Black
with†
1
2 [(R - G) + (R - B)]
u = cos b -1
r
[(R - G) + (R - B)(G - B)]1>2
2
ud
It is assumed that the RGB values have been normalized to the range [0, 1]
and that angle u is measured with respect to the red axis of the HSI space, as
indicated in Fig. 6.13. Hue can be normalized to the range [0, 1] by dividing by
360° all values resulting from Eq. (6.2-2). The other two HSI components al-
ready are in this range if the given RGB values are in the interval [0, 1].
The results in Eqs. (6.2-2) through (6.2-4) can be derived from the geometry
shown in Figs. 6.12 and 6.13. The derivation is tedious and would not add sig-
Given values of HSI in the interval [0, 1], we now want to find the correspond-
C
ing RGB values in the same range. The applicable equations depend on the
values of H. There are three sectors of interest, corresponding to the 120° in-
tervals in the separation of primaries (see Fig. 6.13). We begin by multiplying
H by 360°, which returns the hue to its original range of [0°, 360°].
tu
d
S cos H
R = Ic1 + (6.2-6)
cos(60° - H)
and
V
G = 3I - (R + B) (6.2-7)
†
It is good practice to add a small number in the denominator of this expression to avoid dividing by 0
when R = G = B, in which case u will be 90°. Note that when all RGB components are equal, Eq. (6.2-3)
gives S = 0. In addition, the conversion from HSI back to RGB in Eqs. (6.2-5) through (6.2-7) will give
R = G = B = I, as expected, because when R = G = B, we are dealing with a gray-scale image.
412 Chapter 6 ■ Color Image Processing
d
S cos H
G = Ic1 + (6.2-10)
cos(60° - H)
and
B = 3I - (R + G) (6.2-11)
ud
H = H - 240° (6.2-12)
Then the RGB components are
G = I(1 - S) (6.2-13)
d
S cos H
B = Ic1 + (6.2-14)
cos(60° - H)
and
lo R = 3I - (G + B) (6.2-15)
Uses of these equations for image processing are discussed in several of the
following sections.
EXAMPLE 6.2: ■ Figure 6.15 shows the hue, saturation, and intensity images for the RGB
The HSI values values shown in Fig. 6.8. Figure 6.15(a) is the hue image. Its most distinguishing
C
corresponding to feature is the discontinuity in value along a 45° line in the front (red) plane of
the image of the
RGB color cube. the cube. To understand the reason for this discontinuity, refer to Fig. 6.8, draw
a line from the red to the white vertices of the cube, and select a point in the
middle of this line. Starting at that point, draw a path to the right, following the
tu
cube around until you return to the starting point. The major colors encoun-
tered in this path are yellow, green, cyan, blue, magenta, and back to red. Ac-
cording to Fig. 6.13, the values of hue along this path should increase from 0°
V
a b c
FIGURE 6.15 HSI components of the image in Fig. 6.8. (a) Hue, (b) saturation, and (c) intensity images.
6.2 ■ Color Models 413
to 360° (i.e., from the lowest to highest possible values of hue). This is precise-
ly what Fig. 6.15(a) shows because the lowest value is represented as black and
the highest value as white in the gray scale. In fact, the hue image was original-
ly normalized to the range [0, 1] and then scaled to 8 bits; that is, it was con-
verted to the range [0, 255], for display.
The saturation image in Fig. 6.15(b) shows progressively darker values to-
ward the white vertex of the RGB cube, indicating that colors become less and
less saturated as they approach white. Finally, every pixel in the intensity
image shown in Fig. 6.15(c) is the average of the RGB values at the corre-
sponding pixel in Fig. 6.8. ■
ud
Manipulating HSI component images
In the following discussion, we take a look at some simple techniques for ma-
nipulating HSI component images. This will help you develop familiarity with
these components and also help you deepen your understanding of the HSI color
model. Figure 6.16(a) shows an image composed of the primary and secondary
RGB colors. Figures 6.16(b) through (d) show the H, S, and I components of
this image, generated using Eqs. (6.2-2) through (6.2-4). Recall from the dis-
lo
cussion earlier in this section that the gray-level values in Fig. 6.16(b) corre-
spond to angles; thus, for example, because red corresponds to 0°, the red
region in Fig. 6.16(a) is mapped to a black region in the hue image. Similarly,
the gray levels in Fig. 6.16(c) correspond to saturation (they were scaled to
[0, 255] for display), and the gray levels in Fig. 6.16(d) are average intensities.
C
a b
c d
FIGURE 6.16
(a) RGB image
tu
a b
c d
FIGURE 6.17
(a)–(c) Modified
HSI component
images.
(d) Resulting
RGB image. (See
Fig. 6.16 for the
original HSI
images.)
ud
lo
To change the individual color of any region in the RGB image, we change
the values of the corresponding region in the hue image of Fig. 6.16(b). Then
C
we convert the new H image, along with the unchanged S and I images, back to
RGB using the procedure explained in connection with Eqs. (6.2-5) through
(6.2-15). To change the saturation (purity) of the color in any region, we follow
the same procedure, except that we make the changes in the saturation image
tu
in HSI space. Similar comments apply to changing the average intensity of any
region. Of course, these changes can be made simultaneously. For example, the
image in Fig. 6.17(a) was obtained by changing to 0 the pixels corresponding to
the blue and green regions in Fig. 6.16(b). In Fig. 6.17(b) we reduced by half
the saturation of the cyan region in component image S from Fig. 6.16(c). In
Fig. 6.17(c) we reduced by half the intensity of the central white region in the
intensity image of Fig. 6.16(d). The result of converting this modified HSI
V
image back to RGB is shown in Fig. 6.17(d). As expected, we see in this figure
that the outer portions of all circles are now red; the purity of the cyan region
was diminished, and the central region became gray rather than white. Al-
though these results are simple, they illustrate clearly the power of the HSI
color model in allowing independent control over hue, saturation, and intensi-
ty, quantities with which we are quite familiar when describing colors.
from the processes associated with true color images, a topic discussed starting
in Section 6.4. The principal use of pseudocolor is for human visualization and
interpretation of gray-scale events in an image or sequence of images. As noted
at the beginning of this chapter, one of the principal motivations for using color
is the fact that humans can discern thousands of color shades and intensities,
compared to only two dozen or so shades of gray.
ud
of placing planes parallel to the coordinate plane of the image; each plane then
“slices” the function in the area of intersection. Figure 6.18 shows an example of
using a plane at f(x, y) = li to slice the image function into two levels.
If a different color is assigned to each side of the plane shown in Fig. 6.18,
any pixel whose intensity level is above the plane will be coded with one color,
and any pixel below the plane will be coded with the other. Levels that lie on
the plane itself may be arbitrarily assigned one of the two colors. The result is
a two-color image whose relative appearance can be controlled by moving the
lo
slicing plane up and down the intensity axis.
In general, the technique may be summarized as follows. Let [0, L - 1]
represent the gray scale, let level l0 represent black [f(x, y) = 0], and level
lL - 1 represent white [f(x, y) = L - 1]. Suppose that P planes perpendicular
to the intensity axis are defined at levels l1, l2, Á , lP. Then, assuming that
0 6 P 6 L - 1, the P planes partition the gray scale into P + 1 intervals,
C
V1, V2, Á , VP + 1. Intensity to color assignments are made according to the re-
lation
f(x, y) = ck if f(x, y) H Vk (6.3-1)
tu
(Black) 0 y
x
416 Chapter 6 ■ Color Image Processing
FIGURE 6.19 An
alternative
representation of c2
the intensity-
slicing technique.
Color
c1
0 li L1
ud
Intensity levels
where ck is the color associated with the kth intensity interval Vk defined by
the partitioning planes at l = k - 1 and l = k.
The idea of planes is useful primarily for a geometric interpretation of the
intensity-slicing technique. Figure 6.19 shows an alternative representation
that defines the same mapping as in Fig. 6.18. According to the mapping func-
tion shown in Fig. 6.19, any input intensity level is assigned one of two colors,
lo
depending on whether it is above or below the value of li. When more levels
are used, the mapping function takes on a staircase form.
EXAMPLE 6.3: ■ A simple, but practical, use of intensity slicing is shown in Fig. 6.20. Figure
Intensity slicing. 6.20(a) is a monochrome image of the Picker Thyroid Phantom (a radiation
C
test pattern), and Fig. 6.20(b) is the result of intensity slicing this image into
eight color regions. Regions that appear of constant intensity in the mono-
chrome image are really quite variable, as shown by the various colors in the
sliced image. The left lobe, for instance, is a dull gray in the monochrome
image, and picking out variations in intensity is difficult. By contrast, the color
tu
image clearly shows eight different regions of constant intensity, one for each
of the colors used. ■
a b
FIGURE 6.20
(a) Monochrome
V
In the preceding simple example, the gray scale was divided into intervals and
a different color was assigned to each region, without regard for the meaning of
the gray levels in the image. Interest in that case was simply to view the different
gray levels constituting the image. Intensity slicing assumes a much more mean-
ingful and useful role when subdivision of the gray scale is based on physical
characteristics of the image. For instance, Fig. 6.21(a) shows an X-ray image of a
weld (the horizontal dark region) containing several cracks and porosities (the
bright, white streaks running horizontally through the middle of the image). It
is known that when there is a porosity or crack in a weld, the full strength of the
X-rays going through the object saturates the imaging sensor on the other side of
the object.Thus, intensity values of 255 in an 8-bit image coming from such a sys-
ud
tem automatically imply a problem with the weld. If a human were to be the ulti-
mate judge of the analysis, and manual processes were employed to inspect welds
(still a common procedure today), a simple color coding that assigns one color to
a
b
FIGURE 6.21
lo (a) Monochrome
X-ray image of a
weld. (b) Result
of color coding.
(Original image
courtesy of
X-TEK Systems,
Ltd.)
C
tu
V
418 Chapter 6 ■ Color Image Processing
level 255 and another to all other intensity levels would simplify the inspector’s
job considerably. Figure 6.21(b) shows the result. No explanation is required to
arrive at the conclusion that human error rates would be lower if images were
displayed in the form of Fig. 6.21(b), instead of the form shown in Fig. 6.21(a). In
other words, if the exact intensity value or range of values one is looking for is
known, intensity slicing is a simple but powerful aid in visualization, especially if
numerous images are involved. The following is a more complex example.
EXAMPLE 6.4: ■ Measurement of rainfall levels, especially in the tropical regions of the
Use of color to Earth, is of interest in diverse applications dealing with the environment. Accu-
highlight rainfall rate measurements using ground-based sensors are difficult and expensive to
ud
levels.
acquire, and total rainfall figures are even more difficult to obtain because a
significant portion of precipitation occurs over the ocean. One approach for ob-
taining rainfall figures is to use a satellite. The TRMM (Tropical Rainfall Mea-
suring Mission) satellite utilizes, among others, three sensors specially designed
to detect rain: a precipitation radar, a microwave imager, and a visible and in-
frared scanner (see Sections 1.3 and 2.3 regarding image sensing modalities).
The results from the various rain sensors are processed, resulting in esti-
lo
mates of average rainfall over a given time period in the area monitored by the
sensors. From these estimates, it is not difficult to generate gray-scale images
whose intensity values correspond directly to rainfall, with each pixel repre-
senting a physical land area whose size depends on the resolution of the sen-
sors. Such an intensity image is shown in Fig. 6.22(a), where the area monitored
by the satellite is the slightly lighter horizontal band in the middle one-third of
C
the picture (these are the tropical regions). In this particular example, the rain-
fall values are average monthly values (in inches) over a three-year period.
Visual examination of this picture for rainfall patterns is quite difficult, if
not impossible. However, suppose that we code intensity levels from 0 to 255
using the colors shown in Fig. 6.22(b). Values toward the blues signify low val-
tu
ues of rainfall, with the opposite being true for red. Note that the scale tops out
at pure red for values of rainfall greater than 20 inches. Figure 6.22(c) shows
the result of color coding the gray image with the color map just discussed. The
results are much easier to interpret, as shown in this figure and in the zoomed
area of Fig. 6.22(d). In addition to providing global coverage, this type of data
allows meteorologists to calibrate ground-based rain monitoring systems with
■
V
ud
lo
C
a b
c d
FIGURE 6.22 (a) Gray-scale image in which intensity (in the lighter horizontal band shown) corresponds to
average monthly rainfall. (b) Colors assigned to intensity values. (c) Color-coded image. (d) Zoom of the
South American region. (Courtesy of NASA.)
tu
FIGURE 6.23
Red Functional block
fR (x, y)
transformation diagram for
pseudocolor
V
image processing.
fR, fG, and fB are
Green fed into the
f(x, y) fG (x, y) corresponding
transformation
red, green, and
blue inputs of an
RGB color
monitor.
Blue
fB (x, y)
transformation
420 Chapter 6 ■ Color Image Processing
EXAMPLE 6.5: ■ Figure 6.24(a) shows two monochrome images of luggage obtained from an
Use of airport X-ray scanning system. The image on the left contains ordinary articles.
ud
pseudocolor for The image on the right contains the same articles, as well as a block of simulated
highlighting
explosives plastic explosives. The purpose of this example is to illustrate the use of intensi-
contained in ty level to color transformations to obtain various degrees of enhancement.
luggage. Figure 6.25 shows the transformation functions used. These sinusoidal func-
tions contain regions of relatively constant value around the peaks as well as
regions that change rapidly near the valleys. Changing the phase and frequen-
cy of each sinusoid can emphasize (in color) ranges in the gray scale. For in-
stance, if all three transformations have the same phase and frequency, the
lo
output image will be monochrome. A small change in the phase between the
three transformations produces little change in pixels whose intensities corre-
spond to peaks in the sinusoids, especially if the sinusoids have broad profiles
(low frequencies). Pixels with intensity values in the steep section of the sinu-
soids are assigned a much stronger color content as a result of significant dif-
ferences between the amplitudes of the three sinusoids caused by the phase
C
displacement between them.
a
b c
tu
FIGURE 6.24
Pseudocolor
enhancement by
using the gray
level to color
transformations in
Fig. 6.25.
V
(Original image
courtesy of
Dr. Mike Hurwitz,
Westinghouse.)
6.3 ■ Pseudocolor Image Processing 421
a
L1 b
FIGURE 6.25
Red Transformation
functions used to
obtain the images
L1 in Fig. 6.24.
Green
L1
ud
Blue
Intensity
0 L1
Explosive Garment Background
bag
L1 lo
Red
L1
C
Green
L1
tu
Blue
Intensity
0 L1
Explosive Garment Background
bag
V
The image shown in Fig. 6.24(b) was obtained with the transformation
functions in Fig. 6.25(a), which shows the gray-level bands corresponding to
the explosive, garment bag, and background, respectively. Note that the ex-
plosive and background have quite different intensity levels, but they were
both coded with approximately the same color as a result of the periodicity of
the sine waves. The image shown in Fig. 6.24(c) was obtained with the trans-
formation functions in Fig. 6.25(b). In this case the explosives and garment
bag intensity bands were mapped by similar transformations and thus re-
ceived essentially the same color assignments. Note that this mapping allows
an observer to “see” through the explosives. The background mappings were
about the same as those used for Fig. 6.24(b), producing almost identical color
assignments. ■
422 Chapter 6 ■ Color Image Processing
g K (x, y)
f K (x, y) Transformation TK hB (x, y)
ud
The approach shown in Fig. 6.23 is based on a single monochrome image.
Often, it is of interest to combine several monochrome images into a single
color composite, as shown in Fig. 6.26. A frequent use of this approach (illus-
trated in Example 6.6) is in multispectral image processing, where different
sensors produce individual monochrome images, each in a different spectral
band. The types of additional processes shown in Fig. 6.26 can be techniques
such as color balancing (see Section 6.5.4), combining images, and selecting
lo
the three images for display based on knowledge about response characteris-
tics of the sensors used to generate the images.
EXAMPLE 6.6: ■ Figures 6.27(a) through (d) show four spectral satellite images of Washing-
Color coding of ton, D.C., including part of the Potomac River. The first three images are in the
C
multispectral visible red, green, and blue, and the fourth is in the near infrared (see Table 1.1
images.
and Fig. 1.10). Figure 6.27(e) is the full-color image obtained by combining the
first three images into an RGB image. Full-color images of dense areas are dif-
ficult to interpret, but one notable feature of this image is the difference in
tu
color in various parts of the Potomac River. Figure 6.27(f) is a little more in-
teresting.This image was formed by replacing the red component of Fig. 6.27(e)
with the near-infrared image. From Table 1.1, we know that this band is strong-
ly responsive to the biomass components of a scene. Figure 6.27(f) shows quite
clearly the difference between biomass (in red) and the human-made features
in the scene, composed primarily of concrete and asphalt, which appear bluish
in the image.
V
ud
lo
C
tu
V
FIGURE 6.27 (a)–(d) Images in bands 1–4 in Fig. 1.10 (see Table 1.1). (e) Color a b
composite image obtained by treating (a), (b), and (c) as the red, green, blue com- c d
ponents of an RGB image. (f) Image obtained in the same manner, but using in the e f
red channel the near-infrared image in (d). (Original multispectral images courtesy
of NASA.)
424 Chapter 6 ■ Color Image Processing
a
b
FIGURE 6.28
(a) Pseudocolor
rendition of
Jupiter Moon Io.
(b) A close-up.
(Courtesy of
NASA.)
ud
lo
C
tu
from an active volcano on Io, and the surrounding yellow materials are older
sulfur deposits. This image conveys these characteristics much more readily
V
ud
in which they have received considerable attention.
7.1 Background
When we look at images, generally we see connected regions of similar texture
and intensity levels that combine to form objects. If the objects are small in
size or low in contrast, we normally examine them at high resolutions; if they
are large in size or high in contrast, a coarse view is all that is required. If both
lo
small and large objects—or low- and high-contrast objects—are present simul-
taneously, it can be advantageous to study them at several resolutions. This, of
course, is the fundamental motivation for multiresolution processing.
From a mathematical viewpoint, images are two-dimensional arrays of inten-
Local histograms are sity values with locally varying statistics that result from different combinations
histograms of the pixels
of abrupt features like edges and contrasting homogeneous regions.As illustrated
C
in a neighborhood (see
Section 3.3.3). in Fig. 7.1—an image that will be examined repeatedly in the remainder of the
FIGURE 7.1
tu
www.EBooksWorld.ir
7.1 ■ Background 463
ud
tains a low-resolution approximation. As you move up the pyramid, both size
and resolution decrease. Base level J is of size 2J * 2J or N * N, where
J = log2 N, apex level 0 is of size 1 * 1, and general level j is of size 2j * 2j,
where 0 … j … J. Although the pyramid shown in Fig. 7.2(a) is composed of
J + 1 resolution levels from 2J * 2J to 20 * 20, most image pyramids are trun-
cated to P + 1 levels, where 1 … P … J and j = J - P, Á , J - 2, J - 1, J.
That is, we normally limit ourselves to P reduced resolution approximations of
the original image; a 1 * 1 (i.e., single pixel) approximation of a 512 * 512
lo
image, for example, is of little value. The total number of pixels in a P + 1 level
pyramid for P 7 0 is
1 1 1 4
N2 ¢ 1 + 1
+ 2
+Á+ P
≤ … N2
(4) (4) (4) 3
C
Figure 7.2(b) shows a simple system for constructing two intimately related
image pyramids. The Level j - 1 approximation output provides the images
Level 0 (apex)
a
11
tu
b
22 Level 1
FIGURE 7.2
44 Level 2
(a) An image
pyramid. (b) A
simple system for
Level J 1 creating
N/2 N/ 2
approximation
Level J (base) and prediction
NN
V
residual pyramids.
Downsampler
(rows and columns)
Approximation Level j 1
2T
filter approximation
2c Upsampler
(rows and columns)
Interpolation
filter
Prediction
Level j
Level j
prediction
input image
residual
www.EBooksWorld.ir
464 Chapter 7 ■ Wavelets and Multiresolution Processing
ud
coded more efficiently
than 2-D intensity arrays. mids are computed in an iterative fashion. Before the first iteration, the image
to be represented in pyramidal form is placed in level J of the approximation
pyramid. The following three-step procedure is then executed P times—for
j = J, J - 1, Á , and J - P + 1 (in that order):
into the system of Fig. 7.2(b). Typically, the filtering is performed in the spatial
domain (see Section 3.4). Useful approximation filtering techniques include
neighborhood averaging (see Section 3.5.1.), which produces mean pyramids;
lowpass Gaussian filtering (see Sections 4.7.4 and 4.8.3), which produces
Gaussian pyramids; and no filtering, which results in subsampling pyramids.
Any of the interpolation methods described in Section 2.4.4, including nearest
neighbor, bilinear, and bicubic, can be incorporated into the interpolation fil-
ter. Finally, we note that the upsampling and downsampling blocks of Fig.
7.2(b) are used to double and halve the spatial dimensions of the approxima-
tion and prediction images that are computed. Given an integer variable n and
1-D sequence of samples f(n), upsampled sequence f2c (n) is defined as
www.EBooksWorld.ir
7.1 ■ Background 465
In this chapter, we will be
f(n>2) if n is even
f2c (n) = b (7.1-1) working with both
continuous and discrete
0 otherwise functions and variables.
With the notable
exception of 2-D image
where, as is indicated by the subscript, the upsampling is by a factor of 2. The f(x, y) and unless other-
complementary operation of downsampling by 2 is defined as wise noted, x, y, z, Á are
continuous variables;
i, j, k, l, m, n, Á are
f2T (n) = f(2n) (7.1-2) discrete variables.
ud
are annotated to indicate that both the rows and columns of the 2-D inputs on
which they operate are to be up- and downsampled. Like the separable 2-D DFT
in Section 4.11.1, 2-D upsampling and downsampling can be performed by suc-
cessive passes of the 1-D operations defined in Eqs. (7.1-1) and (7.1-2).
■ Figure 7.3 shows both an approximation pyramid and a prediction residual EXAMPLE 7.1:
pyramid for the vase of Fig. 7.1. A lowpass Gaussian smoothing filter (see Approximation
lo
Section 4.7.4) was used to produce the four-level approximation pyramid in
Fig. 7.3(a). As you can see, the resulting pyramid contains the original
512 * 512 resolution image (at its base) and three low-resolution approxima-
tions (of resolution 256 * 256, 128 * 128, and 64 * 64). Thus, P is 3 and levels
and prediction
residual pyramids.
9, 8, 7, and 6 out of a possible log 2 (512) + 1 or 10 levels are present. Note the
reduction in detail that accompanies the lower resolutions of the pyramid. The
C
level 6 (i.e., 64 * 64) approximation image is suitable for locating the window
stiles (i.e., the window pane framing), for example, but not for finding the stems
of the plant. In general, the lower-resolution levels of a pyramid can be used for
the analysis of large structures or overall image context; the high-resolution im-
tu
tion image in the prediction residual pyramid), predict the level 7 128 * 128 res-
olution approximation (by upsampling and filtering), and add the level 7
prediction residual. This process is repeated using successively computed ap-
proximation images until the original 512 * 512 image is generated. Note that
the prediction residual histogram in Fig. 7.3(b) is highly peaked around zero; the
approximation histogram in Fig. 7.3(a) is not. Unlike approximation images, pre-
diction residual images can be highly compressed by assigning fewer bits to the
more probable values (see the variable-length codes of Section 8.2.1). Finally, we
note that the prediction residuals in Fig. 7.3(b) are scaled to make small predic-
tion errors more visible; the prediction residual histogram, however, is based on
the original residual values, with level 128 representing zero error. ■
www.EBooksWorld.ir
466 Chapter 7 ■ Wavelets and Multiresolution Processing
a
b
FIGURE 7.3
Two image
pyramids and
their histograms:
(a) an
approximation
pyramid;
(b) a prediction
residual pyramid.
ud
The approximation
pyramid in (a) is called a
Gaussian pyramid
because a Gaussian filter
was used to construct it.
The prediction residual
pyramid in (b) is often
called a Laplacian
pyramid; note the lo
similarity in appearance
with the Laplacian fil-
tered images in Chapter 3.
C
tu
www.EBooksWorld.ir
7.1 ■ Background 467
o
f(0) for n = 2
f(n - 2) = d
f(1) for n = 2 + 1 = 3
o
As the grayed annotations in Fig. 7.4(a) indicate, input sequence f(n) =
f(n - 0) and the K - 1 delayed sequences at the outputs of the unit delays,
denoted f(n - 1), f(n - 2), Á , f(n - K + 1), are multiplied by constants
h(0), h(1), Á , h(K - 1), respectively, and summed to produce the filtered
output sequence
ud
q If the coefficients of the
fN(n) = aqh(k)f(n - k)
filter in Fig. 7.4(a) are
indexed using values of n
k=-
between 0 and K - 1 (as
(7.1-3) we have done), the limits
= f(n) h(n) on the sum in Eq. (7.1-3)
can be reduced to 0 to
where denotes convolution. Note that—except for a change in variables— K - 1 [like Eq. (4.4-10)].
Eq. (7.1-3) is equivalent to the discrete convolution defined in Eq. (4.4-10) of
Chapter 4. The K multiplication constants in Fig. 7.4(a) and Eq. (7.1-3) are
f (n)
f(n 0)
h(0)
Unit
delay
f (n 1)
h(1)
loUnit
delay
f (n 2)
h(2)
... Unit
delay
h(K 1)
f (n K 1)
h(0)f(n) h(1)f (n 1)
K1
h(k)f (n k) f (n) h(n)
k0
1 1
h(1)
Input sequence f(n) (n)
h(2)
Impulse response h(n)
h(0)
h(K 1)
V
h(3)
0 0
...
h(4)
1 1
1 0 1 2 . . . ... K1 1 0 1 2 . . . ... K1
n n
a
b c
FIGURE 7.4 (a) A digital filter; (b) a unit discrete impulse sequence; and (c) the impulse response of the filter.
www.EBooksWorld.ir
468 Chapter 7 ■ Wavelets and Multiresolution Processing
called filter coefficients. Each coefficient defines a filter tap, which can be
thought of as the components needed to compute one term of the sum in Eq.
(7.1-3), and the filter is said to be of order K.
If the input to the filter of Fig. 7.4(a) is the unit discrete impulse of
Fig. 7.4(b) and Section 4.2.3, Eq. (7.1-3) becomes
q
fN(n) = a h(k)d(n - k)
k = -q (7.1-4)
= h(n)
That is, by substituting d(n) for input f(n) in Eq. (7.1-3) and making use of
ud
the sifting property of the unit discrete impulse as defined in Eq. (4.2-13), we
find that the impulse response of the filter in Fig. 7.4(a) is the K-element se-
quence of filter coefficients that define the filter. Physically, the unit impulse
is shifted from left to right across the top of the filter (from one unit delay to
the next), producing an output that assumes the value of the coefficient at
the location of the delayed impulse. Because there are K coefficients, the im-
pulse response is of length K and the filter is called a finite impulse response
(FIR) filter.
In the remainder of the
chapter, “filter h(n)” will
be used to refer to the
filter whose impulse
response is h(n).
lo
Figure 7.5 shows the impulse responses of six functionally related filters. Fil-
ter h2(n) in Fig. 7.5(b) is a sign-reversed (i.e., reflected about the horizontal
axis) version of h1(n) in Fig. 7.5(a). That is,
0 0 0
tu
1 1 1
. . . 3 21 0 1 2 3 4 5 6 7 . . . . . . 3 21 0 1 2 3 4 5 6 7 . . . . . . 3 21 0 1 2 3 4 5 6 7 . . .
n n n
1 1 1
h4(n) h1(K 1 n) h5(n) (1)nh1(n) h6(n) (1)nh1(K 1 n)
V
0 0 0
1 1 1
. . . 3 21 0 1 2 3 4 5 6 7 . . . . . . 3 21 0 1 2 3 4 5 6 7 . . . . . . 3 21 0 1 2 3 4 5 6 7 . . .
n n n
a b c
d e f
FIGURE 7.5 Six functionally related filter impulse responses: (a) reference response; (b) sign reversal;
(c) and (d) order reversal (differing by the delay introduced); (e) modulation; and (f) order reversal and
modulation.
www.EBooksWorld.ir
7.1 ■ Background 469
Filters h3(n) and h4(n) in Figs. 7.5(c) and (d) are order-reversed versions of Order reversal is often
called time reversal when
h1(n): the input sequence is a
sampled analog signal.
h3(n) = h1( -n) (7.1-6)
h4(n) = h1(K - 1 - n) (7.1-7)
Filter h3(n) is a reflection of h1(n) about the vertical axis; filter h4(n) is a re-
flected and translated (i.e., shifted) version of h1(n). Neglecting translation,
the responses of the two filters are identical. Filter h5(n) in Fig. 7.5(e), which is
defined as
ud
h5(n) = (-1)nh1(n) (7.1-8)
This sequence is included to illustrate the fact that sign reversal, order rever-
sal, and modulation are sometimes combined in the specification of the rela-
tionship between two filters.
(7.1-9)
With this brief introduction to digital signal filtering, consider the two-band
C
subband coding and decoding system in Fig. 7.6(a). As indicated in the figure,
A filter bank is a collec-
the system is composed of two filter banks, each containing two FIR filters of tion of two or more filters.
the type shown in Fig. 7.4(a). Note that each of the four FIR filters is depicted
tu
flp(n) a
b
h0(n) 2T 2c g0(n)
FIGURE 7.6
(a) A two-band
f(n) Analysis filter bank Synthesis filter bank fˆ(n) subband coding
and decoding
system, and (b) its
V
h1(n) 2T 2c g1(n)
spectrum splitting
properties.
fhp(n)
H0() H1()
0 /2
www.EBooksWorld.ir
470 Chapter 7 ■ Wavelets and Multiresolution Processing
as a single block in Fig. 7.6(a), with the impulse response of each filter (and the
convolution symbol) written inside it. The analysis filter bank, which includes
filters h0 (n) and h1(n), is used to break input sequence f(n) into two half-
length sequences flp(n) and fhp(n), the subbands that represent the input. Note
that filters h0 (n) and h1(n) are half-band filters whose idealized transfer char-
acteristics, H0 and H1, are shown in Fig. 7.6(b). Filter h0 (n) is a lowpass filter
whose output, subband flp(n), is called an approximation of f(n); filter h1(n) is
a highpass filter whose output, subband fhp (n), is called the high frequency or
detail part of f(n). Synthesis bank filters g0 (n) and g1 (n) combine flp(n) and
fhp(n) to produce fN(n). The goal in subband coding is to select h0 (n), h1(n),
ud
g0 (n), and g1 (n) so that fN(n) = f(n). That is, so that the input and output of the
subband coding and decoding system are identical. When this is accomplished,
the resulting system is said to employ perfect reconstruction filters.
By real-coefficient, we There are many two-band, real-coefficient, FIR, perfect reconstruction fil-
mean that the filter
coefficients are real (not ter banks described in the filter bank literature. In all of them, the synthesis fil-
complex) numbers. ters are modulated versions of the analysis filters—with one (and only one)
synthesis filter being sign reversed as well. For perfect reconstruction, the im-
pulse responses of the synthesis and analysis filters must be related in one of
lo
the following two ways:
Equations (7.1-10) g0 (n) = (-1)nh1(n)
through (7.1-14) are (7.1-10)
described in detail in the
filter bank literature (see, g1 (n) = (-1)n + 1h0 (n)
for example, Vetterli and
C
Kovacevic [1995]). or
Filters h0 (n), h1(n), g0 (n), and g1(n) in Eqs. (7.1-10) and (7.1-11) are said to be
cross-modulated because diagonally opposed filters in the block diagram of
Fig. 7.6(a) are related by modulation [and sign reversal when the modulation
factor is -(-1)n or (-1)n + 1]. Moreover, they can be shown to satisfy the fol-
lowing biorthogonality condition:
Here, 8hi (2n - k), gj (k)9 denotes the inner product of hi (2n - k) and gj (k).†
When i is not equal to j, the inner product is 0; when i and j are equal, the
product is the unit discrete impulse function, d(n). Biorthogonality will be con-
sidered again in Section 7.2.1.
Of special interest in subband coding—and in the development of the fast
wavelet transform of Section 7.4—are filters that move beyond biorthogonality
and require
The vector inner product of sequences f1 (n) and f2 (n) is 8f1, f29 = a f*1(n)f2 (n), where the * denotes
†
the complex conjugate operation. If f1 (n) and f2 (n) are real, 8f1, f29 = 8f2, f19.
n
www.EBooksWorld.ir
7.1 ■ Background 471
where the subscript on Keven is used to indicate that the number of filter coef-
ud
ficients must be divisible by 2 (i.e., an even number). As Eq. (7.1-14) indicates,
synthesis filter g1 is related to g0 by order reversal and modulation. In addi-
tion, both h0 and h1 are order-reversed versions of synthesis filters, g0 and g1,
respectively. Thus, an orthonormal filter bank can be developed around the
impulse response of a single filter, called the prototype; the remaining filters
can be computed from the specified prototype’s impulse response. For
biorthogonal filter banks, two prototypes are required; the remaining filters
can be computed via Eq. (7.1-10) or (7.1-11). The generation of useful proto-
lo
type filters, whether orthonormal or biorthogonal, is beyond the scope of this
chapter. We simply use filters that have been presented in the literature and
provide references for further study.
Before concluding the section with a 2-D subband coding example, we note
that 1-D orthonormal and biorthogonal filters can be used as 2-D separable
filters for the processing of images. As can be seen in Fig. 7.7, the separable fil-
C
ters are first applied in one dimension (e.g., vertically) and then in the other
(e.g., horizontally) in the manner introduced in Section 2.6.7. Moreover, down-
sampling is performed in two stages—once before the second filtering opera-
tion to reduce the overall number of computations. The resulting filtered
tu
FIGURE 7.7
h0 (n) 2T a(m,n) A two-
Columns
dimensional, four-
(along n) band filter bank
h0 (m) 2T for subband
V
f(m,n) Columns
h0 (n) 2T dH(m,n)
Columns
h1 (m) 2T
Rows
h1 (n) 2T dD(m,n)
Columns
www.EBooksWorld.ir
472 Chapter 7 ■ Wavelets and Multiresolution Processing
outputs, denoted a(m, n), dV(m, n), d H(m, n), and dD(m, n) in Fig. 7.7, are
called the approximation, vertical detail, horizontal detail, and diagonal detail
subbands of the input image, respectively. These subbands can be split into
four smaller subbands, which can be split again, and so on—a property that
will be described in greater detail in Section 7.4.
EXAMPLE 7.2: ■ Figure 7.8 shows the impulse responses of four 8-tap orthonormal filters.
A four-band The coefficients of prototype synthesis filter g0 (n) for 0 … n … 7 [in Fig. 7.8(c)]
subband coding of
are defined in Table 7.1 (Daubechies [1992]). The coefficients of the remaining
the vase in Fig. 7.1.
orthonormal filters can be computed using Eq. (7.1-14). With the help of Fig.
ud
7.5, note (by visual inspection) the cross modulation of the analysis and synthe-
sis filters in Fig. 7.8. It is relatively easy to show numerically that the filters are
TABLE 7.1
n g0(n)
Daubechies 8-tap
orthonormal filter 0 0.23037781
coefficients for 1 0.71484657
g0(n) (Daubechies lo 2 0.63088076
[1992]). 3 -0.02798376
4 -0.18703481
5 0.03084138
6 0.03288301
7 -0.01059740
C
a b h0(n) h1(n)
c d 1 1
FIGURE 7.8
0.5 0.5
tu
The impulse
responses of four
8-tap Daubechies 0 0
orthonormal
filters. See 0.5 0.5
Table 7.1 for the
values of g0 (n) for 1 n 1 n
0 … n … 7. 0 2 4 6 8 0 2 4 6 8
V
g0 (n) g1(n)
1 1
0.5 0.5
0 0
0.5 0.5
1 n 1 n
0 2 4 6 8 0 2 4 6 8
www.EBooksWorld.ir
7.1 ■ Background 473
a b
c d
FIGURE 7.9
A four-band split
of the vase in
Fig. 7.1 using the
subband coding
system of Fig. 7.7.
The four
subbands that
result are the
(a) approximation,
ud
(b) horizontal
detail, (c) vertical
detail, and
(d) diagonal detail
subbands.
lo
C
both biorthogonal (they satisfy Eq. 7.1-12) and orthonormal (they satisfy Eq. 7.1-
tu
13). As a result, the Daubechies 8-tap filters in Fig. 7.8 support error-free recon-
struction of the decomposed input.
A four-band split of the 512 * 512 image of a vase in Fig. 7.1, based on the
filters in Fig. 7.8, is shown in Fig. 7.9. Each quadrant of this image is a subband
of size 256 * 256. Beginning with the upper-left corner and proceeding in a
clockwise manner, the four quadrants contain approximation subband a, hori-
zontal detail subband dH, diagonal detail subband dD, and vertical detail sub-
V
www.EBooksWorld.ir
474 Chapter 7 ■ Wavelets and Multiresolution Processing
ud
T = HFH T (7.1-15)
1
h0 (z) = h00 (z) = , z H [0, 1] (7.1-16)
1N
C
and
first row of the 2 * 2 Haar matrix is computed using h0(z) with z = 0>2, 1>2.
From Eq. (7.1-16), h0 (z) is equal to 1> 1 2, independent of z, so the first row of
H 2 has two identical 1> 1 2 elements. The second row is obtained by computing
h1(z) for z = 0>2, 1>2. Because k = 2 p + q - 1, when k = 1, p = 0 and
q = 1. Thus, from Eq. (7.1-17), h1(0) = 2 0> 12 = 1> 12, h1 (1>2) = -2 0> 12
= -1> 1 2, and the 2 * 2 Haar matrix is
1 1 1
H2 = B R (7.1-18)
12 1 -1
www.EBooksWorld.ir
7.1 ■ Background 475
k p q
0 0 0
1 0 1
2 1 1
3 1 2
ud
1 1 1 1
1 1 1 -1 -1
H4 = D T (7.1-19)
14 1 2 -12 0 0
0 0 12 -12
Our principal interest in the Haar transform is that the rows of H 2 can be used
lo
to define the analysis filters, h0 (n) and h1(n), of a 2-tap perfect reconstruction
filter bank (see the previous section), as well as the scaling and wavelet vectors
(defined in Sections 7.2.2 and 7.2.3, respectively) of the simplest and oldest
wavelet transform (see Example 7.10 in Section 7.4). Rather than concluding
the section with the computation of a Haar transform, we close with an exam-
ple that illustrates the influence of the decomposition methods that have been
C
considered to this point on the methods that will be developed in the remainder
of the chapter.
■ Figure 7.10(a) shows a decomposition of the 512 * 512 image in Fig. 7.1 EXAMPLE 7.3:
tu
that combines the key features of pyramid coding, subband coding, and the Haar functions in
Haar transform (the three techniques we have discussed so far). Called the a discrete wavelet
transform.
discrete wavelet transform (and developed later in the chapter), the represen-
tation is characterized by the following important features:
1. With the exception of the subimage in the upper-left corner of Fig. 7.10(a),
the local histograms are very similar. Many of the pixels are close to zero.
V
Because the subimages (except for the subimage in the upper-left corner)
have been scaled to make their underlying structure more visible, the dis-
played histograms are peaked at intensity 128 (the zeroes have been
scaled to mid-gray). The large number of zeroes in the decomposition
makes the image an excellent candidate for compression (see Chapter 8).
2. In a manner that is similar to the way in which the levels of the prediction
residual pyramid of Fig. 7.3(b) were used to create approximation images
of differing resolutions, the subimages in Fig. 7.10(a) can be used to con-
struct both coarse and fine resolution approximations of the original
vase image in Fig. 7.1. Figures 7.10(b) through (d), which are of size
www.EBooksWorld.ir
476 Chapter 7 ■ Wavelets and Multiresolution Processing
a
b c d
FIGURE 7.10
(a) A discrete
wavelet transform
using Haar H 2
basis functions. Its
local histogram
variations are also
shown. (b)–(d)
Several different
approximations
ud
(64 * 64,
128 * 128, and
256 * 256) that
can be obtained
from (a).
lo
C
tu
V
64 * 64, 128 * 128, and 256 * 256, respectively, were generated from
the subimages in Fig. 7.10(a). A perfect 512 * 512 reconstruction of the
original image is also possible.
3. Like the subband coding decomposition in Fig. 7.9, a simple real-coefficient,
FIR filter bank of the form given in Fig. 7.7 was used to produce Fig. 7.10(a).
After the generation of a four subband image like that of Fig. 7.9, the
256 * 256 approximation subband was decomposed and replaced by four
128 * 128 subbands (using the same filter bank), and the resulting approx-
imation subband was again decomposed and replaced by four 64 * 64 sub-
bands. This process produced the unique arrangement of subimages that
www.EBooksWorld.ir
7.2 ■ Multiresolution Expansions 477
ud
subimage in the upper-right corner of Fig. 7.10(a) captures horizontal edge
information in the original image].
Considering this impressive list of features, it is remarkable that the discrete
wavelet transform of Fig. 7.10(a) was generated using two 2-tap digital filters
with a total of four filter coefficients. ■
To say that f(x) H V means that f(x) is in the closed span of E wk(x) F and can
be written in the form of Eq. (7.2-1).
www.EBooksWorld.ir
478 Chapter 7 ■ Wavelets and Multiresolution Processing
For any function space V and corresponding expansion set E wk(x) F , there is a
set of dual functions denoted E wk(x) F that can be used to compute the ak coeffi-
'
cients of Eq. (7.2-1) for any f(x) H V. These coefficients are computed by taking
'
the integral inner products† of the dual wk(x) and function f(x). That is,
ak = 8wk(x), f(x)9 =
' '
w*k(x)f(x) dx (7.2-3)
L
where the * denotes the complex conjugate operation. Depending on the or-
thogonality of the expansion set, this computation assumes one of three possi-
ble forms. Problem 7.10 at the end of the chapter illustrates the three cases
ud
using vectors in two-dimensional Euclidean space.
becomes
lo ak = 8wk(x), f(x)9
'
the basis and its dual are equivalent. That is, wk(x) = wk(x) and Eq. (7.2-3)
(7.2-5)
The ak are computed as the inner products of the basis functions and f(x).
C
Case 2: If the expansion functions are not orthonormal, but are an orthog-
onal basis for V, then
and the basis functions and their duals are called biorthogonal. The ak are
computed using Eq. (7.2-3), and the biorthogonal basis and its dual are
such that
Case 3: If the expansion set is not a basis for V, but supports the expan-
sion defined in Eq. (7.2-1), it is a spanning set in which there is more than
one set of ak for any f(x) H V. The expansion functions and their duals are
said to be overcomplete or redundant. They form a frame in which‡
The integral inner product of two real or complex-valued functions f(x) and g(x) is 8f(x), g(x)9 =
†
f*(x)g(x) dx. If f(x) is real, f*(x) = f(x) and 8f(x), g(x)9 = f(x)g(x) dx.
L L
The norm of f(x), denoted 7f(x)7, is defined as the square root of the absolute value of the inner prod-
‡
www.EBooksWorld.ir
7.2 ■ Multiresolution Expansions 479
8wk(x), f(x)9wk(x)
1
Aa
f(x) = (7.2-9)
k
Except for the A-1 term, which is a measure of the frame’s redundancy,
this is identical to the expression obtained by substituting Eq. (7.2-5) (for
ud
orthonormal bases) into Eqs. (7.2-1).
7.2.2 Scaling Functions
Consider the set of expansion functions composed of integer translations and
binary scalings of the real, square-integrable function w(x); this is the set
E wj, k(x) F , where
wj, k(x) = 2j>2w(2jx - k) (7.2-10)
lo
for all j, k H Z and w(x) H L2(R).† Here, k determines the position of wj, k(x)
along the x-axis, and j determines the width of wj, k(x)—that is, how broad or
narrow it is along the x-axis. The term 2 j>2 controls the amplitude of the func-
tion. Because the shape of wj, k(x) changes with j, w(x) is called a scaling function.
By choosing w(x) properly, E wj, k(x) F can be made to span L2(R), which is the
C
set of all measurable, square-integrable functions.
If we restrict j in Eq. (7.2-10) to a specific value, say j = j0 , the resulting
expansion set, E wj0, k(x) F , is a subset of E wj, k(x) F that spans a subspace of L2(R).
Using the notation of the previous section, we can define that subspace as
tu
k
More generally, we will denote the subspace spanned over k for any j as
As will be seen in the following example, increasing j increases the size of Vj,
allowing functions with smaller variations or finer detail to be included in the
subspace. This is a consequence of the fact that, as j increases, the wj, k(x) that
are used to represent the subspace functions become narrower and separated
by smaller changes in x.
†
The notation L2(R), where R is the set of real numbers, denotes the set of measurable, square-integrable,
one-dimensional functions; Z is the set of integers.
www.EBooksWorld.ir
480 Chapter 7 ■ Wavelets and Multiresolution Processing
EXAMPLE 7.4: ■ Consider the unit-height, unit-width scaling function (Haar [1910])
The Haar scaling
function.
1 0 … x 6 1
w(x) = b (7.2-14)
0 otherwise
Figures 7.11(a) through (d) show four of the many expansion functions that
can be generated by substituting this pulse-shaped scaling function into
Eq. (7.2-10). Note that the expansion functions for j = 1 in Figs. 7.11(c) and
(d) are half as wide as those for j = 0 in Figs. 7.11(a) and (b). For a given in-
terval on x, we can define twice as many V1 scaling functions as V0 scaling func-
ud
tions (e.g., w1, 0 and w1, 1 of V1 versus w0, 0 of V0 for the interval 0 … x 6 1).
Figure 7.11(e) shows a member of subspace V1. This function does not be-
long to V0, because the V0 expansion functions in 7.11(a) and (b) are too
coarse to represent it. Higher-resolution functions like those in 7.11(c) and (d)
0
lo 1
x x
C
0 1 2 3 0 1 2 3
1 1
0 0
x x
0 1 2 3 0 1 2 3
V
1 1, 1 1
w1, 1/ 2
0.25 1, 4
0 0
0.5 1, 0 w1,0/ 2
x x
0 1 2 3 0 1 2 3
www.EBooksWorld.ir
7.2 ■ Multiresolution Expansions 481
are required. They can be used, as shown in (e), to represent the function by
the three-term expansion
f(x) = 0.5w1, 0 (x) + w1, 1 (x) - 0.25w1, 4 (x)
To conclude the example, Fig. 7.11(f) illustrates the decomposition of
w0, 0 (x) as a sum of V1 expansion functions. In a similar manner, any V0 expan-
sion function can be decomposed using
1 1
w0, k(x) = w1, 2k (x) + w1, 2k + 1 (x)
12 12
Thus, if f(x) is an element of V0, it is also an element of V1. This is because all
ud
V0 expansion functions are contained in V1. Mathematically, we write that V0 is
a subspace of V1, denoted V0 ( V1. ■
The simple scaling function in the preceding example obeys the four funda-
mental requirements of multiresolution analysis (Mallat [1989a]):
MRA Requirement 1: The scaling function is orthogonal to its integer
translates. lo
This is easy to see in the case of the Haar function, because whenever it has a
value of 1, its integer translates are 0, so that the product of the two is 0. The
Haar scaling function is said to have compact support, which means that it is
0 everywhere outside a finite interval called the support. In fact, the width of
the support is 1; it is 0 outside the half open interval [0, 1). It should be noted
that the requirement for orthogonal integer translates becomes harder to
C
satisfy as the width of support of the scaling function becomes larger than 1.
MRA Requirement 2: The subspaces spanned by the scaling function at low
scales are nested within those spanned at higher scales.
tu
V0 V1 V2 FIGURE 7.12
The nested
function spaces
spanned by a
scaling function.
V0
www.EBooksWorld.ir
482 Chapter 7 ■ Wavelets and Multiresolution Processing
should not be taken to indicate that any function with a support width of 1
automatically satisfies the condition. It is left as an exercise for the reader
to show that the equally simple function
1 0.25 … x 6 0.75
w(x) = b
0 elsewhere
is not a valid scaling function for a multiresolution analysis (see Problem 7.11).
MRA Requirement 3: The only function that is common to all Vj is f(x) = 0.
If we consider the coarsest possible expansion functions (i.e., j = - q ),
ud
the only representable function is the function of no information. That is,
V- q = 506 (7.2-16)
Vq = E L2(R) F (7.2-17)
where the index of summation has been changed to n for clarity. Substituting
The an are changed to
hw(n) because they are
for wj + 1, n (x) from Eq. (7.2-10) and changing variable an to hw(n), this becomes
used later (see Section
7.4) as filter bank
coefficients. wj, k (x) = a hw(n)2 (j + 1)>2w(2 j + 1x - n)
n
Because w(x) = w0, 0 (x), both j and k can be set to 0 to obtain the simpler non-
V
subscripted expression
The hw(n) coefficients in this recursive equation are called scaling function co-
efficients; hw is referred to as a scaling vector. Equation (7.2-18) is fundamental
to multiresolution analysis and is called the refinement equation, the MRA
equation, or the dilation equation. It states that the expansion functions of any
subspace can be built from double-resolution copies of themselves—that is,
from expansion functions of the next higher resolution space. The choice of a
reference subspace, V0, is arbitrary.
www.EBooksWorld.ir
7.2 ■ Multiresolution Expansions 483
■ The scaling function coefficients for the Haar function of Eq. (7.2-14) EXAMPLE 7.5:
are hw(0) = hw(1) = 1> 12, the first row of matrix H 2 in Eq. (7.1-18). Thus, Haar scaling
function
Eq. (7.2-18) yields
coefficients.
C 12w(2x) D + C 12w(2x - 1) D
1 1
w(x) =
12 12
This decomposition was illustrated graphically for w0, 0 (x) in Fig. 7.11(f), where
the bracketed terms of the preceding expression are seen to be w1, 0 (x) and
w1, 1 (x). Additional simplification yields w(x) = w(2x) + w(2x - 1). ■
ud
Given a scaling function that meets the MRA requirements of the previous
section, we can define a wavelet function c(x) that, together with its integer
translates and binary scalings, spans the difference between any two adjacent
scaling subspaces, Vj and Vj + 1. The situation is illustrated graphically in Fig. 7.13.
We define the set 5cj, k (x)6 of wavelets
where { denotes the union of spaces (like the union of sets). The orthogonal
complement of Vj in Vj + 1 is Wj, and all members of Vj are orthogonal to the
members of Wj. Thus,
8wj, k (x), cj, l (x)9 = 0 (7.2-23)
for all appropriate j, k, l H Z.
V
www.EBooksWorld.ir
484 Chapter 7 ■ Wavelets and Multiresolution Processing
L2(R) = V0 { W0 { W1 { Á (7.2-24)
or
L2(R) = V1 { W1 { W2 { Á (7.2-25)
or even
ud
which eliminates the scaling function, and represents a function in terms of
wavelets alone [i.e., there are only wavelet function spaces in Eq. (7.2-26)].
Note that if f(x) is an element of V1, but not V0, an expansion using Eq. (7.2-24)
contains an approximation of f(x) using V0 scaling functions. Wavelets from
W0 would encode the difference between this approximation and the actual
function. Equations (7.2-24) through (7.2-26) can be generalized to yield
Since wavelet spaces reside within the spaces spanned by the next higher
resolution scaling functions (see Fig. 7.13), any wavelet function—like its scal-
ing function counterpart of Eq. (7.2-18)—can be expressed as a weighted sum
C
of shifted, double-resolution scaling functions. That is, we can write
where the hc(n) are called the wavelet function coefficients and hc is the
wavelet vector. Using the condition that wavelets span the orthogonal comple-
ment spaces in Fig. 7.13 and that integer wavelet translates are orthogonal, it
can be shown that hc(n) is related to hw(n) by (see, for example, Burrus,
Gopinath, and Guo [1998])
V
Note the similarity of this result and Eq. (7.1-14), the relationship governing
the impulse responses of orthonormal subband coding and decoding filters.
EXAMPLE 7.6: ■ In the previous example, the Haar scaling vector was defined as
The Haar wavelet hw(0) = hw(1) = 1> 12. Using Eq. (7.2-29), the corresponding wavelet
function vector is hc(0) = (-1)0hw(1 - 0) = 1> 12 and hc(1) = (-1)1hw(1 - 1)
coefficients.
= -1> 12. Note that these coefficients correspond to the second row of ma-
trix H2 in Eq. (7.1-18). Substituting these values into Eq. (7.2-28), we get
www.EBooksWorld.ir
7.2 ■ Multiresolution Expansions 485
c(x) = w(2x) - w(2x - 1), which is plotted in Fig. 7.14(a). Thus, the Haar
wavelet function is
1 0 … x 6 0.5
c(x) = c -1 0.5 … x 6 1 (7.2-30)
0 elsewhere
Using Eq. (7.2-19), we can now generate the universe of scaled and translated
Haar wavelets.Two such wavelets, c0, 2 (x) and c1, 0 (x), are plotted in Figs. 7.14(b)
and (c), respectively. Note that wavelet c1, 0 (x) for space W1 is narrower than
c0, 2 (x) for W0; it can be used to represent finer detail.
Figure 7.14(d) shows a function of subspace V1 that is not in subspace V0. This
ud
function was considered in an earlier example [see Fig. 7.11(e)]. Although the
function cannot be represented accurately in V0, Eq. (7.2-22) indicates that it can
be expanded using V0 and W0 expansion functions. The resulting expansion is
f(x) = fa (x) + fd (x)
0
lo 1
0
e f
FIGURE 7.14
Haar wavelet
functions in W0
and W1.
1 1
C
x x
0 1 2 3 0 1 2 3
1 1
0 0
1 1
x x
V
0 1 2 3 0 1 2 3
fa(x) H V0 fd (x) H W0
3 2/4 w0, 0
1 1 2/8 c0, 2
0 0
x x
0 1 2 3 0 1 2 3
www.EBooksWorld.ir
486 Chapter 7 ■ Wavelets and Multiresolution Processing
where
312 12
fa (x) = w0, 0 (x) - w0, 2 (x)
4 8
and
- 12 12
fd (x) = c0, 0 (x) - c0, 2 (x)
4 8
Here, fa (x) is an approximation of f(x) using V0 scaling functions, while fd (x)
is the difference f(x) - fa (x) as a sum of W0 wavelets. The two expansions,
which are shown in Figs. 7.14(e) and (f), divide f(x) in a manner similar to a
ud
lowpass and highpass filter as discussed in connection with Fig. 7.6. The low
frequencies of f(x) are captured in fa (x)—it assumes the average value of
f(x) in each integer interval—while the high-frequency details are encoded in
fd (x). ■
ative to wavelet c(x) and scaling function w(x). In accordance with Eq. (7.2-27),
f(x) can be represented by a scaling function expansion in subspace Vj0
[Eq. (7.2-12) defines such an expansion] and some number of wavelet func-
tion expansions in subspaces Wj0, Wj0 + 1, Á [as defined in Eq. (7.2-21)]. Thus,
q
V
where j0 is an arbitrary starting scale and the cj0(k) and dj (k) are relabeled ak
from Eqs. (7.2-12) and (7.2-21), respectively. The cj0(k) are normally called
approximation and/or scaling coefficients; the dj (k) are referred to as detail
and/or wavelet coefficients. This is because the first sum in Eq. (7.3-1) uses scal-
ing functions to provide an approximation of f(x) at scale j0 [unless f(x) H Vj0
so that the sum of the scaling functions is equal to f(x)]. For each higher scale
j Ú j0 in the second sum, a finer resolution function—a sum of wavelets—is
added to the approximation to provide increasing detail. If the expansion
628 Chapter 9 ■ Morphological Image Processing
9.1 Preliminaries
You will find it helpful to The language of mathematical morphology is set theory. As such, morpholo-
review Sections 2.4.2 and
2.6.4 before proceeding. gy offers a unified and powerful approach to numerous image processing
problems. Sets in mathematical morphology represent objects in an image.
For example, the set of all white pixels in a binary image is a complete mor-
phological description of the image. In binary images, the sets in question are
members of the 2-D integer space Z2 (see Section 2.4.2), where each element
of a set is a tuple (2-D vector) whose coordinates are the (x, y) coordinates
of a white (or black, depending on convention) pixel in the image. Gray-
scale digital images of the form discussed in the previous chapters can be
ud
represented as sets whose components are in Z3. In this case, two compo-
nents of each element of the set refer to the coordinates of a pixel, and the
third corresponds to its discrete intensity value. Sets in higher dimensional
spaces can contain other image attributes, such as color and time varying
components.
In addition to the basic set definitions in Section 2.6.4, the concepts of set
The set reflection opera- reflection and translation are used extensively in morphology. The reflection of
tion is analogous to the
a set B, denoted B N , is defined as
flipping (rotating) opera- lo
tion performed in spatial
convolution (Section N = 5w ƒ w = -b, for
B b H B6 (9.1-1)
3.4.2).
N is
If B is the set of pixels (2-D points) representing an object in an image, then B
simply the set of points in B whose (x, y) coordinates have been replaced by
(-x, -y). Figures 9.1(a) and (b) show a simple set and its reflection.†
C
a b c
tu
FIGURE 9.1
(a) A set, (b) its
reflection, and
(c) its translation
by z. z2
B̂
B
z1
V
(B) z
†
When working with graphics, such as the sets in Fig. 9.1, we use shading to indicate points (pixels) that
are members of the set under consideration. When working with binary images, the sets of interest are
pixels corresponding to objects. We show these in white, and all other pixels in black. The terms
foreground and background are used often to denote the sets of pixels in an image defined to be objects
and non-objects, respectively.
9.1 ■ Preliminaries 629
ud
terest. The first row of Fig. 9.2 shows several examples of structuring ele-
ments where each shaded square denotes a member of the SE. When it does
not matter whether a location in a given structuring element is or is not a
member of the SE set, that location is marked with an “*” to denote a “don’t
care” condition, as defined later in Section 9.5.4. In addition to a definition
of which elements are members of the SE, the origin of a structuring element
also must be specified. The origins of the various SEs in Fig. 9.2 are indicated
by a black dot (although placing the center of an SE at its center of gravity is
lo
common, the choice of origin is problem dependent in general). When the
SE is symmetric and no dot is shown, the assumption is that the origin is at
the center of symmetry.
When working with images, we require that structuring elements be rec-
tangular arrays. This is accomplished by appending the smallest possible
number of background elements (shown nonshaded in Fig. 9.2) necessary to
C
form a rectangular array. The first and last SEs in the second row of Fig. 9.2
illustrate the procedure. The other SEs in that row already are in rectangu-
lar form.
As an introduction to how structuring elements are used in morphology,
tu
consider Fig. 9.3. Figures 9.3(a) and (b) show a simple set and a structuring el-
ement. As mentioned in the previous paragraph, a computer implementation
requires that set A be converted also to a rectangular array by adding back-
ground elements. The background border is made large enough to accommo-
date the entire structuring element when its origin is on the border of the
V
ud
a b
c d e
FIGURE 9.3 (a) A set (each shaded square is a member of the set). (b) A structuring
element. (c) The set padded with background elements to form a rectangular array and
provide a background border. (d) Structuring element as a rectangular array. (e) Set
processed by the structuring element.
In future illustrations, we original set (this is analogous to padding for spatial correlation and convolu-
add enough background
points to form rectangular
arrays, but let the padding
be implicit when the
meaning is clear in order
to simplify the figures.
lo
tion, as discussed in Section 3.4.2). In this case, the structuring element is of
size 3 * 3 with the origin in the center, so a one-element border that encom-
passes the entire set is sufficient, as Fig. 9.3(c) shows. As in Fig. 9.2, the struc-
turing element is filled with the smallest possible number of background
elements necessary to make it into a rectangular array [Fig. 9.3(d)].
Suppose that we define an operation on set A using structuring element B,
C
as follows: Create a new set by running B over A so that the origin of B visits
every element of A. At each location of the origin of B, if B is completely con-
tained in A, mark that location as a member of the new set (shown shaded);
else mark it as not being a member of the new set (shown not shaded).
tu
Figure 9.3(e) shows the result of this operation. We see that, when the origin of
B is on a border element of A, part of B ceases to be contained in A, thus elim-
inating the location on which B is centered as a possible member for the new
set. The net result is that the boundary of the set is eroded, as Fig. 9.3(e) shows.
When we use terminology such as “the structuring element is contained in the
set,” we mean specifically that the elements of A and B fully overlap. In other
words, although we showed A and B as arrays containing both shaded and
V
nonshaded elements, only the shaded elements of both sets are considered in
determining whether or not B is contained in A. These concepts form the basis
of the material in the next section, so it is important that you understand the
ideas in Fig. 9.3 fully before proceeding.
9.2.1 Erosion
With A and B as sets in Z 2, the erosion of A by B, denoted A | B, is defined as
A | B = 5z ƒ (B)z 8 A6 (9.2-1)
In words, this equation indicates that the erosion of A by B is the set of all
points z such that B, translated by z, is contained in A. In the following discus-
sion, set B is assumed to be a structuring element. Equation (9.2-1) is the
mathematical formulation of the example in Fig. 9.3(e), discussed at the end of
the last section. Because the statement that B has to be contained in A is
ud
equivalent to B not sharing any common elements with the background, we
can express erosion in the following equivalent form:
A | B = 5z ƒ (B)z ¨ Ac = 6 (9.2-2)
d/4
d d/4
B
AB
A 3d/4
d/ 8 d/8
d/4
V
d/ 2
d
d/ 2
AB
B 3d/4
a b c d/8 d/8
d e
FIGURE 9.4 (a) Set A. (b) Square structuring element, B. (c) Erosion of A by B, shown
shaded. (d) Elongated structuring element. (e) Erosion of A by B using this element.
The dotted border in (c) and (e) is the boundary of set A, shown only for reference.
632 Chapter 9 ■ Morphological Image Processing
ud
EXAMPLE 9.1: ■ Suppose that we wish to remove the lines connecting the center region to
Using erosion to the border pads in Fig. 9.5(a). Eroding the image with a square structuring
remove image
element of size 11 * 11 whose components are all 1s removed most of the
components.
lines, as Fig. 9.5(b) shows. The reason the two vertical lines in the center were
thinned but not removed completely is that their width is greater than 11
pixels. Changing the SE size to 15 * 15 and eroding the original image again
did remove all the connecting lines, as Fig. 9.5(c) shows (an alternate ap-
lo
proach would have been to erode the image in Fig. 9.5(b) again using the
same 11 * 11 SE). Increasing the size of the structuring element even more
would eliminate larger components. For example, the border pads can be re-
moved with a structuring element of size 45 * 45, as Fig. 9.5(d) shows.
C
a b
c d
FIGURE 9.5 Using
erosion to remove
image compo-
tu
nents. (a) A
486 * 486 binary
image of a wire-
bond mask.
(b)–(d) Image
eroded using
square structuring
elements of sizes
V
11 * 11, 15 * 15,
and 45 * 45,
respectively. The
elements of the
SEs were all 1s.
9.2 ■ Erosion and Dilation 633
We see from this example that erosion shrinks or thins objects in a bina-
ry image. In fact, we can view erosion as a morphological filtering operation
in which image details smaller than the structuring element are filtered (re-
moved) from the image. In Fig. 9.5, erosion performed the function of a
“line filter.” We return to the concept of a morphological filter in Sections
9.3 and 9.6.3. ■
9.2.2 Dilation
With A and B as sets in Z2, the dilation of A by B, denoted A { B, is defined as
ud
A{B = E z ƒ (BN )z ¨ A Z F (9.2-3)
This equation is based on reflecting B about its origin, and shifting this reflection
by z (see Fig. 9.1). The dilation of A by B then is the set of all displacements,
z, such that BN and A overlap by at least one element. Based on this inter-
pretation, Eq. (9.2-3) can be written equivalently as
in Section 3.4.2. Keep in mind, however, that dilation is based on set opera-
tions and therefore is a nonlinear operation, whereas convolution is a linear
operation.
Unlike erosion, which is a shrinking or thinning operation, dilation
“grows” or “thickens” objects in a binary image. The specific manner and ex-
tent of this thickening is controlled by the shape of the structuring element
V
used. Figure 9.6(a) shows the same set used in Fig. 9.4, and Fig. 9.6(b) shows a
structuring element (in this case BN = B because the SE is symmetric about its
origin). The dashed line in Fig. 9.6(c) shows the original set for reference, and
the solid line shows the limit beyond which any further displacements of the
origin of BN by z would cause the intersection of BN and A to be empty. There-
fore, all points on and inside this boundary constitute the dilation of A by B.
Figure 9.6(d) shows a structuring element designed to achieve more dilation
vertically than horizontally, and Fig. 9.6(e) shows the dilation achieved with
this element.
634 Chapter 9 ■ Morphological Image Processing
a b c d
d e
d/4
FIGURE 9.6 d/4
d
(a) Set A.
(b) Square Bˆ B
structuring ele- AB
A
ment (the dot de- d
notes the origin). d/ 8 d/8
(c) Dilation of A
by B, shown
shaded. d/ 2
d/4
(d) Elongated
ud
structuring ele-
ment. (e) Dilation
of A using this d d d
element. The
dotted border in
(c) and (e) is the
Bˆ B d/ 2
boundary of set A,
shown only for AB
reference d
d/ 8 d/ 8
EXAMPLE 9.2:
An illustration of
dilation.
lo
■ One of the simplest applications of dilation is for bridging gaps. Figure 9.7(a)
shows the same image with broken characters that we studied in Fig. 4.49 in
connection with lowpass filtering. The maximum length of the breaks is
known to be two pixels. Figure 9.7(b) shows a structuring element that can be
C
used for repairing the gaps (note that instead of shading, we used 1s to denote
the elements of the SE and 0s for the background; this is because the SE is
now being treated as a subimage and not as a graphic). Figure 9.7(c) shows
the result of dilating the original image with this structuring element. The
gaps were bridged. One immediate advantage of the morphological approach
tu
over the lowpass filtering method we used to bridge the gaps in Fig. 4.49 is
a c
b
FIGURE 9.7
V
9.2.3 Duality
Erosion and dilation are duals of each other with respect to set complementa-
tion and reflection. That is,
N
(A | B)c = Ac { B (9.2-5)
and
ud
N
(A { B)c = Ac | B (9.2-6)
Equation (9.2-5) indicates that erosion of A by B is the complement of the di-
lation of Ac by B N , and vice versa. The duality property is useful particularly
when the structuring element is symmetric with respect to its origin (as often is
the case), so that B N = B. Then, we can obtain the erosion of an image by B
simply by dilating its background (i.e., dilating Ac ) with the same structuring
lo
element and complementing the result. Similar comments apply to Eq. (9.2-6).
We proceed to prove formally the validity of Eq. (9.2-5) in order to illus-
trate a typical approach for establishing the validity of morphological expres-
sions. Starting with the definition of erosion, it follows that
E z ƒ (B)z 8 A F
c
(A | B)c =
C
If set (B)z is contained in A, then (B)z ¨ Ac = , in which case the preceding
expression becomes
E z ƒ (B)z ¨ Ac = F
c
(A | B)c =
tu
But the complement of the set of z’s that satisfy (B)z ¨ Ac = is the set of z’s
such that (B)z ¨ Ac Z . Therefore,
(A | B)c = E z ƒ (B)z ¨ Ac Z F
N
= Ac { B
where the last step follows from Eq. (9.2-3). This concludes the proof. A simi-
V
lar line of reasoning can be used to prove Eq. (9.2-6) (see Problem 9.13).
A B = (A | B) { B (9.3-1)
A • B = (A { B) | B (9.3-2)
ud
which says that the closing of A by B is simply the dilation of A by B, followed
by the erosion of the result by B.
The opening operation has a simple geometric interpretation (Fig. 9.8).
Suppose that we view the structuring element B as a (flat) “rolling ball.” The
boundary of A B is then established by the points in B that reach the
farthest into the boundary of A as B is rolled around the inside of this bound-
ary. This geometric fitting property of the opening operation leads to a set-
theoretic formulation, which states that the opening of A by B is obtained by
lo
taking the union of all translates of B that fit into A. That is, opening can be ex-
pressed as a fitting process such that
where ´ 5 # 6 denotes the union of all the sets inside the braces.
C
Closing has a similar geometric interpretation, except that now we roll B on
the outside of the boundary (Fig. 9.9). As discussed below, opening and closing
are duals of each other, so having to roll the ball on the outside is not unex-
pected. Geometrically, a point w is an element of A • B if and only if
(B)z ¨ A Z for any translate of (B)z that contains w. Figure 9.9 illustrates
tu
A B {(B)z|(B)z A}
A
Translates of B in A
V
a b c d
FIGURE 9.8 (a) Structuring element B “rolling” along the inner boundary of A (the dot
indicates the origin of B). (b) Structuring element. (c) The heavy line is the outer
boundary of the opening. (d) Complete opening (shaded). We did not shade A in (a)
for clarity.
9.3 ■ Opening and Closing 637
B
AB
a b c
FIGURE 9.9 (a) Structuring element B “rolling” on the outer boundary of set A. (b) The
ud
heavy line is the outer boundary of the closing. (c) Complete closing (shaded). We did
not shade A in (a) for clarity.
■ Figure 9.10 further illustrates the opening and closing operations. Figure EXAMPLE 9.3:
9.10(a) shows a set A, and Fig. 9.10(b) shows various positions of a disk struc- A simple
turing element during the erosion process. When completed, this process re- illustration of
morphological
sulted in the disjoint figure in Fig. 9.10(c). Note the elimination of the bridge opening and
between the two main sections. Its width was thin in relation to the diameter of
lo closing.
a
b c
d e
f g
C
h i
A
FIGURE 9.10
Morphological
opening and
closing. The
tu
structuring
AB element is the
small circle shown
in various
positions in
(b). The SE was
not shaded here
A B (A B) B for clarity. The
V
AB
A B (A B) B
638 Chapter 9 ■ Morphological Image Processing
the structuring element; that is, the structuring element could not be complete-
ly contained in this part of the set, thus violating the conditions of Eq. (9.2-1).
The same was true of the two rightmost members of the object. Protruding el-
ements where the disk did not fit were eliminated. Figure 9.10(d) shows the
process of dilating the eroded set, and Fig. 9.10(e) shows the final result of
opening. Note that outward pointing corners were rounded, whereas inward
pointing corners were not affected.
Similarly, Figs. 9.10(f) through (i) show the results of closing A with the
same structuring element. We note that the inward pointing corners were
rounded, whereas the outward pointing corners remained unchanged. The
leftmost intrusion on the boundary of A was reduced in size significantly, be-
ud
cause the disk did not fit there. Note also the smoothing that resulted in parts
of the object from both opening and closing the set A with a circular structur-
ing element. ■
As in the case with dilation and erosion, opening and closing are duals of
each other with respect to set complementation and reflection. That is,
N)
(A • B)c = (Ac B (9.3-4)
and
lo N)
(A B)c = (Ac • B (9.3-5)
We leave the proof of this result as an exercise (Problem 9.14).
The opening operation satisfies the following properties:
C
(a) A B is a subset (subimage) of A.
(b) If C is a subset of D, then C B is a subset of D B.
(c) (A B) B = A B.
Similarly, the closing operation satisfies the following properties:
tu
EXAMPLE 9.4: ■ Morphological operations can be used to construct filters similar in concept
Use of opening to the spatial filters discussed in Chapter 3. The binary image in Fig. 9.11(a)
and closing for
shows a section of a fingerprint corrupted by noise. Here the noise manifests
morphological
filtering. itself as random light elements on a dark background and as dark elements on
the light components of the fingerprint. The objective is to eliminate the noise
and its effects on the print while distorting it as little as possible. A morpho-
logical filter consisting of opening followed by closing can be used to accom-
plish this objective.
Figure 9.11(b) shows the structuring element used. The rest of Fig. 9.11
shows a step-by-step sequence of the filtering operation. Figure 9.11(c) is the
9.3 ■ Opening and Closing 639
A
AB 1 1 1 B a b
1 1 1 d c
1 1 1 e f
FIGURE 9.11
(a) Noisy image.
(b) Structuring
element.
(c) Eroded image.
(d) Opening of A.
(e) Dilation of the
opening.
(f) Closing of the
ud
opening.
(Original image
courtesy of the
(A B) B A B National Institute
(A B) B [(A B) B] B (A B) B of Standards and
Technology.)
lo
C
result of eroding A with the structuring element. The background noise was
completely eliminated in the erosion stage of opening because in this case all
tu
noise components are smaller than the structuring element. The size of the
noise elements (dark spots) contained within the fingerprint actually increased
in size. The reason is that these elements are inner boundaries that increase in
size as the object is eroded. This enlargement is countered by performing dila-
tion on Fig. 9.11(c). Figure 9.11(d) shows the result. The noise components con-
tained in the fingerprint were reduced in size or deleted completely.
V
a b
c d ACDE W (W D)
ud
e
f
FIGURE 9.12
E Origin
(a) Set A. (b) A
window, W, and C
the local back-
D
ground of D with
respect to
W, (W - D).
(c) Complement
of A. (d) Erosion
of A by D.
(e) Erosion of Ac
by (W - D).
lo Ac
(A D)
(f) Intersection of
(d) and (e), show-
C
ing the location of
the origin of D, as
desired. The dots Ac (W D)
indicate the
origins of C, D,
and E.
tu
Ac (W D)
V
(A D) (Ac [W D])
9.4 ■ The Hit-or-Miss Transformation 641
Let the origin of each shape be located at its center of gravity. Let D be en-
closed by a small window, W. The local background of D with respect to W is
defined as the set difference (W - D), as shown in Fig. 9.12(b). Figure 9.12(c)
shows the complement of A, which is needed later. Figure 9.12(d) shows the
erosion of A by D (the dashed lines are included for reference). Recall that
the erosion of A by D is the set of locations of the origin of D, such that D is
completely contained in A. Interpreted another way, A | D may be viewed
geometrically as the set of all locations of the origin of D at which D found a
match (hit) in A. Keep in mind that in Fig. 9.12 A consists only of the three
disjoint sets C, D, and E.
Figure 9.12(e) shows the erosion of the complement of A by the local back-
ud
ground set (W - D). The outer shaded region in Fig. 9.12(e) is part of the ero-
sion. We note from Figs. 9.12(d) and (e) that the set of locations for which D
exactly fits inside A is the intersection of the erosion of A by D and the erosion
of Ac by (W - D) as shown in Fig. 9.12(f). This intersection is precisely the lo-
cation sought. In other words, if B denotes the set composed of D and its back-
ground, the match (or set of matches) of B in A, denoted A ~ * B, is
* B = (A | D) ¨ C A | (W - D) D
c
A~ (9.4-1)
lo
We can generalize the notation somewhat by letting B = (B1, B2), where
B1 is the set formed from elements of B associated with an object and B2 is the
set of elements of B associated with the corresponding background. From the
preceding discussion, B1 = D and B2 = (W - D). With this notation, Eq.
(9.4-1) becomes
C
c
A~
* B = (A | B1) ¨ (A | B2) (9.4-2)
of set differences given in Eq. (2.6-19) and the dual relationship between ero-
sion and dilation given in Eq. (9.2-5), we can write Eq. (9.4-2) as
A~ N )
* B = (A | B1) - (A { B2 (9.4-3)
However, Eq. (9.4-2) is considerably more intuitive. We refer to any of the pre-
ceding three equations as the morphological hit-or-miss transform.
V
The reason for using a structuring element B1 associated with objects and
an element B2 associated with the background is based on an assumed defini-
tion that two or more objects are distinct only if they form disjoint (discon-
nected) sets. This is guaranteed by requiring that each object have at least a
one-pixel-thick background around it. In some applications, we may be inter-
ested in detecting certain patterns (combinations) of 1s and 0s within a set, in
which case a background is not required. In such instances, the hit-or-miss
transform reduces to simple erosion. As indicated previously, erosion is still a
set of matches, but without the additional requirement of a background match
for detecting individual objects. This simplified pattern detection scheme is
used in some of the algorithms developed in the following section.
642 Chapter 9 ■ Morphological Image Processing
ud
designed to clarify the mechanics of each morphological process as we in-
troduce it. These images are shown graphically with 1s shaded and 0s in
white.
A B
V
AB b(A)
a b
c d
FIGURE 9.13 (a) Set A. (b) Structuring element B. (c) A eroded by B. (d) Boundary,
given by the set difference between A and its erosion.
9.5 ■ Some Basic Morphological Algorithms 643
a b
FIGURE 9.14
(a) A simple
binary image, with
1s represented in
white. (b) Result
of using
Eq. (9.5-1) with
the structuring
element in
Fig. 9.13(b).
ud
■ Figure 9.14 further illustrates the use of Eq. (9.5-1) with a 3 * 3 structuring EXAMPLE 9.5:
element of 1s. As for all binary images in this chapter, binary 1s are shown in Boundary
extraction by
white and 0s in black, so the elements of the structuring element, which are 1s,
lo morphological
also are treated as white. Because of the size of the structuring element used, processing.
the boundary in Fig. 9.14(b) is one pixel thick. ■
Xk = (Xk - 1 { B) ¨ Ac k = 1, 2, 3, Á (9.5-2)
V
a b c
d e f
g h i
FIGURE 9.15 Hole
filling. (a) Set A
(shown shaded).
(b) Complement
of A.
A Ac B
(c) Structuring
element B.
(d) Initial point
inside the
ud
boundary.
(e)–(h) Various
steps of
Eq. (9.5-2).
(i) Final result
[union of (a) X0 X1 X2
and (h)].
lo X6 X8 X8 A
C
EXAMPLE 9.6: ■ Figure 9.16(a) shows an image composed of white circles with black inner
Morphological spots. An image such as this might result from thresholding into two levels a
hole filling.
scene containing polished spheres (e.g., ball bearings). The dark spots inside
the spheres could be the result of reflections. The objective is to eliminate the
tu
reflections by hole filling. Figure 9.16(a) shows one point selected inside one of
the spheres, and Fig. 9.16(b) shows the result of filling that component. Finally,
V
a b c
FIGURE 9.16 (a) Binary image (the white dot inside one of the regions is the starting
point for the hole-filling algorithm). (b) Result of filling that region. (c) Result of filling
all holes.
9.5 ■ Some Basic Morphological Algorithms 645
Fig. 9.16(c) shows the result of filling all the spheres. Because it must be known
whether black points are background points or sphere inner points, fully au-
tomating this procedure requires that additional “intelligence” be built into
the algorithm. We give a fully automatic approach in Section 9.5.9 based on
morphological reconstruction. (See also Problem 9.23.) ■
ud
one or more connected components, and form an array X0 (of the same size as
the array containing A) whose elements are 0s (background values), except at
each location known to correspond to a point in each connected component in
A, which we set to 1 (foreground value). The objective is to start with X0 and
find all the connected components. The following iterative procedure accom-
plishes this objective:
Xk = (Xk - 1 { B) ¨ A k = 1, 2, 3, Á (9.5-3)
lo
where B is a suitable structuring element (as in Fig. 9.17). The procedure ter-
minates when Xk = Xk - 1, with Xk containing all the connected components
B
C
tu
A X0 X1
V
X2 X3 X6
a
b c d
e f g
FIGURE 9.17 Extracting connected components. (a) Structuring element. (b) Array
containing a set with one connected component. (c) Initial array containing a 1 in the
region of the connected component. (d)–(g) Various steps in the iteration of Eq. (9.5-3).
646 Chapter 9 ■ Morphological Image Processing
of the input image. Note the similarity in Eqs. (9.5-3) and (9.5-2), the only dif-
ference being the use of A as opposed to Ac. This is not surprising, because
here we are looking for foreground points, while the objective in Section 9.5.2
was to find background points.
Figure 9.17 illustrates the mechanics of Eq. (9.5-3), with convergence being
achieved for k = 6. Note that the shape of the structuring element used is
based on 8-connectivity between pixels. If we had used the SE in Fig. 9.15,
which is based on 4-connectivity, the leftmost element of the connected com-
See Problem 9.24 for an
ponent toward the bottom of the image would not have been detected because
algorithm that does not it is 8-connected to the rest of the figure. As in the hole-filling algorithm,
require that a point in
Eq. (9.5-3) is applicable to any finite number of connected components con-
ud
each connected compo-
nent be known a priori. tained in A, assuming that a point is known in each.
EXAMPLE 9.7: ■ Connected components are used frequently for automated inspection.
Using connected Figure 9.18(a) shows an X-ray image of a chicken breast that contains bone
components to fragments. It is of considerable interest to be able to detect such objects in
detect foreign
objects in
processed food before packaging and/or shipping. In this particular case, the
packaged food. density of the bones is such that their nominal intensity values are different
a
b
lo
from the background. This makes extraction of the bones from the background
c d
FIGURE 9.18
C
(a) X-ray image
of chicken filet
with bone frag-
ments.
(b) Thresholded
tu
02 9
(Image courtesy of 03 9
NTB 04 39
Elektronische 05 133
Geraete GmbH, 06 1
Diepholz, 07 1
Germany, 08 743
www.ntbxray.com.) 09 7
10 11
11 11
12 9
13 9
14 674
15 85
9.5 ■ Some Basic Morphological Algorithms 647
ud
tion. There are a total of 15 connected components, with four of them being
dominant in size. This is enough to determine that significant undesirable ob-
jects are contained in the original image. If needed, further characterization
(such as shape) is possible using the techniques discussed in Chapter 11. ■
Xik = (Xk - 1 ~ i
* B)´A i = 1, 2, 3, 4 and k = 1, 2, 3, Á (9.5-4)
tu
with Xi0 = A. When the procedure converges (i.e., when Xik = Xik - 1), we let
Di = Xik. Then the convex hull of A is
4
C(A) = d Di (9.5-5)
i=1
V
**
*
**
** **
**
**
* *
b c d
*
** **
e f g
B1 B2 B3 B4
h
FIGURE 9.19
(a) Structuring
elements. (b) Set
A. (c)–(f) Results
of convergence
with the
structuring
elements shown
ud
in (a). (g) Convex
hull. (h) Convex X 01 A X 41 X 22
hull showing the
contribution of
each structuring
element.
lo X 83 X 24 C(A)
B1
C
B2
B3
B4
tu
ud
9.5.5 Thinning
The thinning of a set A by a structuring element B, denoted A z B, can be de-
fined in terms of the hit-or-miss transform:
A z B = A - (A ~
* B)
c
= A ¨ (A ~
* B) (9.5-6)
lo
As in the previous section, we are interested only in pattern matching with the
structuring elements, so no background operation is required in the hit-or-miss
transform. A more useful expression for thinning A symmetrically is based on
a sequence of structuring elements:
The process is to thin A by one pass with B1, then thin the result with one pass
of B2, and so on, until A is thinned with one pass of Bn. The entire process is
repeated until no further changes occur. Each individual thinning pass is per-
formed using Eq. (9.5-6).
V
FIGURE 9.20
Result of limiting
growth of the
convex hull
algorithm to the
maximum
dimensions of the
original set of
points along the
vertical and
horizontal
directions.
650 Chapter 9 ■ Morphological Image Processing
9.5.6 Thickening
ud
Thickening is the morphological dual of thinning and is defined by the expression
A } B = A ´ (A ~
* B) (9.5-9)
Origin
*
*
*
*
*
* *
* *
*
*
* *
*
B1 B2 B3 B4 B5 B6 B7 B8
Origin
lo
A A1 A B1 A2 A1 B2
C
tu
A3 A2 B3 A4 A3 B4 A5 A4 B5
a FIGURE 9.21 (a) Sequence of rotated structuring elements used for thinning. (b) Set A.
b c d (c) Result of thinning with the first element. (d)–(i) Results of thinning with the next
e f g seven elements (there was no change between the seventh and eighth elements).
h i j (j) Result of using the first four elements again. (l) Result after convergence. (m)
k l m Conversion to m-connectivity.
9.5 ■ Some Basic Morphological Algorithms 651
The structuring elements used for thickening have the same form as those
shown in Fig. 9.21(a), but with all 1s and 0s interchanged. However, a separate
algorithm for thickening is seldom used in practice. Instead, the usual proce-
dure is to thin the background of the set in question and then complement the
result. In other words, to thicken a set A, we form C = Ac, thin C, and then
form C c. Figure 9.22 illustrates this procedure.
ud
Depending on the nature of A, this procedure can result in disconnected
points, as Fig. 9.22(d) shows. Hence thickening by this method usually is fol-
lowed by postprocessing to remove disconnected points. Note from Fig. 9.22(c)
that the thinned background forms a boundary for the thickening process.
This useful feature is not present in the direct implementation of thickening
using Eq. (9.5-10), and it is one of the principal reasons for using background
thinning to accomplish thickening.
9.5.7 Skeletons
lo
As Fig. 9.23 shows, the notion of a skeleton, S(A), of a set A is intuitively sim-
ple. We deduce from this figure that
(a) If z is a point of S(A) and (D)z is the largest disk centered at z and con-
C
tained in A, one cannot find a larger disk (not necessarily centered at z)
containing (D)z and included in A. The disk (D)z is called a maximum
disk.
(b) The disk (D)z touches the boundary of A at two or more different places.
tu
V
a b
c d
e
FIGURE 9.22 (a) Set A. (b) Complement of A. (c) Result of thinning the complement
of A. (d) Thickened set obtained by complementing (c). (e) Final result, with no
disconnected points.
652 Chapter 9 ■ Morphological Image Processing
a b
c d
FIGURE 9.23
(a) Set A.
(b) Various
positions of
maximum disks
with centers on
the skeleton of A.
(c) Another
maximum disk on
a different
ud
segment of the
skeleton of A.
(d) Complete
skeleton.
lo
The skeleton of A can be expressed in terms of erosions and openings. That is,
it can be shown (Serra [1982]) that
K
C
S(A) = d Sk(A) (9.5-11)
k=0
with
Sk(A) = (A | kB) - (A | kB) B (9.5-12)
tu
■ Figure 9.24 illustrates the concepts just discussed. The first column EXAMPLE 9.8:
shows the original set (at the top) and two erosions by the structuring ele- Computing the
skeleton of a
ment B. Note that one more erosion of A would yield the empty set, so
simple figure.
K = 2 in this case. The second column shows the opening of the sets in the
first column by B. These results are easily explained by the fitting charac-
terization of the opening operation discussed in connection with Fig. 9.8.
The third column simply contains the set differences between the first and
second columns.
The fourth column contains two partial skeletons and the final result (at
the bottom of the column). The final skeleton not only is thicker than it
needs to be but, more important, it is not connected. This result is not unex-
ud
pected, as nothing in the preceding formulation of the morphological skele-
ton guarantees connectivity. Morphology produces an elegant formulation in
terms of erosions and openings of the given set. However, heuristic formula-
tions such as the algorithm developed in Section 11.1.7 are needed if, as is
usually the case, the skeleton must be maximally thin, connected, and mini-
mally eroded.
k
A kB (A kB) B
lo
Sk(A)
K
Sk(A)
k0
K
Sk(A) kB Sk(A) kB
k0
FIGURE 9.24
Implementation
of Eqs. (9.5-11)
through (9.5-15).
The original set is
at the top left, and
its morphological
C
skeleton is at the
0 bottom of the
fourth column.
The reconstructed
set is at the
tu
bottom of the
sixth column.
1
V
S(A) A
B
654 Chapter 9 ■ Morphological Image Processing
9.5.8 Pruning
Pruning methods are an essential complement to thinning and skeletonizing
algorithms because these procedures tend to leave parasitic components that
need to be “cleaned up” by postprocessing. We begin the discussion with a
pruning problem and then develop a morphological solution based on the ma-
ud
terial introduced in the preceding sections. Thus, we take this opportunity to il-
lustrate how to go about solving a problem by combining several of the
techniques discussed up to this point.
A common approach in the automated recognition of hand-printed charac-
ters is to analyze the shape of the skeleton of each character. These skeletons
often are characterized by “spurs” (parasitic components). Spurs are caused
during erosion by non uniformities in the strokes composing the characters.
We develop a morphological technique for handling this problem, starting
lo
with the assumption that the length of a parasitic component does not exceed
a specified number of pixels.
Figure 9.25(a) shows the skeleton of a hand-printed “a.” The parasitic com-
ponent on the leftmost part of the character is illustrative of what we are in-
terested in removing. The solution is based on suppressing a parasitic branch
We may define an end by successively eliminating its end point. Of course, this also shortens (or elim-
point as the center point
C
of a 3 * 3 region that inates) other branches in the character but, in the absence of other structural
satisfies any of the information, the assumption in this example is that any branch with three or
arrangements in
Figs. 9.25(b) or (c). less pixels is to be eliminated. Thinning of an input set A with a sequence of
structuring elements designed to detect only end points achieves the desired
result. That is, let
tu
X1 = A z 5B6 (9.5-17)
where 5B6 denotes the structuring element sequence shown in Figs. 9.25(b)
and (c) [see Eq. (9.5-7) regarding structuring-element sequences]. The se-
quence of structuring elements consists of two different structures, each of
which is rotated 90° for a total of eight elements. The * in Fig. 9.25(b) sig-
V
nifies a “don’t care” condition, in the sense that it does not matter whether
the pixel in that location has a value of 0 or 1. Numerous results reported in
the literature on morphology are based on the use of a single structuring ele-
ment, similar to the one in Fig. 9.25(b), but having “don’t care” conditions
along the entire first column. This is incorrect. For example, this element
would identify the point located in the eighth row, fourth column of Fig.
9.25(a) as an end point, thus eliminating it and breaking connectivity in the
stroke.
Applying Eq. (9.5-17) to A three times yields the set X1 in Fig. 9.25(d). The
next step is to “restore” the character to its original form, but with the parasitic
9.5 ■ Some Basic Morphological Algorithms 655
a b
c
*
d e
* B1, B2, B3, B4 (rotated 90) f g
FIGURE 9.25
(a) Original
image. (b) and
(c) Structuring
B5, B6, B7, B8 (rotated 90) elements used for
deleting end
points. (d) Result
of three cycles of
ud
thinning. (e) End
points of (d).
(f) Dilation of end
points condi-
tioned on (a).
(g) Pruned image.
lo
C
branches removed. To do so first requires forming a set X2 containing all end
points in X1 [Fig. 9.25(e)]:
8
tu
k
X2 = d (X1 ~
*B ) (9.5-18)
k=1
where the Bk are the same end-point detectors shown in Figs. 9.25(b) and (c).
The next step is dilation of the end points three times, using set A as a delimiter:
Equation (9.5-19) is the
X3 = (X2 { H) ¨ A (9.5-19) basis for morphological
reconstruction by dila-
where H is a 3 * 3 structuring element of 1s and the intersection with A is
V
points of these branches are near the skeleton. Although Eq. (9.5-17) may
eliminate them, they can be picked up again during dilation because they are
valid points in A. Unless entire parasitic elements are picked up again (a rare
case if these elements are short with respect to valid strokes), detecting and
eliminating them is easy because they are disconnected regions.
A natural thought at this juncture is that there must be easier ways to solve
this problem. For example, we could just keep track of all deleted points and
simply reconnect the appropriate points to all end points left after application
of Eq. (9.5-17). This option is valid, but the advantage of the formulation just
presented is that the use of simple morphological constructs solved the entire
problem. In practical situations when a set of such tools is available, the ad-
ud
vantage is that no new algorithms have to be written. We simply combine the
necessary morphological functions into a sequence of operations.
G (F) = (F { B) ¨ G
where ¨ denotes the set intersection (here ¨ may be interpreted as a logical
AND because the set intersection and logical AND operations are the same
for binary sets). The geodesic dilation of size n of F with respect to G is de-
fined as
G C G
D(n)
G
(F) = D(1) D(n - 1)(F) D (9.5-22)
V
(0)
with DG (F) = F. In this recursive expression, the set intersection in Eq. (9.5-21)
is performed at each step.‡ Note that the intersection operator guarantees that
†
In much of the literature on morphological reconstruction, the structuring element is tacitly assumed to
be isotropic and typically is called an elementary isotropic structuring element. In the context of this
chapter, an example of such an SE is simply a 3 * 3 array of 1s with the origin at the center.
‡
Although it is more intuitive to develop morphological-reconstruction methods using recursive formu-
lations (as we do here), their practical implementation typically is based on more computationally effi-
cient algorithms (see, for example, Vincent [1993] and Soille [2003]). All image-based examples in this
section were generated using such algorithms.
9.5 ■ Some Basic Morphological Algorithms 657
FIGURE 9.26
Illustration of
B geodesic dilation.
Marker, F
Marker dilated by B Geodesic dilation, D(1)(F )
G
ud
Mask, G
mask G will limit the growth (dilation) of marker F. Figure 9.26 shows a sim-
ple example of a geodesic dilation of size 1. The steps in the figure are a direct
implementation of Eq. (9.5-21).
Similarly, the geodesic erosion of size 1 of marker F with respect to mask G
is defined as lo
E (1)
G (F) = (F | B) ´ G (9.5-23)
where ´ denotes set union (or OR operation). The geodesic erosion of size n
of F with respect to G is defined as
G (F) = E G C E G (F) D
E (n) (1) (n - 1)
(9.5-24)
C
with E (0)
G (F) = F. The set union operation in Eq. (9.5-23) is performed at each
iterative step, and guarantees that geodesic erosion of an image remains
greater than or equal to its mask image. As expected from the forms in Eqs.
(9.5-21) and (9.5-23), geodesic dilation and erosion are duals with respect to
tu
set complementation (see Problem 9.29). Figure 9.27 shows a simple example
of geodesic erosion of size 1. The steps in the figure are a direct implementa-
tion of Eq. (9.5-23).
FIGURE 9.27
Illustration of
V
B geodesic erosion.
Marker, F
Marker eroded by B Geodesic erosion, E (1)(F )
G
Mask, G
658 Chapter 9 ■ Morphological Image Processing
Geodesic dilation and erosion of finite images always converge after a finite
number of iterative step because propagation or shrinking of the marker
image is constrained by the mask.
RD (k)
G (F) = D G (F) (9.5-25)
ud
with k such that D(k)
G (F) = D G
(k + 1)
(F).
Figure 9.28 illustrates reconstruction by dilation. Figure 9.28(a) continues
the process begun in Fig. 9.26; that is, the next step in reconstruction after ob-
(1)
taining DG (F) is to dilate this result and then AND it with the mask G to yield
(2) (2)
DG (F), as Fig. 9.28(b) shows. Dilation of DG (F) and masking with G then
(3)
yields DG (F), and so on. This procedure is repeated until stability is
reached. If we carried this example one more step, we would find that
(5) (6)
DG (F) = DG (F), so the morphologically reconstructed image by dilation is
lo (5)
given by R D G (F) = D G (F), as indicated in Eq. (9.5-25). Note that the recon-
structed image in this case is identical to the mask because F contained a sin-
gle 1-valued pixel (this is analogous to convolution of an image with an
impulse, which simply copies the image at the location of the impulse, as ex-
plained in Section 3.4.2).
C
In a similar manner, the morphological reconstruction by erosion of a mask
image G from a marker image F, denoted RE G(F), is defined as the geodesic
erosion of F with respect to G, iterated until stability; that is,
(k)
RE
G(F) = E G (F) (9.5-26)
tu
(k) (k + 1)
with k such that E G (F) = E G (F). As an exercise, you should generate a
figure similar to Fig. 9.28 for morphological reconstruction by erosion.
a b c d
e f g h
FIGURE 9.28
V
Illustration of
morphological
reconstruction by
dilation. F, G, B (1) (2) (2) (3)
(1) DG (F) dilated by B DG (F) DG (F ) dilated by B DG (F)
and DG (F) are
from Fig. 9.26.
Reconstruction by dilation and erosion are duals with respect to set com-
plementation (see Problem 9.30).
Sample applications
Morphological reconstruction has a broad spectrum of practical applications,
each determined by the selection of the marker and mask images, by the struc-
turing elements used, and by combinations of the primitive operations defined
in the preceding discussion. The following examples illustrate the usefulness of
these concepts.
ud
small objects and the subsequent dilation attempts to restore the shape of ob-
jects that remain. However, the accuracy of this restoration is highly dependent
on the similarity of the shapes of the objects and the structuring element used.
Opening by reconstruction restores exactly the shapes of the objects that remain
after erosion.The opening by reconstruction of size n of an image F is defined as
the reconstruction by dilation of F from the erosion of size n of F; that is,
R (F) = R F C (F | nB) D
O (n) D
(9.5-27)
lo
where (F | nB) indicates n erosions of F by B, as explained in Section 9.5.7.
Note that F is used as the mask in this application. A similar expression can be
written for closing by reconstruction (see Table 9.1).
Figure 9.29 shows an example of opening by reconstruction. In this illus-
tration, we are interested in extracting from Fig. 9.29(a) the characters that
C
contain long, vertical strokes. Opening by reconstruction requires at least
one erosion, so we perform that step first. Figure 9.29(b) shows the erosion
tu
V
a b
c d
FIGURE 9.29 (a) Text image of size 918 * 2018 pixels. The approximate average height
of the tall characters is 50 pixels. (b) Erosion of (a) with a structuring element of size
51 * 1 pixels. (c) Opening of (a) with the same structuring element, shown for
reference. (d) Result of opening by reconstruction.
660 Chapter 9 ■ Morphological Image Processing
ud
I(x, y) denote a binary image and suppose that we form a marker image F that
is 0 everywhere, except at the image border, where it is set to 1 - I; that is,
1 - I(x, y) if (x, y) is on the border of I
F(x, y) = b (9.5-28)
0 otherwise
Then
H = C R Ic (F) D
D c
(9.5-29)
lo
is a binary image equal to I with all holes filled.
Let us consider the individual components of Eq. (9.5-29) to see how this
expression in fact leads to all holes in an image being filled. Figure 9.30(a)
shows a simple image I containing one hole, and Fig. 9.30(b) shows its comple-
ment. Note that because the complement of I sets all foreground (1-valued)
pixels to background (0-valued) pixels, and vice versa, this operation in effect
C
builds a “wall” of 0s around the hole. Because I c is used as an AND mask, all
we are doing here is protecting all foreground pixels (including the wall
around the hole) from changing during iteration of the procedure. Figure
9.30(c) is array F formed according to Eq. (9.5-28) and Fig. 9.30(d) is F dilated
with a 3 * 3 SE whose elements are all 1s. Note that marker F has a border of
tu
Fig. 9.30(f). As desired, the hole is now filled and the rest of image I was un-
changed. The operation H ¨ I c yields an image containing 1-valued pixels in
the locations corresponding to the holes in I, as Fig. 9.30(g) shows.
a b c d e f g
FIGURE 9.30
Illustration of
hole filling on a
simple image.
I Ic F FB F B Ic H H Ic
9.5 ■ Some Basic Morphological Algorithms 661
a b
c d
FIGURE 9.31
(a) Text image of
size 918 * 2018
pixels. (b) Com-
plement of (a) for
use as a mask
image. (c) Marker
image. (d) Result
of hole-filling
using Eq. (9.5-29).
ud
Figure 9.31 shows a more practical example. Figure 9.31(b) shows the com-
plement of the text image in Fig. 9.31(a), and Fig. 9.31(c) is the marker image,
F, generated using Eq. (9.5-28). This image has a border of 1s, except at loca-
tions corresponding to 1s in the border of the original image. Finally, Fig. 9.31(d)
lo
shows the image with all the holes filled.
X = I - RD
I (F) (9.5-31)
to obtain an image, X, with no objects touching the border.
a b
FIGURE 9.32
Border clearing.
(a) Marker image.
(b) Image with no
objects touching
the border. The
original image is
Fig. 9.29(a).
662 Chapter 9 ■ Morphological Image Processing
each element is at (rotate 90) (rotate 45)
its center and the III IV
*’s indicate
“don’t care”
values. B i i 1, 2, 3, 4 B i i 5, 6, 7, 8
(rotate 90) (rotate 90)
ud
V
As an example, consider the text image again. Figure 9.32(a) in the previous
page shows the reconstruction R D I (F) obtained using a 3 * 3 structuring ele-
ment of all 1s (note the objects touching the boundary on the right side), and
Fig. 9.32(b) shows image X, computed using Eq. (9.5-31). If the task at hand
were automated character recognition, having an image in which no characters
touch the border is most useful because the problem of having to recognize
lo
partial characters (a difficult task at best) is avoided.
(B)z = 5w ƒ w = b + z,
operations and
their properties. Translation Translates the origin
for b H B6 of B to point z.
(Continued)
9.5 ■ Some Basic Morphological Algorithms 663
ud
Boundary b(A) = A - (A | B) Set of points on the boundary
extraction of set A. (I)
Hole filling Xk = (Xk - 1 { B) ¨ Ac; Fills holes in A; X0 = array of
k = 1, 2, 3, Á 0s with a 1 in each hole. (II)
Connected Xk = (Xk - 1 { B) ¨ A; Finds connected components
components k = 1, 2, 3, Á in A; X0 = array of 0s with a
1 in each connected
(Continued)
664 Chapter 9 ■ Morphological Image Processing
ud
dilation of and mask images, respectively.
size 1
DG (F) = DG C DG (F) D ;
(n) (1) (n - 1)
Geodesic
dilation of D(0)
G (F) = F
size n
Geodesic E (1)
G
(F) = (F | B) ´ G
erosion of
size 1
EG (F) = EG C EG (F) D ;
Geodesic
erosion of
size n
Morphological
lo (n)
(0)
E G (F)
RD
= F
G (F) = D G (F)
(k)
(1) (n - 1)
k is such that
reconstruction D(k)
G (F) = D G
(k + 1)
(F)
by dilation
C
(k)
Morphological RE
G(F) = E G (F) k is such that
reconstruction E (k)
G (F) = E G
(k + 1)
(F)
by erosion
(n)
Opening by O R (F) = RD
F [(F | nB)] (F | nB) indicates n
reconstruction erosions of F by B.
tu
Closing by
(n)
reconstruction C R (F) = R E
F [(F { nB)] (F { nB) indicates n
dilations of F by B.
H = C RD
Ic (F) D
c
Hole filling H is equal to the input
image I, but with all holes
filled. See Eq. (9.5-28) for
V