0% found this document useful (0 votes)
34 views

DIP Notes

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

DIP Notes

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 172

Image Processing Lecture 1

References:-
1. Rafael C. Gonzalez, Richard E. Woods, "Digital Image Processing", 2/E, Prentice
– Hall 2001.
2. Scott E Umbaugh, “Computer Vision and Image Processing”, Prentice – Hall
1998.
3. Nick Efford, “Digital Image Processing – a practical approach using Java”,
Pearson Education 2000.
4. John R Jensen, “Introductory Digital Image Processing”, 3/E. Prentice Hall, 2005.

1. Introduction
An image is a picture: a way of recording and presenting information
visually. Since vision is the most advanced of our senses, it is not
surprising that images play the single most important role in human
perception. The information that can be conveyed in images has been
known throughout the centuries to be extraordinary - one picture is worth
a thousand words.
However, unlike human beings, imaging machines can capture and
operate on images generated by sources that cannot be seen by humans.
These include X-ray, ultrasound, electron microscopy, and computer-
generated images. Thus, image processing has become an essential field
that encompasses a wide and varied range of applications.

2. Basic definitions
 Image processing is a general term for the wide range of
techniques that exist for manipulating and modifying images in
various ways.
 A digital image may be defined as a finite, discrete
representation of the original continuous image. A digital image
is composed of a finite number of elements called pixels, each
of which has a particular location and value.

Page 1
Image Processing Lecture 1

 The term digital image processing refers to processing digital


images by means of a digital computer.

3. Digital image processing and other related areas


There is no general agreement regarding where image processing
stops and other related areas, such as image analysis and computer vision,
start. Sometimes a distinction is made by the following paradigm:
 Image processing is a discipline in which both the input and output
of a process are images. For example, it involves primitive
operations such as image preprocessing to reduce noise and
contrast enhancement.
 Image analysis (also called image understanding) is in between
image processing and computer vision. In this area, the process is
characterized by the fact that its inputs generally are images, but its
outputs are attributes extracted from those images (e.g., edges,
contours, and the identity of individual objects). This area includes
tasks such as image segmentation (partitioning an image into
regions or objects), description of those objects to reduce them to a
form suitable for computer processing, and classification
(recognition) of individual objects.
 Finally, computer vision is a field whose ultimate goal is to use
computers to emulate human vision, including learning and being
able to make inferences of recognized objects and take actions
based on visual inputs. This area itself is a branch of artificial
intelligence (AI) whose objective is to emulate human intelligence.

Page 2
Image Processing Lecture 1

4. Types of Imaging Systems


Imaging systems are varying depending on their energy source (e.g.
visual, X-ray, and so on). The principal energy source for images in use
today is the electromagnetic (EM) spectrum illustrated in Figure 1.1.
Other important sources of energy include acoustic, ultrasonic, and
electronic (in the form of electron beams used in electron microscopy).
Synthetic images, used for modeling and visualization, are generated by
computer. In this section we discuss briefly how images are generated in
these various categories and the areas in which they are applied.

Figure 1.1 the electromagnetic spectrum arranged according to energy per photon.

4.1 Gamma-ray Imaging


Gamma rays are emitted as a result of collision of certain radioactive
isotopes (a positron and an electron). This occurs naturally around
exploding stars, and can be created easily. Images are produced from the
emissions collected by gamma ray detectors.
Major uses of gamma ray imaging include nuclear medicine and
astronomical observations. In nuclear medicine, a patient is injected with
a radioactive isotope that emits gamma rays as it decays. Figure 1.2(a)
shows a major modality of nuclear imaging called positron emission
tomography (PET) obtained by using gamma-ray imaging. The image in
this figure shows a tumor in the brain and one in the lung, easily visible
as small white masses.

Page 3
Image Processing Lecture 1

(b)

(a)
Figure 1.2 Examples of Gamma-Ray imaging a) PET image b) Star explosion 15,000 years
ago

Figure 1.2(b) shows a star exploded about 15,000 years ago, imaged in
the gamma-ray band. Unlike the previous example shown in Figure 1.2(a)
, this image was obtained using the natural radiation of the object being
imaged.

4.2 X-ray Imaging


X-rays are generated using an X-ray tube (a vacuum tube with a cathode
and anode). The cathode is heated, causing free electrons to be released
and flowing at high speed to the positively charged anode. When the
electrons strike a nucleus, a modified energy is released in the form of X-
ray radiation. Images are either generated by: 1) dropping the resulting
energy on a film, then digitizing it or 2) dropping directly onto devices
that convert X-rays to light. The light signal in turn is captured by a light-
sensitive digitizing system.

Page 4
Image Processing Lecture 1

X-rays are widely used for imaging in medicine, industry and astronomy.
In medicine, chest X-ray, illustrated in Figure 1.3(a), is widely used for
medical diagnostics.

(a) (b) (c)


Figure 1.3 Examples of X-Ray imaging. a) Chest X-ray. b) Circuit board. c) Star explosion

In industrial processes, X-rays are used to examine circuit boards, see


Figure 1.3(b), for flaws in manufacturing, such as missing components or
broken traces. Figure 1.3(c) shows an example of X-ray imaging in
astronomy. This image is the star explosion of Figure 1.2 (b), but imaged
this time in the X-ray band.

4.3 Ultraviolet Imaging


Applications of ultraviolet "light" are varied. They include industrial
inspection, fluorescence microscopy, lasers, biological imaging, and
astronomical observations. For example, Figure 1.4(a) shows a
fluorescence microscope image of normal corn, and Figure 1.4(b) shows
corn infected by "smut," a disease of corn. Figure 1.4(c) shows the entire
"oval" of the auroral emissions at Saturn's South Pole captured with
Cassini's ultraviolet imaging spectrograph.

Page 5
Image Processing Lecture 1

(a) (b) (c)


Figure 1.4 Examples of ultraviolet imaging (a) Normal corn (b) Smut corn (c) Emissions at
Saturn's South Pole

4.4 Imaging in the Visible and Infrared bands


The visual band of the EM spectrum is the most familiar in all activities
and has the widest scope of application. The infrared band often is used in
conjunction with visual imaging (Multispectral Imaging). Applications
include light microscopy, astronomy, remote sensing, industry, and law
enforcement. Figure 1.5(a) shows a microprocessor image magnified 60
times with a light microscope, and Figure 1.5(b) illustrates infrared
satellite image of the Americas. Figure 1.5(c) shows a multispectral image
of a hurricane taken by a weather satellite.

(b)

(a) (c)
Figure 1.5 Examples of visible and infrared imaging. a) Microprocessor magnified 60 times.
b) Infrared satellite image of the US. c) Multispectral image of Hurricane
Page 6
Image Processing Lecture 1

4.5 Imaging in the Microwave band


The dominant application of imaging in the microwave band is radar.
Imaging radar works like a flash camera in that it provides its own
illumination (microwave pulses) to illuminate an area on the ground and
take a snapshot image. Instead of a camera lens, radar uses an antenna
and digital computer processing to record its images. In a radar image,
one can see only the microwave energy that was reflected back toward
the radar antenna. Figure 1.6 shows a radar image covering a rugged
mountainous area.

Figure 1.6 Radar image of mountainous region

4.6 Imaging in the Radio band


The major applications of imaging in the radio band are in medicine and
astronomy. In medicine radio waves are used in magnetic resonance
imaging (MRI). For MRI, a powerful magnet passes radio waves through
the patient body in short pulses. Patient’s tissues respond by emitting
pulses of radio waves. The location and strength of these signals are
determined by a computer, which produces a 2D picture of a section of
the patient. Figure 1.7 shows MRI images of a human knee and spine.

Page 7
Image Processing Lecture 1

(a) (b)
Figure 1.7 MRI images of a human (a) knee, and (b) spine.

7. Other Imaging Modalities


There are a number of other imaging modalities that also are important.
Examples include acoustic imaging, electron microscopy, and synthetic
(computer-generated) imaging.
Imaging using "sound waves" finds application in medicine,
industry and geological exploration. In medicine, ultrasound imaging is
used in obstetrics where unborn babies are imaged to determine the health
of their development. A byproduct of this examination is determining the
sex of the baby. Figure 1.8 shows examples of ultrasound imaging.
The procedure of generating ultrasound images is as follows:
1. The ultrasound system (a computer, ultrasound probe consisting of
a source and receiver, and a display) transmits high-frequency (1 to
5 MHz) sound pulses into the body.
2. The sound waves travel into the body and hit a boundary between
tissues. Then, they are reflected back and picked up by the probe
and relayed to the computer.
3. The computer calculates the distance from the probe to the tissue or
organ boundaries, and then it displays the distances and intensities
of the echoes on the screen, forming a two-dimensional image.

Page 8
Image Processing Lecture 1

(a) (b)
Figure 1.8 Examples of ultrasound imaging. a) Baby b) another view of baby

Finally, Figure 1.9(a) shows an image of damaged integrated


circuit magnified 2500 times using an electron microscope. Figure 1.9(b)
shows a fractal image generated by a computer.

(a) (b)
Figure 1.9 (a) image of damaged integrated circuit magnified 2500 times (b) fractal image

Page 9
Image Processing Lecture 1

5. Digital image processing applications


Image processing is used in a wide range of applications for example:

 Security (e.g. face, fingerprint and iris recognition)

(a)

(b) (c)

Figure 1.10 (a) Face recognition system for PDA (b) Iris recognition (c) Fingerprint
recognition

 Surveillance (e.g. car number plate recognition)

Figure 1.11 Car number plate recognition

 Medical applications as shown in the previous sections

Page 10
Image Processing Lecture 1

6. Components of digital image processing system


The basic model of a digital image processing system assumes the
existence of a source of energy, a sensor devise to detect the
emitted/reflected energy, a coding system for the range of measurements,
and a display device. However, a modern DIP system requires powerful
computing hardware, specialized software, large storage systems and
communication devices. Figure 1.12 shows the basic components
comprising a typical general-purpose system used for digital image
processing.

Figure 1.12 Components of a general-purpose image processing system

7. Fundamental tasks in digital image processing


Image applications require a variety of techniques that can be divided into
two main categories: image processing techniques whose input and

Page 11
Image Processing Lecture 1

output are images, and image analysis techniques whose inputs are
images, but whose outputs are attributes extracted from those images.
1. Image processing techniques include:
 Image Enhancement: brings out detail that is obscured, or simply
highlights certain features of interest in an image. A familiar
example of enhancement is increasing the contrast of an image.
 Image Restoration: attempts to reconstruct or recover an image that
has been degraded by using a priori knowledge of the degradation
phenomenon.
 Image Compression: deals with techniques for reducing the storage
required to save an image, or the bandwidth required to transmit it.
2. Image Analysis tasks include:
 Image Segmentation: is concerned with procedures that partition an
image into its constituent parts or objects.
 Image Representation and Description: Image representation
converts the output of a segmentation stage to a form suitable for
computer processing. This form could be either the boundary of a
region or the whole region itself. Image description, also called
feature selection, deals with extracting attributes that result in some
quantitative information of interest or are basic for differentiating
one class of objects from another.
 Image Recognition: is the process that assigns a label (e.g.,
"vehicle") to an object based on its descriptors.

Page 12
Image Processing

Types of Digital Images


The images types we will consider are: 1) binary, 2) gray-scale, 3) color,
and 4) multispectral.

1. Binary images
Binary images are the simplest type of images and can take on two
values, typically black and white, or 0 and 1. A binary image is referred to
as a 1-bit image because it takes only 1 binary digit to represent each
pixel. These types of images are frequently used in applications where the
only information required is general shape or outline, for example optical
character recognition (OCR).
Binary images are often created from the gray-scale images via a
threshold operation, where every pixel above the threshold value is turned
white (‘1’), and those below it are turned black (‘0’). In the figure below,
we see examples of binary images.

(a) (b)

Figure 2.1 Binary images. (a) Object outline. (b) Page of text used in OCR application.

2. Gray-scale images
Gray-scale images are referred to as monochrome (one-color) images.

Page 1
Image Processing Lecture 2

They contain gray-level information, no color information. The number


of bits used for each pixel determines the number of different gray levels
available. The typical gray-scale image contains 8bits/pixel data, which
allows us to have 256 different gray levels. The figure below shows
examples of gray-scale images.

Figure 2.2 Examples of gray-scale images

In applications like medical imaging and astronomy, 12 or 16 bits/pixel


images are used. These extra gray levels become useful when a small
section of the image is made much larger to discern details.

3. Color images
Color images can be modeled as three-band monochrome image data,
where each band of data corresponds to a different color. The actual
information stored in the digital image data is the gray-level information
in each spectral band.
Typical color images are represented as red, green, and blue (RGB
images). Using the 8-bit monochrome standard as a model, the
corresponding color image would have 24-bits/pixel (8-bits for each of
the three color bands red, green, and blue). The figure below illustrates a
representation of a typical RGB color image.

Page 2
Image Processing Lecture 2

Figure 2.3 Representation of a typical RGB color image

4. Multispectral images
Multispectral images typically contain information outside the normal
human perceptual range. This may include infrared, ultraviolet, X-ray,
acoustic, or radar data. These are not images in the usual sense because
the information represented is not directly visible by the human system.
However, the information is often represented in visual form by mapping
the different spectral bands to RGB components.

Digital Image File Formats


Types of image data are divided into two primary categories: bitmap and
vector.
 Bitmap images (also called raster images) can be represented as 2-
dimensional functions f(x,y), where they have pixel data and the
corresponding gray-level values stored in some file format.
 Vector images refer to methods of representing lines, curves, and
shapes by storing only the key points. These key points are
sufficient to define the shapes. The process of turning these into an

Page 3
Image Processing Lecture 2

image is called rendering. After the image has been rendered, it


can be thought of as being in bitmap format, where each pixel has
specific values associated with it.

Most of the types of file formats fall into the category of bitmap images,
for example:
 PPM (Portable Pix Map) format
 TIFF (Tagged Image File Format)
 GIF (Graphics Interchange Format)
 JPEG (Joint Photographic Experts Group) format
 BMP (Windows Bitmap)
 PNG (Portable Network Graphics)
 XWD (X Window Dump)

A simple image formation model


 In a mathematical view, a monochromatic image is a two-
dimensional function, f(x, y), where x and y are spatial (plane)
coordinates, and the amplitude of f at any pair of coordinates (x,
y) is called the intensity or gray level of the image at that point.
 The values of a monochromatic image (i.e. intensities) are said to
span the gray scale.
 When x,y, and the amplitude value of f are all finite, discrete
quantities, the image is called a digital image.

The function f(x, y) must be nonzero and finite; that is,


0 < f(x, y) < ∞

Page 4
Image Processing Lecture 2

The function f(x, y) is the product of two components: 1) the amount of


source illumination incident on the scene i(x, y) and 2) the amount of
illumination reflected by the objects in the scene r(x, y):
f(x, y)= i(x, y)r(x, y)
where 0<i(x, y)<∞ and 0<r(x, y)<1.
Note that the equation 0<r(x, y)<1 indicates that reflectance is bounded
by 0 (total absorption) and 1 (total reflectance).
The nature of i(x, y) is determined by the illumination source, and r(x, y)
is determined by the characteristics of the imaged objects.
As mentioned earlier, we call the intensity of a monochrome image at any
coordinates (xa, yb) the gray level ( l ) of the image at that point. That is,

l = f(xa,yb)
From the above equations, it is evident that l lies in the range

Lmin ≤ l ≤ Lmax

Where Lmin is positive, and Lmax is finite.


The interval [Lmin , Lmax] is called the gray scale. Common practice is to

shift this interval numerically to the interval [0, L-1], where l = 0 is

considered black and l = L-1 is considered white on the gray scale. All
intermediate values are shades of gray varying from black to white.

Image Sampling and Quantization


To convert the continuous function f(x,y) to digital form, we need to
sample the function in both coordinates and in amplitude.
 Digitizing the coordinate values is called sampling.
 Digitizing the amplitude values is called quantization.

Page 5
Image Processing Lecture 2

In the figure below, we show how to convert the continuous image in


Figure 2.1(a) to the digital form using the sampling and quantization
processes. The one-dimensional function shown in Figure 2.1(b) is a plot
of amplitude (gray level) values of the continuous image along the line
segment AB in Figure 2.1(a).
To sample this function, we take equally spaced samples along line
AB, as shown in Figure 2.1(c). The samples are shown as small white
squares superimposed on the function. The set of these discrete locations
gives the sampled function.
In order to form a digital function, the gray-level values also must be
converted (quantized) into discrete quantities. The right side of Figure
2.1(c) shows the gray-level scale divided into eight discrete levels,
ranging from black to white. The continuous gray levels are quantized
simply by assigning one of the eight discrete gray levels to each sample.

Page 6
Image Processing Lecture 2

Figure 2.1 Generating a digital image. (a) Continuous image, (b) A scan line from A to B in
the continuous image (c) Sampling and quantization, (d) Digital scan line.

The digital samples resulting from both sampling and quantization are
shown in Figure 2.1(d). Starting at the top of the image and carrying out
this procedure line by line produces a two-dimensional digital image as
shown in Figure 2.3.

Page 7
Image Processing Lecture 2

Figure 2.3 Digital image resulted from sampling and quantization

Note that:
 The number of selected values in the sampling process is known as
the image spatial resolution. This is simply the number of pixels
relative to the given image area.
 The number of selected values in the quantization process is called
the grey-level (color level) resolution. This is expressed in terms
of the number of bits allocated to the color levels.
 The quality of a digitized image depends on the resolution
parameters on both processes.

Digital Image Representation


The monochrome digital image f(x,y) resulted from sampling and
quantization has finite discrete coordinates (x,y) and intensities (gray
levels). We shall use integer values for these discrete coordinates and
gray levels. Thus, a monochrome digital image can be represented as a 2-
dimensional array (matrix) that has M rows and N columns:

Page 8
Image Processing Lecture 2

Each element of this matrix array is called pixel. The spatial resolution
(number of pixels) of the digital image is M * N. The gray level resolution
(number of gray levels) L is

L = 2k
Where k is the number of bits used to represent the gray levels of the
digital image. When an image can have 2k gray levels, we can refer to the
image as a “k-bit image”. For example, an image with 256 possible gray-
level values is called an 8-bit image.
The gray levels are integers in the interval [0, L-1]. This interval is called
the gray scale.
The number, b, of bits required to store a digitized image is
b = M * N *k

Example:
For an 8-bit image of size 512×512, determine its gray-scale and storage
size.
Solution k = 8 , M = N = 512
Number of gray levels L = 2k = 28 = 256
The gray scale is [0 , 255]
Storage size (b) = M * N * k = 512 * 512 * 8 = 2,097,152 bits

Page 9
Image Processing Lecture 2

Spatial and Gray-level Resolution


Spatial resolution is the smallest discernible detail in an image. It is
determined by the sampling process. The spatial resolution of a digital
image reflects the amount of details that one can see in the image (i.e. the
ratio of pixel “area” to the area of the image display). If an image is
spatially sampled at M×N pixels, then the larger M×N the finer the
observed details.
Gray-level resolution refers to the smallest discernible change in gray
level. It is determined by the quantization process. As mentioned earlier,
the number of gray levels is usually an integer power of 2. The most
common number is 8 bits, however, 16 bits is used in some applications
where enhancement of specific gray-level ranges is necessary.

Effect of reducing the spatial resolution


Decreasing spatial resolution of a digital image, within the same area,
may result in what is known as checkerboard pattern. Also image details
are lost when the spatial resolution is reduced.
To demonstrate the checkerboard pattern effect, we subsample the
1024× 1024 image shown in the figure below to obtain the image of size
512×512 pixels. The 512×512 is then subampled to 256×256 image, and
so on until 32×32 image. The subsampling process means deleting the
appropriate number of rows and columns from the original image. The
number of allowed gray levels was kept at 256 in all the images.

Page 10
Image Processing Lecture 2

Figure 2.4 A 1024×1024, 8-bit image subsampled down to size 32×32 pixels.

To see the effects resulting from the reduction in the number of samples,
we bring all the subsampled images up to size 1024×1024 by row and
column pixel replication. The resulted images are shown in the figure
below.

Figure 2.5 (a) 1024×1024, 8-bit image. (b) through (f) 512×512, 256×256, 128×128, 64×64,
and 32×32 images resampled into 1024×1024 pixels by row and column duplication

Page 11
Image Processing Lecture 2

Compare Figure 2.5(a) with the 512×512 image in Figure 2.5(b), we find
that the level of detail lost is simply too fine to be seen on the printed
page at the scale in which these images are shown. Next, the 256×256
image in Figure 2.5(c) shows a very slight fine checkerboard pattern in
the borders between flower petals and the black background. A slightly
more pronounced graininess throughout the image also is beginning to
appear. These effects are much more visible in the 128×128 image in
Figure 2.5(d), and they become pronounced in the 64×64 and 32×32
images in Figures 2.5(e) and (f), respectively.

Effect of reducing the gray-level resolution


Decreasing the gray-level resolution of a digital image may result in what
is known as false contouring. This effect is caused by the use of an
insufficient number of gray levels in smooth areas of a digital image.
To illustrate the false contouring effect, we reduce the number of
gray levels of the 256-level image shown in Figure 2.6(a) from 256 to 2.
The resulted images are shown in the figures 2.6(b) through (h). This can
be achieved by reducing the number of bits from k = 7 to k = 1 while
keeping the spatial resolution constant at 452×374 pixels.
We can clearly see that the 256-, 128-, and 64-level images are
visually identical. However, the 32-level image shown in Figure 2.6(d)
has an almost imperceptible set of very fine ridgelike structures in areas
of smooth gray levels (particularly in the skull).False contouring
generally is quite visible in images displayed using 16 or less uniformly
spaced gray levels, as the images in Figures 2.6(e) through (h) show.

Page 12
Image Processing Lecture 2

(a) (b) (c) (d)

(e) (f) (g) (h)


Figure 2.6 (a) 452×374, 256-level image. (b)-(h) Image displayed in 128, 64, 32, 16, 8, 4, and
2 gray levels, while keeping the spatial resolution constant.

Page 13
Image Processing Lecture 3

Example
The pixel values of the following 5×5 image are represented by 8-bit
integers:

Determine f with a gray-level resolution of 2k for (i) k=5 and (ii) k=3.

Solution:
Dividing the image by 2 will reduce its gray level resolution by 1 bit.
Hence to reduce the gray level resolution from 8-bit to 5-bit,
8 bits – 5 bits = 3 bits will be reduced
Thus, we divide the 8-bit image by 8 (23) to get the following 5-bit
image:

Similarly, to obtain 3-bit image, we divide the 8-bit image by 25 (32) to


get:

Page 1
Image Processing Lecture 3

Basic relationships between pixels


1. Neighbors of a Pixel
A pixel p at coordinates (x ,y) has the following neighbors:
 4-neighbors
 four horizontal and vertical neighbors whose coordinates are
(x + 1, y), (x - 1, y), (x, y + 1), (x, y - 1)
 denoted by N4(p)
 diagonal neighbors
 (x +1 , y + 1),(x+ 1 , y- 1),(x-1 , y +1),(x- 1 , y -1)
 denoted by ND(p)
 8-neighbors
 both 4-neighbors and diagonal neighbors
 denoted by N8(p)
Note: some of the neighbors of p lie outside the digital image if (x, y) is
on the border of the image.

2. Distance Measures
For pixels p, q, and z, with coordinates (x, y), (s, t), and (v, w),
respectively, D is a distance function or metric if
a) D(p, q) >= 0 ( D(p,q)=0 iff p = q ),
b) D(p,q) = D(q,p), and
c) D(p,z) <= D(p,q) + D(q,z).
 Euclidean distance between p and q is defined as

 City-block distance between p and q is defined as

Page 2
Image Processing Lecture 3

 Chessboard distance between p and q is defined as

Zooming and Shrinking Digital Images


A) Zooming
Zooming may be viewed as oversampling. It is the scaling of an image
area A of w×h pixels by a factor s while maintaining spatial resolution
(i.e. output has sw×sh pixels). Zooming requires two steps:
 creation of new pixel locations
 assignment of gray levels to those new locations
There are many methods of gray-level assignments, for example nearest
neighbor interpolation and bilinear interpolation.

Nearest neighbor interpolation (Zero-order hold)


is performed by repeating pixel values, thus creating a checkerboard
effect. Pixel replication (a special case of nearest neighbor interpolation)
is used to increase the size of an image an integer number of times. The
example below shows 8-bit image zooming by 2x (2 times) using nearest
neighbor interpolation:

= =

Original image image with rows expanded image with rows and
columns expanded

Page 3
Image Processing Lecture 3

Bilinear interpolation (First-order hold)


is performed by finding linear interpolation between adjacent pixels, thus
creating a blurring effect. This can be done by finding the average gray
value between two pixels and use that as the pixel value between those
two. We can do this for the rows first, and then we take that result and
expand the columns in the same way. The example below shows 8-bit
image zooming by 2x (2 times) using bilinear interpolation:

= =

Original image image with rows expanded image with rows and
columns expanded

Note that the zoomed image has size 2M-1 × 2N-1. However, we can use
techniques such as padding which means adding new columns and/or
rows to the original image in order to perform bilinear interpolation to get
zoomed image of size 2M × 2N.

The figure below shows image zooming using nearest neighbor


interpolation and bilinear interpolation. We can clearly see the
checkerboard and blurring effects. However, the improvements using
bilinear interpolation in overall appearance are clear, especially in the
128×128 and 64×64 cases. The bilinear interpolated images are smoother
than those resulted from nearest neighbor interpolation.

Page 4
Image Processing Lecture 3

Figure 3.1 Top row: images zoomed from 128×128, 64×64, and 32×32 pixels to 1024×1024
pixels susing nearest neighbor interpolation. Bottom row, same sequence, but using bilinear
interpolation

B) Shrinking
Shrinking may be viewed as undersampling. Image shrinking is
performed by row-column deletion. For example, to shrink an image by
one-half, we delete every other row and column.

Original image image with rows deleted image with rows


and columns deleted

Page 5
Image Processing Lecture 3

Image algebra
There are two categories of algebraic operations applied to images:
 Arithmetic
 Logic
These operations are performed on a pixel-by-pixel basis between two or
more images, except for the NOT logic operation which requires only one
image. For example, to add images I1 and I2 to create I3:
I3(x,y) = I1(x,y) + I2(x,y)

 Addition is used to combine the information in two images.


Applications include development of image restoration algorithms
for modeling additive noise and special effects such as image
morphing in motion pictures as shown in the figures below.

(a) Original image (b) Gaussian noise (c) Addition of images


(a) and (b)
Figure 3.2 Image addition (adding noise to the image)

Page 6
Image Processing Lecture 3

(a) First Original (b) Second Original (c) Addition of images


(a) and (b)
Figure 3.3 Image addition (image morphing example)

 Subtraction of two images is often used to detect motion. For


example, in a scene when nothing has changed, the image resulting
from the subtraction is filled with zeros(black image). If something
has changed in the scene, subtraction produces a nonzero result at
the location of movement as shown in the figure below.

(a) Original scene (b) Same scene at a later time

(c) Subtracting image (b) from (a). Only moving objects appear in the resulting image

Figure 3.4 Image subtraction

Page 7
Image Processing Lecture 3

 Multiplication and division are used to adjust the brightness of an


image. Multiplying the pixel values by a number greater than one
will brighten the image, and dividing the pixel values by a factor
greater than one will darken the image. An example of brightness
adjustment is shown in the figure below.

(a) Original image (b) Image multiplied by 2

(c) Image divided by 2

Figure 3.5 Image multiplication and division

The logic operations AND, OR, and NOT form a complete set, meaning
that any other logic operation (XOR, NOR, NAND) can be created by a
combination of these basic elements. They operate in a bit-wise fashion
on pixel data.

Page 8
Image Processing Lecture 3

The AND and OR operations are used to perform masking operation; that
is; for selecting subimages in an image, as shown in the figure below.
Masking is also called Region of Interest (ROI) processing.

(a) Original image (b) AND image mask (c) Resulting image, (a)
AND (b)

(d) Original image (e) OR image mask (f) Resulting image, (d)
OR (e)
Figure 3.6 Image masking

The NOT operation creates a negative of the original image ,as shown in
the figure below, by inverting each bit within each pixel value.

Page 9
Image Processing Lecture 3

(a) Original image (b) NOT operator applied to image (a)


Figure 3.7 Complement image

Image Histogram
The histogram of a digital image is a plot that records the frequency
distribution of gray levels in that image. In other words, the histogram is
a plot of the gray-level values versus the number of pixels at each gray
value. The shape of the histogram provides us with useful information
about the nature of the image content.
The histogram of a digital image f of size M× N and gray levels
in the range [0, L-1] is a discrete function

where is the kth gray level and is the number of pixels in the image
having gray level .
The next figure shows an image and its histogram.

Page 10
Image Processing Lecture 3

(a) 8-bit image

(b) Histogram of image (a)

Figure 3.8 Image histogram

Note that the horizontal axis of the histogram plot (Figure 3.8(b))
represents gray level values, , from 0 to 255. The vertical axis represents
the values of i.e. the number of pixels which have the gray level .
The next figure shows another image and its histogram.

Page 11
Image Processing Lecture 3

(a) 8-bit image

(b) Histogram of image (a)

Figure 3.9 Another image histogram

It is customary to “normalize” a histogram by dividing each of its values


by the total number of pixels in the image, i.e. use the probability
distribution:

Page 12
Image Processing Lecture 3

Thus, represents the probability of occurrence of gray level .


As with any probability distribution:
 all the values of a normalized histogram are less than or equal
to 1
 the sum of all values is equal to 1
Histograms are used in numerous image processing techniques, such as
image enhancement, compression and segmentation.

Page 13
Image Processing Lecture 4

Image Enhancement
Image enhancement aims to process an image so that the output image is
“more suitable” than the original. It is used to solve some computer
imaging problems, or to improve “image quality”. Image enhancement
techniques include smoothing, sharpening, highlighting features, or
normalizing illumination for display and/or analysis.

Image Enhancement Approaches


Image enhancement approaches are classified into two categories:
 Spatial domain methods: are based on direct manipulation of pixels
in an image.
 Frequency domain methods: are based on modifying the Fourier
transform of an image.

Image Enhancement in the Spatial Domain


The term spatial domain refers to the image plane itself, i.e. the total
number of pixels composing an image. To enhance an image in the
spatial domain we transform an image by changing pixel values or
moving them around. A spatial domain process is denoted by the
expression:
S = T(r)
where r is the input image, s is the processed image, and T is an
operator on r . The operator T is applied at each location (x, y) in r to
yield the output, s, at that location.

Page 1
Image Processing Lecture 4

Enhancement using basic gray level transformations


Basic gray level transformation functions can be divided into:
 Linear: e.g. image negatives and piecewise-linear transformation
 Non-linear: e.g. logarithm and power-law transformations

Image negatives
The negative of an image with gray levels in the range [0, L-1] is
obtained by using the following expression
s = L− 1 −r
This type of processing is useful for enhancing white or gray detail
embedded in dark regions of an image, especially when the black areas
are dominant in size.

Page 2
Image Processing Lecture 4

Piecewise-linear transformation
The form of piecewise linear functions can be arbitrarily complex. Some
important transformations can be formulated only as piecewise functions,
for example thresholding:
For any 0 < t < 255 the threshold transform ℎ can be defined as:

Thresholding Transform
255
Output Gray Level, s

204
153
102
51
0
0 51 102 153 204 255
Input Gray Level, r

Figure 4.2 Form of thresholding transform

The figure below shows an example of thresholding an image by 80.

(a) Original image (b) Result of thresholding


Figure 4.3 Thresholding by 80

Page 3
Image Processing Lecture 4

Thresholding has another form used to generate binary images from the
gray-scale images, i.e.:

Thresholding Transform
255
Output Gray Level, s

204
153
102
51
0
0 51 102 153 204 255
Input Gray Level, r

Figure 4.4 Form of thresholding transform to produce binary images

The figure below shows a gray-scale image and its binary image resulted
from thresholding the original by 120:

(a) (b)
Figure 4.5 Thresholding. (a) Gray-scale image. (b) Result of thresholding (a) by 120

Another more complex piecewise linear function can be defined as:

Page 4
Image Processing Lecture 4

Piecewise Linear Transform


255
240
225
210
Output Gray Level, s

195
180
165
150
135
120
105
90
75
60
45
30
15
0
60
0
15
30
45

75
90
105
120
135
150
165
180
195
210
225
240
255
Input Gray Level, r

Figure 4.6 Form of previous piecewise linear transform

By applying this transform on the original image in Figure 4.3(a) we get


the following output image:

Figure 4.7 Result of thresholding


Piecewise linear functions are commonly used for contrast enhancement
and gray-level slicing.

Page 5
Image Processing Lecture 4

Example:
For the following piecewise linear chart determine the equation of
the corresponding grey-level transforms:

Piecewise Linear Transform


255
240
225
210
Output Gray Level, s

195
180
165
150
135
120
105
90
75
60
45
30
15
0
60
15
30
45

75
90

165
105
120
135
150

180
195
210
225
240
255
0

Input Gray Level, r

Solution
We use the straight line formula to compute the equation of each line
segment using two points.

Page 6
Image Processing Lecture 4

Log transformation
The general form of the log transformation is
s = c ∗ log(1 + r )
where c is a constant, and it is assumed that ≥ 0. This transformation is
used to expand the values of dark pixels in an image while compressing
the higher-level values as shown in the figure below.

Log Transform
255
Output Gray Level, s

204

153

102

51

0
0 51 102 153 204 255
Input Gray Level, r

Figure 4.8 Form of Log transform


The figure below shows an example of applying Log transform.

(a) Original image (b) Result of Log transform with c = 1


Figure 4.9 Applying log transformation
Page 7
Image Processing Lecture 4

Note the wealth of detail visible in transformed image in comparison with


the original.

Power-law transformation
Power-law transformations have the basic form:
s = c ∗ r^y
where c and y are positive constants. The power y is known as gamma ,
hence this transform is also called Gamma transformation . The figure
below shows the form of a power-law transform with different gamma (y)
values.

Figure 4.10 Form of power-law transform with


various gamma values (c = 1 in all cases)

Power-law transformations are useful for contrast enhancement. The next


figure shows the use of power-law transform with gamma values less
than 1 to enhance a dark image.
Page 8
Image Processing Lecture 4

(a) (b)

(c) (d)
Figure 4.11 (a) Original MRI image of a human spine. (b)-(d) Results of applying power-law
transformation with c = 1 and y = 0.6,0.4, and 0.3, respectively.

We note that, as gamma decreased from 0.6 to 0.4, more detail became
visible. A further decrease of gamma to 0.3 enhanced a little more detail
in the background, but began to reduce contrast ("washed-out" image).

The next figure shows another example of power-law transform with


gamma values greater than 1, used to enhance a bright image.

Page 9
Image Processing Lecture 4

(a) (b)

(c) (d)
Figure 4.12 (a) Original bright image. (b)-(d) Results of applying power-law transformation
with c = 1 and y = 3, 4, and 5, respectively.

We note that, suitable results were obtained with gamma values of 3.0
and 4.0. The result obtained with y = 5.0 has areas that are too dark, in
which some detail is lost.

From the two examples, we note that:


 Dark areas become brighter and very bright areas become slightly
darker.
 Faint (bright) images can be improved with y >1, and dark images
benefit from using y <1.
Page 10
Image Processing Lecture 4

Image Dynamic range, Contrast and Brightness


The dynamic range of an image is the exact subset of gray values
{0,1,…,L-1} that are present in the image. The image histogram gives a
clear indication on its dynamic range.
Image contrast is a combination of the range of intensity values
effectively used within a given image and the difference between the
image's maximum and minimum pixel values.
 When the dynamic range of an image is concentrated on the low
side of the gray scale, the image will be a dark image.
 When the dynamic range of an image is biased toward the high
side of the gray scale, the image will be a bright (light) image.
 An image with low contrast has a dynamic range that will be
narrow and will be centered toward the middle of the gray scale.
Low-contrast images tend to have a dull, washed-out gray look,
and they can result from 1) poor illumination, 2) lack of dynamic
range in the imaging sensor, or 3) wrong setting of lens aperture at
the image capturing stage.
 When the dynamic range of an image contains a significant
proportion (i.e. covers a broad range) of the gray scale, the image is
said to have a high dynamic range, and the image will have a high
contrast. In high-contrast images, the distribution of pixels is not
too far from uniform, with very few vertical lines being much
higher than the others.
The next figure illustrates a gray image shown in four basic gray-level
characteristics: dark, light, low-contrast, and high-contrast. The right side
of the figure shows the histograms corresponding to these images.

Page 11
Image Processing Lecture 4

Dark
image

Light
image

Low-
contrast
image

High-
contrast
image

Figure 4.13 Four basic image types: dark, light, low-contrast, high-contrast, and their
corresponding histograms.

Page 12
Image Processing Lecture 5

Contrast stretching
aims to increase (expand) the dynamic range of an image. It transforms
the gray levels in the range {0,1,…, L-1} by a piecewise linear function.
The figure below shows a typical transformation used for contrast
stretching.
The locations of points
(r1, s1) and (r2, s2)
control the shape of the
transformation function.

Figure 5.1 Form of transformation function

For example the following piecewise linear function

Contrast Stretching
255
240
225
Output Gray Level, s

210
195
180
165
150
135
120
105
90
75
60
45
30
15
0
180
105
120
135
150
165
195
210
225
240
255
30
0
15
45
60
75
90

Input Gray Level, r

Figure 5.2 Plot of above piecewise linear function

Page 1
Image Processing Lecture 5

will be used to increase the contrast of the image shown in the figure
below:

(a) (b)

(c) (d)
Figure 5.3 Contrast stretching. (a) Original image. (b) Histogram of (a). (c) Result of contrast
stretching. (d) Histogram of (c).

For a given plot, we use the equation of a straight line to compute the
piecewise linear function for each line:

Page 2
Image Processing Lecture 5

For example the plot in Figure 5.2, for the input gray values in the
interval [28 to 75] we get:
255 − 28
y− 28 = ( x− 28)
75 − 28
y= (227 ∗ − 5040)/47 if 28 ≤ x ≤ 75
Similarly, we compute the equations of the other lines.

Another form of contrast stretching is called automatic (full) contrast


stretching as shown in the example below:

0 i f r < 90
s = (255 ∗ r − 22950)/48 i f 90 ≤ r ≤ 138
255 i f r > 138

255
Output Gray Level, s

204

153

102

51

0
180
105
120
135
150
165
195
210
225
240
255
30
0
15
45
60
75
90

Input Gray Level, r

Figure 5.4 Full contrast-stretching

This transform produces a high-contrast image from the low-contrast


image as shown in the next figure.

Page 3
Image Processing Lecture 5

(a) (b)

(c) (d)
Figure 5.5 (a) Low-contrast image. (b) Histogram of (a). (c) High-contrast image resulted
from applying full contrast-stretching in Figure 5.4 on (a). (d) Histogram of (c)

Gray-level slicing
Gray-level slicing aims to highlight a specific range [A…B] of gray
levels. It simply maps all gray levels in the chosen range to a high value.
Other gray levels are either mapped to a low value (Figure 5.6(a)) or left
unchanged (Figure 5.6(b)). Gray-level slicing is used for enhancing
features such as masses of water in satellite imagery. Thus it is useful for
feature extraction.

Page 4
Image Processing Lecture 5

(a) (b)
Figure 5.6 Gray-level slicing

The next figure shows an example of gray level slicing:

(a) Original image

(b) Operation intensifies desired gray level (c) Result of applying (b) on (a)
range, while preserving other values (background unchanged)

Page 5
Image Processing Lecture 5

(d) Operation intensifies desired gray level (e) Result of applying (d) on (a) (background
range, while changing other values to black changed to black)

Figure 5.7 Example of gray level slicing

Enhancement through Histogram Manipulation


Histogram manipulation aims to determine a gray-level transform that
produces an enhanced image that has a histogram with desired properties.
We study two histogram manipulation techniques namely Histogram
Equalization (HE) and Histogram Matching (HM).

Histogram Equalization
is an automatic enhancement technique which produces an output
(enhanced) image that has a near uniformly distributed histogram.
For continuous functions, the intensity (gray level) in an image
may be viewed as a random variable with its probability density function
(PDF). The PDF at a gray level r represents the expected proportion
(likelihood) of occurrence of gray level r in the image. A transformation
function has the form

Page 6
Image Processing Lecture 5

where w is a variable of integration. The right side of this equation is


called the cumulative distribution function (CDF) of random variable r.
For discrete gray level values, we deal with probabilities (histogram
values) and summations instead of probability density functions and
integrals. Thus, the transform will be:

The right side of this equation is known as the cumulative histogram for
the input image. This transformation is called histogram equalization or
histogram linearization.
Because a histogram is an approximation to a continuous PDF, perfectly
flat histograms are rare in applications of histogram equalization. Thus,
the histogram equalization results in a near uniform histogram. It spreads
the histogram of the input image so that the gray levels of the equalized
(enhanced) image span a wider range of the gray scale. The net result is
contrast enhancement.

Page 7
Image Processing Lecture 5

Example:
Suppose that a 3-bit image (L = 8) of size 64 × 64 pixels has the gray
level (intensity) distribution shown in the table below.

rk nk
r0 = 0 790
r1 = 1 1023
r2 = 2 850
r3 = 3 656
r4 = 4 329
r5 = 5 245
r6 = 6 122
r7 = 7 81
Perform histogram equalization on this image, and draw its normalized
histogram, transformation function, and the histogram of the equalized
image.

Solution:
M × N = 4096
We compute the normalized histogram:

rk nk pr (rk ) = nk /MN
r0 = 0 790 0.19
r1 = 1 1023 0.25
r2 = 2 850 0.21
r3 = 3 656 0.16
r4 = 4 329 0.08
r5 = 5 245 0.06
r6 = 6 122 0.03
r7 = 7 81 0.02

Page 8
Image Processing Lecture 5

Normalized histogram

Then we find the transformation function:

Transformation function
We round the values of s to the nearest integer:

Page 9
Image Processing Lecture 5

These are the values of the equalized histogram. Note that there are only
five gray levels.

rk nk sk New nk pr (rk ) = New nk /MN


r0 = 0 790 s0 = 1 790 0.19
r1 = 1 1023 s1 = 3 1023 0.25
r2 = 2 850 s2 = 5 850 0.21
r3 = 3 656 s3 = 6 985 0.24
r4 = 4 329 s4 = 6
r5 = 5 245 s5 = 7 448 0.11
r6 = 6 122 s6 = 7
r7 = 7 81 s7 = 7
Thus, the histogram of the equalized image can be drawn as follows:

Histogram of equalized image

The next figure shows the results of performing histogram equalization


on dark, light, low-contrast, and high-contrast gray images.

Page 10
Image Processing Lecture 5

Figure 5.8 Left column original images. Center column corresponding histogram equalized
images. Right column histograms of the images in the center column.

Page 11
Image Processing Lecture 5

Although all the histograms of the equalized images are different, these
images themselves are visually very similar. This is because the
difference between the original images is simply one of contrast, not of
content.
However, in some cases histogram equalization does not lead to a
successful result as shown below.

(a) Original image (b) Histogram of (a)


Figure 5.9 Image of Mars moon and its histogram

The result of performing histogram equalization on the above image is


shown in the figure below.

(a) Result of applying HE (b) Histogram of (a)


Figure 5.10 Result of applying HE on Figure 5.9 (a)

Page 12
Image Processing Lecture 5

We clearly see that histogram equalization did not produce a good result
in this case. We see that the intensity levels have been shifted to the upper
one-half of the gray scale, thus giving the image a washed-out
appearance. The cause of the shift is the large concentration of dark
components at or near 0 in the original histogram. In turn, the cumulative
transformation function obtained from this histogram is steep, as shown in
the figure below, thus mapping the large concentration of pixels in the
low end of the gray scale to the high end of the scale.

Figure 5.11 HE transformation function of Figure 5.10(a)

In other cases, HE may introduce noise and other undesired effect to the
output images as shown in the figure below.

(a) Original image (b) Result of applying HE on (a)

Page 13
Image Processing Lecture 5

(c) Original image (d) Result of applying HE on (c)


Figure 5.12 Undesired effects caused by HE

These undesired effects are a consequence of digitization. When digitize


the continuous operations, rounding leads to approximations.
From the previous examples, we conclude that the effect of HE differs
from one image to another depending on global and local variation in the
brightness and in the dynamic range.

Histogram Matching (Specification)


is another histogram manipulation process which is used to generate a
processed image that has a specified histogram. In other words, it enables
us to specify the shape of the histogram that we wish the processed image
to have. It aims to transform an image so that its histogram nearly
matches that of another given image. It involves the sequential
application of a HE transform of the input image followed by the inverse
of a HE transform of the given image.

The procedure of Histogram Specification is as follows:


1. Compute the histogram pr( r) of the input image, and use it to find
the histogram equalization transformation using

Page 14
Image Processing Lecture 5

Then round the resulting values, Sk, to the integer range [0, L-1].
2. Compute the specified histogram pz(z ) of the given image, and use
it find the transformation function G using

Then round the values of G to integers in the range [0, L-1]. Store
the values of G in a table.
3. Perform inverse mapping. For every value of sk, use the stored
values of G from step 2 to find the corresponding value of zq so
that G(zq ) is closest to sk and store these mappings from s to z.
When more than one value of zq satisfies the given sk (i.e. the
mapping is not unique), choose the smallest value.
4. Form the output image by first histogram-equalizing the input
image and then mapping every equalized pixel value, sk, of this
image to the corresponding value zq in the histogram-specified
image using the inverse mappings in step 3.

Page 15
Image Processing Lecture 5

Example:
Suppose the 3-bit image of size 64 × 64 pixels with the gray level
distribution shown in the table, and the specified histogram below.

rk nk
r0 = 0 790
r1 = 1 1023
r2 = 2 850
r3 = 3 656
r4 = 4 329
r5 = 5 245
r6 = 6 122
r7 = 7 81
Perform histogram specification on the image, and draw its normalized
histogram, specified transformation function, and the histogram of the
output image.

Solution:
Step 1:
M × N = 4096
We compute the normalized histogram:

rk nk pr (rk ) = nk /MN
r0 = 0 790 0.19
r1 = 1 1023 0.25
r2 = 2 850 0.21
r3 = 3 656 0.16
r4 = 4 329 0.08
r5 = 5 245 0.06
r6 = 6 122 0.03
r7 = 7 81 0.02

Page 16
Image Processing Lecture 5

Normalized histogram

Then we find the histogram-equalized values:

We round the values of s to the nearest integer:

Step 2:
We compute the values of the transformation G

Page 17
Image Processing Lecture 5

zq G (zq )
z0 = 0 0
z1 = 1 0
z2 = 2 0
z3 = 3 1
z4 = 4 2
z5 = 5 5
z6 = 6 6
z7 = 7 7
Transformation function obtained
from the specified histogram

Step 3:
We find the corresponding value of zq so that the value G(zq ) is the
closest to sk.

Page 18
Image Processing Lecture 5

Step 4:

rk nk sk New nk pr (rk ) = New nk /MN


r0 = 0 790 s0 = 3 790 0.19
r1 = 1 1023 s1 = 4 1023 0.25
r2 = 2 850 s2 = 5 850 0.21
r3 = 3 656 s3 = 6 985 0.24
r4 = 4 329 s4 = 6
r5 = 5 245 s5 = 7 448 0.11
r6 = 6 122 s6 = 7
r7 = 7 81 s7 = 7

Thus, the histogram of the output image can be drawn as follows:

To see an example of histogram specification we consider again the


image below.

(a) Original image (b) Histogram of (a)


Figure 5.13 Image of Mars moon and its histogram

Page 19
Image Processing Lecture 5

We use the following specified histogram shown in the figure below to


perform histogram specification on the image in the previous figure.

Figure 5.14 Specified histogram for image in Figure 5.13(a)

The output image of histogram specification is shown below.

(a) Result of applying HS (b) Histogram of (a)


Figure 5.15 Result of applying HS on Figure 5.13 (a)

Page 20
Image Processing Lecture 6

Filtering in the spatial domain (Spatial Filtering)


refers to image operators that change the gray value at any pixel (x,y)
depending on the pixel values in a square neighborhood centered at (x,y)
using a fixed integer matrix of the same size. The integer matrix is called
a filter, mask, kernel or a window.
The mechanism of spatial filtering, shown below, consists simply
of moving the filter mask from pixel to pixel in an image. At each pixel
(x,y), the response of the filter at that pixel is calculated using a
predefined relationship (linear or nonlinear).

Figure 6.1 Spatial filtering

Note:
The size of mask must be odd (i.e. 3×3, 5×5, etc.) to ensure it has a
center. The smallest meaningful size is 3×3.

Page 1
Image Processing Lecture 6

Linear Spatial Filtering (Convolution)


The process consists of moving the filter mask from pixel to pixel in an
image. At each pixel (x,y), the response is given by a sum of products of
the filter coefficients and the corresponding image pixels in the area
spanned by the filter mask.
For the 3×3 mask shown in the previous figure, the result (or response),
R, of linear filtering is:

= (−1, −1) ( − 1, − 1) + (−1,0) ( − 1, ) + ⋯


+ (0,0) ( , ) + ⋯ + (1,0) ( + 1, ) + (1,1) ( + 1, + 1)

In general, linear filtering of an image f of size M× N with a filter mask of


size m× n is given by the expression:

( , )= ( , ) ( + , + )

where a = (m - 1)/2 and b = (n - l)/2. To generate a complete filtered


image this equation must be applied for x = 0,1, 2,..., M-1 and y = 0,1,
2,..., N-1.

Nonlinear Spatial Filtering


The operation also consists of moving the filter mask from pixel to pixel
in an image. The filtering operation is based conditionally on the values
of the pixels in the neighborhood, and they do not explicitly use
coefficients in the sum-of-products manner.
For example, noise reduction can be achieved effectively with a
nonlinear filter whose basic function is to compute the median gray-level
value in the neighborhood in which the filter is located. Computation of
the median is a nonlinear operation.

Page 2
Image Processing Lecture 6

Example:
Use the following 3×3mask to perform the convolution process on the
shaded pixels in the 5×5 image below. Write the filtered image.

0 1/6 0 30 40 50 70 90
1/6 1/3 1/6 40 50 80 60 100
0 1/6 0 35 255 70 0 120
3×3 mask 30 45 80 100 130
40 50 90 125 140
5×5 image

Solution:
1 1 1 1 1
0 × 30 + × 40 + 0 × 50 + × 40 + × 50 + × 80 + 0 × 35 + × 255
6 6 3 6 6
+ 0 × 70 = 85
1 1 1 1 1
0 × 40 + × 50 + 0 × 70 + × 50 + × 80 + × 60 + 0 × 255 + × 70
6 6 3 6 6
+ 0 × 0 = 65
1 1 1 1 1
0 × 50 + × 70 + 0 × 90 + × 80 + × 60 + × 100 + 0 × 70 + × 0
6 6 3 6 6
+ 0 × 120 =
1 1 1 1 1
0 × 40 + × 50 + 0 × 80 + × 35 + × 255 + × 70 + 0 × 30 + × 45
6 6 3 6 6
+ 0 × 80 = 118
and so on …

30 40 50 70 90
40 85 65 61 100
Filtered image = 35 118 92 58 120
30 84 77 89 130
40 50 90 125 140

Page 3
Image Processing Lecture 6

Spatial Filters
Spatial filters can be classified by effect into:
1. Smoothing Spatial Filters: also called lowpass filters. They include:
1.1 Averaging linear filters
1.2 Order-statistics nonlinear filters.
2. Sharpening Spatial Filters: also called highpass filters. For example,
the Laplacian linear filter.

Smoothing Spatial Filters


are used for blurring and for noise reduction. Blurring is used in
preprocessing steps to:
§ remove small details from an image prior to (large) object
extraction
§ bridge small gaps in lines or curves.
Noise reduction can be accomplished by blurring with a linear filter and
also by nonlinear filtering.

Averaging linear filters


The response of averaging filter is simply the average of the pixels
contained in the neighborhood of the filter mask.
The output of averaging filters is a smoothed image with reduced "sharp"
transitions in gray levels.
Noise and edges consist of sharp transitions in gray levels. Thus
smoothing filters are used for noise reduction; however, they have the
undesirable side effect that they blur edges.

©Asst. Lec. Wasseem Nahy Ibrahem Page 4


Image Processing Lecture 6

The figure below shows two 3×3 averaging filters.

1 1 1 1 2 1
1 1 1 1 1 2 4 2
× ×
9 16
1 1 1 1 2 1
Standard average filter Weighted average filter
Note:
Weighted average filter has different coefficients to give more
importance (weight) to some pixels at the expense of others. The idea
behind that is to reduce blurring in the smoothing process.

Averaging linear filtering of an image f of size M× N with a filter mask of


size m× n is given by the expression:

To generate a complete filtered image this equation must be applied for


x = 0,1, 2,..., M-1 and y = 0,1, 2,..., N-1.

Figure below shows an example of applying the standard averaging filter.

Page 5
Image Processing Lecture 6

(a) (b)

(c) (d)

(e) (f)
Figure 6.2 Effect of averaging filter. (a) Original image. (b)-(f) Results of smoothing with
square averaging filter masks of sizes n = 3,5,9,15, and 35, respectively.

Page 6
Image Processing Lecture 6

As shown in the figure, the effects of averaging linear filter are:


1. Blurring which is increased whenever the mask size increases.
2. Blending (removing) small objects with the background. The size
of the mask establishes the relative size of the blended objects.
3. Black border because of padding the borders of the original image.
4. Reduced image quality.

Order-statistics filters
are nonlinear spatial filters whose response is based on ordering (ranking)
the pixels contained in the neighborhood, and then replacing the value of
the center pixel with the value determined by the ranking result.
Examples include Max, Min, and Median filters.

Median filter
It replaces the value at the center by the median pixel value in the
neighborhood, (i.e. the middle element after they are sorted). Median
filters are particularly useful in removing impulse noise (also known as
salt-and-pepper noise). Salt = 255, pepper = 0 gray levels.
In a 3×3 neighborhood the median is the 5th largest value, in a 5×5
neighborhood the 13th largest value, and so on.
For example, suppose that a 3×3 neighborhood has gray levels (10,
20, 0, 20, 255, 20, 20, 25, 15). These values are sorted as
(0,10,15,20,20,20,20,25,255), which results in a median of 20 that
replaces the original pixel value 255 (salt noise).

Page 7
Image Processing Lecture 6

Example:
Consider the following 5×5 image:
20 30 50 80 100
30 20 80 100 110
25 255 70 0 120
30 30 80 100 130
40 50 90 125 140
Apply a 3×3 median filter on the shaded pixels, and write the filtered
image.

Solution
20 30 50 80 100 20 30 50 80 100
30 20 80 100 110 30 20 80 100 110
25 255 70 0 120 25 255 70 0 120
30 30 80 100 130 30 30 80 100 130
40 50 90 125 140 40 50 90 125 140
Sort: Sort
20, 25, 30, 30, 30, 70, 80, 80, 255 0, 20, 30, 70, 80, 80, 100, 100, 255

20 30 50 80 100
30 20 80 100 110
25 255 70 0 120
30 30 80 100 130
40 50 90 125 140
Sort
0, 70, 80, 80, 100, 100, 110, 120, 130

20 30 50 80 100
30 20 80 100 110
Filtered Image = 25 30 80 100 120
30 30 80 100 130
40 50 90 125 140

Page 8
Image Processing Lecture 6

Figure below shows an example of applying the median filter on an


image corrupted with salt-and-pepper noise.

(a) (b)

(c)
Figure 6.3 Effect of median filter. (a) Image corrupted by salt & pepper noise. (b) Result of
applying 3×3 standard averaging filter on (a). (c) Result of applying 3×3 median filter on (a).

As shown in the figure, the effects of median filter are:


1. Noise reduction
2. Less blurring than averaging linear filter

Page 9
Image Processing Lecture 6

Sharpening Spatial Filters


Sharpening aims to highlight fine details (e.g. edges) in an image, or
enhance detail that has been blurred through errors or imperfect capturing
devices.
Image blurring can be achieved using averaging filters, and hence
sharpening can be achieved by operators that invert averaging operators.
In mathematics, averaging is equivalent to the concept of integration, and
differentiation inverts integration. Thus, sharpening spatial filters can be
represented by partial derivatives.

Partial derivatives of digital functions


The first order partial derivatives of the digital image f(x,y) are:

= ( + 1, ) − ( , ) and = ( , + 1) − ( , )

The first derivative must be:


1) zero along flat segments (i.e. constant gray values).
2) non-zero at the outset of gray level step or ramp (edges or
noise)
3) non-zero along segments of continuing changes (i.e. ramps).

The second order partial derivatives of the digital image f(x,y) are:

= ( + 1, ) + ( − 1, ) − 2 ( , )

= ( , + 1) + ( , − 1) − 2 ( , )

The second derivative must be:


1) zero along flat segments.
2) nonzero at the outset and end of a gray-level step or ramp;

Page 10
Image Processing Lecture 6

3) zero along ramps

Consider the example below:

Figure 6.4 Example of partial derivatives

We conclude that:
• 1st derivative detects thick edges while 2nd derivative detects thin
edges.
• 2nd derivative has much stronger response at gray-level step than 1st
derivative.
Thus, we can expect a second-order derivative to enhance fine detail (thin
lines, edges, including noise) much more than a first-order derivative.

Page 11
Image Processing Lecture 6

The Laplacian Filter


The Laplacian operator of an image f(x,y) is:

∇ = +

This equation can be implemented using the 3×3 mask:


−1 −1 −1
−1 8 −1
−1 −1 −1
Since the Laplacian filter is a linear spatial filter, we can apply it using
the same mechanism of the convolution process. This will produce a
laplacian image that has grayish edge lines and other discontinuities, all
superimposed on a dark, featureless background.
Background features can be "recovered" while still preserving the
sharpening effect of the Laplacian operation simply by adding the
original and Laplacian images.
( , )= ( , )+∇ ( , )
The figure below shows an example of using Laplacian filter to sharpen
an image.

Page 12
Image Processing Lecture 6

(a) (b)

(c)

Figure 6.5 Example of applying Laplacian filter. (a) Original image. (b) Laplacian image.
(c) Sharpened image.

Page 13
Image Processing Lecture 7

Image Enhancement in the Frequency Domain


• The frequency content of an image refers to the rate at which the
gray levels change in the image.
• Rapidly changing brightness values correspond to high frequency
terms, slowly changing brightness values correspond to low
frequency terms.
• The Fourier transform is a mathematical tool that analyses a signal
(e.g. images) into its spectral components depending on its
wavelength (i.e. frequency content).

2D Discrete Fourier Transform


The DFT of a digitized function f(x,y) (i.e. an image) is defined as:

1
( , )= ( , ) cos 2 +

− sin 2 +

The domain of u and v values u = 0, 1, ..., M-1, v = 0,1,…, N-1 is called


the frequency domain of f(x,y).

1
( , )= ( , ) cos 2 + is called real part

−1
( , )= ( , ) sin 2 + is called imaginary part

The magnitude of F(u,v), |F(u,v)|= [R2(u,v)+I 2(u,v)]1/2 , is called the


Fourier spectrum of the transform.

Page 1
Image Processing Lecture 7

The phase angle (phase spectrum) of the transform is:


( , )
∅( , ) =
( , )
Note that, F(0,0) = the average value of f(x,y) and is referred to as the dc
component of the spectrum.
It is a common practice to multiply the image f(x,y) by (-1)x+y. In this
case, the DFT of (f(x,y)(-1)x+y) has its origin located at the centre of the
image, i.e. at (u,v) = (M/2,N/2).

The figure below shows a gray image and its centered Fourier spectrum.

(a)

(b)
Figure 7.1 (a) Gray image. (b) Centered Fourier spectrum of (a)

Page 2
Image Processing Lecture 7

The original image contains two principal features: edges run


approximately at ±45°.
The Fourier spectrum shows prominent components in the same
directions.

Phase spectrum
Phase data contains information about where objects are in the image, i.e.
it holds spatial information as shown in the Figure below.

(a) Original image (b) Phase only image

(c) Contrast enhanced version of image (b) to show detail


Figure 7.2 Phase spectrum

Fourier transform does not provide simultaneously frequency as well as


spatial information.

Page 3
Image Processing Lecture 7

Inverse 2D-DFT
After performing the Fourier transform, if we want to convert the image
from the frequency domain back to the original spatial domain, we apply
the inverse transform. The inverse 2D-DFT is defined as:

( , )= ( , ) cos 2 +

+ sin 2 +

where x = 0,1,…, M-1 and y = 0,1,…, N-1.

Frequency domain vs. Spatial domain


Frequency domain Spatial domain

1. is resulted from Fourier is resulted from sampling and


transform quantization

2. refers to the space defined by refers to the image plane itself,


values of the Fourier i.e. the total number of pixels
transform and its frequency composing an image, each has
variables (u,v). spatial coordinates (x,y)

3. has complex quantities has integer quantities

Page 4
Image Processing Lecture 7

Filtering in the Frequency Domain


Filtering in the frequency domain aims to enhance an image through
modifying its DFT. Thus, there is a need for an appropriate filter function
H(u,v).
The filtering of an image f(x,y) works in 4 steps:
1. Compute the centered DFT, F(u, v) = ℑ((−1) f(x, y))
2. Compute G(u, v) = F(u, v)H(u, v).
3. Compute the inverse DFT of G(u,v), ℑ (G(u, v)).
4. Obtain the real part of ℑ (G(u, v)).
5. Compute the filtered image g(x, y) = (−1) R(ℑ (G(u, v))).

Generally, the inverse DFT is a complex-valued function. However, when


f(x,y) is real then the imaginary part of the inverse DFT vanishes. Thus,
for images step 4, above, does not apply.
The figure below illustrates the filtering in the frequency domain.

Figure 7.3 Basic steps for filtering in the frequency domain

Page 5
Image Processing Lecture 7

Low-pass and High-pass filtering


• Low frequencies in the DFT spectrum correspond to image values
over smooth areas, while high frequencies correspond to detailed
features such as edges & noise.
• A filter that suppresses high frequencies but allows low ones is
called Low-pass filter, while a filter that reduces low frequencies
and allows high ones is called High-pass filter.
• Examples of such filters are obtained from circular Gaussian
functions of 2 variables:

( , )= ( )/
Low-pass filter

( , )= ( )/
(1 − ) High-pass filter

The results of applying these two filters on the image in Figure 6.1(a) are
shown in the figure below.

(a) Low-pass filter function (b) Result of lowpass filtering

Page 6
Image Processing Lecture 7

(c) Highpass filter function (d) Result of highpass filtering


Figure 7.4 Low-pass and High-pass Filtering

Low-pass filtering results in blurring effects, while High-pass filtering


results in sharper edges.
In the last example, the highpass filtered image has little smooth gray-
level detail as a result of setting F(0,0) to 0. This can be improved by
adding a constant to the filter, for example we add 0.75 to the previous
highpass filter to obtain the following sharp image.

Figure 7.5 Result of highpass filter modified by adding 0.75 to the filter

Page 7
Image Processing Lecture 8

Smoothing frequency domain filters


Ideal Lowpass Filter (ILPF)
ILPF is the simplest lowpass filter that “cuts off” all high frequency
components of the DFT that are at a distance greater than a specified
distance D0 from the origin of the (centered) transform.
The transfer function of this filter is:
1 ( , )≤
( , )=
0 ( , )>
where D0 is the cutoff frequency, and ( , ) = ( − /2) + ( − /2)

(a) (b) (c)


Figure 8.1 (a) ILPF transfer function. (b) ILPF as an image. (c) ILPF radial cross section

The ILPF indicates that all frequencies inside a circle of radius D0 are
passed with no attenuation, whereas all frequencies outside this circle are
completely attenuated.

The next figure shows a gray image with its Fourier spectrum. The circles
superimposed on the spectrum represent cutoff frequencies 5, 15, 30, 80
and 230.

Page 1
Image Processing Lecture 8

(a) (b)
Figure 8.2 (a) Original image. (b) its Fourier spectrum

The figure below shows the results of applying ILPF with the previous
cutoff frequencies.

(a) (b)

(c) (d)

Page 2
Image Processing Lecture 8

(e) (f)
Figure 8.3 (a) Original image. (b) - (f) Results of ILPF with cutoff frequencies 5, 15, 30, 80,
and 230 respectively.

We can see the following effects of ILPF:


1. Blurring effect which decreases as the cutoff frequency increases.
2. Ringing effect which becomes finer (i.e. decreases) as the cutoff
frequency increases.

Butterworth Lowpass Filter (BLPF)


The BLPF of order n and with cutoff frequency at a distance D0 from the
origin is defined as:
1
( , )=
1 + [ ( , )/ ]

(a) (b) (c)


Figure 8.4 (a) BLPF transfer function. (b) BLPF as an image. (c) BLPF radial cross section

Page 3
Image Processing Lecture 8

Unlike ILPF, the BLPF transfer function does not have a sharp transition
that establishes a clear cutoff between passed and filtered frequencies.
Instead, BLPF has a smooth transition between low and high frequencies.

The figure below shows the results of applying BLPF of order 2 with the
same previous cutoff frequencies.

(a) (b)

(c) (d)

Page 4
Image Processing Lecture 8

(e) (f)
Figure 8.5 (a) Original image. (b) - (f) Results of BLPF of order n = 2 with cutoff frequencies
5, 15, 30, 80, and 230 respectively.

We can see the following effects of BLPF compared to ILPF:


1- Smooth transition in blurring as a function of increasing cutoff
frequency.
2- No ringing is visible because of the smooth transition between low
and high frequencies.
3- BLPF of order 1 has no ringing. Ringing is imperceptible in BLPF
of order 2, but can become significant in BLPF of higher order as
shown in the figure below.

(a) (b)

Page 5
Image Processing Lecture 8

(c) (d)
Figure 8.6 (a) Result of BLPF with order 5. (b) BLPF of order 5. (c) Result of BLPF with
order 20. (d) BLPF of order 20. (cutoff frequency 30 in both cases).

BLPF is the preferred choice in cases where the tight control of the
transition between low and high frequencies are needed. However, the
side effect of this control is the possibility of ringing.

Gaussian Lowpass Filter (GLPF)


The GLPF with cutoff frequency D 0 is defined as:

( , )= ( , )/

(a) (b) (c)


Figure 8.7 (a) GLPF transfer function. (b) GLPF as an image. (c) GLPF radial cross section

Unlike ILPF, the GLPF transfer function does not have a sharp transition
that establishes a clear cutoff between passed and filtered frequencies.
Page 6
Image Processing Lecture 8

Instead, GLPF has a smooth transition between low and high frequencies.
The figure below shows the results of applying GLPF.

(a) (b)

(c) (d)

(e) (f)
Figure 8.8 (a) Original image. (b) - (f) Results of GLPF with cutoff frequencies 5, 15, 30, 80,
and 230 respectively.
Page 7
Image Processing Lecture 8

We can see the following effects of GLPF:


1. Smooth transition in blurring as a function of increasing cutoff
frequency.
2. GLPF did not achieve as much smoothing as BLPF of order 2 for
the same cutoff frequency.
3. No ringing effect. This is important in situations (e.g. medical
imaging) where any type of artifact is not acceptable.

Smoothing (lowpass) filtering is useful in many applications. For


example, GLPF can be used to bridge small gaps in broken characters by
blurring it as shown in the figure below. This is useful for automatic
character recognition system.

(a) (b)
Figure 8.9 (a) Text of poor resolution. (b) Result of applying GLPF with cutoff=80 on (a)

GLPF can also be used for cosmetic processing prior to printing and
publishing as shown in the next figure.

©Asst. Lec. Wasseem Nahy Ibrahem Page 8


Image Processing Lecture 8

(a) (b)
Figure 8.10 (a) Original image. (b) Result of filtering using GLPF with cutoff=80

Page 9
Image Processing Lecture 8

Sharpening frequency domain filters


Edges and sudden changes in gray levels are associated with high
frequencies. Thus to enhance and sharpen significant details we need to
use highpass filters in the frequency domain
For any lowpass filter there is a highpass filter:
( , )=1− ( , )

Ideal Highpass Filter (IHPF)


The IHPF cuts off all low frequencies of the DFT but maintain the high
ones that are within a certain distance from the center of the DFT.
1 ( , )>
( , )=
0 ( , )≤
where D0 is the cutoff frequency, and ( , ) = ( − /2) + ( − /2)

(a) (b) (c)


Figure 8.11 (a) IHPF transfer function. (b) IHPF as an image. (c) IHPF radial cross section

The IHPF sets to zero all frequencies inside a circle of radius D 0 while
passing, without attenuation, all frequencies outside the circle.
The next figure shows the results of applying IHPF with cutoff
frequencies 15, 30, and 80.

Page 10
Image Processing Lecture 8

(a) (b)

(c) (d)
Figure 8.12 (a) Original image. (b) - (d) Results of IHPF with cutoff frequencies 15, 30, and
80 respectively.

We can see the following effects of IHPF:


1. Ringing effect.
2. Edge distortion (i.e. distorted, thickened object boundaries).
Both effects are decreased as the cutoff frequency increases.

Page 11
Image Processing Lecture 8

Butterworth Highpass Filter (BHPF)


The transfer function of BHPF of order n and with cutoff frequency at
distance D0 is defined as:
1
( , )=
1+[ / ( , )]

(a) (b) (c)


Figure 8.13 (a) BHPF transfer function. (b) BHPF as an image. (c) BHPF radial cross section

The figure below shows the results of applying BHPF with cutoff
frequencies 15, 30 and 80.

(a) (b)

©Asst. Lec. Wasseem Nahy Ibrahem Page 12


Image Processing Lecture 8

(c) (d)
Figure 8.14 (a) Original image. (b) - (d) Results of BHPF with cutoff frequencies 15, 30, and
80 respectively.

We can clearly see the following effects of BHPF:


1- BHPF behaves smoother than IHPF.
2- The boundaries are much less distorted than that of IHPF, even for
the smallest value of cutoff frequency.

Gaussian Highpass Filter (GHPF)


The Gaussian Highpass Filter (GHPF) with cutoff frequency at distance
D0 is defined as:

( , )=1− ( , )/

(a) (b) (c)


Figure 8.15 (a) GHPF transfer function. (b) GHPF as an image. (c) GHPF radial cross section

Page 13
Image Processing Lecture 8

The figure below shows the results of applying GHPF with cutoff
frequencies 15, 30 and 80.

(a) (b)

(c) (d)
Figure 8.16 (a) Original image. (b) - (d) Results of GHPF with cutoff frequencies 15, 30, and
80 respectively.

The effects of GHPF are:


1. No ringing effect.
2. Less edge distortion.
3. The results are smoother than those obtained by IHPF and BHPF.

Page 14
Image Processing Lecture 9

Wavelets and Multiresolution Processing


Wavelet Transform (WT)
Unlike Fourier transform whose basis functions are sinusoids, wavelet
transform (WT) is based on wavelets. Wavelets (i.e. small waves) are
mathematical functions that represent scaled and translated (shifted)
copies of a finite-length waveform called the mother wavelet as shown in
the figure below.

(a) (b)
Figure 9.1 Functions of (a) Fourier transform and (b) Wavelet transform

Wavelet transform is used to analyze a signal (image) into different


frequency components at different resolution scales (i.e. multiresolution).
This allows revealing image’s spatial and frequency attributes
simultaneously. In addition, features that might go undetected at one
resolution may be easy to spot at another.
Multiresolution theory incorporates image pyramid and subband coding
techniques.

Page 1
Image Processing Lecture 9

Image Pyramid
is a powerful simple structure for representing images at more than one
resolution. It is a collection of decreasing resolution images arranged in
the shape of a pyramid as shown in the figure below.

Figure 9.2 Image pyramid


The base of the pyramid contains a high-resolution representation of the
image being processed; the apex contains a low-resolution
approximation. As we move up the pyramid, both size and resolution
decrease.

Subband Coding
is used to decompose an image into a set of bandlimited components
called subbands, which can be reassembled to reconstruct the original
image without error. Each subband is generated by bandpass filtering the
input image. The next figures show 1D and 2D subband coding.

Page 2
Image Processing Lecture 9

Figure 9.3 1D Two-band subband coding and decoding system


The analysis filter bank consists of:
• Lowpass filter ℎ ( ) whose output, subband ( ) is called
approximation subband of ( )
• Highpass filter ℎ ( ) whose output subband ( ) is called high
frequency or detail part of ( )
The synthesis bank filters ( ) and ( ) combine ( ) and ( ) to
^
produce ( ).
The 2D subband coding is shown in the figure below.

Figure 9.4 2D Four-band subband image coding

Page 3
Image Processing Lecture 9

2D-Discrete Wavelet Transform (2D-DWT)


The DWT provides a compact representation of a signal’s frequency
components with strong spatial support. DWT decomposes a signal into
frequency subbands at different scales from which it can be perfectly
reconstructed.
2D-signals such as images can be decomposed using many wavelet
decomposition filters in many different ways. We study the Haar wavelet
filter and the pyramid decomposition method.

The Haar Wavelet Transform (HWT)


The Haar wavelet is a discontinuous, and resembles a step function.
For a function f, the HWT is defined as:
f → (a d )
a = (a , a , … , a / )
d = (d , d , … , d / )
where L is the decomposition level, a is the approximation subband and
d is the detail subband.
f +f
a = = 1,2, … , /2
√2
f −f
d = = 1,2, … , /2
√2

For example, if f={f1,f2,f3,f4 ,f5 ,f6 ,f7 ,f8 } is a time-signal of length 8, then
the HWT decomposes f into an approximation subband containing the
Low frequencies and a detail subband containing the high frequencies:
Low = a = { + , + , + , + }/√2
High = d = { − , − , − , − }/√2

Page 4
Image Processing Lecture 9

To apply HWT on images, we first apply a one level Haar wavelet to


each row and secondly to each column of the resulting "image" of the
first operation. The resulted image is decomposed into four subbands: LL,
HL, LH, and HH subbands. (L=Low, H=High). The LL-subband contains
an approximation of the original image while the other subbands contain
the missing details. The LL-subband output from any stage can be
decomposed further.
The figure below shows the result of one and two level HWT based
on the pyramid decomposition

(a) Decomposition Level 1 (b) Decomposition Level 2


Figure 9.5 Pyramid decomposition using Haar wavelet filter

The next figure shows an image decomposed with 3-level Haar wavelet
transform.

©Asst. Lec. Wasseem Nahy Ibrahem Page 5


Image Processing Lecture 9

(a) Original image

(b) Level 1

Page 6
Image Processing Lecture 9

(c) Level 2

(d) Level 3
Figure 9.6 Example of a Haar wavelet transformed image

Page 7
Image Processing Lecture 9

Wavelet transformed images can be perfectly reconstructed using the four


subbands using the inverse wavelet transform.

Inverse Haar Wavelet Transform (IHWT)


The inverse of the Haar wavelet transform is computed in the reverse
order as follows:
a −d a +d a / −d / a / +d /
=( , ,…, , )
√2 √2 √2 √2
To apply IHWT on images, we first apply a one level inverse Haar
wavelet to each column and secondly to each row of the resulting
"image" of the first operation.

Statistical Properties of Wavelet subbands


The distribution of the LL-subband approximates that of the original
image but all non-LL subbands have a Laplacian distribution. This
remains valid at all depths (i.e. decomposition levels).

(a) (b)

Page 8
Image Processing Lecture 9

(c) (d)
Figure 9.7 Histogram of (a) LL-subband (b) HL-subband (c) LH-subband (d) HH-subband of
subbands in Figure 9.6 (b)

Wavelet Transforms in image processing


Any wavelet-based image processing approach has the following steps:
1. Compute the 2D-DWT of an image
2. Alter the transform coefficients (i.e. subbands)
3. Compute the inverse transform
Wavelet transforms are used in a wide range of image applications. These
include:
• Image and video compression
• Feature detection and recognition
• Image denoising
• Face recognition
Most applications benefit from the statistical properties of the non-LL
subbands (The Laplacian distribution of the wavelet coefficients in these
subbands).

Wavelet-based edge detection


The next figure shows a gray image and its wavelet transform for one-
level of decomposition.

Page 9
Image Processing Lecture 9

(a)

(b)
Figure 9.8 (a) gray image. (b) its one-level wavelet transform

Page 10
Image Processing Lecture 9

Note the horizontal edges of the original image are present in the HL
subband of the upper-right quadrant of the Figure above. The vertical
edges of the image can be similarly identified in the LH subband of the
lower-left quadrant.
To combine this information into a single edge image, we simply zero the
LL subband of the transform, compute the inverse transform, and take the
absolute value.
The next Figure shows the modified transform and resulting edge image.

(a)

Page 11
Image Processing Lecture 9

(b)
Figure 9.9 (a) transform modified by zeroing the LL subband. (b) resulted edge image

Wavelet-based image denoising


The general wavelet-based procedure for denoising the image is as
follows:
1. Choose a wavelet filter (e.g. Haar, symlet, etc…) and number of
levels for the decomposition. Then compute the 2D-DWT of the
noisy image.
2. Threshold the non-LL subbands.
3. Perform the inverse wavelet transform on the original
approximation LL-subband and the modified non-LL subbands.
The next figure shows a noisy image and its wavelet transform for two-
levels of decomposition.

Page 12
Image Processing Lecture 9

(a)

(b)
Figure 9.10 (a) noisy image. (b) its two-level wavelet transform

Page 13
Image Processing Lecture 9

Now we threshold all the non-LL subbands at both decomposition levels


by 85. Then we perform the inverse wavelet transform on the LL-subband
and the modified (i.e. thresholded) non-LL subbands to obtain the
denoised image shown in the next figure.

Figure 9.11 denoised image generated by thresholding all non-LL subbands by 85

In the image above, we can see the following:


• Noise Reduction.
• Loss of quality at the image edges.
The loss of edge detail can be reduced by zeroing the non-LL subbands at
the first decomposition level and only the HH-subband at the second
level. Then we apply the inverse transform to obtain the denoised image
in the figure below.

Page 14
Image Processing Lecture 9

Figure 9.12 denoised image generated by zeroing the non-LL subbands

Page 15
Image Processing Lecture 10

Image Restoration
Image restoration attempts to reconstruct or recover an image that has
been degraded by a degradation phenomenon. Thus, restoration
techniques are oriented toward modeling the degradation and applying
the inverse process in order to recover the original image. As in image
enhancement, the ultimate goal of restoration techniques is to improve an
image in some predefined sense.

Image restoration vs. image enhancement


Image restoration Image enhancement

1. is an objective process is a subjective process

2. formulates a criterion of involves heuristic procedures


goodness that will designed to manipulate an image
yield an optimal estimate of the in order to satisfy the human
desired result visual system

3. Techniques include noise Techniques include contrast


removal and deblurring (removal stretching
of image blur)

Like enhancement techniques, restoration techniques can be performed in


the spatial domain and frequency domain. For example, noise removal is
applicable using spatial domain filters whereas deblurring is performed
using frequency domain filters because image blur are difficult to
approach in the spatial domain using small masks.

Page 1
Image Processing Lecture 10

A Model of Image Degradation & Restoration


As shown in the next figure, image degradation is a process that acts on
an input image f(x,y) through a degradation function H and an additive
noise η(x,y). It results in a degraded image g(x,y) such that:

( , ) = ℎ( , ) ∗ ( , ) + ( , )

where h(x,y) is the spatial representation of the degradation function and


the symbol “ * ” indicates convolution.
Note that we only have the degraded image g(x,y). The objective of
^
restoration is to obtain an estimate ( , ) of the original image. We
want the estimate to be as close as possible to the original input image
^
and, in general, the more we know about H and η, the closer ( , ) will
be to f(x, y).

Figure 10.1 A model of the image degradation/restoration process

In the frequency domain, this model is equivalent to:


( , )= ( , ) ( , )+ ( , )

The approach that we will study is based on various types of image


restoration filters. We assume that H is the identity operator, and we deal
only with degradations due to noise.

Page 2
Image Processing Lecture 10

Noise and its characteristics


Noise in digital images arises during:
• Acquisition: environmental conditions (light level & sensor
temperature), and type of cameras
• and/or transmission – interference in the transmission channel
To remove noise we need to understand the spatial characteristics of
noise and its frequency characteristics (Fourier spectrum).
Generally, spatial noise is assumed to be independent of position in
an image and uncorrelated to the image itself (i.e. there is no correlation
between pixel values and the values of noise components). Frequency
properties refer to the frequency content of noise in the Fourier sense.

Noise Models
Spatial noise is described by the statistical behavior of the gray-level
values in the noise component of the degraded image. Noise can be
modeled as a random variable with a specific probability distribution
function (PDF). Important examples of noise models include:
1. Gaussian Noise
2. Rayleigh Noise
3. Gamma Noise
4. Exponential Noise
5. Uniform Noise
6. Impulse (Salt & Pepper) Noise

Page 3
Image Processing Lecture 10

Gaussian Noise
The PDF of Gaussian noise is given by

1 ( ) /
( )=
√2

Figure 10.2 Gaussian noise PDF

where z is the gray value, μ is the mean and σ is the standard deviation.

Rayleigh Noise
The PDF of Rayleigh noise is given by
( ) /
( ) = (2/ )/( − ) ≥
0 <

Figure 10.3 Rayleigh noise PDF

Page 4
Image Processing Lecture 10

Impulse (Salt & Pepper) Noise


The PDF of impulse noise is given by
=
( )= =
0 ℎ

Figure 10.4 Impulse noise PDF

If b > a, then gray level b appears as a light dot (salt), otherwise the gray
level a appears as a dark dot (pepper).

Determining noise models


The simple image below is well-suited test pattern for illustrating the
effect of adding various noise models.

Figure 10.5 Test pattern used to


illustrate the characteristics of the noise
models

The next figure shows degraded (noisy) images resulted from adding the
previous noise models to the above test pattern image.

Page 5
Image Processing Lecture 10

Gaussian Rayleigh Gamma

Exponential Uniform Salt & Pepper

Figure 10.6 Images and histograms from adding Gaussian, Rayleigh, Gamma, Exponential,
Uniform, and Salt & Pepper noise.

Page 6
Image Processing Lecture 10

To determine the noise model in a noisy image, one may select a


relatively small rectangular sub-image of relatively smooth region. The
histogram of the sub-image approximates the probability distribution of
the corrupting model of noise. This is illustrated in the figure below.

(a) (b) (c)

(d) (e) (f)


Figure 10.7 (a) Gaussian noisy image. (b) sub-image extracted from a. (c) histogram of b
(d) Rayleigh noisy image. (e) sub-image extracted from d. (f) histogram of e

Image restoration in the presence of Noise Only


When the only degradation present in an image is noise, the degradation
is modeled as:
( , )= ( , )+ ( , )
and
( , )= ( , )+ ( , )

Page 7
Image Processing Lecture 10

Spatial filtering is the method of choice in situations when only additive


noise is present. Spatial filters that designed to remove noise include:
1. Order Statistics Filters: e.g. Min, Max, & Median
2. Adaptive Filters: e.g. adaptive median filter

Order-Statistics Filters
We have used one of these filters (i.e. median) in the image enhancement.
We now use additional filters (min and max) in image restoration.

Min filter
This filter is useful for finding the darkest points in an image. Also, it
reduces salt noise as a result of the min operation.

(a) (b)
Figure 10.8 (a) image corrupted by salt noise. (b) Result of filtering (a) with a 3×3 min filter.

Max filter
This filter is useful for finding the brightest points in an image. Also,
because pepper noise has very low values, it is reduced by this filter as a
result of the max operation.

Page 8
Image Processing Lecture 10

(a) (b)
Figure 10.9 (a) image corrupted by pepper noise. (b) Result of filtering (a) with a 3×3 max
filter.

Adaptive Filters
The previous spatial filters are applied regardless of local image
variation. Adaptive filters change their behavior using local statistical
parameters in the mask region. Consequently, adaptive filters outperform
the non-adaptive ones.

Adaptive median filter


The median filter performs well as long as the spatial density of the
impulse noise is not large (i.e. Pa and Pb less than 0.2). Adaptive median
filtering can handle impulse noise with probabilities even larger than
these. Moreover the adaptive median filter seeks to preserve detail while
smoothing non-impulse noise, while the median filter does not do.
The adaptive median filter aims to replace f(x,y) with the median
of a neighborhood up to a specified size as long as the median is different
from the max and min values but f(x,y)=min or f(x,y)=max. Otherwise,
f(x,y) is not changed.

Page 9
Image Processing Lecture 10

Consider the following notation:


Sxy = mask region (neighborhood sub-image)
zmin = minimum gray level value in Sxy
zmax = maximum gray level value in Sxy
zmed = median of gray levels in Sxy
zxy = gray level at coordinates (x, y)
Smax = maximum allowed size of Sxy
The adaptive median filtering algorithm works in two levels A and B as
follows:
Level A: A1 = zmed - zmin
A2 = zmed - zmax
If A1 > 0 AND A2 < 0, Go to level B
Else increase the window size
If window size <= Smax repeat level A
Else output zmed.
Level B: B1 = zxy - zmin
B2 = zxy - zmax
If B1 > 0 AND B2 < 0, output zxy
Else output zmed.

The next figure shows an example of filtering an image corrupted by salt-


and-pepper noise with density 0.25 using 7×7 median filter and the
adaptive median filter with Smax = 7.

Page 10
Image Processing Lecture 10

(a)

(b) (c)
Figure 10.10 (a) Image corrupted by salt&pepper noise with density 0.25. (b) Result obtained
using a 7×7 median filter. (c) Result obtained using adaptive median filter with Smax = 7.

From this example, we find that the adaptive median filter has three main
purposes:
1. to remove salt-and-pepper (impulse) noise.
2. to provide smoothing of other noise that may not be impulsive.
3. to reduce distortion, such as excessive thinning or thickening of
object boundaries.

Page 11
Image Processing Lecture 11

Morphological Image Processing


• Morphology is concerned with image analysis methods whose
outputs describe image content (i.e. extract “meaning” from an
image).
• Mathematical morphology is a tool for extracting image
components that can be used to represent and describe region
shapes such as boundaries and skeletons.
• Morphological methods include filtering, thinning and pruning.
These techniques are based on set theory. All morphology
functions are defined for binary images, but most have natural
extension to grayscale images.

Basic Concepts of Set Theory


A set is specified by the elements between two braces: { }. The elements
of the sets are the coordinates (x,y) of pixels representing objects or other
features in an image.
Let A be a set in 2D image space Z2:
• If a = (a 1 , a 2) is an element of A, then ∈
• If a is not an element of A, then ∈
• Empty set is a set with no elements and is denoted by ∅
• If every element of a set A is also an element of another set B, then
A is said to be a subset of B, denoted as ⊆
• The union of two sets A and B, denoted by = ∪
• The intersection of two sets A and B, denoted by = ∩
• Two sets A and B are said to be disjoint, if they have no common
elements. This is denoted by ∩ =∅

Page 1
Image Processing Lecture 11

• The complement of a set A is the set of elements not contained in A.


This is denoted by ={ | ∉ }
• The difference of two sets A and B, denoted A – B, is defined as
− ={ | ∈ , ∉ }= ∩

• The reflection of set B, denoted B̂ , is defined as


B̂ = { | = − , for ∈ }
• The translation of set A by point z = (z1 , z2), denoted (A)z is
defined as ( ) = { | = + , for ∈ }

The figure below illustrates the preceding concepts.

Figure 11.1 Basic concepts of Set Theory

©Asst. Lec. Wasseem Nahy Ibrahem Page 2


Image Processing Lecture 11

Logic Operations Involving Binary Images


A binary image is an image whose pixel values are 0 (representing black)
or 1 (representing white, i.e. 255). The usual set operations of
complement, union, intersection, and difference can be defined easily in
terms of the corresponding logic operations NOT, OR and AND. For
example:
• Intersection operation ⋂ is implemented by AND operation
• Union operation ⋃ is implemented by OR operation
The figure below shows an example of using logic operations to perform
set operations on two binary images.

(a) (b)

a&b a|b a – b = a & bc


Figure 11.2 Using logic operations for applying set operations on two binary images

Page 3
Image Processing Lecture 11

Structuring Element
A morphological operation is based on the use of a filter-like binary
pattern called the structuring element of the operation. Structuring
element is represented by a matrix of 0s and 1s; for simplicity, the zero
entries are often omitted.
Symmetric with respect to its origin:
Lines:

0 0 0 0 1 1
0 0 0 1 0 1 1
0 0 1 0 0 = 1 1
0 1 0 0 0 1 1
1 0 0 0 0 1

Diamond:
0 1 0
1 1 1
0 1 0

Non-symmetric:
1 1
1 1 1 1 1 1 Reflection 1 1
1 1 on origin 1 1 1 1 1 1
1 1

Dilation
Dilation is an operation used to grow or thicken objects in binary images.
The dilation of a binary image A by a structuring element B is defined as:
⊕ = { ∶ B̂ ∩ ≠∅}
This equation is based on obtaining the reflection of B about its origin
and translating (shifting) this reflection by z. Then, the dilation of A by B
Page 4
Image Processing Lecture 11

is the set of all structuring element origin locations where the reflected
and translated B overlaps with A by at least one element.

Example: Use the following structuring element to dilate the binary


image below.
1 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 1 1 1 1 1 1 0 0 0
1 0 0 0 1 1 1 1 1 1 0 0 0
Structuring element 0 0 0 1 1 1 1 1 1 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
Binary image

Solution:
We find the reflection of B:

B= 1 In this case B̂ = B
1
1
1
1

0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 1 1 1 1 1 1 0
0 0 0 0 1 1 1 1 1 1 1 0
0 0 0 1 1 1 1 1 1 1 1 0
⊕ = 0 0 1 1 1 1 1 1 1 1 0 0
0 1 1 1 1 1 1 1 1 0 0 0
0 1 1 1 1 1 1 1 0 0 0 0
0 1 1 1 1 1 1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0

Page 5
Image Processing Lecture 11

Dilation can be used for bridging gaps, for example, in broken/unclear


characters as shown in the figure below.

(a)

(b)

Figure 11.3 (a) Broken-text binary image. (b) Dilated image.

Page 6
Image Processing Lecture 11

Erosion
Erosion is used to shrink or thin objects in binary images. The erosion of
a binary image A by a structuring element B is defined as:
⊖ ={ ∶ ( ) ∩ ≠∅}
The erosion of A by B is the set of all structuring element origin locations
where the translated B does not overlap with the background of A.

Example: Use the following structuring element to erode the binary


image below.
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 1 1 1 1 1 1 0 0 0
1 0 0 0 1 1 1 1 1 1 0 0 0
Structuring 0 0 0 1 1 1 1 1 1 0 0 0
element 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
Binary image

Solution
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
⊖ = 0 0 0 1 1 1 1 1 1 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0

Page 7
Image Processing Lecture 11

Erosion can be used to remove isolated features (i.e. irrelevant detail)


which may include noise or thin edges as shown in the figure below.

(a) (b)
Figure 11.4 (a) Binary image. (b) Eroded image.

Combining Dilation & Erosion - Opening Morphology


The opening operation erodes an image and then dilates the eroded image
using the same structuring element for both operations, i.e.
∘ =( ⊖ )⊕
where A is the original image and B is the structuring element.
The opening operation is used to remove regions of an object that cannot
contain the structuring element, smooth objects contours, and breaks thin
connections as shown in the figure below.

(a)

Page 8
Image Processing Lecture 11

(b)

(c)
Figure 11.5 (a) Original binary image. (b) Result of opening with square structuring element
of size 10 pixels. (c) Result of opening with square structuring element of size 20 pixels.

The opening operation can also be used to remove small objects in an


image while preserving the shape and size of larger objects as illustrated
in the figure below.

(a) (b)
Figure 11.6 (a) Original binary image. (b) Result of opening with square structuring element
of size 13 pixels.

Page 9
Image Processing Lecture 11

Combining Dilation & Erosion - Closing Morphology


The closing operation dilates an image and then erodes the dilated image
using the same structuring element for both operations, i.e.
• =( ⊕ )⊖
where A is the original image and B is the structuring element.
The closing operation fills holes that are smaller than the structuring
element, joins narrow breaks, fills gaps in contours, and smoothes objects
contours as shown in the figure below.

(a)

(b)
Figure 11.7 (a) Result of closing with square structuring element of size 10 pixels. (c) Result
of closing with square structuring element of size 20 pixels.

Combining Opening & Closing Morphology


Combining opening and closing can be quite effective in removing noise
as illustrated in the next figure.

Page 10
Image Processing Lecture 11

(a)

(b) (c)
Figure 11.8 (a) Noisy fingerprint. (b) Result of opening (a) with square structuring element of
size 3 pixels. (c) Result of closing (b) with the same structuring element.

Note that the noise was removed by opening the image, but this process
introduced numerous gaps in the ridges of the fingerprint. These gaps can
be filled by following the opening with a closing operation.

Page 11
Image Processing Lecture 12

The Hit-or-Miss Transformation


The hit-or-miss transformation of an image A by B is denoted by A⊛B.
B is a pair of structuring elements B= (B1,B2) rather than a single element.
B1: set of elements of B associated with an object
B2 : set of elements of B associated with the background
The hit-or-miss transform is defined as follows:

⊛ =( ⊖ )∩( ⊖ )

This transform is useful in locating all pixel configurations that match the
B1 structure (i.e a hit) but do not match that of B2 (i.e. a miss). Thus, the
hit-or-miss transform is used for shape detection.

Example: Use the hit-or-miss transform to identify the locations of the


following shape pixel configuration in the image below using the two
structuring elements B1 and B2.

0 1 0 00000000000
1 1 1 00100000000 1
0 1 0 00100111100 1 1 1
Shap 01110000000 1
00100001100 B1
00001001110
00011100100 1 1
00001000000
00000000000 1 1
Image A B2

Solution:

Page 1
Image Processing Lecture 12

00000000000 11111111111
A ⊖ B1 = 00000000000 Ac = 11011111111
00000000000 11011000011
00100000000 10001111111
00000000000 11011110011
00000000100 11110110001
00001000000 11100011011
00000000000 11110111111
00000000000 11111111111

10101111111 00000000000
Ac ⊖ B2=
10100000001 00000000000
00000111111 A⊛B= 00000000000
10100000001 00100000000
00000000000 00000000000
10000000001 00000000000
11101000000 00001000000
11000000101 00000000000
11101011111 00000000000

The figure below shows an example of applying the hit-or-miss transform


on the image in the previous example.

(a) (b)
Figure 12.1 (a) Binary image. (b) Result of applying hit-or-miss transform.

©Asst. Lec. Wasseem Nahy Ibrahem Page 2


Image Processing Lecture 12

Basic Morphological Algorithms (Applications)


The principle application of morphology is extracting image components
that are useful in the representation and description of shape.
Morphological algorithms are used for boundaries extraction,
skeletonization (i.e. extracting the skeleton of an object), and thinning.

Boundary Extraction
The boundary of a set A, denoted by (A), can be obtained by:
( )= −( ⊖ )
where B is the structuring element.
The figure below shows an example of extracting the boundary of an
object in a binary image.

(a) (b)
Figure 12.2 (a) Binary image. (b) Object boundary extracted
using the previous equation and 3×3 square structuring element.

Note that, because the size of structuring element is 3×3 pixels, the
resulted boundary is one pixel thick. Thus, using 5×5 structuring element
will produce a boundary between 2 and 3 pixels thick as shown in the
next figure.

Page 3
Image Processing Lecture 12

Figure 12.3 Object boundary extracted


using 5×5 square structuring element

Thinning
Thinning means reducing binary objects or shapes in an image to strokes
that are a single pixel wide. The thinning of a set A by a structuring
element B, is defined as:
⊗ = −( ⊛ )
= ∩( ⊛ )
Since we only match the pattern (shape) with the structuring elements, no
background operation is required in the hit-or-miss transform.
Here, B is a sequence of structuring elements:
{ }={ , , ,…, }
where Bi is the rotation of Bi-1. Thus, the thinning equation can be written
as:
⊗ { } = ((… ( ⊗ )⊗ …) ⊗ )

The entire process is repeated until no further changes occur. The next
figure shows an example of thinning the fingerprint ridges so that each is
one pixel thick.

Page 4
Image Processing Lecture 12

(a) (b)

(c) (d)
Figure 12.4 (a) Original fingerprint image. (b) Image thinned once. (c) Image thinned twice.
(d) Image thinned until stability (no changes occur).

Skeletonization (Skeleton Extraction)


is another way to reduce binary objects to thin strokes that retain
important structural information about the shapes of the original objects.
The skeleton of A can be expressed in terms of erosions and openings as
follows:

( )= ( )

with
( )=( ⊖ )−( ⊖ )∘
where B is a structuring element, and (A ⊖ kB) indicates k successive
erosions of A:

Page 5
Image Processing Lecture 12

( ⊖ ) = (… ( ⊖ ) ⊖ ) ⊖ … ) ⊖

The figure below illustrates an example of extracting a skeleton of an


object in a binary image.

(a) (b)
Figure 12.5 (a) Bone image. (b) Skeleton extracted from (a).

Gray-scale Morphology
The basic morphological operations of dilation, erosion, opening and
closing can also be applied to gray images.

Gray-scale Dilation
The gray-scale dilation of a gray-scale image f by a structure element b is
defined as:
( ⊕ )( , ) = max{ ( − , − )+ ( , )| ( , )∈ }
where Db is the domain of the structuring element b. This process
operates in the same way as the spatial convolution.

Page 6
Image Processing Lecture 12

The figure below shows the result of dilating a gray image using a 3×3
square structuring element.

(a) (b)
Figure 12.6 (a) Original gray image. (b) Dilated image.

We can see that gray-scale dilation produces the following:


1. Bright and slightly blurred image.
2. Small, dark details have been reduced.

Gray-scale Erosion
The gray-scale erosion of a gray-scale image f by a structure element b is
defined as:
( ⊖ )( , ) = min{ ( + , + )− ( , )| ( , )∈ }

The next figure shows the result of eroding a gray image using a 3×3
square structuring element.

Page 7
Image Processing Lecture 12

(a) (b)
Figure 12.7 (a) Original gray image. (b) Eroded image.
We can see that gray-scale erosion produces the following:
1. Dark image. 2. Small, bright details were reduced.

Gray-scale Opening and Closing


The opening and closing of gray-scale images have the same form as in
the binary images:
Opening : ∘ =( ⊖ )⊕
Closing: • =( ⊕ )⊖
The figure below shows the result of opening a gray image.

(a) (b)
Figure 12.8 (a) Original gray image. (b) Opened image.

Page 8
Image Processing Lecture 12

Note the decreased sizes of the small, bright details, with no appreciable
effect on the darker gray levels.
The figure below shows the result of closing a gray image.

(a) (b)
Figure 12.9 (a) Original gray image. (b) Closed image.
Note the decreased sizes of the small, dark details, with relatively little
effect on the bright features.

Gray-Scale Morphology Applications


Morphological smoothing
Smoothing is obtained by performing a morphological opening followed
by a closing as shown in the figure below.

(a) (b)
Figure 12.10 (a) Original gray image. (b) Morphological smoothed image.

Page 9
Image Processing Lecture 12

Morphological gradient
is produced from subtracting an eroded image from its dilated version. It
is defined as:
=( ⊕ )−( ⊖ )
The resulted image has edge-enhancement characteristics, thus
morphological gradient can be used for edge detection as shown in the
figure below.

(a) (b)
Figure 12.11 (a) Original gray image. (b) Morphological gradient.

©Asst. Lec. Wasseem Nahy Ibrahem Page 10


Image Processing Lecture 13

Image Segmentation
is one of image analysis methods used to subdivide an image into its
regions or objects depending on the type of shapes and objects searched
for in the image. Image segmentation is an essential first step in most
automatic pictorial pattern recognition and scene analysis tasks.

Applications of image segmentation


• Inspecting images of electronic boards for missing components or
broken connections.
• Detecting faces, facial features and other objects for surveillance.
• Detecting certain cellular objects in biomedical images.

Segmentation Approaches
Image segmentation algorithms are based on one of two basic properties
of gray-level values: discontinuity and similarity.
• In the first category, the approach is to partition an image based on
abrupt discontinuity (i.e. change) in gray level, such as edges in an
image.
• In the second category, the approaches are based on partitioning an
image into regions that are similar according to a set of predefined
criteria.

We shall focus on segmentation algorithms to detect discontinuities such


as points, lines and edges. Segmentation methods, studied here, rely on
two steps:

Page 1
Image Processing Lecture 13

1. Choosing appropriate filters that help highlight the required


feature(s).
2. Thresholding.

Point Detection
This is concerned with detecting isolated image points in relation to its
neighborhood which is an area of nearly constant gray level.

1. Simple method
The simplest point detection method works in two steps:
1. Filter the image with the mask:
-1 -1 -1
-1 8 -1
-1 -1 -1

Then, we take the absolute values of the filtered image.


2. On the filtered image apply an appropriate threshold (e.g. the
maximum pixel value).

The next figure shows an example of point detection in a face image


using the simple method.

©Asst. Lec. Wasseem Nahy Ibrahem Page 2


Image Processing Lecture 13

(a)

(b) Result with (c) Result with (d) Result with


threshold=max threshold=220 threshold=168

(e) Result with (f) Result with (g) Result with


threshold=118 threshold=68 threshold=55

Figure 13.1 Example of point detection using simple method. (a) Original face image.
(b)-(g) Results with different Thresholds

Page 3
Image Processing Lecture 13

2. Alternative method
An alternative approach to the simple method is to locate the points in a
window of a given size where the difference between the max and the
min value in the window exceeds a given threshold. This can be done
again in two steps:
1. Obtain the difference between the max value (obtained with the
order statistics max filter) and the min value (obtained with the
order statistics min filter) in the given size mask.
2. On the output image apply an appropriate threshold (e.g. the
maximum pixel value).
The figure below shows an example of point detection in a face image
using the alternative method.

(b) Threshold=max (c) Threshold=90

(a)

(d) Threshold=40 (e) Threshold=30

Figure 13.2 Example of point detection using alternative method. (a) Original face image.
(b)-(e) Results with different Thresholds

Page 4
Image Processing Lecture 13

Line Detection
Detecting a line in a certain direction require detecting adjacent points in
the image in the given direction. This can be done using filters that yields
significant response at points aligned in the given direction.
For example, the following filters
-1 2 -1 -1 -1 -1
-1 2 -1 2 2 2
-1 2 -1 -1 -1 -1

-1 -1 2 2 -1 -1
-1 2 -1 -1 2 -1
2 -1 -1 -1 -1 2

highlight lines in the vertical, horizontal, +45° direction , and – 45°


direction, respectively.
This can be done again in two steps:
1. Filter the image using an appropriate filter.
2. Apply an appropriate threshold (e.g. max value).

The next figure illustrates an example of line detection using the filters
above.

Page 5
Image Processing Lecture 13

(a)

(b) (c)

(d) (e)
Figure 13.3 Example of line detection. (a) Original image. (b)-(e) Detected lines in the
vertical, horizontal, +45° direction , and – 45° direction, respectively.

Page 6
Image Processing Lecture 13

Edge detection
Edge detection in images aims to extract meaningful discontinuity in
pixel gray level values. Such discontinuities can be deduced from first
and second derivatives as defined in Laplacian filter.
The 1st-order derivative of an image f(x,y) is defined as:

⎡ ⎤
⎢ ⎥
∇ = =⎢ ⎥
⎢ ⎥
⎣ ⎦
Its magnitude is defined as:

∇ = +

Or by using the absolute values


∇ ≈| |+

The 2nd-order derivative is computed using the Laplacian as follows:


( , ) ( , )
∇ = +

However, Laplacian filter is not used for edge detection because, as a


second order derivative:
• it is sensitive to noise.
• its magnitude produces double edges.
• it is unable to detect edge direction.

There are 1st-order derivative estimators in which we can specify whether


the edge detector is sensitive to horizontal or vertical edges or both. We
study only two edge detectors namely Sobel and Prewitt edge detectors.

Page 7
Image Processing Lecture 13

Sobel edge detector


This detector uses the following masks to approximate the digitally the
1st-order derivatives Gx and Gy:
-1 -2 -1 -1 0 1
0 0 0 -2 0 2
1 2 1 -1 0 1

To detect:
• Horizontal edges, we filter the image f using the left mask above.
• Vertical edges, we filter the image f using the right mask above.
• Edges in both directions, we do the following:
1. Filter the image f with the left mask to obtain Gx
2. Filter the image f again with the right mask to obtain Gy
3. Compute = + or ≈| |+
In all cases, we then take the absolute values of the filtered image, then
apply an appropriate threshold.

The next figure shows an example of edge detection using the Sobel
detector.

Page 8
Image Processing Lecture 13

(a) (b)

(c) (d)
Figure 13.4 Example of Sobel edge detection. (a) Original image.
(b)-(d) Edges detected in vertical, horizontal, and both directions, respectively.

Prewitt edge detector.


This detector uses the following masks to approximate Gx and Gy:
-1 -1 -1 -1 0 1
0 0 0 -1 0 1
1 1 1 -1 0 1
The steps of applying this detector are the same as that of the Sobel
detector.
The next figure shows an example of edge detection using the Prewitt
detector.

Page 9
Image Processing Lecture 13

(a) (b)

(c) (d)
Figure 13.5 Example of Prewitt edge detection. (a) Original image.
(b)-(d) Edges detected in vertical, horizontal, and both directions, respectively.

We can see that the Prewitt detector produces noisier results than the
Sobel detector. This is because the coefficient with value 2 in the Sobel
detector provides smoothing.

Page 10
Image Processing Lecture 14

Image Compression
• Image compression means the reduction of the amount of data
required to represent a digital image by removing the redundant data.
It involves reducing the size of image data files, while retaining
necessary information.
• Mathematically, this means transforming a 2D pixel array (i.e. image)
into a statistically uncorrelated data set. The transformation is applied
prior to storage or transmission of the image. At later time, the
compressed image is decompressed to reconstruct the original
(uncompressed) image or an approximation of it.
• The ratio of the original (uncompressed) image to the compressed
image is referred to as the Compression Ratio CR:

= =

where
= × ×
=

Example:
Consider an 8-bit image of 256×256 pixels. After compression, the image
size is 6,554 bytes. Find the compression ratio.
Solution:
Usize = (256 × 256 × 8) / 8 = 65,536 bytes
Compression Ratio = 65536 / 6554 = 9.999 ≈ 10 (also written 10:1)
This means that the original image has 10 bytes for every 1 byte in the
compressed image.

Page 1
Image Processing Lecture 14

Image Data Redundancies


There are three basic data redundancies that can be exploited by image
compression techniques:
• Coding redundancy: occurs when the data used to represent the image
are not utilized in an optimal manner. For example, we have an 8-bit
image that allows 256 gray levels, but the actual image contains only
16 gray levels (i.e. only 4-bits are needed).
• Interpixel redundancy: occurs because adjacent pixels tend to be
highly correlated. In most images, the gray levels do not change
rapidly, but change gradually so that adjacent pixel values tend to be
relatively close to each other in value.
• Psychovisual redundancy: means that some information is less
important to the human visual system than other types of information.
This information is said to be psychovisually redundant and can be
eliminated without impairing the image quality.

Image compression is achieved when one or more of these redundancies


are reduced or eliminated.

Fidelity Criteria
These criteria are used to assess (measure) image fidelity. They quantify
the nature and extent of information loss in image compression. Fidelity
criteria can be divided into classes:
1. Objective fidelity criteria
2. Subjective fidelity criteria

Page 2
Image Processing Lecture 14

Objective fidelity criteria


These are metrics that can be used to measure the amount of information
loss (i.e. error) in the reconstructed (decompressed) image.
Commonly used objective fidelity criteria include:
• root-mean-square error, , between an input image f(x,y) and
output image ( , ):

1
= ( , )− ( , )

where the images are of size M × N. The smaller the value of ,


the better the compressed image represents the original image.

• mean-square signal-to-noise ratio, SNRms :


∑ ∑ ( , )
=
∑ ∑ ( , )− ( , )

• Peak signal-to-noise ratio, SNRpeak :


( − 1)
= 10 log
1
∑ ∑ ( , )− ( , )

where L is the number of gray levels.


A larger number of SNR implies a better image.

Page 3
Image Processing Lecture 14

Subjective fidelity criteria


These criteria measure image quality by the subjective evaluations of a
human observer. This can be done by showing a decompressed image to a
group of viewers and averaging their evaluations. The evaluations may be
made using an absolute rating scale, for example {Excellent, Fine,
Passable, Marginal, Inferior, and Unusable}.

Image Compression System


As shown in the figure below, the image compression system consists of
two distinct structural blocks: an encoder and a decoder.

Figure 14.1 A general image compression system

The encoder is responsible for reducing or eliminating any coding,


interpixel, or psychovisual redundancies in the input image. It consists of:
• Mapper: it transforms the input image into a nonvisual format
designed to reduce interpixel redundancies in the input image. This
operation is reversible and may or may not reduce directly the
amount of data required.

Page 4
Image Processing Lecture 14

• Quantizer: it reduces the psychovisual redundancies of the input


image. This operation is irreversible.
• Symbol coder: it creates a fixed- or variable-length code to
represent the quantizer output. In a variable-length code, the
shortest code words are assigned to the most frequently occurring
output values, and thus reduce coding redundancy. This operation
is reversible.

The decoder contains only two components: symbol decoder and an


inverse mapper.

Image Compression Types


Compression techniques are classified into two primary types:
• Lossless compression
• Lossy compression

Lossless compression
• It allows an image to be compressed and decompressed without losing
information (i.e. the original image can be recreated exactly from the
compressed image).
• This is useful in image archiving (as in the storage of legal or medical
records).
• For complex images, the compression ratio is limited (2:1 to 3:1). For
simple images (e.g. text-only images) lossless methods may achieve
much higher compression.
• An example of lossless compression techniques is Huffman coding.

Page 5
Image Processing Lecture 14

Huffman Coding
is a popular technique for removing coding redundancy. The result after
Huffman coding is variable length code, where the code words are
unequal length. Huffman coding yields the smallest possible number of
bits per gray level value.

Example:
Consider the 8-bit gray image shown below. Use Huffman coding
technique for eliminating coding redundancy in this image.
119 123 168 119
123 119 168 168
119 119 107 119
107 107 119 119

Solution:
Gray level Histogram Probability
119 8 0.5
168 3 0.1875
107 3 0.1875
123 2 0.125

1 1
0.5 0.5 0.5 1
01
0.1875 00 0.3125 0.5 0
011 00
0.1875 0.1875
0.125 010

We build a lookup table:

Page 6
Image Processing Lecture 14

Lookup table:
Gray level Probability Code
119 0.5 1
168 0.1875 00
107 0.1875 011
123 0.125 010

We use this code to represent the gray level values of the compressed
image:
1 010 00 1
010 1 00 00
1 1 011 1
011 011 1 1

Hence, the total number of bits required to represent the gray levels of the
compressed image is 29-bit: 10101011010110110000011110011.
Whereas the original (uncompressed) image requires 4*4*8 = 128 bits.
Compression ratio = 128 / 29 ≈ 4.4

Lossy compression
• It allows a loss in the actual image data, so the original uncompressed
image cannot be recreated exactly from the compressed image).
• Lossy compression techniques provide higher levels of data reduction
but result in a less than perfect reproduction of the original image.
• This is useful in applications such as broadcast television and
videoconferencing. These techniques can achieve compression ratios
of 10 or 20 for complex images, and 100 to 200 for simple images.
• An example of lossy compression techniques is JPEG compression
and JPEG2000 compression.

Page 7
Image Processing Lecture 16

Object Recognition
The automatic recognition of objects or patterns is one of the important
image analysis tasks. The approaches to pattern recognition are divided
into two principal areas:
• Decision-theoretic methods: deal with patterns described using
quantitative descriptors, such as length, area, and texture.
• Structural methods: deal with patterns best described by qualitative
descriptors (symbolic information), such as the relational
descriptors.

Patterns and Pattern Classes


• A pattern is an arrangement of descriptors (or features).
• A pattern class is a family of patterns that share some common
properties. Pattern classes are denoted w1, w2, . . . , wN where N is
the number of classes.
• Pattern recognition by machine involves techniques for assigning
patterns to their respective classes—automatically and with as little
human intervention as possible.
The object or pattern recognition task consists of two steps:
Ø feature selection (extraction)
Ø matching (classification)
There are three common pattern arrangements used in practice:
• Numeric vectors (for quantitative descriptions)

= ⋮

Page 1
Image Processing Lecture 16

• Strings and trees (for structural descriptions)


x = abababa….

Recognition Based on Decision-Theoretic Methods


These methods are based on the use of decision functions. Let x = (x1, x2,
... , xn)T represent an n-dimensional pattern vector. For N known pattern
classes w1, w2, ... , wN the idea here is to find N decision functions d1(x),
d2(x),..., dN(x) with the property that, if an unknown pattern x is said to
belong to the ith pattern class if, upon substitution of x into all decision
functions, di(x) yields the largest numerical value.

Matching
Recognition techniques based on matching represent each class by a
prototype pattern vector. Set of patterns of known classes is called the
training set. Set of patterns of unknown classes is called the testing set.
An unknown pattern is assigned to the class to which it is closest in terms
of a predefined metric. The simplest approach is the minimum-distance
classifier, which, as its name implies, computes the (Euclidean) distance
between the unknown and each of the prototype vectors. Then, it chooses
the smallest distance to make a decision.

Wavelet-based Face Recognition Application


In the enrolment stage, each face image in the training set is transformed
to the wavelet domain to extract its pattern vector (i.e. subband). The
choice of an appropriate subband varies depending on the operational
circumstances of the face recognition application. The decomposition

Page 2
Image Processing Lecture 16

level is predetermined based on the efficiency and accuracy requirements


and the size of the face image. In the recognition stage, a minimum-
distance classification method was used to classify the unknown face
images. Figure 15.1 illustrates the stages of this approach.

Figure 16.1 Wavelet-based face recognition

Let the set F = { fi,1, fi,2, fi,3, . . . , fi,m} be a training set of face
images of n subjects, where each subject i has m images. In the enrolment
stage, wavelet transform is applied on each training image so that a set
Wk(F) of multi-resolution decomposed images result. A new set LLk(F) of
all k-level LL-subbands will be obtained from the transformed face
images in the set Wk(F). The new set LLk(F) forms the set of features for
the training images. Thus, the training face image 1 of subject i (fi,1) is

Page 3
Image Processing Lecture 16

expressed by its feature vector LLk,i,1. The collection of feature vectors


LLk,i,1, LLk,i,2, . . ., LLk,i,m represents the stored template of subject i. In a
similar manner, 3 new sets HLk(F), LHk(F), HHk(F) can be created at the
same decomposition level k from the kth-level HL-, LH-, HH-subbands
respectively.

In the recognition phase, a minimum-distance classifier is used to


classify the input face image. When a probe face image is introduced to
the system, it is decomposed by wavelet transform, and a certain subband
(e.g. LLk) is chosen to represent the feature vector of the probe image. A
match score Si,j can now be computed between the probe feature vector
and each of the feature vectors j of the subject i in the feature set LLk(F).
Then, the identity (i.e. class) of the training image which gives the
minimum score is assigned to the probe image:

Si = min (Si,j) (j = 1, . . . , m)

Many similarity measures can be used for the minimum-distance


classifier, for example CityBlock or Euclidean distance functions.

Structural methods
Structural recognition techniques are based on representing objects as
strings, trees or graphs and then defining descriptors and recognition rules
based on those representations.
The key difference between decision-theoretic and structural
methods is that the former uses quantitative descriptors expressed in the
form of numeric vectors, while the structural techniques deal with
symbolic information.

Page 4

You might also like