DIP Notes
DIP Notes
References:-
1. Rafael C. Gonzalez, Richard E. Woods, "Digital Image Processing", 2/E, Prentice
– Hall 2001.
2. Scott E Umbaugh, “Computer Vision and Image Processing”, Prentice – Hall
1998.
3. Nick Efford, “Digital Image Processing – a practical approach using Java”,
Pearson Education 2000.
4. John R Jensen, “Introductory Digital Image Processing”, 3/E. Prentice Hall, 2005.
1. Introduction
An image is a picture: a way of recording and presenting information
visually. Since vision is the most advanced of our senses, it is not
surprising that images play the single most important role in human
perception. The information that can be conveyed in images has been
known throughout the centuries to be extraordinary - one picture is worth
a thousand words.
However, unlike human beings, imaging machines can capture and
operate on images generated by sources that cannot be seen by humans.
These include X-ray, ultrasound, electron microscopy, and computer-
generated images. Thus, image processing has become an essential field
that encompasses a wide and varied range of applications.
2. Basic definitions
Image processing is a general term for the wide range of
techniques that exist for manipulating and modifying images in
various ways.
A digital image may be defined as a finite, discrete
representation of the original continuous image. A digital image
is composed of a finite number of elements called pixels, each
of which has a particular location and value.
Page 1
Image Processing Lecture 1
Page 2
Image Processing Lecture 1
Figure 1.1 the electromagnetic spectrum arranged according to energy per photon.
Page 3
Image Processing Lecture 1
(b)
(a)
Figure 1.2 Examples of Gamma-Ray imaging a) PET image b) Star explosion 15,000 years
ago
Figure 1.2(b) shows a star exploded about 15,000 years ago, imaged in
the gamma-ray band. Unlike the previous example shown in Figure 1.2(a)
, this image was obtained using the natural radiation of the object being
imaged.
Page 4
Image Processing Lecture 1
X-rays are widely used for imaging in medicine, industry and astronomy.
In medicine, chest X-ray, illustrated in Figure 1.3(a), is widely used for
medical diagnostics.
Page 5
Image Processing Lecture 1
(b)
(a) (c)
Figure 1.5 Examples of visible and infrared imaging. a) Microprocessor magnified 60 times.
b) Infrared satellite image of the US. c) Multispectral image of Hurricane
Page 6
Image Processing Lecture 1
Page 7
Image Processing Lecture 1
(a) (b)
Figure 1.7 MRI images of a human (a) knee, and (b) spine.
Page 8
Image Processing Lecture 1
(a) (b)
Figure 1.8 Examples of ultrasound imaging. a) Baby b) another view of baby
(a) (b)
Figure 1.9 (a) image of damaged integrated circuit magnified 2500 times (b) fractal image
Page 9
Image Processing Lecture 1
(a)
(b) (c)
Figure 1.10 (a) Face recognition system for PDA (b) Iris recognition (c) Fingerprint
recognition
Page 10
Image Processing Lecture 1
Page 11
Image Processing Lecture 1
output are images, and image analysis techniques whose inputs are
images, but whose outputs are attributes extracted from those images.
1. Image processing techniques include:
Image Enhancement: brings out detail that is obscured, or simply
highlights certain features of interest in an image. A familiar
example of enhancement is increasing the contrast of an image.
Image Restoration: attempts to reconstruct or recover an image that
has been degraded by using a priori knowledge of the degradation
phenomenon.
Image Compression: deals with techniques for reducing the storage
required to save an image, or the bandwidth required to transmit it.
2. Image Analysis tasks include:
Image Segmentation: is concerned with procedures that partition an
image into its constituent parts or objects.
Image Representation and Description: Image representation
converts the output of a segmentation stage to a form suitable for
computer processing. This form could be either the boundary of a
region or the whole region itself. Image description, also called
feature selection, deals with extracting attributes that result in some
quantitative information of interest or are basic for differentiating
one class of objects from another.
Image Recognition: is the process that assigns a label (e.g.,
"vehicle") to an object based on its descriptors.
Page 12
Image Processing
1. Binary images
Binary images are the simplest type of images and can take on two
values, typically black and white, or 0 and 1. A binary image is referred to
as a 1-bit image because it takes only 1 binary digit to represent each
pixel. These types of images are frequently used in applications where the
only information required is general shape or outline, for example optical
character recognition (OCR).
Binary images are often created from the gray-scale images via a
threshold operation, where every pixel above the threshold value is turned
white (‘1’), and those below it are turned black (‘0’). In the figure below,
we see examples of binary images.
(a) (b)
Figure 2.1 Binary images. (a) Object outline. (b) Page of text used in OCR application.
2. Gray-scale images
Gray-scale images are referred to as monochrome (one-color) images.
Page 1
Image Processing Lecture 2
3. Color images
Color images can be modeled as three-band monochrome image data,
where each band of data corresponds to a different color. The actual
information stored in the digital image data is the gray-level information
in each spectral band.
Typical color images are represented as red, green, and blue (RGB
images). Using the 8-bit monochrome standard as a model, the
corresponding color image would have 24-bits/pixel (8-bits for each of
the three color bands red, green, and blue). The figure below illustrates a
representation of a typical RGB color image.
Page 2
Image Processing Lecture 2
4. Multispectral images
Multispectral images typically contain information outside the normal
human perceptual range. This may include infrared, ultraviolet, X-ray,
acoustic, or radar data. These are not images in the usual sense because
the information represented is not directly visible by the human system.
However, the information is often represented in visual form by mapping
the different spectral bands to RGB components.
Page 3
Image Processing Lecture 2
Most of the types of file formats fall into the category of bitmap images,
for example:
PPM (Portable Pix Map) format
TIFF (Tagged Image File Format)
GIF (Graphics Interchange Format)
JPEG (Joint Photographic Experts Group) format
BMP (Windows Bitmap)
PNG (Portable Network Graphics)
XWD (X Window Dump)
Page 4
Image Processing Lecture 2
l = f(xa,yb)
From the above equations, it is evident that l lies in the range
Lmin ≤ l ≤ Lmax
considered black and l = L-1 is considered white on the gray scale. All
intermediate values are shades of gray varying from black to white.
Page 5
Image Processing Lecture 2
Page 6
Image Processing Lecture 2
Figure 2.1 Generating a digital image. (a) Continuous image, (b) A scan line from A to B in
the continuous image (c) Sampling and quantization, (d) Digital scan line.
The digital samples resulting from both sampling and quantization are
shown in Figure 2.1(d). Starting at the top of the image and carrying out
this procedure line by line produces a two-dimensional digital image as
shown in Figure 2.3.
Page 7
Image Processing Lecture 2
Note that:
The number of selected values in the sampling process is known as
the image spatial resolution. This is simply the number of pixels
relative to the given image area.
The number of selected values in the quantization process is called
the grey-level (color level) resolution. This is expressed in terms
of the number of bits allocated to the color levels.
The quality of a digitized image depends on the resolution
parameters on both processes.
Page 8
Image Processing Lecture 2
Each element of this matrix array is called pixel. The spatial resolution
(number of pixels) of the digital image is M * N. The gray level resolution
(number of gray levels) L is
L = 2k
Where k is the number of bits used to represent the gray levels of the
digital image. When an image can have 2k gray levels, we can refer to the
image as a “k-bit image”. For example, an image with 256 possible gray-
level values is called an 8-bit image.
The gray levels are integers in the interval [0, L-1]. This interval is called
the gray scale.
The number, b, of bits required to store a digitized image is
b = M * N *k
Example:
For an 8-bit image of size 512×512, determine its gray-scale and storage
size.
Solution k = 8 , M = N = 512
Number of gray levels L = 2k = 28 = 256
The gray scale is [0 , 255]
Storage size (b) = M * N * k = 512 * 512 * 8 = 2,097,152 bits
Page 9
Image Processing Lecture 2
Page 10
Image Processing Lecture 2
Figure 2.4 A 1024×1024, 8-bit image subsampled down to size 32×32 pixels.
To see the effects resulting from the reduction in the number of samples,
we bring all the subsampled images up to size 1024×1024 by row and
column pixel replication. The resulted images are shown in the figure
below.
Figure 2.5 (a) 1024×1024, 8-bit image. (b) through (f) 512×512, 256×256, 128×128, 64×64,
and 32×32 images resampled into 1024×1024 pixels by row and column duplication
Page 11
Image Processing Lecture 2
Compare Figure 2.5(a) with the 512×512 image in Figure 2.5(b), we find
that the level of detail lost is simply too fine to be seen on the printed
page at the scale in which these images are shown. Next, the 256×256
image in Figure 2.5(c) shows a very slight fine checkerboard pattern in
the borders between flower petals and the black background. A slightly
more pronounced graininess throughout the image also is beginning to
appear. These effects are much more visible in the 128×128 image in
Figure 2.5(d), and they become pronounced in the 64×64 and 32×32
images in Figures 2.5(e) and (f), respectively.
Page 12
Image Processing Lecture 2
Page 13
Image Processing Lecture 3
Example
The pixel values of the following 5×5 image are represented by 8-bit
integers:
Determine f with a gray-level resolution of 2k for (i) k=5 and (ii) k=3.
Solution:
Dividing the image by 2 will reduce its gray level resolution by 1 bit.
Hence to reduce the gray level resolution from 8-bit to 5-bit,
8 bits – 5 bits = 3 bits will be reduced
Thus, we divide the 8-bit image by 8 (23) to get the following 5-bit
image:
Page 1
Image Processing Lecture 3
2. Distance Measures
For pixels p, q, and z, with coordinates (x, y), (s, t), and (v, w),
respectively, D is a distance function or metric if
a) D(p, q) >= 0 ( D(p,q)=0 iff p = q ),
b) D(p,q) = D(q,p), and
c) D(p,z) <= D(p,q) + D(q,z).
Euclidean distance between p and q is defined as
Page 2
Image Processing Lecture 3
= =
Original image image with rows expanded image with rows and
columns expanded
Page 3
Image Processing Lecture 3
= =
Original image image with rows expanded image with rows and
columns expanded
Note that the zoomed image has size 2M-1 × 2N-1. However, we can use
techniques such as padding which means adding new columns and/or
rows to the original image in order to perform bilinear interpolation to get
zoomed image of size 2M × 2N.
Page 4
Image Processing Lecture 3
Figure 3.1 Top row: images zoomed from 128×128, 64×64, and 32×32 pixels to 1024×1024
pixels susing nearest neighbor interpolation. Bottom row, same sequence, but using bilinear
interpolation
B) Shrinking
Shrinking may be viewed as undersampling. Image shrinking is
performed by row-column deletion. For example, to shrink an image by
one-half, we delete every other row and column.
Page 5
Image Processing Lecture 3
Image algebra
There are two categories of algebraic operations applied to images:
Arithmetic
Logic
These operations are performed on a pixel-by-pixel basis between two or
more images, except for the NOT logic operation which requires only one
image. For example, to add images I1 and I2 to create I3:
I3(x,y) = I1(x,y) + I2(x,y)
Page 6
Image Processing Lecture 3
(c) Subtracting image (b) from (a). Only moving objects appear in the resulting image
Page 7
Image Processing Lecture 3
The logic operations AND, OR, and NOT form a complete set, meaning
that any other logic operation (XOR, NOR, NAND) can be created by a
combination of these basic elements. They operate in a bit-wise fashion
on pixel data.
Page 8
Image Processing Lecture 3
The AND and OR operations are used to perform masking operation; that
is; for selecting subimages in an image, as shown in the figure below.
Masking is also called Region of Interest (ROI) processing.
(a) Original image (b) AND image mask (c) Resulting image, (a)
AND (b)
(d) Original image (e) OR image mask (f) Resulting image, (d)
OR (e)
Figure 3.6 Image masking
The NOT operation creates a negative of the original image ,as shown in
the figure below, by inverting each bit within each pixel value.
Page 9
Image Processing Lecture 3
Image Histogram
The histogram of a digital image is a plot that records the frequency
distribution of gray levels in that image. In other words, the histogram is
a plot of the gray-level values versus the number of pixels at each gray
value. The shape of the histogram provides us with useful information
about the nature of the image content.
The histogram of a digital image f of size M× N and gray levels
in the range [0, L-1] is a discrete function
where is the kth gray level and is the number of pixels in the image
having gray level .
The next figure shows an image and its histogram.
Page 10
Image Processing Lecture 3
Note that the horizontal axis of the histogram plot (Figure 3.8(b))
represents gray level values, , from 0 to 255. The vertical axis represents
the values of i.e. the number of pixels which have the gray level .
The next figure shows another image and its histogram.
Page 11
Image Processing Lecture 3
Page 12
Image Processing Lecture 3
Page 13
Image Processing Lecture 4
Image Enhancement
Image enhancement aims to process an image so that the output image is
“more suitable” than the original. It is used to solve some computer
imaging problems, or to improve “image quality”. Image enhancement
techniques include smoothing, sharpening, highlighting features, or
normalizing illumination for display and/or analysis.
Page 1
Image Processing Lecture 4
Image negatives
The negative of an image with gray levels in the range [0, L-1] is
obtained by using the following expression
s = L− 1 −r
This type of processing is useful for enhancing white or gray detail
embedded in dark regions of an image, especially when the black areas
are dominant in size.
Page 2
Image Processing Lecture 4
Piecewise-linear transformation
The form of piecewise linear functions can be arbitrarily complex. Some
important transformations can be formulated only as piecewise functions,
for example thresholding:
For any 0 < t < 255 the threshold transform ℎ can be defined as:
Thresholding Transform
255
Output Gray Level, s
204
153
102
51
0
0 51 102 153 204 255
Input Gray Level, r
Page 3
Image Processing Lecture 4
Thresholding has another form used to generate binary images from the
gray-scale images, i.e.:
Thresholding Transform
255
Output Gray Level, s
204
153
102
51
0
0 51 102 153 204 255
Input Gray Level, r
The figure below shows a gray-scale image and its binary image resulted
from thresholding the original by 120:
(a) (b)
Figure 4.5 Thresholding. (a) Gray-scale image. (b) Result of thresholding (a) by 120
Page 4
Image Processing Lecture 4
195
180
165
150
135
120
105
90
75
60
45
30
15
0
60
0
15
30
45
75
90
105
120
135
150
165
180
195
210
225
240
255
Input Gray Level, r
Page 5
Image Processing Lecture 4
Example:
For the following piecewise linear chart determine the equation of
the corresponding grey-level transforms:
195
180
165
150
135
120
105
90
75
60
45
30
15
0
60
15
30
45
75
90
165
105
120
135
150
180
195
210
225
240
255
0
Solution
We use the straight line formula to compute the equation of each line
segment using two points.
Page 6
Image Processing Lecture 4
Log transformation
The general form of the log transformation is
s = c ∗ log(1 + r )
where c is a constant, and it is assumed that ≥ 0. This transformation is
used to expand the values of dark pixels in an image while compressing
the higher-level values as shown in the figure below.
Log Transform
255
Output Gray Level, s
204
153
102
51
0
0 51 102 153 204 255
Input Gray Level, r
Power-law transformation
Power-law transformations have the basic form:
s = c ∗ r^y
where c and y are positive constants. The power y is known as gamma ,
hence this transform is also called Gamma transformation . The figure
below shows the form of a power-law transform with different gamma (y)
values.
(a) (b)
(c) (d)
Figure 4.11 (a) Original MRI image of a human spine. (b)-(d) Results of applying power-law
transformation with c = 1 and y = 0.6,0.4, and 0.3, respectively.
We note that, as gamma decreased from 0.6 to 0.4, more detail became
visible. A further decrease of gamma to 0.3 enhanced a little more detail
in the background, but began to reduce contrast ("washed-out" image).
Page 9
Image Processing Lecture 4
(a) (b)
(c) (d)
Figure 4.12 (a) Original bright image. (b)-(d) Results of applying power-law transformation
with c = 1 and y = 3, 4, and 5, respectively.
We note that, suitable results were obtained with gamma values of 3.0
and 4.0. The result obtained with y = 5.0 has areas that are too dark, in
which some detail is lost.
Page 11
Image Processing Lecture 4
Dark
image
Light
image
Low-
contrast
image
High-
contrast
image
Figure 4.13 Four basic image types: dark, light, low-contrast, high-contrast, and their
corresponding histograms.
Page 12
Image Processing Lecture 5
Contrast stretching
aims to increase (expand) the dynamic range of an image. It transforms
the gray levels in the range {0,1,…, L-1} by a piecewise linear function.
The figure below shows a typical transformation used for contrast
stretching.
The locations of points
(r1, s1) and (r2, s2)
control the shape of the
transformation function.
Contrast Stretching
255
240
225
Output Gray Level, s
210
195
180
165
150
135
120
105
90
75
60
45
30
15
0
180
105
120
135
150
165
195
210
225
240
255
30
0
15
45
60
75
90
Page 1
Image Processing Lecture 5
will be used to increase the contrast of the image shown in the figure
below:
(a) (b)
(c) (d)
Figure 5.3 Contrast stretching. (a) Original image. (b) Histogram of (a). (c) Result of contrast
stretching. (d) Histogram of (c).
For a given plot, we use the equation of a straight line to compute the
piecewise linear function for each line:
Page 2
Image Processing Lecture 5
For example the plot in Figure 5.2, for the input gray values in the
interval [28 to 75] we get:
255 − 28
y− 28 = ( x− 28)
75 − 28
y= (227 ∗ − 5040)/47 if 28 ≤ x ≤ 75
Similarly, we compute the equations of the other lines.
0 i f r < 90
s = (255 ∗ r − 22950)/48 i f 90 ≤ r ≤ 138
255 i f r > 138
255
Output Gray Level, s
204
153
102
51
0
180
105
120
135
150
165
195
210
225
240
255
30
0
15
45
60
75
90
Page 3
Image Processing Lecture 5
(a) (b)
(c) (d)
Figure 5.5 (a) Low-contrast image. (b) Histogram of (a). (c) High-contrast image resulted
from applying full contrast-stretching in Figure 5.4 on (a). (d) Histogram of (c)
Gray-level slicing
Gray-level slicing aims to highlight a specific range [A…B] of gray
levels. It simply maps all gray levels in the chosen range to a high value.
Other gray levels are either mapped to a low value (Figure 5.6(a)) or left
unchanged (Figure 5.6(b)). Gray-level slicing is used for enhancing
features such as masses of water in satellite imagery. Thus it is useful for
feature extraction.
Page 4
Image Processing Lecture 5
(a) (b)
Figure 5.6 Gray-level slicing
(b) Operation intensifies desired gray level (c) Result of applying (b) on (a)
range, while preserving other values (background unchanged)
Page 5
Image Processing Lecture 5
(d) Operation intensifies desired gray level (e) Result of applying (d) on (a) (background
range, while changing other values to black changed to black)
Histogram Equalization
is an automatic enhancement technique which produces an output
(enhanced) image that has a near uniformly distributed histogram.
For continuous functions, the intensity (gray level) in an image
may be viewed as a random variable with its probability density function
(PDF). The PDF at a gray level r represents the expected proportion
(likelihood) of occurrence of gray level r in the image. A transformation
function has the form
Page 6
Image Processing Lecture 5
The right side of this equation is known as the cumulative histogram for
the input image. This transformation is called histogram equalization or
histogram linearization.
Because a histogram is an approximation to a continuous PDF, perfectly
flat histograms are rare in applications of histogram equalization. Thus,
the histogram equalization results in a near uniform histogram. It spreads
the histogram of the input image so that the gray levels of the equalized
(enhanced) image span a wider range of the gray scale. The net result is
contrast enhancement.
Page 7
Image Processing Lecture 5
Example:
Suppose that a 3-bit image (L = 8) of size 64 × 64 pixels has the gray
level (intensity) distribution shown in the table below.
rk nk
r0 = 0 790
r1 = 1 1023
r2 = 2 850
r3 = 3 656
r4 = 4 329
r5 = 5 245
r6 = 6 122
r7 = 7 81
Perform histogram equalization on this image, and draw its normalized
histogram, transformation function, and the histogram of the equalized
image.
Solution:
M × N = 4096
We compute the normalized histogram:
rk nk pr (rk ) = nk /MN
r0 = 0 790 0.19
r1 = 1 1023 0.25
r2 = 2 850 0.21
r3 = 3 656 0.16
r4 = 4 329 0.08
r5 = 5 245 0.06
r6 = 6 122 0.03
r7 = 7 81 0.02
Page 8
Image Processing Lecture 5
Normalized histogram
Transformation function
We round the values of s to the nearest integer:
Page 9
Image Processing Lecture 5
These are the values of the equalized histogram. Note that there are only
five gray levels.
Page 10
Image Processing Lecture 5
Figure 5.8 Left column original images. Center column corresponding histogram equalized
images. Right column histograms of the images in the center column.
Page 11
Image Processing Lecture 5
Although all the histograms of the equalized images are different, these
images themselves are visually very similar. This is because the
difference between the original images is simply one of contrast, not of
content.
However, in some cases histogram equalization does not lead to a
successful result as shown below.
Page 12
Image Processing Lecture 5
We clearly see that histogram equalization did not produce a good result
in this case. We see that the intensity levels have been shifted to the upper
one-half of the gray scale, thus giving the image a washed-out
appearance. The cause of the shift is the large concentration of dark
components at or near 0 in the original histogram. In turn, the cumulative
transformation function obtained from this histogram is steep, as shown in
the figure below, thus mapping the large concentration of pixels in the
low end of the gray scale to the high end of the scale.
In other cases, HE may introduce noise and other undesired effect to the
output images as shown in the figure below.
Page 13
Image Processing Lecture 5
Page 14
Image Processing Lecture 5
Then round the resulting values, Sk, to the integer range [0, L-1].
2. Compute the specified histogram pz(z ) of the given image, and use
it find the transformation function G using
Then round the values of G to integers in the range [0, L-1]. Store
the values of G in a table.
3. Perform inverse mapping. For every value of sk, use the stored
values of G from step 2 to find the corresponding value of zq so
that G(zq ) is closest to sk and store these mappings from s to z.
When more than one value of zq satisfies the given sk (i.e. the
mapping is not unique), choose the smallest value.
4. Form the output image by first histogram-equalizing the input
image and then mapping every equalized pixel value, sk, of this
image to the corresponding value zq in the histogram-specified
image using the inverse mappings in step 3.
Page 15
Image Processing Lecture 5
Example:
Suppose the 3-bit image of size 64 × 64 pixels with the gray level
distribution shown in the table, and the specified histogram below.
rk nk
r0 = 0 790
r1 = 1 1023
r2 = 2 850
r3 = 3 656
r4 = 4 329
r5 = 5 245
r6 = 6 122
r7 = 7 81
Perform histogram specification on the image, and draw its normalized
histogram, specified transformation function, and the histogram of the
output image.
Solution:
Step 1:
M × N = 4096
We compute the normalized histogram:
rk nk pr (rk ) = nk /MN
r0 = 0 790 0.19
r1 = 1 1023 0.25
r2 = 2 850 0.21
r3 = 3 656 0.16
r4 = 4 329 0.08
r5 = 5 245 0.06
r6 = 6 122 0.03
r7 = 7 81 0.02
Page 16
Image Processing Lecture 5
Normalized histogram
Step 2:
We compute the values of the transformation G
Page 17
Image Processing Lecture 5
zq G (zq )
z0 = 0 0
z1 = 1 0
z2 = 2 0
z3 = 3 1
z4 = 4 2
z5 = 5 5
z6 = 6 6
z7 = 7 7
Transformation function obtained
from the specified histogram
Step 3:
We find the corresponding value of zq so that the value G(zq ) is the
closest to sk.
Page 18
Image Processing Lecture 5
Step 4:
Page 19
Image Processing Lecture 5
Page 20
Image Processing Lecture 6
Note:
The size of mask must be odd (i.e. 3×3, 5×5, etc.) to ensure it has a
center. The smallest meaningful size is 3×3.
Page 1
Image Processing Lecture 6
( , )= ( , ) ( + , + )
Page 2
Image Processing Lecture 6
Example:
Use the following 3×3mask to perform the convolution process on the
shaded pixels in the 5×5 image below. Write the filtered image.
0 1/6 0 30 40 50 70 90
1/6 1/3 1/6 40 50 80 60 100
0 1/6 0 35 255 70 0 120
3×3 mask 30 45 80 100 130
40 50 90 125 140
5×5 image
Solution:
1 1 1 1 1
0 × 30 + × 40 + 0 × 50 + × 40 + × 50 + × 80 + 0 × 35 + × 255
6 6 3 6 6
+ 0 × 70 = 85
1 1 1 1 1
0 × 40 + × 50 + 0 × 70 + × 50 + × 80 + × 60 + 0 × 255 + × 70
6 6 3 6 6
+ 0 × 0 = 65
1 1 1 1 1
0 × 50 + × 70 + 0 × 90 + × 80 + × 60 + × 100 + 0 × 70 + × 0
6 6 3 6 6
+ 0 × 120 =
1 1 1 1 1
0 × 40 + × 50 + 0 × 80 + × 35 + × 255 + × 70 + 0 × 30 + × 45
6 6 3 6 6
+ 0 × 80 = 118
and so on …
30 40 50 70 90
40 85 65 61 100
Filtered image = 35 118 92 58 120
30 84 77 89 130
40 50 90 125 140
Page 3
Image Processing Lecture 6
Spatial Filters
Spatial filters can be classified by effect into:
1. Smoothing Spatial Filters: also called lowpass filters. They include:
1.1 Averaging linear filters
1.2 Order-statistics nonlinear filters.
2. Sharpening Spatial Filters: also called highpass filters. For example,
the Laplacian linear filter.
1 1 1 1 2 1
1 1 1 1 1 2 4 2
× ×
9 16
1 1 1 1 2 1
Standard average filter Weighted average filter
Note:
Weighted average filter has different coefficients to give more
importance (weight) to some pixels at the expense of others. The idea
behind that is to reduce blurring in the smoothing process.
Page 5
Image Processing Lecture 6
(a) (b)
(c) (d)
(e) (f)
Figure 6.2 Effect of averaging filter. (a) Original image. (b)-(f) Results of smoothing with
square averaging filter masks of sizes n = 3,5,9,15, and 35, respectively.
Page 6
Image Processing Lecture 6
Order-statistics filters
are nonlinear spatial filters whose response is based on ordering (ranking)
the pixels contained in the neighborhood, and then replacing the value of
the center pixel with the value determined by the ranking result.
Examples include Max, Min, and Median filters.
Median filter
It replaces the value at the center by the median pixel value in the
neighborhood, (i.e. the middle element after they are sorted). Median
filters are particularly useful in removing impulse noise (also known as
salt-and-pepper noise). Salt = 255, pepper = 0 gray levels.
In a 3×3 neighborhood the median is the 5th largest value, in a 5×5
neighborhood the 13th largest value, and so on.
For example, suppose that a 3×3 neighborhood has gray levels (10,
20, 0, 20, 255, 20, 20, 25, 15). These values are sorted as
(0,10,15,20,20,20,20,25,255), which results in a median of 20 that
replaces the original pixel value 255 (salt noise).
Page 7
Image Processing Lecture 6
Example:
Consider the following 5×5 image:
20 30 50 80 100
30 20 80 100 110
25 255 70 0 120
30 30 80 100 130
40 50 90 125 140
Apply a 3×3 median filter on the shaded pixels, and write the filtered
image.
Solution
20 30 50 80 100 20 30 50 80 100
30 20 80 100 110 30 20 80 100 110
25 255 70 0 120 25 255 70 0 120
30 30 80 100 130 30 30 80 100 130
40 50 90 125 140 40 50 90 125 140
Sort: Sort
20, 25, 30, 30, 30, 70, 80, 80, 255 0, 20, 30, 70, 80, 80, 100, 100, 255
20 30 50 80 100
30 20 80 100 110
25 255 70 0 120
30 30 80 100 130
40 50 90 125 140
Sort
0, 70, 80, 80, 100, 100, 110, 120, 130
20 30 50 80 100
30 20 80 100 110
Filtered Image = 25 30 80 100 120
30 30 80 100 130
40 50 90 125 140
Page 8
Image Processing Lecture 6
(a) (b)
(c)
Figure 6.3 Effect of median filter. (a) Image corrupted by salt & pepper noise. (b) Result of
applying 3×3 standard averaging filter on (a). (c) Result of applying 3×3 median filter on (a).
Page 9
Image Processing Lecture 6
= ( + 1, ) − ( , ) and = ( , + 1) − ( , )
The second order partial derivatives of the digital image f(x,y) are:
= ( + 1, ) + ( − 1, ) − 2 ( , )
= ( , + 1) + ( , − 1) − 2 ( , )
Page 10
Image Processing Lecture 6
We conclude that:
• 1st derivative detects thick edges while 2nd derivative detects thin
edges.
• 2nd derivative has much stronger response at gray-level step than 1st
derivative.
Thus, we can expect a second-order derivative to enhance fine detail (thin
lines, edges, including noise) much more than a first-order derivative.
Page 11
Image Processing Lecture 6
∇ = +
Page 12
Image Processing Lecture 6
(a) (b)
(c)
Figure 6.5 Example of applying Laplacian filter. (a) Original image. (b) Laplacian image.
(c) Sharpened image.
Page 13
Image Processing Lecture 7
1
( , )= ( , ) cos 2 +
− sin 2 +
1
( , )= ( , ) cos 2 + is called real part
−1
( , )= ( , ) sin 2 + is called imaginary part
Page 1
Image Processing Lecture 7
The figure below shows a gray image and its centered Fourier spectrum.
(a)
(b)
Figure 7.1 (a) Gray image. (b) Centered Fourier spectrum of (a)
Page 2
Image Processing Lecture 7
Phase spectrum
Phase data contains information about where objects are in the image, i.e.
it holds spatial information as shown in the Figure below.
Page 3
Image Processing Lecture 7
Inverse 2D-DFT
After performing the Fourier transform, if we want to convert the image
from the frequency domain back to the original spatial domain, we apply
the inverse transform. The inverse 2D-DFT is defined as:
( , )= ( , ) cos 2 +
+ sin 2 +
Page 4
Image Processing Lecture 7
Page 5
Image Processing Lecture 7
( , )= ( )/
Low-pass filter
( , )= ( )/
(1 − ) High-pass filter
The results of applying these two filters on the image in Figure 6.1(a) are
shown in the figure below.
Page 6
Image Processing Lecture 7
Figure 7.5 Result of highpass filter modified by adding 0.75 to the filter
Page 7
Image Processing Lecture 8
The ILPF indicates that all frequencies inside a circle of radius D0 are
passed with no attenuation, whereas all frequencies outside this circle are
completely attenuated.
The next figure shows a gray image with its Fourier spectrum. The circles
superimposed on the spectrum represent cutoff frequencies 5, 15, 30, 80
and 230.
Page 1
Image Processing Lecture 8
(a) (b)
Figure 8.2 (a) Original image. (b) its Fourier spectrum
The figure below shows the results of applying ILPF with the previous
cutoff frequencies.
(a) (b)
(c) (d)
Page 2
Image Processing Lecture 8
(e) (f)
Figure 8.3 (a) Original image. (b) - (f) Results of ILPF with cutoff frequencies 5, 15, 30, 80,
and 230 respectively.
Page 3
Image Processing Lecture 8
Unlike ILPF, the BLPF transfer function does not have a sharp transition
that establishes a clear cutoff between passed and filtered frequencies.
Instead, BLPF has a smooth transition between low and high frequencies.
The figure below shows the results of applying BLPF of order 2 with the
same previous cutoff frequencies.
(a) (b)
(c) (d)
Page 4
Image Processing Lecture 8
(e) (f)
Figure 8.5 (a) Original image. (b) - (f) Results of BLPF of order n = 2 with cutoff frequencies
5, 15, 30, 80, and 230 respectively.
(a) (b)
Page 5
Image Processing Lecture 8
(c) (d)
Figure 8.6 (a) Result of BLPF with order 5. (b) BLPF of order 5. (c) Result of BLPF with
order 20. (d) BLPF of order 20. (cutoff frequency 30 in both cases).
BLPF is the preferred choice in cases where the tight control of the
transition between low and high frequencies are needed. However, the
side effect of this control is the possibility of ringing.
( , )= ( , )/
Unlike ILPF, the GLPF transfer function does not have a sharp transition
that establishes a clear cutoff between passed and filtered frequencies.
Page 6
Image Processing Lecture 8
Instead, GLPF has a smooth transition between low and high frequencies.
The figure below shows the results of applying GLPF.
(a) (b)
(c) (d)
(e) (f)
Figure 8.8 (a) Original image. (b) - (f) Results of GLPF with cutoff frequencies 5, 15, 30, 80,
and 230 respectively.
Page 7
Image Processing Lecture 8
(a) (b)
Figure 8.9 (a) Text of poor resolution. (b) Result of applying GLPF with cutoff=80 on (a)
GLPF can also be used for cosmetic processing prior to printing and
publishing as shown in the next figure.
(a) (b)
Figure 8.10 (a) Original image. (b) Result of filtering using GLPF with cutoff=80
Page 9
Image Processing Lecture 8
The IHPF sets to zero all frequencies inside a circle of radius D 0 while
passing, without attenuation, all frequencies outside the circle.
The next figure shows the results of applying IHPF with cutoff
frequencies 15, 30, and 80.
Page 10
Image Processing Lecture 8
(a) (b)
(c) (d)
Figure 8.12 (a) Original image. (b) - (d) Results of IHPF with cutoff frequencies 15, 30, and
80 respectively.
Page 11
Image Processing Lecture 8
The figure below shows the results of applying BHPF with cutoff
frequencies 15, 30 and 80.
(a) (b)
(c) (d)
Figure 8.14 (a) Original image. (b) - (d) Results of BHPF with cutoff frequencies 15, 30, and
80 respectively.
( , )=1− ( , )/
Page 13
Image Processing Lecture 8
The figure below shows the results of applying GHPF with cutoff
frequencies 15, 30 and 80.
(a) (b)
(c) (d)
Figure 8.16 (a) Original image. (b) - (d) Results of GHPF with cutoff frequencies 15, 30, and
80 respectively.
Page 14
Image Processing Lecture 9
(a) (b)
Figure 9.1 Functions of (a) Fourier transform and (b) Wavelet transform
Page 1
Image Processing Lecture 9
Image Pyramid
is a powerful simple structure for representing images at more than one
resolution. It is a collection of decreasing resolution images arranged in
the shape of a pyramid as shown in the figure below.
Subband Coding
is used to decompose an image into a set of bandlimited components
called subbands, which can be reassembled to reconstruct the original
image without error. Each subband is generated by bandpass filtering the
input image. The next figures show 1D and 2D subband coding.
Page 2
Image Processing Lecture 9
Page 3
Image Processing Lecture 9
For example, if f={f1,f2,f3,f4 ,f5 ,f6 ,f7 ,f8 } is a time-signal of length 8, then
the HWT decomposes f into an approximation subband containing the
Low frequencies and a detail subband containing the high frequencies:
Low = a = { + , + , + , + }/√2
High = d = { − , − , − , − }/√2
Page 4
Image Processing Lecture 9
The next figure shows an image decomposed with 3-level Haar wavelet
transform.
(b) Level 1
Page 6
Image Processing Lecture 9
(c) Level 2
(d) Level 3
Figure 9.6 Example of a Haar wavelet transformed image
Page 7
Image Processing Lecture 9
(a) (b)
Page 8
Image Processing Lecture 9
(c) (d)
Figure 9.7 Histogram of (a) LL-subband (b) HL-subband (c) LH-subband (d) HH-subband of
subbands in Figure 9.6 (b)
Page 9
Image Processing Lecture 9
(a)
(b)
Figure 9.8 (a) gray image. (b) its one-level wavelet transform
Page 10
Image Processing Lecture 9
Note the horizontal edges of the original image are present in the HL
subband of the upper-right quadrant of the Figure above. The vertical
edges of the image can be similarly identified in the LH subband of the
lower-left quadrant.
To combine this information into a single edge image, we simply zero the
LL subband of the transform, compute the inverse transform, and take the
absolute value.
The next Figure shows the modified transform and resulting edge image.
(a)
Page 11
Image Processing Lecture 9
(b)
Figure 9.9 (a) transform modified by zeroing the LL subband. (b) resulted edge image
Page 12
Image Processing Lecture 9
(a)
(b)
Figure 9.10 (a) noisy image. (b) its two-level wavelet transform
Page 13
Image Processing Lecture 9
Page 14
Image Processing Lecture 9
Page 15
Image Processing Lecture 10
Image Restoration
Image restoration attempts to reconstruct or recover an image that has
been degraded by a degradation phenomenon. Thus, restoration
techniques are oriented toward modeling the degradation and applying
the inverse process in order to recover the original image. As in image
enhancement, the ultimate goal of restoration techniques is to improve an
image in some predefined sense.
Page 1
Image Processing Lecture 10
( , ) = ℎ( , ) ∗ ( , ) + ( , )
Page 2
Image Processing Lecture 10
Noise Models
Spatial noise is described by the statistical behavior of the gray-level
values in the noise component of the degraded image. Noise can be
modeled as a random variable with a specific probability distribution
function (PDF). Important examples of noise models include:
1. Gaussian Noise
2. Rayleigh Noise
3. Gamma Noise
4. Exponential Noise
5. Uniform Noise
6. Impulse (Salt & Pepper) Noise
Page 3
Image Processing Lecture 10
Gaussian Noise
The PDF of Gaussian noise is given by
1 ( ) /
( )=
√2
where z is the gray value, μ is the mean and σ is the standard deviation.
Rayleigh Noise
The PDF of Rayleigh noise is given by
( ) /
( ) = (2/ )/( − ) ≥
0 <
Page 4
Image Processing Lecture 10
If b > a, then gray level b appears as a light dot (salt), otherwise the gray
level a appears as a dark dot (pepper).
The next figure shows degraded (noisy) images resulted from adding the
previous noise models to the above test pattern image.
Page 5
Image Processing Lecture 10
Figure 10.6 Images and histograms from adding Gaussian, Rayleigh, Gamma, Exponential,
Uniform, and Salt & Pepper noise.
Page 6
Image Processing Lecture 10
Page 7
Image Processing Lecture 10
Order-Statistics Filters
We have used one of these filters (i.e. median) in the image enhancement.
We now use additional filters (min and max) in image restoration.
Min filter
This filter is useful for finding the darkest points in an image. Also, it
reduces salt noise as a result of the min operation.
(a) (b)
Figure 10.8 (a) image corrupted by salt noise. (b) Result of filtering (a) with a 3×3 min filter.
Max filter
This filter is useful for finding the brightest points in an image. Also,
because pepper noise has very low values, it is reduced by this filter as a
result of the max operation.
Page 8
Image Processing Lecture 10
(a) (b)
Figure 10.9 (a) image corrupted by pepper noise. (b) Result of filtering (a) with a 3×3 max
filter.
Adaptive Filters
The previous spatial filters are applied regardless of local image
variation. Adaptive filters change their behavior using local statistical
parameters in the mask region. Consequently, adaptive filters outperform
the non-adaptive ones.
Page 9
Image Processing Lecture 10
Page 10
Image Processing Lecture 10
(a)
(b) (c)
Figure 10.10 (a) Image corrupted by salt&pepper noise with density 0.25. (b) Result obtained
using a 7×7 median filter. (c) Result obtained using adaptive median filter with Smax = 7.
From this example, we find that the adaptive median filter has three main
purposes:
1. to remove salt-and-pepper (impulse) noise.
2. to provide smoothing of other noise that may not be impulsive.
3. to reduce distortion, such as excessive thinning or thickening of
object boundaries.
Page 11
Image Processing Lecture 11
Page 1
Image Processing Lecture 11
(a) (b)
Page 3
Image Processing Lecture 11
Structuring Element
A morphological operation is based on the use of a filter-like binary
pattern called the structuring element of the operation. Structuring
element is represented by a matrix of 0s and 1s; for simplicity, the zero
entries are often omitted.
Symmetric with respect to its origin:
Lines:
0 0 0 0 1 1
0 0 0 1 0 1 1
0 0 1 0 0 = 1 1
0 1 0 0 0 1 1
1 0 0 0 0 1
Diamond:
0 1 0
1 1 1
0 1 0
Non-symmetric:
1 1
1 1 1 1 1 1 Reflection 1 1
1 1 on origin 1 1 1 1 1 1
1 1
Dilation
Dilation is an operation used to grow or thicken objects in binary images.
The dilation of a binary image A by a structuring element B is defined as:
⊕ = { ∶ B̂ ∩ ≠∅}
This equation is based on obtaining the reflection of B about its origin
and translating (shifting) this reflection by z. Then, the dilation of A by B
Page 4
Image Processing Lecture 11
is the set of all structuring element origin locations where the reflected
and translated B overlaps with A by at least one element.
Solution:
We find the reflection of B:
B= 1 In this case B̂ = B
1
1
1
1
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 1 1 1 1 1 1 0
0 0 0 0 1 1 1 1 1 1 1 0
0 0 0 1 1 1 1 1 1 1 1 0
⊕ = 0 0 1 1 1 1 1 1 1 1 0 0
0 1 1 1 1 1 1 1 1 0 0 0
0 1 1 1 1 1 1 1 0 0 0 0
0 1 1 1 1 1 1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
Page 5
Image Processing Lecture 11
(a)
(b)
Page 6
Image Processing Lecture 11
Erosion
Erosion is used to shrink or thin objects in binary images. The erosion of
a binary image A by a structuring element B is defined as:
⊖ ={ ∶ ( ) ∩ ≠∅}
The erosion of A by B is the set of all structuring element origin locations
where the translated B does not overlap with the background of A.
Solution
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
⊖ = 0 0 0 1 1 1 1 1 1 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
Page 7
Image Processing Lecture 11
(a) (b)
Figure 11.4 (a) Binary image. (b) Eroded image.
(a)
Page 8
Image Processing Lecture 11
(b)
(c)
Figure 11.5 (a) Original binary image. (b) Result of opening with square structuring element
of size 10 pixels. (c) Result of opening with square structuring element of size 20 pixels.
(a) (b)
Figure 11.6 (a) Original binary image. (b) Result of opening with square structuring element
of size 13 pixels.
Page 9
Image Processing Lecture 11
(a)
(b)
Figure 11.7 (a) Result of closing with square structuring element of size 10 pixels. (c) Result
of closing with square structuring element of size 20 pixels.
Page 10
Image Processing Lecture 11
(a)
(b) (c)
Figure 11.8 (a) Noisy fingerprint. (b) Result of opening (a) with square structuring element of
size 3 pixels. (c) Result of closing (b) with the same structuring element.
Note that the noise was removed by opening the image, but this process
introduced numerous gaps in the ridges of the fingerprint. These gaps can
be filled by following the opening with a closing operation.
Page 11
Image Processing Lecture 12
⊛ =( ⊖ )∩( ⊖ )
This transform is useful in locating all pixel configurations that match the
B1 structure (i.e a hit) but do not match that of B2 (i.e. a miss). Thus, the
hit-or-miss transform is used for shape detection.
0 1 0 00000000000
1 1 1 00100000000 1
0 1 0 00100111100 1 1 1
Shap 01110000000 1
00100001100 B1
00001001110
00011100100 1 1
00001000000
00000000000 1 1
Image A B2
Solution:
Page 1
Image Processing Lecture 12
00000000000 11111111111
A ⊖ B1 = 00000000000 Ac = 11011111111
00000000000 11011000011
00100000000 10001111111
00000000000 11011110011
00000000100 11110110001
00001000000 11100011011
00000000000 11110111111
00000000000 11111111111
10101111111 00000000000
Ac ⊖ B2=
10100000001 00000000000
00000111111 A⊛B= 00000000000
10100000001 00100000000
00000000000 00000000000
10000000001 00000000000
11101000000 00001000000
11000000101 00000000000
11101011111 00000000000
(a) (b)
Figure 12.1 (a) Binary image. (b) Result of applying hit-or-miss transform.
Boundary Extraction
The boundary of a set A, denoted by (A), can be obtained by:
( )= −( ⊖ )
where B is the structuring element.
The figure below shows an example of extracting the boundary of an
object in a binary image.
(a) (b)
Figure 12.2 (a) Binary image. (b) Object boundary extracted
using the previous equation and 3×3 square structuring element.
Note that, because the size of structuring element is 3×3 pixels, the
resulted boundary is one pixel thick. Thus, using 5×5 structuring element
will produce a boundary between 2 and 3 pixels thick as shown in the
next figure.
Page 3
Image Processing Lecture 12
Thinning
Thinning means reducing binary objects or shapes in an image to strokes
that are a single pixel wide. The thinning of a set A by a structuring
element B, is defined as:
⊗ = −( ⊛ )
= ∩( ⊛ )
Since we only match the pattern (shape) with the structuring elements, no
background operation is required in the hit-or-miss transform.
Here, B is a sequence of structuring elements:
{ }={ , , ,…, }
where Bi is the rotation of Bi-1. Thus, the thinning equation can be written
as:
⊗ { } = ((… ( ⊗ )⊗ …) ⊗ )
The entire process is repeated until no further changes occur. The next
figure shows an example of thinning the fingerprint ridges so that each is
one pixel thick.
Page 4
Image Processing Lecture 12
(a) (b)
(c) (d)
Figure 12.4 (a) Original fingerprint image. (b) Image thinned once. (c) Image thinned twice.
(d) Image thinned until stability (no changes occur).
( )= ( )
with
( )=( ⊖ )−( ⊖ )∘
where B is a structuring element, and (A ⊖ kB) indicates k successive
erosions of A:
Page 5
Image Processing Lecture 12
( ⊖ ) = (… ( ⊖ ) ⊖ ) ⊖ … ) ⊖
(a) (b)
Figure 12.5 (a) Bone image. (b) Skeleton extracted from (a).
Gray-scale Morphology
The basic morphological operations of dilation, erosion, opening and
closing can also be applied to gray images.
Gray-scale Dilation
The gray-scale dilation of a gray-scale image f by a structure element b is
defined as:
( ⊕ )( , ) = max{ ( − , − )+ ( , )| ( , )∈ }
where Db is the domain of the structuring element b. This process
operates in the same way as the spatial convolution.
Page 6
Image Processing Lecture 12
The figure below shows the result of dilating a gray image using a 3×3
square structuring element.
(a) (b)
Figure 12.6 (a) Original gray image. (b) Dilated image.
Gray-scale Erosion
The gray-scale erosion of a gray-scale image f by a structure element b is
defined as:
( ⊖ )( , ) = min{ ( + , + )− ( , )| ( , )∈ }
The next figure shows the result of eroding a gray image using a 3×3
square structuring element.
Page 7
Image Processing Lecture 12
(a) (b)
Figure 12.7 (a) Original gray image. (b) Eroded image.
We can see that gray-scale erosion produces the following:
1. Dark image. 2. Small, bright details were reduced.
(a) (b)
Figure 12.8 (a) Original gray image. (b) Opened image.
Page 8
Image Processing Lecture 12
Note the decreased sizes of the small, bright details, with no appreciable
effect on the darker gray levels.
The figure below shows the result of closing a gray image.
(a) (b)
Figure 12.9 (a) Original gray image. (b) Closed image.
Note the decreased sizes of the small, dark details, with relatively little
effect on the bright features.
(a) (b)
Figure 12.10 (a) Original gray image. (b) Morphological smoothed image.
Page 9
Image Processing Lecture 12
Morphological gradient
is produced from subtracting an eroded image from its dilated version. It
is defined as:
=( ⊕ )−( ⊖ )
The resulted image has edge-enhancement characteristics, thus
morphological gradient can be used for edge detection as shown in the
figure below.
(a) (b)
Figure 12.11 (a) Original gray image. (b) Morphological gradient.
Image Segmentation
is one of image analysis methods used to subdivide an image into its
regions or objects depending on the type of shapes and objects searched
for in the image. Image segmentation is an essential first step in most
automatic pictorial pattern recognition and scene analysis tasks.
Segmentation Approaches
Image segmentation algorithms are based on one of two basic properties
of gray-level values: discontinuity and similarity.
• In the first category, the approach is to partition an image based on
abrupt discontinuity (i.e. change) in gray level, such as edges in an
image.
• In the second category, the approaches are based on partitioning an
image into regions that are similar according to a set of predefined
criteria.
Page 1
Image Processing Lecture 13
Point Detection
This is concerned with detecting isolated image points in relation to its
neighborhood which is an area of nearly constant gray level.
1. Simple method
The simplest point detection method works in two steps:
1. Filter the image with the mask:
-1 -1 -1
-1 8 -1
-1 -1 -1
(a)
Figure 13.1 Example of point detection using simple method. (a) Original face image.
(b)-(g) Results with different Thresholds
Page 3
Image Processing Lecture 13
2. Alternative method
An alternative approach to the simple method is to locate the points in a
window of a given size where the difference between the max and the
min value in the window exceeds a given threshold. This can be done
again in two steps:
1. Obtain the difference between the max value (obtained with the
order statistics max filter) and the min value (obtained with the
order statistics min filter) in the given size mask.
2. On the output image apply an appropriate threshold (e.g. the
maximum pixel value).
The figure below shows an example of point detection in a face image
using the alternative method.
(a)
Figure 13.2 Example of point detection using alternative method. (a) Original face image.
(b)-(e) Results with different Thresholds
Page 4
Image Processing Lecture 13
Line Detection
Detecting a line in a certain direction require detecting adjacent points in
the image in the given direction. This can be done using filters that yields
significant response at points aligned in the given direction.
For example, the following filters
-1 2 -1 -1 -1 -1
-1 2 -1 2 2 2
-1 2 -1 -1 -1 -1
-1 -1 2 2 -1 -1
-1 2 -1 -1 2 -1
2 -1 -1 -1 -1 2
The next figure illustrates an example of line detection using the filters
above.
Page 5
Image Processing Lecture 13
(a)
(b) (c)
(d) (e)
Figure 13.3 Example of line detection. (a) Original image. (b)-(e) Detected lines in the
vertical, horizontal, +45° direction , and – 45° direction, respectively.
Page 6
Image Processing Lecture 13
Edge detection
Edge detection in images aims to extract meaningful discontinuity in
pixel gray level values. Such discontinuities can be deduced from first
and second derivatives as defined in Laplacian filter.
The 1st-order derivative of an image f(x,y) is defined as:
⎡ ⎤
⎢ ⎥
∇ = =⎢ ⎥
⎢ ⎥
⎣ ⎦
Its magnitude is defined as:
∇ = +
Page 7
Image Processing Lecture 13
To detect:
• Horizontal edges, we filter the image f using the left mask above.
• Vertical edges, we filter the image f using the right mask above.
• Edges in both directions, we do the following:
1. Filter the image f with the left mask to obtain Gx
2. Filter the image f again with the right mask to obtain Gy
3. Compute = + or ≈| |+
In all cases, we then take the absolute values of the filtered image, then
apply an appropriate threshold.
The next figure shows an example of edge detection using the Sobel
detector.
Page 8
Image Processing Lecture 13
(a) (b)
(c) (d)
Figure 13.4 Example of Sobel edge detection. (a) Original image.
(b)-(d) Edges detected in vertical, horizontal, and both directions, respectively.
Page 9
Image Processing Lecture 13
(a) (b)
(c) (d)
Figure 13.5 Example of Prewitt edge detection. (a) Original image.
(b)-(d) Edges detected in vertical, horizontal, and both directions, respectively.
We can see that the Prewitt detector produces noisier results than the
Sobel detector. This is because the coefficient with value 2 in the Sobel
detector provides smoothing.
Page 10
Image Processing Lecture 14
Image Compression
• Image compression means the reduction of the amount of data
required to represent a digital image by removing the redundant data.
It involves reducing the size of image data files, while retaining
necessary information.
• Mathematically, this means transforming a 2D pixel array (i.e. image)
into a statistically uncorrelated data set. The transformation is applied
prior to storage or transmission of the image. At later time, the
compressed image is decompressed to reconstruct the original
(uncompressed) image or an approximation of it.
• The ratio of the original (uncompressed) image to the compressed
image is referred to as the Compression Ratio CR:
= =
where
= × ×
=
Example:
Consider an 8-bit image of 256×256 pixels. After compression, the image
size is 6,554 bytes. Find the compression ratio.
Solution:
Usize = (256 × 256 × 8) / 8 = 65,536 bytes
Compression Ratio = 65536 / 6554 = 9.999 ≈ 10 (also written 10:1)
This means that the original image has 10 bytes for every 1 byte in the
compressed image.
Page 1
Image Processing Lecture 14
Fidelity Criteria
These criteria are used to assess (measure) image fidelity. They quantify
the nature and extent of information loss in image compression. Fidelity
criteria can be divided into classes:
1. Objective fidelity criteria
2. Subjective fidelity criteria
Page 2
Image Processing Lecture 14
1
= ( , )− ( , )
Page 3
Image Processing Lecture 14
Page 4
Image Processing Lecture 14
Lossless compression
• It allows an image to be compressed and decompressed without losing
information (i.e. the original image can be recreated exactly from the
compressed image).
• This is useful in image archiving (as in the storage of legal or medical
records).
• For complex images, the compression ratio is limited (2:1 to 3:1). For
simple images (e.g. text-only images) lossless methods may achieve
much higher compression.
• An example of lossless compression techniques is Huffman coding.
Page 5
Image Processing Lecture 14
Huffman Coding
is a popular technique for removing coding redundancy. The result after
Huffman coding is variable length code, where the code words are
unequal length. Huffman coding yields the smallest possible number of
bits per gray level value.
Example:
Consider the 8-bit gray image shown below. Use Huffman coding
technique for eliminating coding redundancy in this image.
119 123 168 119
123 119 168 168
119 119 107 119
107 107 119 119
Solution:
Gray level Histogram Probability
119 8 0.5
168 3 0.1875
107 3 0.1875
123 2 0.125
1 1
0.5 0.5 0.5 1
01
0.1875 00 0.3125 0.5 0
011 00
0.1875 0.1875
0.125 010
Page 6
Image Processing Lecture 14
Lookup table:
Gray level Probability Code
119 0.5 1
168 0.1875 00
107 0.1875 011
123 0.125 010
We use this code to represent the gray level values of the compressed
image:
1 010 00 1
010 1 00 00
1 1 011 1
011 011 1 1
Hence, the total number of bits required to represent the gray levels of the
compressed image is 29-bit: 10101011010110110000011110011.
Whereas the original (uncompressed) image requires 4*4*8 = 128 bits.
Compression ratio = 128 / 29 ≈ 4.4
Lossy compression
• It allows a loss in the actual image data, so the original uncompressed
image cannot be recreated exactly from the compressed image).
• Lossy compression techniques provide higher levels of data reduction
but result in a less than perfect reproduction of the original image.
• This is useful in applications such as broadcast television and
videoconferencing. These techniques can achieve compression ratios
of 10 or 20 for complex images, and 100 to 200 for simple images.
• An example of lossy compression techniques is JPEG compression
and JPEG2000 compression.
Page 7
Image Processing Lecture 16
Object Recognition
The automatic recognition of objects or patterns is one of the important
image analysis tasks. The approaches to pattern recognition are divided
into two principal areas:
• Decision-theoretic methods: deal with patterns described using
quantitative descriptors, such as length, area, and texture.
• Structural methods: deal with patterns best described by qualitative
descriptors (symbolic information), such as the relational
descriptors.
= ⋮
Page 1
Image Processing Lecture 16
Matching
Recognition techniques based on matching represent each class by a
prototype pattern vector. Set of patterns of known classes is called the
training set. Set of patterns of unknown classes is called the testing set.
An unknown pattern is assigned to the class to which it is closest in terms
of a predefined metric. The simplest approach is the minimum-distance
classifier, which, as its name implies, computes the (Euclidean) distance
between the unknown and each of the prototype vectors. Then, it chooses
the smallest distance to make a decision.
Page 2
Image Processing Lecture 16
Let the set F = { fi,1, fi,2, fi,3, . . . , fi,m} be a training set of face
images of n subjects, where each subject i has m images. In the enrolment
stage, wavelet transform is applied on each training image so that a set
Wk(F) of multi-resolution decomposed images result. A new set LLk(F) of
all k-level LL-subbands will be obtained from the transformed face
images in the set Wk(F). The new set LLk(F) forms the set of features for
the training images. Thus, the training face image 1 of subject i (fi,1) is
Page 3
Image Processing Lecture 16
Si = min (Si,j) (j = 1, . . . , m)
Structural methods
Structural recognition techniques are based on representing objects as
strings, trees or graphs and then defining descriptors and recognition rules
based on those representations.
The key difference between decision-theoretic and structural
methods is that the former uses quantitative descriptors expressed in the
form of numeric vectors, while the structural techniques deal with
symbolic information.
Page 4