A Survey On Edge Detection Methods PDF
A Survey On Edge Detection Methods PDF
29 February 2010
2.2 Edge
Edge is a part of an image that contains significant variation. The edges provide important
visual information since they correspond to major physical, photometrical or geometrical
variations in scene object. Physical edges are produced by variation in the reflectance,
illumination, orientation, and depth of scene surfaces. Since image intensity is often
proportional to scene radiance, physical edges are represented by changes in the intensity
function of an image [2].
The most common edge types are steps, lines and junctions. The step edges are mainly
produced by a physical edge, an object hiding another or a shadow on a surface. It generally
occurs between two regions having almost constant, but different, grey levels. The step edges
are the points at which the grey level discontinuity occurs, and localized at the inflection
points. They can be detected by using the gradient of intensity function of the image. Step
edges are localized as positive maxima or negative minima of the first-order derivative or as
zero-crossings of the second-order derivative (Figure 1). It is more realistic to consider a step
edge as a combination of several inflection points. The most commonly used edge model is
the double step edge. There are two types of double edges: the pulse and the staircase (Figure
2).
The line edges are often created by either a mutual illumination between two objects that are
in contact or a thin object placed over a background. Line edges correspond to local extremes
in the intensity function. Lines correspond to local extrema of the image. They are localized
as zero-crossings of the first derivative, or local maxima of the Laplacian, or local maxima of
the grey level variance of the smoothed image. This type of edge is successfully used in
remote sensing images for instance to detect roads and rivers [2]. Finally, the junction edge is
formed where two or more edges meet together. A physical corner is formed at the junction of
at least two physical edges. Illumination effects or occlusion, in which an edge occludes
another, can produce a junction edge. Figure 3 depicts profiles of line and junction edges. The
junction can be localized in various ways: e.g., a point with high curvature, or a point with
great variation in gradient direction, or a zero-crossing of the Laplacian with high curvature or
near an elliptic extremum. Though, the most of our studies encompass the all types of edges,
but the majority of the reviewed literature is adapted to step edges, which are the most
common.
The edges extracted from a 2D image of a 3D scene can be classified as either viewpoint
dependent or viewpoint independent. A viewpoint independent edge typically reflects
inherent properties of the 3D objects, such as surface markings and surface shape. A
viewpoint dependent edge may change as the viewpoint changes, and typically reflects the
geometry of the scene, such as objects occluding one another [1].
2 g g x g xx 2 g x g y g xy g y g yy
2 2
(5)
r 2 g x2 g y2
2 g 2 g 2 g
g xx , g , g (6)
x 2 y 2 xy
yy xy
The Laplacian of g, defined in (7), is an estimation of the second order derivative along the
gradient direction. In the context of edge detection, it is shown that the Laplacian is a good
approximation to the second derivative along the gradient direction, providing that the
curvature of the line of constant intensity that crosses the point under consideration is small.
Moreover, the Laplacian is useless in the detection of junction edges (zones of high curvature)
[3].
2 g g xx g yy (7)
There are, at least, three major advantages of using the Laplacian in relation to the second
derivative along the gradient direction. First, it is simple to use, since it only requires the
computation of two second order derivatives. Second, it is a linear operator, in opposite to the
second derivative, which is non-linear. Finally, but not less important, the Laplacian is a non-
directional operator. This characteristic avoids the necessity to determine the most
appropriated direction to apply the operator.
1 0 1 1 1 1
H c 1 0 1 , H r 0 0 0
1 1
(14)
3 3
1 0 1 1 1 1
Sobel:
1 0 1 1 2 1
1 1
H c 2 0 2 , H r 0 0 0 (15)
4 4
1 0 1 1 2 1
Frei-Chen (isotropic):
1 0 1 1 2 1
1 , H 1
Hc 2 0 2 0 0 0 (16)
2 2 2 2
r
1 0 1
1 2 1
As seen, except the former, the proposed operators are all odd and based on the image column
and row directions. The Roberts operator is calculated using a set of axes rotated 45 degrees is
relation to the usual orientation of the column and row. To use these operators, we perform an
internal product between the respective mask and the image, as follows
g (c, r ) g (c i, r j ).( H )ij (17)
i j
All of the above-mentioned approximations have the final objective of calculating the
gradient using (3) and (4). Despite the fact that it is enough to compute two directional
derivatives in order to calculate the gradient, some researchers, for noise suppression reasons,
have used more than two directional derivatives [3]. In this case, the gradient would be
(18)
The arrows show the directions of the derivatives approximated by the masks. As can be seen
easily, these masks are generated by rotations of 45 degrees of the elements around the central
element. Other sets of directional masks can be obtained using similar rotations of the
orthogonal masks of Prewitt and Sobel. The angular resolution allowed by a 3x3 operator is,
at most, of 45 degrees. This means that we are only able to distinguish four different
directions. For larger angular resolutions we have to use masks with a larger spatial support.
The second order differences are the simplest approximation to the second order derivative.
We define the second differences along the main axes as
g cc (c, r ) g c (c 1, r ) g c (c, r )
(19)
g rr (c, r ) g r (c, r 1) g r (c, r )
by substituting we obtain:
g cc (c, r ) g (c 1, r ) 2 g (c, r ) g (c 1, r )
(20)
g rr (c, r ) g (c, r 1) 2 g (c, r ) g (c, r 1)
that can be represented by the following mask:
0 0 0 0 1 0
H cc 1 2 1 , H rr 0 2 0 (21)
0 0 0 0 1 0
Using the definition of Laplacian and (20) we obtain a discrete approximation to the
Laplacian given by
0 1 0
H ccrr H cc H rr 1 4 1 (22)
0 1 0
3.3 Convolution
Convolution, in mathematics, is an operation on two function producing third function that is
typically viewed as a modified version of one of the original functions. The discrete
convolution of 2D function f and g is given by
( f g) f (i. j) g (r i, c j)
i j
(23)
In image processing, the convolution is a general purpose filter that allows producing a range
of effects by specifying a set of convolution kernels. It works by determining new value for a
pixel by adding weighted values of all its neighbouring pixels together. The applied weights
are determined by a 2D array called convolution kernel or mask. Comparison of (17) and (23)
d df d2 d2 f
( f g) g , 2 ( f g) 2 g (24)
dx dx dx dx
Moreover, we can use the second derivative of the kernel to localize the step. As it is depicted
in Figure 6, zero-crossing in the modified function denotes the step. The second derivative of
the Gaussian function is employed as the kernel of convolution. This example reveals that the
differentiation and modification (we will call it hereafter filtering and discuss in detail in next
sub-section) of an image is realized with convolution.
Figure 4 - first derivative of a function that is affected by the noise is not enough to localize a
step within it
Figure 6 zero-crossing in the modified function by a convolution with the second derivative
of the kernel represents the step
Convolution allows us to study the effect of modifications on an image in both spatial and
frequency domains. The use of a Fourier transform to convert images from the spatial to the
frequency domain makes possible another class of filtering operations. Considering as the
Furrier transform operator, the formulation (25) help us to study the effect of applied filters
on an image in the frequency domain. This property of the convolution is very helpful in the
image smoothing which is going to be discussed in the next sub-section.
( f g ) ( f ).( g ) (25)
In practice, pixels located in the border of the image can raise problem, since convolution
kernel extends beyond the borders. A common technique to cope with this problem, usually
referred to zero boundary superposition, is simply to ignore the problematic pixels and to
perform the convolution operation only on those pixels that are located at a sufficient distance
from the borders. This method has the disadvantage of producing an output image that is
smaller than the input image. Another option is performing the best job with the pixels near
the boundary. For example, pixels on the corner only have about a quarter of the neighbours
to use in the convolution that we have for pixels where the full filter can be used. The sum
over those neighbouring pixels should be normalized by the actual number of pixels used in
the sum. This normalization avoids overflow in the output pixels.
g ( c, r )
g (c, r ) (26)
The Sobel operator is the most known among the classical methods. The Sobel edge detector
applies 2D spatial gradient convolution operation on an image. It uses the convolution masks
shown in (27) to compute the gradient in two directions (i.e. row and column orientations),
and then works out the pixels gradient through g=|gr+gc|. Finally, the gradient magnitude is
thresholded. Sobel edge detector is a simple and effective approach, but sensitive to noise.
Moreover, the detected edges are thick, which may not be suitable for applications that the
detection of the outmost contour of an object is required.
1 0 1 1 2 1
1 1
H c 2 0 2 , H r 0 0 0 (27)
4 4
1 0 1 1 2 1
Though the classical methods were not in benefit of an independent smoothing module, they
attempted to ease this drawback through the calculation of average over the image. This is
because estimation of a derivative calculated using a relatively large set of neighbouring
pixels is more robust to noise than using only two pixels. Following this idea, several
operators have been proposed, some of them are an extended version of the 3x3 detectors
mentioned earlier. For example, a mask adopted by the 7x7 Prewitt based operator to estimate
the horizontal (column orientation) first order derivative of the image is presented as
following.
(28)
Right from the early stages of edge detection, it was recognized that we can use operators of
several dimensions. Rosenfeld et al. [8]-[10] proposed an algorithm to detect edges,
commonly known as difference of boxes, that relies on the use of pairs of neighbourhoods
(one neighbourhood on each side of the point under analysis) of several dimensions and
orientations. By convenience, they suggested that the neighbourhoods should have a square
1 x2 y2
G ( x, y) exp( ) (29)
2 2
2 2
where is the standard deviation, and (x, y) are the Cartesian coordinates of the image pixels.
They showed that by applying Gaussian filters of different scales (i.e. ) to an image, a set of
images with different levels of smoothing will be obtained.
g ( x, y, ) G ( x, y) g ( x, y) (30)
Then, to detect the edges in these images they proposed to find the zero-crossings of their
second derivatives. Marr and Hildreth achieved this by using Laplacian of Gaussian (LOG)
function as filter. Since Laplacian is a scalar estimation of the second derivative, LOG is an
orientation-independent filter (i.e. no information about the orientation) that breaks down at
corners, curves, and at locations where image intensity function varies in a nonlinear manner
along an edge. As a result, it cant detect edges at such positions. According to (24), the
smoothing and differentiation operations can be implemented by a single operator consisting
on the convolution of the image with the Laplacian of the Gaussian function. The final form
of the filters, known as LOG with scale , which should be convolved with the image is as
follows:
1 x2 y2 x2 y 2
f ( x, y) G 4 (1
2
) exp( ) (31)
2 2 2 2
There are advantages for the Gaussian filter that make it unique and so important in edge
detection. The first concerns to its output. It is proven that when an image is smoothed by a
Gaussian filter, the existing zero-crossings (i.e. detected edges) disappear as moving from
fine-to-coarse scale, but new ones are never created [26]. This unique property makes it
possible to track zero-crossings (i.e. edges) over a range of scales, and also gives the ability to
recover them at sufficiently small scales. Yuille and Poggio [26] proved that with the
Laplacian, the Gaussian function is the only filter in a wide category that does not create zero-
crossings as the scale increases. They also showed that for second derivatives, there is no
filter that does not create zero-crossings as the scale increases. This implies the importance of
combination made by Laplacian and Gaussian. Another issue concerns to the filters
conflicting goals of localization in spatial and frequency domains. The optimal smoothing
filter should obey to two conditions: (1) the filter should be smooth in the frequency domain
dG ( x) x x2
f ( x ) k 2 exp( 2 ) (32)
dx
In 2D, Canny assumed the image affected by white noise, and proposed the use of two filters
representing derivatives along the horizontal and vertical directions. In other word, the edge
detection is performed through the calculation of derivative along two directions of the image
filtered by Gaussian function. The separability feature of the 2D Gaussian function allows us
to decompose it into two 1D filters.
where G(.) and f(.) denotes the 1D Gaussian function and its derivative, respectively, and
f(.,.) denotes 2D optimal filter. The filter (33) shows that the filtering can be applied first to
columns (rows) and then to rows (columns), reducing the computational burden. The optimal
filter has rather an orientation perpendicular to the direction of the detected edge. The method
proposed by Canny can be used for developing filters dedicated to a specific and arbitrary
edge profile. For step edges, Cannys optimal filter is similar to the LOG operator because the
maxima in the output of a first derivative operator correspond to the zero-crossings in the
Laplacian operator used by Marr and Hildreth.
Canny also proposed a scheme for combining the outputs from different scales. His strategy is
fine-to-coarse and the method is called feature synthesis. It starts by marking all the edges
detected by the smallest operators. It then takes the edges marked by the small operator in a
specific direction and convolves them with a Gaussian normal to the edge direction of this
operator so as to synthesize the large operator outputs. It then compares the actual operator
outputs to the synthesized outputs. Additional edges are marked if the large operator detects a
significantly greater number of edges than what is predicted by the synthesis. This process is
then repeated to mark the edges from the second smallest scale that were not marked by the
first, and then to mark the edges from the third scale that were not marked by either of the
first two, and so on. In this way, it is possible to include edges that occur at different scales
even if they do not spatially coincide [5].
Cannys edge-detector looks for local maxima over the first derivative of the filtered image. It
uses adaptive thresholding with hysteresis to eliminate streaking of edge contours. Two
thresholds are involved, with the lower threshold being used for edge elements belonging to
edge segments already having points above the higher threshold. The thresholds are set
according to the amount of noise in the image, which is determined by a noise estimation
procedure.
The problem with Cannys edge detection is that his algorithm marks a point as an edge if its
amplitude is larger than that of its neighbours without checking that the differences between
this point and its neighbours are higher than what is expected for random noise. His technique
causes the algorithm to be slightly more sensitive to weak edges, but it also makes it more
susceptible to spurious and unstable boundaries wherever there is an insignificant change in
intensity (e.g., on smoothly shaded objects and on blurred boundaries) [5].
There are many contributions in the last two decades that present edge detectors using either
directly the Gaussian function or filters with high similarity to the Gaussian function and its
derivatives. This leads us to believe that the optimum linear filter for the detection of step
edges should not differ too much from the derivative of the Gaussian and, therefore, the
smoothing filter should be based on the Gaussian function. This is not surprising since the
Gaussian has been emerging as a very important function in several areas of image analysis
and processing and, specially, in multi-resolution analysis. Our goal is not to give an
inventory of algorithms and merely review significant works that attempt to achieve high
performance edge detectors using multi-resolution analysis.
k n2
( x) 2
2
(35)
f ( x) n2
where k is scaling factor, n is the noise variance, and f local variance of the signal. The
major drawback of this algorithm is that it assumes the noise is Gaussian with known variance.
In practical situations, however, the noise variance has to be estimated. The algorithm is also
very computationally intensive [5].
In [36], Bennamoun et al. present a hybrid detector that divides the tasks of edge localization
and noise suppression between two sub-detectors. This detector is the combination of the
outputs from the Gradient of Gaussian and Laplacian of Gaussian detectors. The hybrid
detector performs better than both the first-order and second-order detectors alone, in terms of
localization and noise removal. The authors extended the work to automatically determine the
optimal scale and threshold, of the hybrid detector. They do this by deriving a cost function
which maximizes the probability of detecting an edge for a signal and simultaneously
minimizes the probability of detecting an edge in noise [5].
g (r , c) k exp{( r rk c ck ) / 2 } b
2 2
(36)
k
where k and b are the solution of a quadratic problem (QP). They applied the estimation of
derivatives into the both gradient and zero crossing methods to locate the edge positions.
They stated a performance near to the Canny method, but faster computation.
Bhandarkar et al. [21] presented a genetic algorithm (GA) based optimization technique for
edge detection. The problem of edge detection was formulated as the choosing a minimum
cost edge configuration. The edge configurations were illustrated as 2D genome with fitness
values inversely proportional to their costs, and meanwhile, the two basic GA operators (i.e.
crossover and mutation operators) were described in the context of the 2D genomes. The
mutation operator that exploits knowledge of the local edge structure was shown to result in
rapid convergence. The incorporation of meta-level operators and strategies such as the
gh i m j k j l
(37)
gv i k j n l o
The detection of the magnitude and orientation for an edge across the casual template is based
on the difference of gh and gv. Finally, the gradient-adjusted prediction procedure that
produces the predictive values is shown below. The experimental results indicate that both the
visual evaluations and objective performance evaluations of the detected image in the
proposed approach are superior to the edge detection of Sobel and Canny.
Kang and Wang [11] developed an edge detection algorithm based on maximizing an object
function. The values of the objective function corresponding to four directions determine the
edge intensity and edge direction of each pixel in the mask. They normalized the image
intensity function into certain levels, and used a 33 mask with two sets of pixels in four
directions, as shown in Figure 9, to define an objective function. After all pixels in the image
have been processed, the edge map and direction map are generated. Then, they applied the
non-maxima suppression method to the edge map and the direction map to extract the edge
points. The proposed method can detect the edge successfully, while double edges, thick
edges, and speckles can be avoided.
Liang and Looney [12] introduced competitive fuzzy edge detection method. They adopted
extended ellipsoidal Epanechnikov functions as a fuzzy set membership function, and a fuzzy
classifier that differentiates image pixels into six classes consisting of background (no edge),
speckle (noisy) edge, and four types of edges (in four directions) as shown in Figure 9.
Chang [22] employed a special design of neural networks for edge detection. He introduced a
method called Contextual Hopfield Neural Network (CHNN) for finding the edges of medical
CT and MRl images. Different from conventional 2D Hopfield neural networks, the CHNN
maps the 2D Hopfield network at the original image plane. With the direct mapping, the
network is capable of incorporating pixels' contextual information into a pixels' labelling
procedure. As a result, the effect of tiny details or noises will be effectively removed by rhe
CHNN and the drawback of disconnected fractions can be overcome. Furthermore, the
problem of satisfving strong constraints can be alleviated and results in a fast converge. Our
experimental results show that the CHNN can obtain more appropriate, more continued edge
points than Laplacian-based, Marr-Hildreth 's, Canny's, and waveletbased methods.
Chao and Dhawan [23] presented an edge detection algorithm using Hopfield neural network.
This algorithm brings up a concept which is different from those conventional differentiation
operators, such as Sobel and Laplacian. In this algorithm, an image is considered a dynamic
system which is completely depicted by an energy function. In other words, an image is
described by a set of interconnected neurons. Every pixel in the image is represented by a
neuron which is connected to all other neurons but not to itself. The weight of connection
between two neurons is described as being a function of contrast of gray-level values and the