0% found this document useful (0 votes)
14 views4 pages

ASM, Image Search n Classification-2

Uploaded by

aqsahussain272
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views4 pages

ASM, Image Search n Classification-2

Uploaded by

aqsahussain272
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Proceedings of the Swedish Symposium on Image Analysis, SSAB 1998

Review
Active Shape Models - Part II: Image Search and Classification
Rafeef Abu-Gharbieh, Ghassan Hamarneh and Tomas Gustavsson
Department of Signals and Systems, Imaging and Image Analysis Group
Chalmers University of Technology, Göteborg, Sweden.

Abstract a shape as a scaled, rotated and translated version of


In this paper we review and investigate a method reference shape xl :
capable of modeling the different appearance of objects xi = M ( si ,θi )[xl ] + ti
in images that is due to natural shape variations, where
varying lighting conditions, 3D pose and others. Objects
cos(θ ) − sin(θ ) 
M ( s,θ ) = s 
are represented by well-defined landmark points and
 and
 sin(θ ) cos(θ ) 
shape variations are modeled using a principal
component analysis. Also, gray level variations are
being modeled. The first part of the paper describes the
shape and gray scale modeling in some detail. The
[
ti = txi tyi txi tyi ... txi tyi . ] T

second part describes an iterative algorithm which The matrix M s,θ ( ) scales the shape by s and rotates
deforms an initial model to fit data in ways that are it by θ , while the vector ti translates it in the x and y
consistent with shape variations found in previously directions.
acquired training data. An application to image The shape xl can be expressed as xl = x + dxl ,
classification is outlined.
where dxl = Pbl . Then, the initial estimate can be

1. Introduction written as xi = M ( si ,θi )[ x + dxl ] + ti .

Given a training set of accurately aligned shapes of 2.2. Updating the Shape Estimate
an object class, we can model these shapes by
x = x + bP where x is the mean shape, P is the Given the initial estimate we try to fit it to the image
matrix of the t first principal components, and b is a data. By examining the image region surrounding each
vector of weights. We refer to this model as a Point landmark point of xi , a new desired location xi + dxi
Distribution Model (PDM). The PDM can be used for is obtained.
generating new shapes by modifying b within certain
limits so as to get shapes which are similar to those in the
2.2.1. Finding the Pose Modifications
training set. We assume that the first t principle
components cover most of the shape variations found in We need to adjust the pose (scaling, rotation and
the training set. translation) parameters, as well as the shape parameters
In this paper we deal with the problem of using the (the weights of the principal components) in order to
PDM for finding an instance of the object class in an move our current estimate xi as close as possible to
image not previously included in the training set. We will xi + dxi , while still satisfying the shape constraints
show how an initial shape, or model template, can be imposed to produce an acceptable or allowable shape. To
forced to deform in iterative steps until a best match do that, we first find the additional scaling 1 + ds ,
between data and model is being found. We will also rotation dθ and translation ( dtx , dty ) , required to move
investigate the use of ASM in image classification.
xi as close as possible to xi + dxi . Put symbolically:
xi = M ( si ,θi )[ xl ] + ti → xi + dxi
(1+ ds ),dθ ,dt
2. Using ASM for Image Search
or
The combined technique of PDM and iterative M ( si (1+ ds),θi + dθ )[xl ] + ti + dt 
→ xi + dxi
template deformation, or updated model-to-data
matching, is called Active Shape Modeling (ASM) and We note that there are remaining adjustments which can
was originally described by Cootes et al. [1]. only be satisfied by deforming the shape xl .

2.1. The Initial Shape Estimate 2.2.2. Finding the Shape Modifications
Knowing 1 + ds , dθ and dt , we need to solve
We assume that an instance of an object is described the following equation for dx :
as the sum of the mean shape obtained from the training
M ( si (1+ ds),θi + dθ )[xl + dx] + ti + dt = xi + dxi
and a weighted sum of the principle components, with
the possibility of this sum being translated, rotated, and or
scaled. That is, we can express the initial estimate xi of M ( si (1+ ds),θi + dθ )[xl + dx] = xi + dxi − ( ti + dt)
using
Proceedings of the Swedish Symposium on Image Analysis, SSAB 1998

xi = M ( si ,θ i )[ xl ] + ti where wt , ws , and wθ are scalars. Wb is diagonal


we get: matrix of weights. To allow faster updates in modes

M ( si (1+ ds),θi + dθ )[xl + dx]


responsible for larger shape variations, the elements of
Wb are chosen proportional to the standard deviation of
= M ( si,θ i )[xl ] + dxi − dt the corresponding shape parameter over the training set.
It is important to ensure that the resulting (updated)
since
M −1 ( s,θ )[...] = M ( s −1 ,−θ )[...]
shape is within the allowable shape domain which is
done by limiting the values of b .
we have
(
xl + dx = M ( si (1 + ds )) −1 ,−(θ i + dθ ) ) 2.3. Using Gray Scale Statistics to Find
[M (si,θ i )[xl ] + dxi − dt ] the Desired Movements
which gives
( )
Here we will describe how the modeling of gray level
dx = M ( si (1 + ds )) −1 ,−(θ i + dθ ) statistics around each landmark can be used to determine
[M (si,θ i )[xl ] + dxi − dt ] − xl the adjustment of each landmark ( dxi ) so that we get a
better model-to-data fit.
In general, the resulting vector dx is in 2n-D space, To find such adjustments, we search along a line
but since there are only t (less than 2n) modes of passing through the landmark and perpendicular to the
variation described in our model, we can only move the boundary formed by the landmark and its neighbours. In
shape in t dimensions described by the first t principal this way, we obtain a search profile. Within this search
axes. So we seek the vector that is most similar to dx profile, we look for a sub-profile with characteristics that
but lies in the t-D space. If we adopt the least-squares match the ones obtained from training. In order to do so,
approach, then the solution dx ′ is the projection of dx we collect the gray level values along the search profile,
onto the t-D space (the space spanned by the vectors of compute the derivative and normalise. We then search
principal components, or the t columns of P ). Put in within the normalised derivative search profile (having
equations [2] we have: length ns ) for a sub-profile that matches the mean
normalised derivative profile (of length np ) obtained
dx′ = Adx , where A = P(P T P) −1 P T .
from the training set.
The search profile si along the landmark i is given
Since the columns of P are orthonormal and P is no
si = [ si 0 si1 ... si ( ns − 1)] .
T
longer square, we have by
P T P = I and thus dx ′ = PP T dx .
The derivative search profile of landmark i will be of
So, instead of xl moving to xl + dx it will move length ns − 1 as follows:
xl + dx ′ . Expressing dx ′ as dx ′ = Pdb ′ and dsi = [ si 1 − si 0 si 2 − si 1 ... si ( ns − 1) − si ( ns − 2 )]
T

multiplying by P from the left, we get db ′ = P dx ′ .


T T

The normalised derivative search profile becomes


2.2.3. Updating the Pose and Shape dsi
Parameters ysi = ns − 2

We are now ready to update the shape and pose ∑ ds


k =0
ik
parameters of our initial estimate. We obtain a new
( 1)
estimate xi where We now examine ysi for sub-profiles that match yi

xi (1) = M ( si (1 + ds),θi + dθ )[ xl + Pdb ′] + ti + dt


(the mean normalised derivative profile obtained from
the training set of images). Denoting the sub-interval of
( 1)
We then start with xi in the same way we started with ysi centred at the dth pixel of ysi by h(d ) , we then
xi and produce xi ( 2 ) , and so on, until no significant find the value of d that makes the sub-interval
change in the shape is noticeable. h(d ) most similar to yi . This can be done by defining
A weighted version of the required parameter
the following square error function (which decreases as
modifications is used to iteratively update the pose and
the fit becomes better) and minimising it with respect to
shape parameters as follows,
d:
txi → txi + wtdtx
f (d ) = (h(d ) − yi ) T Cyi −1 (h(d ) − yi )
tyi → tyi + wtdty −1
where Cyi is the inverse of the covariance matrix of
si → si (1 + wsds)
yi . In this way the location of the point to which the
θi 
→ θi + wθ dθ landmark i should move is determined. The same
procedure is repeated for all the landmark points to
b
→ b + Wbdb ′
obtain the vector of suggested movements dxi .
Proceedings of the Swedish Symposium on Image Analysis, SSAB 1998

3. Multi-Resolution Image Search do not change considerably, for example when 95% of
the landmarks move only within the central 50% of the
An important issue that affects the image search search profile [3]. A maximum number of iterations can
considerably is choosing the length of the search profile also be devised to avoid getting stuck at a higher level
ns . In choosing the length we are faced with two (See Figure 2 and 3).
contradicting requirements. On the one hand, the search P rogres s in MR Progress in HiR Progress in MR Progres s in HiR

profile should be long enough to contain within it the 5


100
5
100

target point (the point to which the landmark is supposed 10


200
10
200

to move). On the other hand, we require the search 15


300
15
300

profile to be as short as possible in order to reduce the 20

25
400
20

25
400

number of computations. Also, if the search profile is 30


500
30
500

long and the target point is close to the current position 35


5 10 15 20
600
100 200 300 400
35
5 10 15 20
600
100 200 300 400

of the landmark then it will be more probable to move a) Search in level 4


too far and miss the target. This problem can be solved Progress in MR Progres s in HiR Progress in MR Progres s in HiR

by a multi-resolution approach. First, the search is 10


100
10
100

extended to include far points. Then, as the search 20

30
200
20

30
200

progresses closer to a target structure, the search is 40


300
40
300

limited to near points. 50 400 50 400

60 60

In order to achieve such multi-resolution search, we 70


500

70
500

600 600

generate a pyramid of images with different resolutions. 10 20 30 40 50 100 200 300 400 10 20 30 40 50 100 200 300 400

At the base of the pyramid (Level 0) we have the original b) Search in level 3
Progress in MR Progres s in HiR Progress in MR Progres s in HiR

image and on higher levels (Level 1 to L-1) we step-wise


20 20

decrease the resolution by a factor of two (see Figure 1). 40


100

40
100

200 200
60 60

300 300
80 80

5 10 20
100 400 100 400
10 20 40

15 30 60 120 120
500 500
40
20 80
140 140
50
100
600 600
25
20 40 60 80 100 100 200 300 400 20 40 60 80 100 100 200 300 400
60
120
30
70
35
5 10 15 20
5 10 15 20 25 30 35 40 45 50
140

10 20 30 40 50 60 70 80 90 100
c) Search in level 2
Progress in MR Progres s in HiR Progres s in MR Progress in HiR

Level: 4 Level: 3 Level: 2 50 100 100 100

100 200 200 200

50 100
150 300 300 300
100 200

300
200 400 400 400
150

400
200
250 500 500 500

500
250

300 600 600 600


600
300 50 100 150 200 250 300 350 400 50 100 150 200 100 200 300 400 100 200 300 400 100 200 300 400
20 40 60 80 100 120 140 160 180 200

Level: 1 Level: 0 d) Search in level 1 e) Search in level 0


Figure 1. Pyramid of Images
Figure 2. The progress of multi resolution image search,
In order to obtain the pyramid we first smooth the (Shapes in higher levels are also shown on the original image)
image at the lower level and then sub-sample every
second pixel to obtain the image at the higher level [3]. X+dX X+dX X+dX

We start by searching at the top level of the pyramid 10

20
20

40
100

200

and then continue at a lower level using the search output 30

40
60

80
300

of the previous level. The procedure is repeated until the 50

60
100

120
400

500

lowest level of the pyramid (the original image) is 70


5 10 15 20 25 30 35 40 45 50
140

10 20 30 40 50 60 70 80 90 100
600
50 100 150 200 250 300 350 400

reached. In order to carry out this multi-resolution Figure 3. Examples of the desired changes in the shape
search, we must use the information about the gray level xi + dxi , before forcing it to be an allowable shape.
profiles at each of these levels. This means that during Shown in level 4 (left), level 3 (middle), and level 0
the training stage, we need to obtain the mean normalised
derivative profile for each landmark in all the pyramidal 4. Shape Recognition and classification
levels .
We denote the mean normalised derivative profile
Images of a certain object class present inter- and
for the landmark i at the pyramidal level l by yil ; intra-class variations [4]. Inter-class variations are due to
0 ≤ i ≤ n − 1 and 0 ≤ l ≤ L − 1 . The mean is obtained by the fact that the objects actually belong to different
averaging the normalised profile for a certain landmark classes. For example, the variation in appearance of two
along the N images of the training set. faces belonging to two different individuals. Intra-class
Before any search can be carried out, we need a variations are those changes in appearance which are due
criterion for determining when to change the level of to lighting conditions, 3D pose or facial expression [5].
search within the pyramid. One possibility is to move to In shape recognition and classification using ASM,
a lower level when a certain percentage of the land marks we first collect a rich set of training images representing
Proceedings of the Swedish Symposium on Image Analysis, SSAB 1998

many classes. Then, we study the shape and gray level References
characteristics, or features, of each class. When
presented with a new image, we locate the shape, [1] T. Cootes, C. Taylor, D. Cooper, J. Graham,
compute its features, and apply a discriminant analysis Active Shape Models - Their Training and
technique to classify it. Application. Computer Vision and Image
The shape features of a class are represented by the Understanding, January 1995, Vol. 61, No. 1, pp.
weights of the principal components (the b vector in the 38-59.
PDM model) for that class. The gray level features can [2] G. Strang. Linear Algebra And Its Applications,
be obtained from two main categories. The first is the Saunders 1988, pp. 170.
local gray level information represented by the [3] T. Cootes, C. Taylor, A. Lanitis, Active Shape
normalised derivative profile perpendicular to the Models: Evaluation of a Multi-Resolution Method
landmark points. The second is the shape-free gray level for Improving Image Search. Proceedings of the
model obtained by first deforming the shape to its mean British Machine Vision Conference, 1994, pp.327-
shape in a way such that changes in gray levels are 336.
minimised. This can be done, for example, using thin- [4] A. Lanitis, C. Taylor, T. Cootes, Automatic
plate splines [6]. Interpretation and Coding of Face Images Using
Flexible Models. IEEE Transactions on Pattern
5. Discussion Analysis and Machine Intelligence, July 1997, Vol.
19, No. 7, pp. 743-756.
Active Shape Models have a potential for [5] A. Lanitis, C. Taylor, T. Cootes, Recognising
applications to image search and classification. The Human Faces Using Shape and Grey-Level
flexibility of this type of shape and gray scale modeling Information. Third International Conference on
lend itself to analysis in the biomedical area because Automation, Robotics and Computer Vision, 1994.
images of natural objects belonging to a certain class [6] F. Bookstein. Principal Warps: Thin-Plate Splines
present a rich variety of different appearances. and the Decomposition of Deformations, IEEE
Transactions on Pattern Analysis and Machine
Acknowledgements Intelligence, VOL 11, No. 6, June 1989, pp. 567-
585.
This study was supported by the Swedish
Foundation for Strategic Research under the VISIT
program.

You might also like