0% found this document useful (0 votes)
12 views

Head-Pose-Determination-from-One-Image-Using-a-Generic-Model

This document presents a method for determining head pose from a single 2D image using a generic 3D model of the human head, which accounts for variations in shape and facial expressions. The technique employs Iterative Closest Curve matching (ICC) to establish correspondences between model curves and image curves, enhancing robustness and efficiency compared to traditional point-based methods. Preliminary experiments indicate that this approach can accurately estimate head pose without the need for artificial markers or prior knowledge of camera parameters.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Head-Pose-Determination-from-One-Image-Using-a-Generic-Model

This document presents a method for determining head pose from a single 2D image using a generic 3D model of the human head, which accounts for variations in shape and facial expressions. The technique employs Iterative Closest Curve matching (ICC) to establish correspondences between model curves and image curves, enhancing robustness and efficiency compared to traditional point-based methods. Preliminary experiments indicate that this approach can accurately estimate head pose without the need for artificial markers or prior knowledge of camera parameters.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Head Pose Determination from One Image Using a Generic Model

Ikuko Shimizu1 3 ;
Zhengyou Zhang2 3 ;
Shigeru Akamatsu3 Koichiro Deguchi1
1
Faculty of Engineering, University of Tokyo, 7-3-1 Hongo, Bynkyo-ku, Tokyo 113, Japan
2
INRIA, 2004 route des Lucioles, BP 93, F-06902 Sophia-Antipolis Cedex, France
3
ATR HIP, 2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-02, Japan
e-mail: [email protected]

Abstract use a generic model of the human head, which is applicable


to many persons and is able to consider the variety of facial
We present a new method for determining the pose of a expressions. Such a model is constructed from the results of
human head from its 2D image. It does not use any artificial intensive measurements on the heads of many people. With
markers put on a face. The basic idea is to use a generic model this 3D generic model, we suppose that an image of a head
of a human head, which accounts for variation in shape and is the projection of this 3D generic model onto the image
facial expression. Particularly, a set of 3D curves are used to plane. Then, the problem is to estimate this transformation,
model the contours of eyes, lips, and eyebrows. A technique which is composed of the rigid displacement of the head and
called Iterative Closest Curve matching (ICC) is proposed, a perspective projection.
which aims at recovering the pose by iteratively minimizing We take a strategy that we define edge curves on the 3D
the distances between the projected model curves and their generic model in advance. For edge curves, we use the
closest image curves. Because curves contain richer infor- contours of eyes, lips, eyebrows, and so on. They are caused
mation (such as curvature and length) than points, ICC is by discontinuity of the reflectance and appear in the image
both more robust and more efficient than the well-known iter- independent of the head pose in 3D space. (We call these
ative closest point matching techniques (ICP). Furthermore, edges stable edges.) For each defined edge curve on the
the image can be taken by a camera with unknown internal generic model, we search its corresponding curves in the
parameters, which can be recovered by our technique thanks image. This is done by first extracting every edge from the
to the 3D model. Preliminary experiments show that the image and next using the relaxation method.
proposed technique is promising and that an accurate pose After we have established the correspondences between
estimate can be obtained from just one image with a generic the edges curves on the model and the edges in the image,
head model. we are to estimate the head pose. For this purpose, we de-
velop ICC (Iterative Closest Curve) method which minimizes
the distance between the curves on the model and the corre-
1. Introduction sponding curves in the image. This ICC method is similar
This paper deals with techniques for estimating the pose to the ICP (Iterative Closest Point) method [5] [8], which
of a human head using its 2D image taken by a camera. They minimizes the distance from points of a 3D model to the cor-
are useful for the realization of a new man-machine interface. responding measured points of the object. Because a curve
We present a new method for the accurate estimation of a head contains much richer information than a point, curve corre-
pose from only one 2D image using a 3D model of human spondences can be established more robustly and with less
heads. By a 3D model with characteristic curves, our method ambiguity, and therefore, pose estimation based on curve cor-
does not use any makers on the face and uses an arbitrary respondence is thought to be more accurate than that based
camera with unknown parameters to take images. on point correspondence.
Several methods have been proposed for head pose esti- The ICC method is an iterative algorithm and needs a
mation which detect facial feature and estimate pose by loca- reasonable initial guess. To obtain it, prior to applying the
tion of these features using 2D face model[1] or by template ICC method, we roughly compute the pose of a head and
matching[3]. Jebara[7] tracked facial features in the sequence the camera parameters by using the correspondence of conics
of images to generate 3D model of face and estimate pose of fitted to the stable edges. The computation is analytically
face. carried out. Then, a more precise pose are estimated by the
We use 3D models of human heads in order to estimate a ICC method. In this step, in addition to the stable edges, we
pose from only one 2D image. There are some difficulties use variable edges, which are pieces of occluding contours
with such 3D models; head shapes are different from one of a head, e.g. the contour of the face.
person to another person and, furthermore, facial expressions Our method is currently applied for extracted face area
may vary even for one person. Nevertheless, it is unrealistic from the natural image or the face image with unicolor back-
to have 3D head models for all persons and for all possible ground. Many techniques have been reported in the literature
facial expressions. To deal with effectively this problem, we to extract the face from clustered background.
2. Notation 3.1. Construction of the Generic Model
The coordinates of a 3D point X = (X; Y; Z )t in a world We represent the deformation of the 3D shape of a human
coordinate system and its image coordinates x = (u; v )t are head (i.e., shape differences and the changes of the facial
related by expression) by the mean X X
and the variance V [ ] of each
x X point on the face. These variables are calculated from the

1
= P 1
; or simply x̃ = P X̃ : (1)
results of measuring heads of many people. To do so, we need
a method for sampling points consistently for all faces. That
is, we need to know which point on a face corresponds to a
where  is an arbitrary scale factor, P is a 3  4 matrix, called point on another face. Many methods have been proposed for

the perspective projection matrix, and = (X; Y; Z; 1)t and such a purpose and we can use them; we use the resampling
x̃= (u; v; 1)t . The matrix P can be decomposed as method [4] developed in our laboratory. This method uses
several feature points (such as the corners of the eyes, the
P = AT: (2) vertex of the nose, and so on) as reference points. Using
these reference points, the shape of a face is segmented into
The matrix A maps the coordinates of the 3D point to the several regions and further each region is resampled. We
image coordinates. The general matrix A can be written as choose the sample points using this method.
0 uo
1 3.2. Edge Extraction in the Model
u 0 0
A = @ 0 v vo 0 A: (3) As mentioned earlier, we use two types of edges: stable
0 0 1 0 edges and variable edges. For stable edges, we extract them
beforehand from the 2D image taken at the same time as the
u and v are the product of the focal length and the hori- acquisition of the 3D data of a head. They are the contours of
zontal and vertical scale factors, respectively. uo and vo are the eyes, lips, and eyebrows. We obtain their corresponding
the coordinates of the principal point of the camera, i.e., the curves on the head by back-projecting them onto the 3D
intersection between the optical axis and the image plane. model. For variable edges, which are occluding contours and
For simplicity of computation, both uo and vo are assumed depend on the head pose and camera parameters, we extract
to be 0 in our case, because the principal point is usually at them whenever these parameters change. Figure 1 shows
the center of the image. an example of images of the generic model with stable and
The matrix T denotes the positional relationship between variable edges. It shows that the stable edges (i.e., the eyes
the world coordinate system and the image coordinate system. and lips) do not change under the change of the pose, and the
T can be written as
R t
variable edge (i.e., the contour of the face ) changes whenever
the pose changes.
T =
0 1 : (4)

R is a 3  3 rotation matrix and t is a translation vector.


Note that there are eight parameters to be estimated: two
camera parameters u and v , three rotation parameters, and
three translation parameters.
We use CkI (k = 1; : : : ; K ) to denote the k-th stable curve
in the image, and ClW (P )(l = 1; : : : ; L), the l-th stable curve
in the model projected by P . Both CkI and ClW are 2D curves.
CoI is used to denote the contour of the face in the image. Figure 1. A generic model of a head. In all poses,
CoW (P ) is the contour of the face projected by P . the stable edges such as the eyes and lips do
xIk is the 2D the point belonging to the k-th curve CI in
i Wl is the 3D point belonging to the kl-th
not change. The variable edges change because
the 2D image. X j
they are occluding contours.
curve of the 3D model. W x
j (P ) is the 2D point belonging to
l
the l-th curve Cl (P ) projected by P . W
W X Wl
j and j (P ) are
l x 4. Definition of the Distance Between Curves
related by
 a =a  0a 1 Here we define distance between curves. It is a basis of

with @ a A =P X W
the ICC method which we will present in the later section,
xWj l (P )= a12=a33
1
j :
2
l (5) and it is also used for finding corresponding curves.
a3 The squared distance between a 2D curve in the image and
a projection of a curve on the 3D model is defined by
3. Generic Model of a Human Head
d(CkI ; ClW (P )) !
We use the generic model of the human head which is able
X
to take account of shape differences between individuals and
=
1
NkI dm (xIi k , xWj l (P )) ;
xWj l 2ClW P
the changes of the facial expression. This section explains min (6)
this generic model. xIk 2C I
i k
( )
where NkI is the number of points in CkI and dm ( Ii k x , 5.2. Calculating the Strength of Match
xj
Wl (P )) is the squared Mahalanobis distance:
If ((CkI ; ClW (Po ))) is a correct pair, many of the rest of the
model curves CkWm have corresponding curve ClIn such that the
dm (xIi k , xWj l (P )) position of ClIn relative to CkWm is similar to that of ClW (Po )
= (xi k , xj l (P ))t M ij (xi k , xj l (P ));
I W kl I W (7) relative to CkI . We define the strength of match SM for pair
0 Wl ! 1
Wl (P ) !t ,
((CkI ; ClW (Po ))) in a way similar to the one for point pair used
x x
1
@ P @
M klij = @ @ X Wl V [X Wj l ] @ X Wl A :(8)
j ( ) j in [10].
j j 5.3. Updating corresponding pairs of curves
The strategy we use for updating corresponding pairs is
It is possible to give another definition of the distance called the “some-winners-take-all” strategy[10]. Consider
between curves. Our definition is based on the following the corresponding pairs having the highest strength of match
assumptions: for both of the image and the model. These pairs are called
potential matches and denoted by fPi g. For fPi g, two tables
 When, for edges on the 3D model, the corresponding
TSM and TUA are constructed.
edges in the image are found, the projected model curve
TSM saves the matching strength of each fPi g which is
sorted in decreasing order. TUA saves the value of UA . UA
contains the image curve.
 The generic model is sampled at a higher resolution than describes unambiguity and is defined as
the image. UA = 1 , SM (2)=SM (1) (10)
 The variance V [X ] of each point can be different and where SM (1) is the SM of fPi g and SM (2) is the SM of the
unisotropic. second best candidate in the pairs which include the curve
forming fPi g . TUA is also sorted in decreasing order.
5. Finding Corresponding Curves by Relaxation The pairs are selected as “correct” matches if they are
among the first q(> 50) percent of pairs in TSM and the first
In this section, we explain the method for finding corre-
spondence between 3D model curves and 2D image curves.
q percent of pairs of TUA . Using this method, the pairs which
This is done by matching 2D image curves CkI and model
are matched well and unambiguous are selected.
curves ClW (Po ) projected by Po . Po is an arbitrary projec- 6. Rough Estimation of a Head Pose
tion.
We assume that all of the eyes and lips are seen in an In this section, we explain the method for roughly estimat-
image. Therefore, the edges of a 2D image are expected to ing the head pose and camera parameters which are used as
include stable edges. However, they also include noisy edges the initial guess in the refinement process.
caused by illumination, measurement error and so on. Conse- To roughly estimate the head pose and the camera param-
quently, there are some correspondence ambiguities. We use eters, co-planar conics are used. Because the eyes and mouth
the relaxation techniques to resolve such the correspondence are approximately on a single plane, the 3D stable edges of
ambiguities. the model, such as the edges of the eyes and lips, are projected
First, we find candidates for corresponding curves using to that plane.
the similarity of the curvature. Curvature of the curve is not We use the intersections and bi-tangent lines of the co-
preserved under projection. However, because we assume planar conics because they are preserved under projection[2].
the pose estimate Po is reasonable, curvature of the same When using pairs of co-planar conics, at least one pair of co-
curve might be similar. After finding candidates, we resolve planar conics is needed to determine all the parameters. But
ambiguities by relaxation method. still remain two possibilities in our case: the correct one and
5.1. Finding Candidates for Corresponding the upside-down one. Therefore, we use three pairs of conics:
Curves left eye and right eye, left eye and lips, right eye and lips.

Both the image edges and the projected model edges are
6.1. Projection to the Face Plane
segmented into equi-curvature curves. Candidates for cor- Edge points of eyes and lips are almost on one plane, called
responding pairs are found by evaluating the similarity of the face plane.
curvature. Consider a coordinate system in which the face plane co-
The similarity of curvature s(k; l) is defined as incides with z = 0. We call such a coordinate system the
plane coordinate system.
s(k; l) = 1:0=(1:0 + jc(CkI ) , c(ClW (Po ))j); (9) The 3D coordinates of a point projected to the face plane
X in the world coordinate system and the coordinates of the
where c(C ) is the curvature of curve C . point (xp ; yp ; 0)t in the plane coordinate system are related
s(k; l) has the following properties:(i) when two curves by
have exactly the same curvature, s(k; l) equals 1, and (ii) as 0 10 x 1 0x 1
p
the difference of the curvature between two curves becomes
larger, s(k; l) becomes smaller. X̃ = B
B@ Rp tp C B y
CA B@ 0p CC = T BB ypp CC ;
If the value of s(k; l) is higher than the threshold, the pair A p@ 0 A (11)
of curves (CkI ; ClW (Po )) is selected as the candidate pair. 0 1 1 1
where Tp means the positional relationship between the world We select the best one among all possible values of H
coordinate system and the plane coordinate system. by evaluating H . The method for evaluation is descried in
From equations (1) and (11), we have appendix B.
0x 1 From equation (13), unknown parameters are obtained
using every components of H (see appendix C). This is the
p
x̃ = H @ yp A ; (12) initial guess for refinement process.
1
7. Refinement of the Head Pose by ICC (Iterative
where H is a 3  3 matrix, given by Closest Curve) Method
0 r 1
u u r12 u t1 In this section, we explain the method for refinement of
H = @ vr A;
11
v r22 v t2
the head pose and camera parameters using the initial guess
21 (13)
r31 r32 t3 obtained by the method described in the previous section.
We employ the ICC method which minimizes the distance
R0 = RRp, and ti is
between corresponding curves.
where rij is the (i; j )-th component of We use the correspondence of two types of edges in this
t Rt t
the i-th component of 0 = p+ . process: stable ones and variable ones.
6.2. Intersection and bi-tangent of co-planar The correspondence of stable edge curves have been es-
conics tablished by the method described in section 5.
Variable edges, e.g. the contour of the face, of the generic
A conic in a 2D space is a set of points x that satisfy model should be extracted whenever the parameters are up-
dated because this curve varies whenever the parameters
x̃tQx̃ = 0; (14) change. However, the correspondence of the contour of the
face is known.
where Q is a 3  3 symmetric matrix. We fit a conic to Once the correspondence of the curves are established,
the edge points of right eye, left eye, and lips by gradient the squared Mahalanobis distance of corresponding curves is
weighted least square fitting described in [9]. minimized.
m̃ Q Q We minimize the value of the function J :
The intersection for two conics 1 and 2 satisfies the
X
following simultaneous equations:
J = d(CkI ; ClW (k)(P )) + d(CoI ; CoW (P )) (17)
m̃tQ1m̃ = 0; m̃tQ2m̃ = 0: l !
X X
and (15)

Q Q =
NkI
1
dm (xIi k , xW
j (P ))
l
xWj l 2ClW P
Denoting bi-tangent line for two conics 1 and 2 as min
t
l̃ x̃ l̃
= 0, satisfies the following simultaneous equations[2]:
l x Ik 2C I
i k
( )

!
X
l̃tQ,1 1l̃ = 0; and l̃tQ,2 1l̃ = 0: (16) +
1
NoI xIo k 2CI xWomin
2CW (P )
dm (xIi o , xW
j (P ))
o :
j o
m and l are obtained by solving quartic equations analyt- i o
(18)
ically.
6.3. Combinations of the Correspondence We minimize the value of J to find P by iterating these
two steps:
There are no real intersections for these pairs of conics.
Therefore, the solutions of the quartic equation are two com-  x
For each image point Ii k of each corresponding curve
plex conjugate pairs. In complex cases, there are eight possi- pairs (CkI ; ClW (P )), the point W l x
j which minimize
bilities to correspond four points of the image to four points
of the model because conjugate pairs project to conjugate x I W x
dm ( i , j ) are found.
k l
pairs under real projection[2].
On the other hand, because all of the bi-tangent lines are
P is updated to minimize J by Levenverg-Marquart
algorithm.
real in this case, there are only four possibilities of correspon-
dence. P include the head pose and camera parameters in equa-
Therefore, there are 32 possible combinations for each pair tion 2. We directly estimate eight parameters, i.e., three
of conics. When we use three pairs of conics, the number of rotation parameters, three translation parameters, and two
the all possible pairs are 323 (= 32768). camera parameters, instead of each component of P .
We reduce the number of combinations. Because there are
7.1. Non-linear Minimization of the Dis-
two possibilities are remaining in our case for only one pair
of conics (the true one and the up-side-down one), we select tance between Curves
two combinations for each pair of conics. Then using these From equation (2), P is decomposed into a perspective
combinations for three pairs of conics, all possible values projection and the rigid displacement.
of H is calculated by the linear least squares described in Non-linear minimization with constraints of the rotation
appendix A. The number of possible values of H is much matrix is complicated. Therefore, we rewrite the rigid dis-
reduced to 32 + 32 + 32 + 23 (= 104). placement part by using a 3D vector as q
(a) (b)
Figure 2. (a) Extracted edges in images of one
woman's face and (b) edge curves of the eyes, (a) (b)
lips, and eyebrows extracted by the correspon- Figure 3. Edges and conic of the eyes and lips
dence between the model and the image. and the result of rough estimation using conics.
(a) Edges of a woman's face and co-planar con-
ics. (b) The results of rough estimation using
T X̃ = RX + t (19) conics of (a). The conics of the image are plot-
ted in black and the projection of model conics
= X+ (q  X , (q  X )  q ) + t:(20)
2
1 + qt q
are plotted in red.
The direction of q is equal to the rotation axis and the norm
of q is equal to tan 2 where  is the rotation angle. Using this
equation, because the three component of q are independent,
the minimization becomes much simpler.

8. Experimental Result
We show in this section some preliminary result with the
proposed technique.
Figure 1 shows the model edges constructed from 36 Figure 4. The result of the head pose estimation
women’s head measurements. All of these are with no facial using ICC.
expressions. Figure 2(a) shows the edges of an image of one
woman. These edges are extracted by the method described
in [6].
Figure 2(b) shows the extracted stable edge curves, i.e., We have proposed the iterative closest curve matching
the contour of the eyes, lips, and eyebrows. These edges are (ICC) method which estimates directly the pose by itera-
extracted by establishing the correspondence between model tively minimizing the squared Mahalanobis distance between
edges and image edges described in section 5. the projected model curves and the corresponding curves in
Figure 3(a) shows the co-planar conics fitted to contours of the image. The curve correspondence is established by the
the eyes and lips in the image showed in figure 2(a). Figure relaxation technique. Because a curve contains much richer
3(a) shows the result of rough estimation. The conics of information than a point, curve correspondences can be estab-
the model are plotted in red and the conics of the image are lished more robustly and with less ambiguity, and therefore,
plotted in black. pose estimation based the ICC is believed to be more accurate
The head pose and camera parameters of the image shown than that based on the well-known ICP.
in Fig. 2(a) was estimated. Figure 4 shows the projection of Furthermore, our technique does not assume that the in-
the generic model by the estimated parameters. The pose of ternal parameters of a camera is known. This provides more
the head shown in Fig. 2(a) and that of Fig. 4 are almost the flexibility in practice because an uncalibrated camera can be
same. used. The unknown parameters are recovered by our tech-
nique thanks to the generic 3D model.
9. Conclusion Preliminary experimental results show that (i) accurate
head pose can be estimated by our method using the generic
Head pose determination is very important for many ap- model and (ii) this generic model can deal with the shape
plications such as human-computer interface and video con- difference between individuals. The accuracy of the pose
ferencing. In this paper, we have proposed a new method estimation depends tightly on whether image curves can be
for estimating accurately a head pose from only one image. successfully extracted. More experiments need to be carried
To deal with shape variation of heads among individuals and out for different facial expressions and for cluttered back-
different facial expressions, we use a generic 3D model of ground.
the human head, which was built through statistical analysis We believe that the ICC method is useful not only for
of range data of many heads. In particular, we use a set of 3D-2D pose estimation but also for 2D-2D or 3D-3D pose
3D curves to model the contours of eyes, lips, and eyebrows. estimation.
Acknowledgment: Using this relation, the criterion function e(H ) is defined
We thank K.Isono for his help in the presentation of ex- as[2]:
perimental data.
e(H ) = (Iab3 (QPm01 ; QP1 ) , 3)2 + (Iab4 (QPm01 ; QP1 ) , 3)2
+(Iab3 (Qm2 ; Q2 ) , 3) + (Iab4 (Qm2 ; Q2 ) , 3)
References P0 P 2 P0 P 2

[1] A.Lanitis, C.J.Taylor and T.F.Cootes. Automatic Interpreta- (25)


tion and Coding of Face Images Using Flexible Models. IEEE
where
Trans. PAMI, 19(7):743–756, 1997.
[2] C.A.Rothwell, A.Zisserman, C.I.Marinos, D.A.Forsyth and
QPmi0 = Hmt QIi Hm; (26)

and
h, ,1 (1= det B)Bi(27)
J.L.Mundy. Relative Motion and Pose from Arbitrary Plane
Curves. IVC, 10(4):250–262, 1992.
Iab3 (A; B ) = trace (1= det A)A ;
h, i
[3] D.J.Beymer. Face Recognition Under Varying Pose. In
CVPR94, pages 756–761, 1994.
[4] K.Isono and S.Akamatsu. A Representation for 3D Faces with Iab4 (A; B ) = trace (1= det B )B
,1 (1= det A)A (28)
:
Better Feature Correspondence for Image Generation using
PCA. Technical Report HIP96-17, IEICE, 1996.
[5] P.J.Besl and N.D.McKay. A Method for Registration 3-D
C. Decomposition of H
Shapes. IEEE Trans. PAMI, 14(2):239–256, 1992. From equation (13), the head pose and camera parameters
are determined using every components of H .
R
[6] R.Deriche. Using Canny’s Criteria to Derive a Recursively
Implemented Optimal Edge Detector. IJCV, 1(2):167–187, Because is a rotation matrix, we have
1987.
[7] T.S.Jebara and A.Pentland. Parametrized Structure from Mo- r112 + r212 + r312 = 1; (29)
tion for 3D Adaptive Feedback Tracking of Faces. In CVPR97, r122 + r222 + r322 = 1; (30)
pages 144–150, 1997. r11 r12 + r21 r22 + r31 r32 = 0: (31)
[8] Z.Zhang. Iterative Point Matching for Registration of Free-
Form Curves and Surfaces. IJCV, 13(2):119–152, 1994. We use hij to denotes the (i; j )-th component of H . From
[9] Z.Zhang. Parameter Estimation Techniques: A Tutorial with equations 13 and 31, we have
Application to Conic Fitting. IVC, 15:59–76, 1997.
[10] Z.Zhang, R.Deriche, O.Faugeras and Q.T.Luong. A Robust h11 h12 = u + h21 h22 = v + h31 h32 = 0:
2 2
(32)
Technique for Matching Two Uncaribrated Images Through
From equations 29 and 30, we also have
, 
the Recovery of the Unknown Epipolar Geometry. AI Journal,
78:87–119, 1995.
2 h211 = u + h21 = v + h31 
2 2 2 2
1;
H ,
2 h212 =
= (33)
u + h22 = v + h32 1:
2 2 2 2
A. Linear Estimation of = (34)
Assume the image point (x; y ) and the object point Then, by eliminating 2 , we have
(xp ; yp ) are the corresponding pair. We rewrite the com-
ponents of H as h2 , h212 )= u + (h21 , h22 )= v + h31 , h32 = 0: (35)
2 2 2 2 2 2
0a b c1
( 11

H = @ d e f A:
Let u =
1
u and v =
1
v . From equation (32) and (35), we
(21) have
g h 1
,h31h32(h221 , h222) + h21h22(h231 , h232) (36)
By eliminating  in equation (12), we get u =
d
axp + byp + c , gxp x , hyp x = x; (22) ,h31h32(h211 , h212) + h11h12(h231 , h232) (37)
v =
d
and
where
dxp + eyp + f , gxp y , hyp y = y: (23)
d = h11 h12 (h221 , h212 ) , h21 h22 (h211 , h212 ): (38)
From equation (22) and equation (23), the components of H
are calculated by the linear least square algorithm. Once u and v are estimated, we can compute  using
equation (33) or (34). All of the pose parameters are given
B. Eliminating Ambiguous Solutions for H by

We select the best correspondence combination which r11 = h11 = u ; r21 = h21 = v ; r31 = h31 ; (39)
minimizes the criterion function. r12 = h12 = u ; r22 = h22 = v ; r32 = h32 ; (40)
Q
If the H is correct, conic on the face plane P and the t1 = h13 = u ; t2 = h23 = v ; t3 = h33 : (41)
Q
image conic I satisfy the following equation:
ri3 (i = 1; : : : ; 3) can be easily computed using the orthogo-
QP = 2 H tQI H: (24) nality of the rotation matrix.

You might also like