Day 3 - Single Image Orientation
Day 3 - Single Image Orientation
STRUCTURE FROM
MOTION (SFM)
http://www.canonoutsideofauto.ca/play/
FIELD OF VIEW FOV
Try to experience the change of the focal length on the image FOV
https://camerasim.com/camerasim-free-web-app/
PINHOLE CAMERA MODEL
The pinhole camera model describes the mathematical relationship between
the coordinates of a 3D point and its projection onto the image plane
� 𝑌𝑌,
𝑋𝑋, � 𝑍𝑍are
̅ the world coordinates.
𝑥𝑥, 𝑦𝑦, are the image coordinates.
From trigonometry:
𝑋𝑋� 𝑌𝑌�
𝑥𝑥 = 𝑓𝑓( � ) ; 𝑦𝑦 = 𝑓𝑓( �)
𝑍𝑍 𝑍𝑍
𝑂𝑂
� 𝑌𝑌,
• 𝑋𝑋, � and 𝑍𝑍̅ are the world coordinates to the origin 𝑝𝑝. 𝑝𝑝 = 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝
located at the camera perspective center 𝑂𝑂 (lens). = 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑤𝑤𝑤𝑤𝑤𝑤𝑤 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑢𝑢0 , 𝑣𝑣0
MATHEMATICAL MODELS OF THE PINHOLE
CAMERA
Collinearity equations is based on the geometric
condition that the object point 𝐴𝐴 𝑋𝑋𝐴𝐴 , 𝑌𝑌𝐴𝐴 , 𝑍𝑍𝐴𝐴 , its
projection on the image 𝑥𝑥𝑎𝑎 , 𝑦𝑦𝑎𝑎 and the camera
lens (𝑋𝑋𝑜𝑜 , 𝑌𝑌𝑜𝑜 , 𝑍𝑍𝑜𝑜 ) are all collinear (green dotted line)
𝑢𝑢 𝑥𝑥 𝑢𝑢 𝑓𝑓 0 𝑥𝑥0 𝑥𝑥
𝑣𝑣 = 𝐾𝐾 𝑦𝑦 𝑜𝑜𝑜𝑜 𝑣𝑣 = 0 𝑓𝑓 𝑦𝑦0 𝑦𝑦
1 1 1 0 0 1 1
𝑣𝑣 𝑥𝑥
3D to 2D transformation
K matrix
Z
3D to 3D transformation
Rotation + Translation
Exterior orientation
WORLD X
SYSTEM
ROTATION IN 3D
• Understanding the 3D rotation is essential in studying the topic of
photogrammetry
• The rotation can be clockwise if a left-handed system is adopted or
anticlockwise if a right-handed system
• The orientation of the camera coordinate system with respect to the world
system can be represented by an orthonormal rotation matrix R
• Point 𝑎𝑎 can be transformed from 𝑥𝑥 𝑎𝑎 = 𝑋𝑋𝑎𝑎 𝑌𝑌𝑎𝑎 𝑍𝑍𝑎𝑎 𝑡𝑡 to point 𝑏𝑏 as 𝑥𝑥 𝑏𝑏 =
𝑋𝑋𝑏𝑏 𝑌𝑌𝑏𝑏 𝑍𝑍𝑏𝑏 𝑡𝑡
𝑥𝑥 𝑎𝑎 = 𝑅𝑅𝑎𝑎𝑎𝑎 𝑥𝑥 𝑏𝑏
ROTATION IN 3D
Euler angles are typically denoted as omega 𝜔𝜔, phi 𝜑𝜑, and kappa 𝑘𝑘. The 𝜔𝜔 angle is
applied around the 𝑋𝑋-axis, 𝜑𝜑 around the 𝑌𝑌-axis, and 𝑘𝑘 around the 𝑍𝑍-axis
𝑅𝑅𝑘𝑘
𝑅𝑅𝜔𝜔 𝑅𝑅𝜑𝜑
cos 𝜑𝜑 0 − sin 𝜑𝜑 cos 𝑘𝑘 sin 𝑘𝑘 0 1 0 0
𝑅𝑅𝜑𝜑 = � 0 1 0 �, 𝑅𝑅𝑘𝑘 = �− sin 𝑘𝑘 cos 𝑘𝑘 0� 𝑅𝑅𝜔𝜔 = �0 cos 𝜔𝜔 sin 𝜔𝜔 �
sin 𝜑𝜑 0 cos 𝜑𝜑 0 0 1 0 − sin 𝜔𝜔 cos 𝜔𝜔
𝑋𝑋� �
𝑥𝑥 = 𝑓𝑓 ⎫ 𝑥𝑥 𝑓𝑓 0 0 0 𝑋𝑋
̅
𝑍𝑍 in homogenous coordinates→ �𝑦𝑦� = � 𝑌𝑌�
0 𝑓𝑓 0 0� � ̅ �
𝑌𝑌� ⎬ 1 𝑍𝑍
𝑦𝑦 = 𝑓𝑓 ⎭ 0 0 1 0
𝑍𝑍 ̅ 1
𝑥𝑥, 𝑦𝑦 are the coordinates related to the center of the image at the principal point and we
need to translate the image coordinates to the pixel coordinates origin at the top left
corner. 𝑦𝑦
� 𝑢𝑢𝑝𝑝𝑝𝑝𝑖𝑖𝑝𝑝 𝑝𝑝𝑖𝑖𝑓𝑓𝑝𝑝 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑖𝑖𝑝𝑝
𝑥𝑥 𝑓𝑓 0 𝑥𝑥0 0 𝑋𝑋 [0,0]
�𝑦𝑦� = �0 𝑌𝑌�
𝑓𝑓 𝑦𝑦0 0� � ̅ � 𝑥𝑥𝑎𝑎 , 𝑦𝑦𝑎𝑎
1 𝑍𝑍 𝑢𝑢𝑎𝑎 , 𝑣𝑣𝑎𝑎
0 0 1 0
1 𝑥𝑥 𝑤𝑖𝑖𝑝𝑝𝑖𝑖𝑤𝑝𝑝
𝑤𝑖𝑖𝑝𝑝𝑖𝑖𝑤𝑝𝑝 𝑥𝑥𝑜𝑜 , 𝑦𝑦𝑜𝑜
𝑤𝑤𝑝𝑝𝑐𝑐𝑝𝑝𝑤 𝑤𝑤𝑝𝑝𝑐𝑐𝑝𝑝𝑤
PROJECTION MATRIX
• 𝑋𝑋� 𝑌𝑌� 𝑍𝑍̅ coordinates are the world's three-dimensional coordinates of points that have been
rotated by 𝑅𝑅 and translated by 𝑡𝑡 into the camera frame
𝑥𝑥 𝑋𝑋
𝑓𝑓 0 𝑥𝑥0
𝑌𝑌
�𝑦𝑦� = �0 𝑓𝑓 𝑦𝑦0 � [��
𝑅𝑅3×3
���−𝑅𝑅
��� ��] � �
��𝑡𝑡�3×1
3×3
� 𝑍𝑍
1 �������
0 0 1 3×4
3×1
�1
3×3
4×1
2D 3D
point = Intrinsic matrix Rotation matrix Translation matrix point
K(3x3) R(3x3) T(3x4) X(4x1)
(3x1)
P matrix
trajectory
When camera motion (multiple
P1 P2 P3 P4
poses) is defined, the structure of
the scene is reconstructed from the
Z images.
Then mapping of the scene can be
applied.
Y
More details will be given in the
SfM lecture.
WORLD X
SYSTEM
DIRECT LINEAR TRANSFORMATION DLT
• There are different techniques to solve the projection matrix
either linear or nonlinear methods.
• DLT equations for one observed point can be formulated as Z
follows:
𝑝𝑝1
Y
𝑋𝑋 𝑌𝑌 𝑍𝑍 1 0 0 0 0 − 𝑥𝑥𝑥𝑥 − 𝑥𝑥𝑥𝑥 − 𝑥𝑥𝑥𝑥 − 𝑥𝑥 𝑝𝑝2 0
� �� ⋮ � = � �
0 0 0 0 𝑋𝑋 𝑌𝑌 𝑍𝑍 1 − 𝑦𝑦𝑦𝑦 − 𝑦𝑦𝑦𝑦 − 𝑦𝑦𝑦𝑦 − 𝑦𝑦 0
𝑝𝑝12
• With six measured image points, twelve equations can be X
formed in the DLT which are the minimum number of points to
determine the parameters of the projection matrix.
• The DLT solution of the form 𝐴𝐴𝐴𝐴 = 0 using singular value
decomposition 𝑺𝑺𝑺𝑺𝑺𝑺 to solve such an equation system by
decomposing matrix 𝐴𝐴.
WHAT HAVE WE LEARNED?
• Some useful camera settings and its relation to have good images.
• The geometry of a single image , 3D to 2D.
• The rotation in 3D space and some rotational variants.
• Deamination of camera pose using linear and nonlinear methods.
Thank you