0% found this document useful (0 votes)

17 views16 pages

05_MVS

Uploaded by

auladacivil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views16 pages

05_MVS

Uploaded by

auladacivil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

1 Introduction 1
1.1 Recovering 3D Geometry from Images . . . . . . . . . . . . . . . . . . . . 1

2 Stereo Reconstruction 3
2.1 Classical Two View Stereo Reconstruction [Optional] . . . . . . . . . . . 3
2.1.1 Epipolar Geometry and Stereo Triangulation . . . . . . . . . . . . 3
2.2 Learning based Two View Stereo Reconstruction . . . . . . . . . . . . . . 9
2.3 PMVS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 MVSNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4.1 Differentiable homography . . . . . . . . . . . . . . . . . . . . . . 11
2.4.2 MVSNet overall architecture . . . . . . . . . . . . . . . . . . . . . 13

Bibliography 15

i
1 Introduction

1.1 Recovering 3D Geometry from Images

Traditional multi-view reconstruction approaches use hand-crafted similarity metrics
Like (e.g. NCC) and regularizations techniques (SGM [6]) to recover 3D points. It is
reported in recent MVS benchmarks [1, 8] that, although traditional algorithms [2, 3, 10]
perform very well on the accuracy, the reconstruction completeness still has large room
for improvement. The main reason for the low completeness of traditional methods is
because the hand-crafted similarity measure and block matching method mainly works
well with Lambertian surfaces and fail in the following failure cases:

• Textureless Surfaces It is hard to infer the geometry from the textureless surface
(e.g. white wall) since it looks similar from different viewpoints.

• Occlusions The scene objects may be partly or wholly invisible in different views
due to the scene occlusions.

• Repetitions Block matching techniques can give a similar response to different

surfaces due to their geometric and photometric repetitiveness.

• Non-Lambertian Surfaces Non-Lambertian surfaces look different from the dif-

ferent viewpoints.

• Other non-geometric variations: image noise, vignetting effect, exposure change,

and lighting variation.

Zbontar et al. [15] have shown that doing block matching on feature spaces can give
more robust results and can be used for depth perception in a two-view stereo setting.
The goal of Multi-view stereo techniques is to estimate the dense representation from
overlapping calibrated views. Recent learning-based MVS methods [11, 13, 14] were
able to get more complete scene representations by learning the depth maps from
feature space.

1
1 Introduction

Figure 1.1: Failure cases of block matching. (Image credit: Andreas Geiger)

Figure 1.2: Top: Traditional 3D reconstruction method [10]

Bottom: our learning-based method
(Image credit: TU Delft students)

2
2 Stereo Reconstruction
Here, we will describe the depth estimation in two-view and multi-view settings
where the pose information is known. We describe both traditional methods and
learning-based methods.

2.1 Classical Two View Stereo Reconstruction [Optional]

2.1.1 Epipolar Geometry and Stereo Triangulation
Triangulation methods

Monocular vision has a scale ambiguity issue which makes it impossible to triangulate
the scene with the correct scale. In a simple explanation, if the distance of the scene
from the camera and geometry of the scene were scaled by some positive factor k,
independently from the value of the k image plane will always have the same projection
of the scene.

( X, Y, Z )T 7−→ ( f X/Z + o x , f Y/Z + oy )T

(2.1)
(kX, kY, kZ )T 7−→ ( f kX/kZ + o x , f kY/kZ + oy )T = ( f X/Z + o x , f Y/Z + oy )T

Without any prior information, it is also impossible to perceive scene geometry from
a single RGB image. The most popular way of constructing and perceiving scene
geometry is having a motion to have a different camera view as shown in Figure 2.1.
Even having multiple monocular views without knowing extrinsic calibration will
not resolve scale ambiguity. Again relative pose between views and camera to scene
distance were scaled by some positive factor k, independently from the value of the k
image plane will always have the same projection of the scene. Figure 2.2 shows that
the point will have the same projection independent from the scale factor of k.
This section will mainly cover, scene triangulation in two and multi-view settings
with known relative pose transformations.

3
2 Stereo Reconstruction

Figure 2.1: Left frame does not give much information if there is one or two spheres
in the scene, seeing also right frame gives better understanding to viewer
and lets viewer have perception of two spheres with different colors. (Image
credit: Arne Nordmann )

Figure 2.2: Scale ambiguity of the two view system without knowing the relative pose.

4
2 Stereo Reconstruction

Two-view triangulation

Before diving into the two-view triangulation methods, this subsection will introduce
the multiple view geometry basics and conventions. Let’s assume x! and x2 are the
projection of 3D point X in homogeneous coordinates in two different frames. R and
T are rotation and translation from the first frame to the second frame. λ1 and λ2 are
distances from the camera centers to the 3D point X.

λ1 x1 = X and λ2 x2 = RX + T
λ2 x2 = R ( λ1 x1 ) + T
v̂v = v × v = 0 hat operator
λ2 T̂x2 = T̂R(λ1 x1 ) + T̂T = T̂R(λ1 x1 )
(2.2)
λ2 x2T ( T̂x2 ) = λ1 x2T T̂Rx1
x2⊥ T̂x2
x2T T̂Rx1 = 0 epipolar constraint
E = T̂R essential matrix

xi0 being image coordinate of xi and the K being intrinsic matrix the equation can be
formulated more generic for uncalibrated views.

x20T K −T T̂RK −1 x10 = 0

F = K −T T̂RK −1 fundamental matrix
(2.3)
x20T Fx10 = 0
E = K T FK relation between essential and fundamental matrix

Because of the sensor noise and discretization step in image formation, there is noise
in pixel coordinates of the 3D point projections. So the extensions of corresponding
points in image planes usually do not intersect in 3D. This noise should be considered
for getting accurate triangulation of corresponding points. There are multiple ways of
doing two-view triangulation. Two of those methods will be covered here.

Midpoint method

A midpoint triangulation method is a simple approach for two-view triangulation.

As shown in figure 2.3, the idea is finding the closest distance between the bearing
vectors which are rays extensions from the camera intersecting the image plane at
corresponding points. Q1 and Q2 are the points on these rays where the rays are at the

5
2 Stereo Reconstruction

Figure 2.3: Midpoint point for two-view triangulation.

closest point to each other. The line passing through Q1 Q2 should be perpendicular
to these bearing vectors for Q1 Q2 being the closest distance between these rays. The
midpoint of the Q1 Q2 is accepted as a valid 3D triangulation of the corresponding
points. λi is being the scalar distance from the camera center to the 3D point Qi , R
and T are being relative pose from the second camera frame to the first camera, the
approach mathematically can be formulated as below:

Q1 = λ1 d1 Q2 = λ2 Rd2 + T (C1 is chosen to be origin for simplicity)

T T
( Q1 − Q2 ) d1 = 0 ( Q1 − Q2 ) Rd2 = 0 (dot product of perpendicular lines)
λ1 d1T d1 − λ2 Rd2T d1 = T T d1
λ1 d1T Rd2 − λ2 Rd2T Rd2 = T T Rd2

d1T d1 −( Rd2 )T d1
T
λ1 T d1
(2.4)
= (Ax = b) form equation
( Rd2 )T d1 ( Rd2 )T ( Rd2 ) λ2 T T ( Rd2 )

λ1
= A −1 b
λ2
P = ( Q1 + Q2 )/2 = (λ1 d1 + λ2 Rd2 + T )/2

6
2 Stereo Reconstruction

Linear triangulation

This method is based on the fact that in the ideal case back-projected rays and rays
from the camera center to the correspondence on the image plane should be aligned.
The cross-product of these two vectors should be equal to zero in the ideal case. Using
this knowledge problem converted to a set of linear equations that can be solved by
SVD. Let x and y be correspondences, P, and P are respective perspective projection
matrices of the camera, and λ x and λy scalar values.
   
ux uy
x =  vx  y =  vy 
1 1
λ x x = PX λ x y = QX
x × PX = 0 y × QX = 0
   T    T
ux p1 uy q1
 v x  ×  p T  X = 0  vy  × q T  X = 0
2 2
1 p3T 1 q3T
(2.5)
v x p3T − p2T vy q3T − q2T
   
 p T − u x p T  X = 0  q T − uy q T  X = 0
1 3 1 3
u x p2T − v x p1T uy q2T − vy q1T
v x p3T − p2T
 
 pT − ux pT 
 1 3
 vy q T − q T  X = 0
3 2
q1T − uy q3T
AX = 0

Solutions for X can be easily calculated using the Singular Value Decomposition.

Multi-view triangulation

For multi-view triangulation, one solution can be to find such an X in 3D space which
has a minimum sum of square distance with the 3D points lying on the bearing vector.
Analytical solution for X can be found by taking the derivative of the loss function
with respect to the 3D point and finding the 3D point where is the derivative of loss
function equals zero. Assuming the Ci is camera center of ith camera, Pi is the point on
ith bearing vector, λi scalar distance between Ci and Pi , and X is being optimal 3D point
as a triangulation result , the triangulation result can be formulated as below.

7
2 Stereo Reconstruction

Figure 2.4: Multi view triangulation method. [9]

Pi = Ci + λi di
λi di ∼ X − Ci At ideal case with no noise
λi = λi diT di ∼ diT ( X − Ci )
Pi = Ci + λi di ∼ Ci + di diT ( X − Ci )
r = X − Ci − di diT ( X − Ci ) = ( I − di diT )( X − Ci )
N N
L= ∑ r2 = ∑ (( I − di diT )(X − Ci ))2
i =1 i =1
∂L (2.6)
arg min L ⇒ =0
x ∂X
N
∂L
= 2 ∑ ( I − di diT )2 ( X − Ci ) = 0
∂X i =1
N
∂L
Ai = ( I − di diT ) ⇒ = ∑ AiT Ai ( X − Ci ) = 0
∂X i =1
N N
X = ( ∑ AiT Ai )−1 ∑ AiT Ai Ci
i =1 i =1

8
2 Stereo Reconstruction

Similarity score

Fully-connected, Sigmoid
Fully-connected, ReLU
Similarity score
Fully-connected, ReLU
Fully-connected, ReLU Dot product
Concatenate Normalize Normalize
Convolution, ReLU Convolution, ReLU Convolution Convolution

Convolution, ReLU Convolution, ReLU Convolution, ReLU Convolution, ReLU

Left input patch Righ input patch Left input patch Right input patch

Figure 2.5: Simaese networks for stereo matching.[15]

2.2 Learning based Two View Stereo Reconstruction

Zbontar et al. [15] have initially shown that depth information can be extracted from
rectified image pairs by learning a similarity measure on relevant image patches. They
train their CNN-based siamese network as a binary classification network with similar
and irrelevant pairs of patches.
Kendall et al. [7] proposed the network where they use 2D CNN with shared weights
to retrieve rectified image pair features. In their work, they later used these feature
maps to calculate a matching score-based cost volume, and as the last step, they use a
3D CNN-based autoencoder to regularize this volume.

ity

[ ]
ar
sp
di

... * * *
...
height

width
Shared Weights Shared Weights
y
rit

[ ]
s pa
di

... * * * ...
height

width

Input Stereo Images 2D Convolution Cost Volume Multi-Scale 3D Convolution 3D Deconvolution Soft ArgMax Disparities

Figure 2.6: GC-Net - deep stereo regression architecture.[7]

9
2 Stereo Reconstruction

2.3 PMVS
Patch-Based Multi-View Stereo [2] has proven being quite effective in practice. After
an initial feature matching step aimed at constructing a sparse set of photoconsistent
patches, in the sense of the previous section—that is, patches whose projections in the
images where they are visible have similar brightness or color patterns—it divides the
input images into small square cells a few pixels across, and attempts to reconstruct
a patch in each one of them, using the cell connectivity to propose new patches, and
visibility constraints to filter out incorrect ones.
We assume throughout that n cameras with known intrinsic and extrinsic parameters
observe a static scene, and respectively denote by Oi and Ii (i = 1, ..., n) the optical
centers of these cameras and the images they have recorded of the scene. The main
elements of the PMVS model of multi-view stereo fusion and scene reconstruction are
small rectangular patches, intended to be tangent to the observed surfaces, and a few
of these patches’ key properties—namely, their geometry, which images they are visible
in and whether they are photoconsistent with those, and some notion of connectivity
inherited from image topology.

1. Matching Use feature matching to construct an initial set of patches, and optimize
their parameters to make them maximally photoconsistent.

2. Repeat 3 times:
a) Expansion Iteratively construct new patches in empty spots near existing
ones, using image connectivity and depth extrapolation to propose candi-
dates, and optimizing their parameters as before to make them maximally
photoconsistent.
b) Filtering Use again the image connectivity to remove patches identified as
outliers because their depth is not consistent with a sufficient number of
other nearby patches.

You should read section 2 (Key Elements of the Proposed Approach) and section 3
(Algorithm) from the paper.

10
2 Stereo Reconstruction

Figure 2.7: PMVS - overall approach. From left to right: a sample input image; detected
features; reconstructed patches after the initial matching;final patches after
expansion and filtering; polygonal surface extracted from reconstructed
patches.[2]

2.4 MVSNet
State-of-the-art learning-based MVS approaches adapt the photogrammetry-based MVS
algorithms by implementing them as a set of differentiable operations defined in the
feature space. MVSNet [13] introduced good quality 3D reconstruction by regularizing
the cost volume that was computed using differentiable homography on feature maps
of the reference and source images.

2.4.1 Differentiable homography

We know that any plane in 3 dimensional space can be parametrized with normal n T
and distance d. For plane π in the figure we can say that point Pi lies on the plane if
and only if n T Pi + d = 0.

11
2 Stereo Reconstruction

Figure 2.8: Two view homography [5] [4] [12]

Derivation with relative pose

1 z
pa = Ka Hab zb Kb−1 pb = b Ka Hab Kb−1 pb
za za
"R" and "t" are relative pose of a with respect to b
Hab Pb = R ∗ Pb + t
Plane constraint n T Pb + d = 0. (2.7)
−nT Pb
Hab Pb = RPb + t since −n T Pb /d =1
d
nT t
Hab = R −
d

12
2 Stereo Reconstruction

Conv + BN + ReLU, Stride = 1

Conv + BN + ReLU, Stride = 2
Conv, stride = 1
c Concatenation
Source Images

Addition
…

Shared Weights

Loss0
GT

Initial Depth Map

Shared Weights Loss1
Reference Image

c
Variance Soft
Metric Argmin
Refined Depth Map

Feature Differentiable Cost Volume Depth Map

Extraction Homography Regularization Refinement

Figure 2.9: MVSNet architecture.[13]

Derivation with absolute pose

nT t
Hab = R − remember relative case
d
Pb = Rb Pw + tb from world to camera b
Pa = R a Pw + t a from world to camera a
Pw = RbT Pb − RbT tb
Pa = R a Pw + t a = R a ( RbT Pb − RbT tb ) + t a
(2.8)
Pa = R a Pw + t a = R a RbT Pb
+ t a − R a RbT tb
T T
−b = R a Rb
R a← −b = t a − R a Rb tb
t a←
n T t a←
−b
Hab = R a←
−b − remember relative case
d
n T (t a − R a RbT tb )
Hab = R a RbT −
d

2.4.2 MVSNet overall architecture

MVSNet at first extract the deep features of the N (number of views) input images for
dense matching. It applies convolutional filters to extract the feature towers scale.Each
convolutional layer is followed by a batch-normalization (BN) layer and a rectified
linear unit (ReLU) the last layer.
Using features and the camera parameters, then we build cost volume regularization.
We use differentiable homography for building this cost volumes.

13
2 Stereo Reconstruction

The raw cost volume computed from image features are regularized later. Multi-scale
3DCNN have been used for cost volume regularization.This regularization step is
designed for refining the above cost volume to generate a probability volume for depth
inference.
Depth that was regressed from probability volume is further refined using the
2DCNN network.
You should read section 3 (3. MVSNET) from the paper.

14
Bibliography
[1] H. Aanæs, R. R. Jensen, G. Vogiatzis, E. Tola, and A. B. Dahl. Large-scale data for
multiple-view stereopsis. International Journal of Computer Vision, pages 1–16, 2016.
[2] Y. Furukawa and J. Ponce. Accurate, dense, and robust multi-view stereopsis. IEEE TPAMI,
32(8):1362–1376, 2010.
[3] S. Galliani, K. Lasinger, and K. Schindler. Massively parallel multiview stereopsis by
surface normal diffusion. June 2015.
[4] D. Gallup, J. Frahm, P. Mordohai, Q. Yang, and M. Pollefeys. Real-time plane-sweeping
stereo with multiple sweeping directions. In 2007 IEEE Conference on Computer Vision and
Pattern Recognition, pages 1–8, 2007. doi: 10.1109/CVPR.2007.383245.
[5] R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge
University Press, ISBN: 0521540518, second edition, 2004.
[6] H. Hirschmuller. Stereo processing by semiglobal matching and mutual information. IEEE
Transactions on pattern analysis and machine intelligence, 30(2):328–341, 2007.
[7] A. Kendall, H. Martirosyan, S. Dasgupta, P. Henry, R. Kennedy, A. Bachrach, and A. Bry.
End-to-end learning of geometry and context for deep stereo regression. In Proceedings of
the IEEE International Conference on Computer Vision, pages 66–75, 2017.
[8] A. Knapitsch, J. Park, Q.-Y. Zhou, and V. Koltun. Tanks and temples: Benchmarking
large-scale scene reconstruction. ACM Transactions on Graphics, 36(4), 2017.
[9] P. Moulon, P. Monasse, R. Perrot, and R. Marlet. OpenMVG: Open multiple view geome-
try. In International Workshop on Reproducible Research in Pattern Recognition, pages 60–74.
Springer, 2016.
[10] J. L. Schönberger, E. Zheng, M. Pollefeys, and J.-M. Frahm. Pixelwise view selection for
unstructured multi-view stereo. In European Conference on Computer Vision (ECCV), 2016.
[11] F. Wang, S. Galliani, C. Vogel, P. Speciale, and M. Pollefeys. Patchmatchnet: Learned
multi-view patchmatch stereo, 2021.
[12] Wikipedia. Homography (computer vision). Wikipedia, 2013.
[13] Y. Yao, Z. Luo, S. Li, T. Fang, and L. Quan. Mvsnet: Depth inference for unstructured
multi-view stereo. European Conference on Computer Vision (ECCV), 2018.
[14] Y. Yao, Z. Luo, S. Li, T. Shen, T. Fang, and L. Quan. Recurrent mvsnet for high-resolution
multi-view stereo depth inference. Computer Vision and Pattern Recognition (CVPR), 2019.
[15] J. Zbontar, Y. LeCun, et al. Stereo matching by training a convolutional neural network to
compare image patches. J. Mach. Learn. Res., 17(1):2287–2318, 2016.

3D Ultrasound Clinics in Three States
No ratings yet
3D Ultrasound Clinics in Three States
12 pages
26 Stereo
No ratings yet
26 Stereo
39 pages
Lecture_06_StereoVision
No ratings yet
Lecture_06_StereoVision
62 pages
Stereo 3d Vision
No ratings yet
Stereo 3d Vision
53 pages
3D geometry applied in computer vision applicatioms
No ratings yet
3D geometry applied in computer vision applicatioms
72 pages
What Is The Goal Stereo Vision?
No ratings yet
What Is The Goal Stereo Vision?
7 pages
Daniel Stereo
No ratings yet
Daniel Stereo
8 pages
Lec3 StereoGeometry
No ratings yet
Lec3 StereoGeometry
53 pages
3 DRec
No ratings yet
3 DRec
31 pages
Stereo_Matching_and_Rectification
No ratings yet
Stereo_Matching_and_Rectification
13 pages
3D Stereo Camera
No ratings yet
3D Stereo Camera
7 pages
Real Time 3D Depth Estimation and
No ratings yet
Real Time 3D Depth Estimation and
6 pages
Stereo Vision Due Diligence
No ratings yet
Stereo Vision Due Diligence
6 pages
Spring'20: 2-View Geometry
No ratings yet
Spring'20: 2-View Geometry
20 pages
Measuring Height: T B R R B T
No ratings yet
Measuring Height: T B R R B T
51 pages
04 Stereo Systems
No ratings yet
04 Stereo Systems
18 pages
Lecture 13
No ratings yet
Lecture 13
130 pages
Part 09 MD
No ratings yet
Part 09 MD
40 pages
Stereo Vision Using The Opencv Library: Sebastian DR Oppelmann Moos Hueting Sander Latour Martijn Van Der Veen June 2010
No ratings yet
Stereo Vision Using The Opencv Library: Sebastian DR Oppelmann Moos Hueting Sander Latour Martijn Van Der Veen June 2010
15 pages
computer_vision_4_3D_vision_motion_2_students
No ratings yet
computer_vision_4_3D_vision_motion_2_students
60 pages
4.2.stereo-geo
No ratings yet
4.2.stereo-geo
55 pages
Calibration and Stereovision Final Kche
No ratings yet
Calibration and Stereovision Final Kche
14 pages
Image Formation in Stereo Vision Setup--RK
No ratings yet
Image Formation in Stereo Vision Setup--RK
31 pages
2D-to-3D Photo Rendering For 3D Displays: Comandu@dsi - Unifi.it Atsuto - Maki@crl - Toshiba.co - Uk
No ratings yet
2D-to-3D Photo Rendering For 3D Displays: Comandu@dsi - Unifi.it Atsuto - Maki@crl - Toshiba.co - Uk
8 pages
Multi - View Stereo A Tutorial
No ratings yet
Multi - View Stereo A Tutorial
151 pages
Stereo Image Processing Using Opencv
No ratings yet
Stereo Image Processing Using Opencv
25 pages
3D Reconstruction Model
No ratings yet
3D Reconstruction Model
2 pages
Determining The Epipolar Geometry and Its Uncertainty: A Review
No ratings yet
Determining The Epipolar Geometry and Its Uncertainty: A Review
35 pages
L17 Panaroma Disparity
No ratings yet
L17 Panaroma Disparity
73 pages
lecture20 calibration cont, stereo
No ratings yet
lecture20 calibration cont, stereo
41 pages
Stereopsis 1: Camera Geometry and 3d-Reconstruction
No ratings yet
Stereopsis 1: Camera Geometry and 3d-Reconstruction
54 pages
Lecture1 PDF
No ratings yet
Lecture1 PDF
95 pages
Lec 17
No ratings yet
Lec 17
10 pages
Stereo Calibration 2
No ratings yet
Stereo Calibration 2
6 pages
04 Multi-View Geometry
No ratings yet
04 Multi-View Geometry
54 pages
cv2 2
No ratings yet
cv2 2
79 pages
Lecture18
No ratings yet
Lecture18
56 pages
Structure From Motion: Computer Vision Jia-Bin Huang, Virginia Tech
No ratings yet
Structure From Motion: Computer Vision Jia-Bin Huang, Virginia Tech
84 pages
Uncalibrated Euclidean Reconstruction A Review
No ratings yet
Uncalibrated Euclidean Reconstruction A Review
9 pages
Stereo Vision
No ratings yet
Stereo Vision
214 pages
3D Reconstruction Slides
No ratings yet
3D Reconstruction Slides
12 pages
unit 3
No ratings yet
unit 3
43 pages
A Multicamera Active 3D Reconstruction Approach
No ratings yet
A Multicamera Active 3D Reconstruction Approach
10 pages
[2025] C4_L5-6_epipolargeometry in computer vision
No ratings yet
[2025] C4_L5-6_epipolargeometry in computer vision
78 pages
VSLAM
No ratings yet
VSLAM
75 pages
3D Reconstruction
No ratings yet
3D Reconstruction
29 pages
10 Stereo
No ratings yet
10 Stereo
131 pages
Epipolar Geometry 2
No ratings yet
Epipolar Geometry 2
50 pages
4.3.stereo-app
No ratings yet
4.3.stereo-app
70 pages
C V: T - V G: Omputer Ision Hree IEW Eometry
No ratings yet
C V: T - V G: Omputer Ision Hree IEW Eometry
15 pages
CVP 3D Vision System Development Mattias Johannesson
No ratings yet
CVP 3D Vision System Development Mattias Johannesson
91 pages
Structure Tensor-Based Gaussian Kernel Edge-Adaptive Depth Map Refinement With Triangular Point View in Images
No ratings yet
Structure Tensor-Based Gaussian Kernel Edge-Adaptive Depth Map Refinement With Triangular Point View in Images
9 pages
Computer Viruses
No ratings yet
Computer Viruses
58 pages
lecture9-2
No ratings yet
lecture9-2
32 pages
Stereo Vision PHD Thesis
100% (2)
Stereo Vision PHD Thesis
7 pages
Computer Vision - Camera Geometry
No ratings yet
Computer Vision - Camera Geometry
35 pages
Foundations of Image Science
From Everand
Foundations of Image Science
Harrison H. Barrett
No ratings yet
Group Theory in Solid State Physics and Photonics: Problem Solving with Mathematica
From Everand
Group Theory in Solid State Physics and Photonics: Problem Solving with Mathematica
Wolfram Hergert
No ratings yet
Calculus: Maths of the Gods
From Everand
Calculus: Maths of the Gods
Bill Todorovich
No ratings yet
Materials Science and Technology of Optical Fabrication
From Everand
Materials Science and Technology of Optical Fabrication
Tayyab I. Suratwala
No ratings yet
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
drawable_points
No ratings yet
drawable_points
9 pages
camera
No ratings yet
camera
34 pages
stop_watch
No ratings yet
stop_watch
2 pages
02-camera_calibration
No ratings yet
02-camera_calibration
6 pages
How to write bash script to insert the records from file my_records.txt into PostgreSQL DB table_ - DEV Community
No ratings yet
How to write bash script to insert the records from file my_records.txt into PostgreSQL DB table_ - DEV Community
6 pages
Boundary Fill Algorithm: Program:: Outcome
No ratings yet
Boundary Fill Algorithm: Program:: Outcome
4 pages
Place An Image in Text in Photoshop
No ratings yet
Place An Image in Text in Photoshop
38 pages
Cohen Sutherland Line Clipping
No ratings yet
Cohen Sutherland Line Clipping
21 pages
A Pictorial Review of Color Appearance Models, Fairchild (Ch.1)
No ratings yet
A Pictorial Review of Color Appearance Models, Fairchild (Ch.1)
3 pages
(WWW - Entrance-Exam - Net) - DOEACC - C Level Image Processing and Computer Vision Sample Paper 7
No ratings yet
(WWW - Entrance-Exam - Net) - DOEACC - C Level Image Processing and Computer Vision Sample Paper 7
3 pages
1.digital Image Processing Basics
No ratings yet
1.digital Image Processing Basics
85 pages
DIP Lec1
No ratings yet
DIP Lec1
61 pages
An Algorithm of NURBS Surface Fitting For Reverse Engineering
No ratings yet
An Algorithm of NURBS Surface Fitting For Reverse Engineering
6 pages
Creating Color Harmony 2021
No ratings yet
Creating Color Harmony 2021
24 pages
MID CG Model Paper
No ratings yet
MID CG Model Paper
2 pages
CE 324 Digital Image Processing
No ratings yet
CE 324 Digital Image Processing
3 pages
Humano 05home Catalog
No ratings yet
Humano 05home Catalog
24 pages
R
No ratings yet
R
7 pages
CT2 QB
No ratings yet
CT2 QB
8 pages
@vtucode - in 21CS63 Module 4 PDF 2021 Scheme
No ratings yet
@vtucode - in 21CS63 Module 4 PDF 2021 Scheme
46 pages
Image Processing Spatial Filtering 1
No ratings yet
Image Processing Spatial Filtering 1
31 pages
Q1-Select The Correct Answer?: Revision Worksheet #Unit 4 Name Subject: Computer Date: Grade: 3
No ratings yet
Q1-Select The Correct Answer?: Revision Worksheet #Unit 4 Name Subject: Computer Date: Grade: 3
7 pages
Addison Wesley - Digital Image Processing, 3rd Edition (Found Via WWW - Emugle.com)
No ratings yet
Addison Wesley - Digital Image Processing, 3rd Edition (Found Via WWW - Emugle.com)
15 pages
A Shadow Removal Method For Tesseract Text Recognition
No ratings yet
A Shadow Removal Method For Tesseract Text Recognition
5 pages
Tutorial Exercise
No ratings yet
Tutorial Exercise
9 pages
Chapter 3 Color - Theory
No ratings yet
Chapter 3 Color - Theory
41 pages
Image Sensor
No ratings yet
Image Sensor
15 pages
Ps English Notes (01-12-2023)
No ratings yet
Ps English Notes (01-12-2023)
3 pages
Analysis of Digital Image Forgery Detection Using Adaptive Over-Segmentation Based On Feature Point Extraction and Matching
No ratings yet
Analysis of Digital Image Forgery Detection Using Adaptive Over-Segmentation Based On Feature Point Extraction and Matching
9 pages
4K-NeRF High Fidelity Neural Radiance Fields at Ultra High Resolutions
No ratings yet
4K-NeRF High Fidelity Neural Radiance Fields at Ultra High Resolutions
16 pages
19 Syllabus
No ratings yet
19 Syllabus
3 pages
JWT - Magazine May 2024
100% (1)
JWT - Magazine May 2024
145 pages
Lecture 3 of Computer Vision
No ratings yet
Lecture 3 of Computer Vision
45 pages
Dynamic Diffuse Global Illumination With Ray-Traced Irradiance Fields
No ratings yet
Dynamic Diffuse Global Illumination With Ray-Traced Irradiance Fields
30 pages

05_MVS

Uploaded by

05_MVS

Uploaded by

Contents

1.1 Recovering 3D Geometry from Images

• Repetitions Block matching techniques can give a similar response to different

• Non-Lambertian Surfaces Non-Lambertian surfaces look different from the dif-

• Other non-geometric variations: image noise, vignetting effect, exposure change,

Figure 1.2: Top: Traditional 3D reconstruction method [10]

2.1 Classical Two View Stereo Reconstruction [Optional]

( X, Y, Z )T 7−→ ( f X/Z + o x , f Y/Z + oy )T

x20T K −T T̂RK −1 x10 = 0

A midpoint triangulation method is a simple approach for two-view triangulation.

Figure 2.3: Midpoint point for two-view triangulation.

Q1 = λ1 d1 Q2 = λ2 Rd2 + T (C1 is chosen to be origin for simplicity)

Figure 2.4: Multi view triangulation method. [9]

Convolution, ReLU Convolution, ReLU Convolution, ReLU Convolution, ReLU

Figure 2.5: Simaese networks for stereo matching.[15]

2.2 Learning based Two View Stereo Reconstruction

Figure 2.6: GC-Net - deep stereo regression architecture.[7]

2.4.1 Differentiable homography

Figure 2.8: Two view homography [5] [4] [12]

Derivation with relative pose

Conv + BN + ReLU, Stride = 1

Initial Depth Map

Feature Differentiable Cost Volume Depth Map

Figure 2.9: MVSNet architecture.[13]

Derivation with absolute pose

2.4.2 MVSNet overall architecture

You might also like