0% found this document useful (0 votes)

5 views

Stereo_Matching_and_Rectification

Uploaded by

5bnvpv9db4

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

Stereo_Matching_and_Rectification

Uploaded by

5bnvpv9db4

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Stereo Matching and rectification

January 6, 2025

1 Camera Model
• World coordinates → (Coordinate transformation) → Camera coordinates → (Perspective Projection)
→ Image Coordinates

Calibrate camera intrinsic and extrinsic parameters to accurately transform and project the world coor-
dinates to image coordinates.

1.1 Epipolar Geometry

• Epipolar line: The intersection of the epipolar plane (the plane defined by the camera centers and a
3D point) with the image plane.
• Epipole: The projection of the other camera center (e.g., o′ in the left camera image and o in the
right camera image) onto the image plane.
• Epipolar constraint: If x is a point in the first image, its corresponding point x′ in the second image
must lie on the epipolar line l′ . This reduces the search for correspondences to a 1D problem.

1.2 Fundamental Matrix

⊤
x′ Fx = 0 (1)

• Essential matrix E operates on points in the camera coordinate system (3D points projected to 2D
camera coordinates).
E = K′⊤ Rt× K (2)

• Fundamental matrix F operates on points in the image (pixel) coordinate system.

F = K′−⊤ EK−1 (3)

• Both E and F are rank 2 matrices, but they have different properties:
– E has two equal non-zero singular values.
– F does not have equal singular values.

• Degrees of Freedom (DoF):

– E has 5 DoF (3 for rotation, 2 for translation up to scale).
– F has 7 DoF (as it includes the intrinsic parameters of the cameras).

1
1.3 Big Picture: 3 Key Components in 3D
1. 3D Points (Structure): The spatial arrangement and coordinates of points in the 3D scene.
2. Estimate Fundamental Matrix: Determine the relationship between two images of the same scene
to find the corresponding points.

3. Correspondences → Camera (Motion): Use the matched points to determine the relative motion
(rotation and translation) between the two cameras.

1.4 (Normalized) Eight-Point Algorithm

1. Normalize points: Transform
√ the image points so that they are centered around the origin and have
an average distance of 2 from the origin.

x̂ = Tx, x̂′ = T′ x′ (4)

2. Construct the M × 9 matrix A: Each row of A is constructed from a pair of corresponding points
(x̂, x̂′ ).
 ˆ′
xˆ1 x1 xˆ1 yˆ1′ xˆ1 yˆ1 xˆ′1 yˆ1 yˆ1′ yˆ1 xˆ′1 yˆ1′

1
 . .. .. .. .. .. .. .. .. 
A =  .. . . . . . . . . (5)
xˆn xˆ′n xˆn yˆn′ xˆn yˆn xˆ′n yˆn yˆn′ yˆn xˆ′n yˆn′ 1

3. Find the SVD of A: Perform Singular Value Decomposition to find the matrix V .

A = U DV ⊤ (6)

4. Extract F : The entries of F are the elements of the column of V corresponding to the smallest
singular value.
5. Enforce rank-2 constraint on F : Modify F to ensure it has rank 2 by setting its smallest singular
value to zero.

6. Un-normalize F : Transform F back using the normalization matrices.

F = T′⊤ F̂T (7)

1.5 Triangulation
• Correspondences → 3D Points (Structure): Given a set of corresponding points in two images,
triangulate to find their 3D coordinates.
• Camera (Motion) → 3D Points (Structure): Use the known camera positions and orientations
to reconstruct the 3D scene from the image points.

2 Basic Two-View Stereo Setup

• Camera (Motion) → Stereo Matching → Correspondences

2.1 Reconstruction of 3D Points

1. Select a point in one image with feature detection (e.g., SIFT):
• Use Scale-Invariant Feature Transform (SIFT) to detect and describe local features in the image.
2. Form the epipolar line for that point in the second image:

2
• Calibrate the cameras to find the Essential matrix E or the Fundamental matrix F .
• Use F to compute the epipolar line l′ in the second image.
3. Find the matching point along the epipolar line (Stereo matching):
• Search along the epipolar line for the corresponding point using a matching criterion (e.g., nor-
malized cross-correlation).
4. Perform triangulation:
• Use the corresponding points from both images to compute the 3D coordinates of the point via
triangulation.

Disadvantages of this procedure:

• Example: Parallel to Image Plane
– When the camera motion is parallel to the image plane, the epipoles are at infinity.
– In this case, the epipolar lines are parallel in both images.
• Epipoles at Infinity:
– When epipoles are infinitely far away, the epipolar lines are parallel and do not converge.
• Depth Map:
– The amount of horizontal movement of corresponding points is inversely proportional to the
distance from the camera, resulting in a depth map.
– Points closer to the camera move more, while points further away move less.

2.2 Depth from Disparity

Disparity is inversely proportional to depth. The relationship between disparity and depth is given by:
fB
z= (8)
x − x′
where:
• z is the depth (distance from the camera).
• f is the focal length of the camera.
• B is the baseline (distance between the two camera centers).
• x and x′ are the x-coordinates of the corresponding points in the left and right images, respectively.

2.3 Can I Compute Depth from Any Two Images of the Same Object?
To accurately compute depth from two images of the same object, the following conditions must be met:

1. Sufficient Baseline:
• There must be a sufficient distance between the two camera positions to observe a noticeable
disparity.
• A larger baseline improves depth accuracy but can also make matching points more challenging.
2. Rectified Images:
• The images need to be rectified, which means transforming them such that the epipolar lines are
horizontal and aligned.
• Rectification simplifies the search for correspondences to a 1D problem along horizontal lines.

3
2.4 Effect of Baseline on Stereo Results
• Large Baseline:
– Advantages: Smaller triangulation error, leading to more accurate depth estimates.
– Disadvantages: Matching points between images becomes more difficult due to greater variation
in perspective.
• Small Baseline:
– Advantages: Easier to match points between images due to less variation in perspective.
– Disadvantages: Higher triangulation error, leading to less accurate depth estimates.

2.5 Steps to Compute Depth from Disparity

1. Rectify Images:
• Transform the images such that the epipolar lines are horizontal, simplifying the correspondence
search.
2. For Each Pixel:
(a) Find the epipolar line in the rectified image.
(b) Scan along the epipolar line to find the best match for the pixel.
(c) Compute depth from the disparity using the formula:
bf
Z= (9)
d
where Z is the depth, b is the baseline, f is the focal length, and d is the disparity.

2.6 How to Make the Epipolar Lines Horizontal

Epipolar lines are horizontal when the rotation matrix R = I and the translation vector t = (T, 0, 0), meaning
the cameras are aligned such that their optical axes are parallel and the translation is purely along the x-axis.
• In rectified images, corresponding points lie on the same row. This alignment ensures that the epipolar
lines are horizontal, simplifying the process of finding correspondences.
• The rectification process involves transforming the images using homographies that align the epipolar
lines.

2.7 Stereo Image Rectification

If the image planes are not parallel, we can find homographies to project each view onto a common plane
parallel to the baseline.

2.8 Image Rectification

To rectify an image, we calculate a rectifying rotation Rrect = (r1 , r2 , r3 )T , with:
r1 = −RT t/∥RT t∥2
[(0, 0, 1)T ] × r1
r2 =
∥[(0, 0, 1)T ] × r1 ∥2
r3 = r1 × r2
As the epipole in the first image is in the direction of r1 , it is easy to see that the rotated epipole is ideal:
Rrect r1 = (1, 0, 0)T . Thus, applying Rrect to the first camera leads to parallel and horizontal epipolar lines.
Rectification Algorithm:

4
1. Estimate Ẽ, decompose into t and R, and construct Rrect as above.
2. Warp pixels in the first image as follows:

x̃′1 = KRrect K1−1 x1 (10)

3. Warp pixels in the second image as follows:

x̃′2 = KRrect RT K2−1 x2 (11)

Remarks:

• K is a shared projection matrix that can be chosen arbitrarily (e.g., K = K1 ).

• In practice, the inverse transformation is used for warping (i.e., query the source image using the
inverse of the computed transformation).
Stereo Image Rectification: Correspondences are located on the same image row as the query point.

2.9 Depth Estimation via Stereo Matching

1. Rectify images: Transform the images such that the epipolar lines are horizontal.
2. For each pixel:

(a) Find the epipolar line in the rectified image.

(b) Scan along the epipolar line to find the best match for the pixel. This can be done using various
matching criteria, such as sum of absolute differences (SAD), sum of squared differences (SSD),
or normalized cross-correlation (NCC).
(c) Compute depth from the disparity using the formula:

bf
z= (12)
d
where:
• z is the depth.
• b is the baseline (distance between the two cameras).
• f is the focal length of the camera.
• d is the disparity (difference in x-coordinates of the corresponding points in the rectified
images).

3 Local Stereo Matching Algorithm

3.1 Matching Using Epipolar Lines
For a patch in the left image:
• Compare it with patches along the same row in the right image.
• Select the patch with the highest match score.

• Repeat for all pixels in the left image.

5
3.2 Similarity Measure
Commonly used similarity measures for evaluating match scores include:
• Sum of Absolute Differences (SAD):
X
SAD = |IL (u, v) − IR (u + d, v)| (13)
(u,v)∈window

• Sum of Squared Differences (SSD):

X
SSD = (IL (u, v) − IR (u + d, v))2 (14)
(u,v)∈window

• Zero-mean SAD:
X
Zero-mean SAD = |(IL (u, v) − I¯L ) − (IR (u + d, v) − I¯R )| (15)
(u,v)∈window

• Locally Scaled SAD:

X IL (u, v) IR (u + d, v)
Locally Scaled SAD = − (16)
I¯L I¯R
(u,v)∈window

• Normalized Cross-Correlation (NCC):

− I¯L )(IR (u + d, v) − I¯R )
P
(u,v)∈window (IL (u, v)
NCC = qP (17)
¯ 2 P ¯ 2
(u,v)∈window (IL (u, v) − IL ) (u,v)∈window (IR (u + d, v) − IR )

3.3 Window Size

Adaptive window methods can help balance detail and noise:
• For each point, match using windows of multiple sizes and use the disparity that results in the best
similarity measure (minimize SSD per pixel).
Smaller Window:
• Advantages: More detail in the disparity map.
• Disadvantages: More noise in the disparity map.
Larger Window:
• Advantages: Smoother disparity map.
• Disadvantages: Less detail, and can fail near boundaries and discontinuities.

3.4 Block Matching

• Choose a disparity range [0, D].
• For all pixels x = (x, y) in the left image, compute the best disparity using a winner-takes-all (WTA)
approach.
• Repeat this process for the right image.
• Apply a left-right consistency check to remove outliers.
Half Occlusions:
• An area that is visible in the left image but not in the right image.

6
3.5 Disparity Space Image (DSI)
First, we introduce the concept of the Disparity Space Image (DSI). The DSI for one row represents pairwise
match scores between patches along that row in the left and right images.

• c(i, j) is the match score for the patch centered at left pixel i with the patch centered at right pixel j.
• The dissimilarity value for each pair of patches is entered as a column in the DSI.

Greedy Selection:
• Simply choose the row with the least disparity for each column.

3.6 Greedy Per-Pixel Path Matching

Greedy selection often does not satisfy order constraints and produces a non-smooth disparity map.

4 Beyond Local Stereo Matching

4.1 Why is Matching Challenging?
• Uniqueness: Each point in one image should match at most one point in the other image.
• Smoothness: We expect disparity to change slowly across most of the image, resulting in smooth
disparity maps.

• Occlusion:
– What if a pixel in the left image is not seen in the right image?
– What if a pixel in the right image is not seen in the left image?
• Ordering Constraint: If pixels (a, b, c) are ordered in the left image, they should have the same
order in the right image.

– This is not always true and depends on the depth of the objects in the scene.

4.2 Non-Local Constraint: Uniqueness

Each point in one image should match at most one point in the other image. However, in real life, uniqueness
does not always hold due to:
• Repetitive textures: Similar patterns can appear in multiple locations, leading to ambiguous matches.
• Reflective surfaces: Reflections can cause mismatches as they show different scenes in the two images.

• Transparency: Overlapping objects can produce multiple potential matches.

4.3 Non-Local Constraint: Smoothness

We expect disparity values to change slowly for the most part, resulting in smooth disparity maps. However,
exceptions include:
• Depth discontinuities: Sharp changes in depth, such as object edges, result in abrupt disparity changes.
• Textureless regions: Large uniform areas might produce unreliable disparity values.

7
4.4 Occlusions: No Matches
Dealing with occlusions:
• Identify occluded regions: Use left-right consistency checks to detect pixels that do not have corre-
sponding matches in the other image.

• Fill occlusions: Interpolate disparity values from neighboring non-occluded regions.

• Use visibility constraints: Model occlusions explicitly in the matching process to improve accuracy.

4.5 Left-Right Consistency Test

Outliers and half occlusions can be detected via a left-right consistency test:
• Compute the disparity map for both the left and right images.
• Verify if the disparities map to each other (cycle consistency). Specifically, check if the disparity of
a pixel in the left image matches the disparity of the corresponding pixel in the right image and vice
versa.
• Pixels that fail this consistency test are likely to be outliers or occlusions.

4.6 Adding Inter-Scanline Consistency

So far, each left image patch has been matched independently along the right epipolar line. This approach
can lead to errors due to lack of consistency within the scanline.
To enforce consistency among matches in the same row (scanline):
• Consider the spatial relationship between neighboring pixels along the scanline.

• Apply constraints or smoothing techniques to ensure that the disparity values change smoothly along
the scanline, except at depth discontinuities.
• Use dynamic programming or other optimization techniques to find a consistent set of disparities that
minimize a global energy function incorporating both matching costs and smoothness constraints.

4.7 DSI and Scanline Consistency

Assigning disparities to all pixels in the left scanline now amounts to finding a connected path through the
Disparity Space Image (DSI).

4.8 Lowest Cost Path

We aim to choose the ”best” path, one with the lowest ”cost” (lowest sum of dissimilarity scores along the
path).

4.9 Stereo Matching with Dynamic Programming

Dynamic programming yields the optimal path through the grid. This path represents the best set of matches
that satisfy the ordering constraint. There are three cases:
• Matching patches: The cost is the dissimilarity score.

• Occluded from the right: The cost is some constant value.

• Occluded from the left: The cost is some constant value.

8
4.10 Real Scanline Example
Every pixel in the left column is now marked with either a disparity value or an occlusion label. This process
is repeated for every scanline in the left image.

4.11 Occlusion Filling

A simple trick for filling in gaps caused by occlusion:
• Fill in left occluded pixels with values from the nearest valid pixel preceding it in the scanline.
• For right occluded pixels, look for a valid pixel to the right.

4.12 Scanline Stereo by Dynamic Programming

This method often generates streaking artifacts due to the local nature of the matching process.

4.13 Improving Depth Estimation

• Issue: Too many discontinuities in the disparity map.
• Expectation: We expect disparity values to change slowly across most of the image.
• Assumption: Depth should change smoothly, meaning neighboring pixels should have similar dispar-
ity values.

To address this:
• Incorporate smoothness constraints into the stereo matching process.
• Penalize large changes in disparity values between neighboring pixels.
• Use techniques such as regularization or optimization algorithms to enforce smoothness in the disparity
map.

4.14 Energy Minimization

What defines a good stereo correspondence?

1. Match Quality:
• We want each pixel to find a good match in the other image.
• The disparity value should accurately represent the corresponding point in the other image.
2. Smoothness:
• If two pixels are adjacent, they should usually move about the same amount.
• Disparity values should change smoothly across the image, except at depth discontinuities.

To achieve these objectives, we can formulate the stereo matching problem as an energy minimization
task. The energy function combines terms for match quality and smoothness, and the goal is to find the
disparity map that minimizes this energy. Optimization algorithms like graph cuts, belief propagation, or
variational methods can be used to find the optimal solution.
subsectionStereo as Energy Minimization
In stereo vision, we can view the matching problem as an energy minimization task, where the goal is to
find the disparity map that minimizes an energy function. The energy function is typically defined as:

E(d) = Ed (d) + λEs (d) (18)

where:

9
• E(d) is the energy function for one pixel.
• Ed (d) is the data term, representing the match quality. It ensures that each pixel finds a good match
in the other image, typically obtained from block matching results.
• Es (d) is the smoothness term, representing the smoothness constraint. It ensures that adjacent pixels
usually move about the same amount, promoting smooth disparity maps.
• λ is a weighting parameter that balances the importance of the data term and the smoothness term.
The task is to find the disparity map d that minimizes this energy function, typically achieved using
optimization algorithms such as graph cuts, belief propagation, or variational methods.

4.15 Dynamic Programming

Dynamic programming (DP) can be used to minimize the energy function independently per scanline. Here,
D(x, y, z) represents the minimum cost of the solution such that d(x, y) = z, where:
• (x, y) represents the pixel coordinates.
• z represents the disparity value.
DP iterates over each pixel in the scanline and computes the minimum cost of the solution for each
possible disparity value. This process is repeated for all pixels in the scanline, efficiently finding the optimal
disparity map that minimizes the energy function.

4.16 Energy Minimization via Graph Cut Algorithm

The energy minimization task in stereo vision can be solved using the graph cut algorithm. This algorithm
finds the optimal disparity map by partitioning a graph representing the problem into two disjoint sets
(source and sink), such that the cut minimizes the total energy of the system.

4.17 Stereo Block Matching Fail

Stereo block matching may fail in scenarios such as:
• Textureless regions: Lack of texture makes it difficult to find distinctive features for matching.
• Repeated patterns: Identical or similar patterns in both images can lead to ambiguous matches.
• Specularities: Reflections or shiny surfaces can distort image appearance, causing mismatches.

4.18 Stereo Reconstruction Pipeline

The stereo reconstruction pipeline typically involves the following steps:
1. Camera Calibration: Determine camera parameters and correct distortions.
2. Rectify Images: Transform images to align corresponding epipolar lines.
3. Compute Disparity: Estimate the pixel-wise disparity between rectified image pairs.
4. Estimation Depth: Calculate depth from the disparity map using known camera parameters.
What Will Cause Errors?
• Camera Calibration Errors: Inaccurate camera parameters can lead to incorrect disparity estima-
tion.
• Poor Image Resolution: Low-resolution images may not contain sufficient detail for accurate match-
ing.

10
• Occlusions: Objects blocking the view can cause missing or incorrect depth information.
• Violations of Brightness Constancy (Specular Reflections): Changes in illumination or specular
reflections can violate the assumption of brightness constancy, leading to errors in matching.
• Large Motions: Rapid movements between frames can cause motion blur and mismatches.

• Low Contrast Image Resolutions: Images with low contrast may not provide enough information
for reliable matching.

4.19 Siamese Network for Stereo Matching

The Siamese network approach for stereo matching involves the following steps:

1. Training CNN Path-wise: Train a convolutional neural network (CNN) on pairs of stereo images
along with their ground truth disparity maps. The CNN is trained to learn features that are useful for
matching corresponding pixels between the left and right images.
2. Calculate Features: Once the CNN is trained, use it to extract features for each pixel in both the
left and right images.

3. Correlate Features: Correlate the features between the left and right images, typically using the dot
product operation.
4. Disparity Estimation:
• Winner Takes All (WTA): For each pixel, find the maximum correlated value, indicating the
best match in the other image. This approach is known as winner takes all.
• Global Optimization: Alternatively, run a global optimization algorithm to refine the disparity
estimation across the entire image, considering contextual information and enforcing consistency
constraints.

By leveraging deep learning techniques and training on large datasets, Siamese networks can learn complex
features and improve stereo matching accuracy compared to traditional methods.

4.20 Stereo Matching with Deep Networks

One notable approach in stereo matching using deep networks is DispNet. Here are its key features:

• DispNet: DispNet was one of the first end-to-end trained deep neural networks for stereo matching.
• Architecture: DispNet employs a U-Net like architecture with skip connections to retain details and
capture multi-scale features effectively.
• Correlation Layer: DispNet utilizes a correlation layer, which computes the similarity between
patches at different displacements, typically with a large displacement range compared to the input
image size.
• Multi-Scale Loss: DispNet employs a multi-scale loss function, which considers disparity errors at
multiple scales in the image pyramid. This helps in capturing disparities accurately across different
levels of details.

• Curriculum Learning: DispNet incorporates curriculum learning, where the network is trained on
a curriculum of increasingly difficult examples. It starts by learning from easy examples and gradually
progresses to harder ones, which helps in faster convergence and better generalization.

By leveraging deep neural networks like DispNet, stereo matching algorithms can achieve state-of-the-art
performance in terms of accuracy and efficiency.

11
4.21 Stereo Mixture Density Networks (SMD-Nets)
Stereo Mixture Density Networks (SMD-Nets) are a variant of deep neural networks specifically designed
for stereo matching tasks. One of their key features is the ability to predict sharper boundaries at higher
resolution compared to traditional methods.
SMD-Nets leverage mixture density models to capture complex relationships between stereo image pairs
and their corresponding disparities. By using mixture density models, SMD-Nets can represent multimodal
distributions, allowing them to predict not only the most likely disparity value for each pixel but also the
uncertainty associated with the prediction.
This ability to model uncertainty is particularly useful in stereo matching, where the correspondence
between pixels in stereo image pairs may be ambiguous or uncertain. By predicting sharper boundaries,
SMD-Nets can provide more accurate and reliable depth estimates, especially in challenging scenarios such
as textureless regions, occlusions, or depth discontinuities.
Overall, SMD-Nets represent a promising approach to stereo matching, offering improved accuracy and
robustness by explicitly modeling uncertainty and predicting sharper boundaries.

4.22 Stereo Datasets

Several stereo datasets are commonly used for training and evaluating stereo matching algorithms. Some of
the popular datasets include:

• Middlebury Stereo Datasets: Middlebury provides a benchmark dataset consisting of stereo im-
age pairs along with ground truth disparity maps. These datasets cover a wide range of scenes and
variations in lighting, texture, and occlusions.
• KITTI: The KITTI dataset is widely used for evaluating stereo algorithms in the context of au-
tonomous driving. It includes stereo image pairs captured from vehicles equipped with stereo cameras,
along with accurate ground truth annotations for depth and motion.
• Synthetic Data: Synthetic datasets such as Flying Things and Monkaa are generated using computer
graphics rendering techniques. These datasets provide stereo image pairs with accurate ground truth
disparities, allowing researchers to train and evaluate stereo matching algorithms under controlled
conditions and diverse environments.

These datasets play a crucial role in the development and evaluation of stereo matching algorithms,
enabling researchers to compare the performance of different methods and assess their generalization capa-
bilities across various real-world scenarios.

5 Active Stereo with Structured Light

Active stereo with structured light involves projecting structured light patterns onto the object to simplify
the correspondence problem in stereo vision. This technique forms the basis for active depth sensors such as
Kinect and iPhone X (using IR).
By using controlled structured light, the correspondence problem becomes easier to solve. The disparity
between laser points on the same scanline in the images determines the 3D coordinates of the laser point on
the object.

5.1 Laser Scanning

Laser scanning utilizes optical triangulation to capture precise 3D information about the surface of an object.
In this method, a single stripe of laser light is projected onto the object, and as it scans across the surface,
the reflected light is captured by a sensor. This results in a highly accurate representation of the object’s
geometry.

12
5.2 Aligning Range Images
A single range scan may not be sufficient to capture the complete surface of a complex object. Therefore,
techniques are required to register multiple range images obtained from different viewpoints. This process
involves aligning the range images to create a coherent 3D representation of the object. This aligning process
is crucial for further analysis and processing of the captured data.
This alignment of range images leads us to the field of multi-view stereo, where the goal is to reconstruct
the 3D geometry of a scene by combining information from multiple viewpoints.

SSD-Earthquake Tutorial 1 7
No ratings yet
SSD-Earthquake Tutorial 1 7
29 pages
Lecture 13
No ratings yet
Lecture 13
130 pages
Stereo Image Processing Using Opencv
No ratings yet
Stereo Image Processing Using Opencv
25 pages
Lec3 StereoGeometry
No ratings yet
Lec3 StereoGeometry
53 pages
computer_vision_4_3D_vision_motion_2_students
No ratings yet
computer_vision_4_3D_vision_motion_2_students
60 pages
Stereo Vision Using The Opencv Library: Sebastian DR Oppelmann Moos Hueting Sander Latour Martijn Van Der Veen June 2010
No ratings yet
Stereo Vision Using The Opencv Library: Sebastian DR Oppelmann Moos Hueting Sander Latour Martijn Van Der Veen June 2010
15 pages
Part 09 MD
No ratings yet
Part 09 MD
40 pages
Stereopsis 1: Camera Geometry and 3d-Reconstruction
No ratings yet
Stereopsis 1: Camera Geometry and 3d-Reconstruction
54 pages
Computer Vision - Camera Geometry
No ratings yet
Computer Vision - Camera Geometry
35 pages
Stereo Vision-Estimation of Disparity: Advanced Computer Vision Course MEC - 301/2
No ratings yet
Stereo Vision-Estimation of Disparity: Advanced Computer Vision Course MEC - 301/2
29 pages
4.2.stereo-geo
No ratings yet
4.2.stereo-geo
55 pages
Lecture_06_StereoVision
No ratings yet
Lecture_06_StereoVision
62 pages
Daniel Stereo
No ratings yet
Daniel Stereo
8 pages
Stereo Vision Due Diligence
No ratings yet
Stereo Vision Due Diligence
6 pages
Stereo Calibration 2
No ratings yet
Stereo Calibration 2
6 pages
A Wearable Stereo Camera System For Distance Measurement Towards Assistive Robot
No ratings yet
A Wearable Stereo Camera System For Distance Measurement Towards Assistive Robot
39 pages
Stereo 3d Vision
No ratings yet
Stereo 3d Vision
53 pages
Camera_calibration_and_stereo_Vision
No ratings yet
Camera_calibration_and_stereo_Vision
4 pages
lecture9-2
No ratings yet
lecture9-2
32 pages
unit 3
No ratings yet
unit 3
43 pages
Tute Questions
No ratings yet
Tute Questions
6 pages
Spring'20: 2-View Geometry
No ratings yet
Spring'20: 2-View Geometry
20 pages
cv2 2
No ratings yet
cv2 2
79 pages
Determining The Epipolar Geometry and Its Uncertainty: A Review
No ratings yet
Determining The Epipolar Geometry and Its Uncertainty: A Review
35 pages
Measuring Height: T B R R B T
No ratings yet
Measuring Height: T B R R B T
51 pages
[2025] C4_L5-6_epipolargeometry in computer vision
No ratings yet
[2025] C4_L5-6_epipolargeometry in computer vision
78 pages
VSLAM
No ratings yet
VSLAM
75 pages
Stereo Matching Using Correlation
No ratings yet
Stereo Matching Using Correlation
14 pages
Cylindrical Rectification PDF
No ratings yet
Cylindrical Rectification PDF
7 pages
lecture20 calibration cont, stereo
No ratings yet
lecture20 calibration cont, stereo
41 pages
Real Time 3D Depth Estimation and
No ratings yet
Real Time 3D Depth Estimation and
6 pages
14 2viewgeometry Notes
No ratings yet
14 2viewgeometry Notes
5 pages
Uncalibrated Euclidean Reconstruction A Review
No ratings yet
Uncalibrated Euclidean Reconstruction A Review
9 pages
Epipolar Geometry 2
No ratings yet
Epipolar Geometry 2
50 pages
Geometry of Two or More Views: V Aclav Hlav A C
No ratings yet
Geometry of Two or More Views: V Aclav Hlav A C
18 pages
EECS 106A Fa24 Homework 5 Vision
No ratings yet
EECS 106A Fa24 Homework 5 Vision
8 pages
Epipolar Geometry
No ratings yet
Epipolar Geometry
14 pages
Stereo Vision
No ratings yet
Stereo Vision
214 pages
Depth Reconstruction With Deep Neural Networks (Part 1)
No ratings yet
Depth Reconstruction With Deep Neural Networks (Part 1)
66 pages
10 Stereo
No ratings yet
10 Stereo
131 pages
EpipolarGeonetry Vvgood
No ratings yet
EpipolarGeonetry Vvgood
16 pages
3DCV_lec05_Epipolar Geometry
No ratings yet
3DCV_lec05_Epipolar Geometry
35 pages
CS231A Course Notes 3: Epipolar Geometry: Kenji Hata and Silvio Savarese
No ratings yet
CS231A Course Notes 3: Epipolar Geometry: Kenji Hata and Silvio Savarese
14 pages
Lecture1 PDF
No ratings yet
Lecture1 PDF
95 pages
04 Multi-View Geometry
No ratings yet
04 Multi-View Geometry
54 pages
Experiments in 3D Measurements by Using Single Camera and Accurate Motion
No ratings yet
Experiments in 3D Measurements by Using Single Camera and Accurate Motion
6 pages
326-Intro
No ratings yet
326-Intro
2 pages
Linear Triangulation
No ratings yet
Linear Triangulation
41 pages
What Is The Goal Stereo Vision?
No ratings yet
What Is The Goal Stereo Vision?
7 pages
Stereo Vision: The Correspondence Problem
No ratings yet
Stereo Vision: The Correspondence Problem
23 pages
MCV C4 2024 Exam Answers
No ratings yet
MCV C4 2024 Exam Answers
7 pages
Epipolar Geometry
No ratings yet
Epipolar Geometry
44 pages
3D Reconstruction - Computer Vision
No ratings yet
3D Reconstruction - Computer Vision
6 pages
05_MVS
No ratings yet
05_MVS
16 pages
26 Stereo
No ratings yet
26 Stereo
39 pages
Stereo Vision
No ratings yet
Stereo Vision
19 pages
Computer Viruses
No ratings yet
Computer Viruses
58 pages
3D Stereo Camera
No ratings yet
3D Stereo Camera
7 pages
Epipolar Geometry: Unlocking Depth Perception in Computer Vision
From Everand
Epipolar Geometry: Unlocking Depth Perception in Computer Vision
Fouad Sabry
No ratings yet
Bilinear Interpolation: Enhancing Image Resolution and Clarity through Bilinear Interpolation
From Everand
Bilinear Interpolation: Enhancing Image Resolution and Clarity through Bilinear Interpolation
Fouad Sabry
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Programming Fundamentals: Writing Code
No ratings yet
Programming Fundamentals: Writing Code
43 pages
Igcse Contact Process
No ratings yet
Igcse Contact Process
4 pages
Continuous Variable Transmission
No ratings yet
Continuous Variable Transmission
30 pages
Full download Probability, Statistics, and Stochastic Processes for Engineers and Scientists (Mathematical Engineering, Manufacturing, and Management Sciences) First Edition Aliakbar Montazer Haghighi pdf docx
100% (3)
Full download Probability, Statistics, and Stochastic Processes for Engineers and Scientists (Mathematical Engineering, Manufacturing, and Management Sciences) First Edition Aliakbar Montazer Haghighi pdf docx
62 pages
Mean Deviation of Ungrouped Data
No ratings yet
Mean Deviation of Ungrouped Data
2 pages
Galvatech2015 Proceeding Final
No ratings yet
Galvatech2015 Proceeding Final
11 pages
α,β-Unsaturated Carbonyl Compounds
No ratings yet
α,β-Unsaturated Carbonyl Compounds
3 pages
MinHook - The Minimalistic x86 - x64 API Hooking Library - CodeProject
100% (1)
MinHook - The Minimalistic x86 - x64 API Hooking Library - CodeProject
11 pages
Caso c2201 110mmscfd
No ratings yet
Caso c2201 110mmscfd
1 page
Switching Basics and Intermediate Routing
No ratings yet
Switching Basics and Intermediate Routing
19 pages
CSE111 Lab Assignment 4 - Fall'24
No ratings yet
CSE111 Lab Assignment 4 - Fall'24
15 pages
Mechanical Principles - Assignment 2
No ratings yet
Mechanical Principles - Assignment 2
9 pages
PHYS 1120 Momentum and Impulse Solutions
No ratings yet
PHYS 1120 Momentum and Impulse Solutions
11 pages
Windows Driver Install Manual SRP-275
No ratings yet
Windows Driver Install Manual SRP-275
14 pages
ZTE To Nokia Swap - Strategy
100% (1)
ZTE To Nokia Swap - Strategy
20 pages
Fg-Week 3-Western Philosophy Lec Notes
No ratings yet
Fg-Week 3-Western Philosophy Lec Notes
5 pages
Additive Manufacturing
No ratings yet
Additive Manufacturing
6 pages
Xpol 1710 2170Mhz 65°18dbi Fixed Tilt X Sector Panel Antenna
No ratings yet
Xpol 1710 2170Mhz 65°18dbi Fixed Tilt X Sector Panel Antenna
1 page
Arts Module 1
No ratings yet
Arts Module 1
51 pages
Automotive Vehicles: Assignment 2
No ratings yet
Automotive Vehicles: Assignment 2
6 pages
Escom: Instituto Politecnico Nacional
No ratings yet
Escom: Instituto Politecnico Nacional
9 pages
Oracle Note
No ratings yet
Oracle Note
5 pages
SANCHEZ-Jericho - Assignment 6 - STEM12-2
No ratings yet
SANCHEZ-Jericho - Assignment 6 - STEM12-2
1 page
NMR Precision Teslameter: The World's Most Precise Magnetometer
No ratings yet
NMR Precision Teslameter: The World's Most Precise Magnetometer
2 pages
[FREE PDF sample] Teach Yourself Electricity and Electronics, Seventh Edition Stan Gibilisco - eBook PDF ebooks
100% (7)
[FREE PDF sample] Teach Yourself Electricity and Electronics, Seventh Edition Stan Gibilisco - eBook PDF ebooks
69 pages
Unit 8 - Linear Programming
No ratings yet
Unit 8 - Linear Programming
39 pages
Ps Pcs Sa 8.2r4.1 Releasenotes
No ratings yet
Ps Pcs Sa 8.2r4.1 Releasenotes
24 pages
4 React面试真题-156页
No ratings yet
4 React面试真题-156页
156 pages
ch 7 mcq
No ratings yet
ch 7 mcq
5 pages

Stereo_Matching_and_Rectification

Uploaded by

Stereo_Matching_and_Rectification

Uploaded by

Stereo Matching and rectification

1.1 Epipolar Geometry

1.2 Fundamental Matrix

• Fundamental matrix F operates on points in the image (pixel) coordinate system.

F = K′−⊤ EK−1 (3)

• Degrees of Freedom (DoF):

1.4 (Normalized) Eight-Point Algorithm

x̂ = Tx, x̂′ = T′ x′ (4)

6. Un-normalize F : Transform F back using the normalization matrices.

F = T′⊤ F̂T (7)

2 Basic Two-View Stereo Setup

2.1 Reconstruction of 3D Points

Disadvantages of this procedure:

2.2 Depth from Disparity

2.5 Steps to Compute Depth from Disparity

2.6 How to Make the Epipolar Lines Horizontal

2.7 Stereo Image Rectification

2.8 Image Rectification

x̃′1 = KRrect K1−1 x1 (10)

3. Warp pixels in the second image as follows:

x̃′2 = KRrect RT K2−1 x2 (11)

• K is a shared projection matrix that can be chosen arbitrarily (e.g., K = K1 ).

2.9 Depth Estimation via Stereo Matching

(a) Find the epipolar line in the rectified image.

3 Local Stereo Matching Algorithm

• Repeat for all pixels in the left image.

• Sum of Squared Differences (SSD):

• Locally Scaled SAD:

• Normalized Cross-Correlation (NCC):

3.3 Window Size

3.4 Block Matching

3.6 Greedy Per-Pixel Path Matching

4 Beyond Local Stereo Matching

4.2 Non-Local Constraint: Uniqueness

• Transparency: Overlapping objects can produce multiple potential matches.

4.3 Non-Local Constraint: Smoothness

• Fill occlusions: Interpolate disparity values from neighboring non-occluded regions.

4.5 Left-Right Consistency Test

4.6 Adding Inter-Scanline Consistency

4.7 DSI and Scanline Consistency

4.8 Lowest Cost Path

4.9 Stereo Matching with Dynamic Programming

• Occluded from the right: The cost is some constant value.

4.11 Occlusion Filling

4.12 Scanline Stereo by Dynamic Programming

4.13 Improving Depth Estimation

4.14 Energy Minimization

E(d) = Ed (d) + λEs (d) (18)

4.15 Dynamic Programming

4.16 Energy Minimization via Graph Cut Algorithm

4.17 Stereo Block Matching Fail

4.18 Stereo Reconstruction Pipeline

4.19 Siamese Network for Stereo Matching

4.20 Stereo Matching with Deep Networks

4.22 Stereo Datasets

5 Active Stereo with Structured Light

5.1 Laser Scanning

You might also like