0% found this document useful (0 votes)
16 views

Motion Analysis

Image processing and computer vision notes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Motion Analysis

Image processing and computer vision notes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Definitions and Paradigms of 3D Vision

1. Marr's Approach (1982)


o Definition: 3D vision involves deriving a precise three-
dimensional geometric description of a scene from one or
more images.
o Focus: Bottom-up reconstruction of 3D shape in a coordinate
system independent of the viewer.
o Assumptions:
 Single rigid object
 Separation from the background is straightforward.
o Goal: Accurate 3D reconstruction to facilitate tasks like
navigation, parts inspection, and object recognition.
2. Aloimonos and Shulman's Approach (1989)
o Definition: Understanding the object or scene and its 3D
properties from a sequence of images, considering both
moving and stationary scenarios.
o Focus: Understanding is key, which complicates the process if
minimal a priori knowledge is available.
o Complexity Spectrum:
 Minimal Knowledge: Complex understanding required.
 Simple Object Matching: Limited interpretations and
straightforward solutions.
3. Wechsler's Approach (1990)
o Definition: Vision as a parallel distributed system solving
minimization problems with natural constraints.
o Focus: Control principle in visual tasks, utilizing distributed
computation and active perception.
o Concept: Perception-control-action cycle.
4. Aloimonos' Questions (1993)
o Empirical Questions: How are existing visual systems
designed?
o Normative Questions: What characteristics should ideal
vision systems have?
o Theoretical Questions: What potential mechanisms could
exist in intelligent visual systems?

System Theory and Computer Vision

5. System Theory Framework (Klir, 1991)


o Objective: Use mathematics to understand complex
phenomena through formal models.
o Components:
 Feature Observability: Determine if task-relevant
information is present in images.
 Representation: Choose a model for interpreting the
world at various complexity levels.
 Interpretation: Map data from mathematical models to
the real world.
Approaches to Artificial Vision

1. Reconstruction (Bottom-Up)
o Objective: Reconstruct 3D shapes from intensity or range
images.
o Characteristics:
 Marr's theory emphasizes minimal a priori knowledge.
 Practical approaches use range images for creating 3D
models.
2. Recognition (Top-Down, Model-Based Vision)
o Objective: Recognize objects using a priori knowledge
expressed through 3D models.
o Characteristics:
 Utilizes CAD models for recognition.
 Additional constraints embedded in models aid in
solving under-determined vision tasks.
3. Object Recognition without 3D Models
o Geons Approach (Biederman, 1987): Recognize 3D shapes
directly from 2D drawings using qualitative features called
geons.
o Alignment of 2D Views: Use lines or points in 2D views to
align and recognize objects.
o Image-Based Scene Representations: Store collections of
images with established correspondences instead of 3D
models for scene representation.

Summary of Techniques

 Reconstruction: Bottom-up approach focused on deriving 3D


shapes from images.
 Recognition: Top-down approach using pre-existing 3D models or
qualitative features for object recognition.
 Alignment: Use of 2D features to align and recognize objects from
multiple views.

Overview of Marr’s Theory

1. Three Levels of Understanding Computer Vision:


o Computational Theory: Describes what the system should
do and the logic behind it. This level focuses on defining the
goal and the strategy for achieving it.
o Representation and Algorithm: Details how the
computations are carried out, including the information
representations and algorithms used to process this
information.
o Implementation: Refers to the actual physical realization of
the algorithms, including the software and hardware
components.
2. Importance of Theoretical Understanding:
o Marr emphasizes that addressing the theoretical level is
crucial for understanding and solving vision problems, as
opposed to focusing solely on algorithms or physical
implementations.
o He uses examples like the Necker cube illusion to illustrate
how different levels of theory relate to understanding visual
phenomena.

Stages of Marr’s Visual Processing

1. The Primal Sketch:


o Purpose: Captures significant intensity changes in an image
(like edges) in a general way, without assuming physical
meaning at this stage.
o Method:
 Uses a range of blurring filters (e.g., Gaussian filters) to
isolate features at different scales.
 Identifies zero-crossings (points where intensity
changes) using a Laplacian operator.
 Groups these zero-crossings based on their location and
orientation to identify potential tokens (edges, bars,
blobs) in the image.
o Human Vision Analogy: This stage mirrors human visual
processing in detecting features and grouping them.
2. The 2.5D Sketch:
o Purpose: Provides a depth map representing the relative
distances of surfaces from the viewer. It is a midway stage
between 2D and 3D.
o Characteristics:
 Reconstructs surface orientations and depths, but does
not provide information about the "other side" of
objects.
 Uses features detected in the primal sketch and
incorporates additional clues (like lighting or motion) for
depth estimation.
o Methods: Employs “shape from X” techniques, where X could
be texture, shading, or motion.
3. The 3D Representation:
o Purpose: Transforms the depth map into a full 3D object
representation, independent of the viewer’s perspective.
o Characteristics:
 Requires knowledge about objects and their
descriptions, transitioning to an object-centered
coordinate system.
 Involves identifying and describing objects in a viewer-
independent manner.
o Challenges:
 This stage is complex and less guided by physiological
insights compared to earlier stages.
 Emphasizes modular representation, where each object
is treated in its own coordinate system, rather than
using a single global system.

Additional Insights and Considerations

 Modularity: Marr suggests modular approaches where different


levels of vision processing (low-level, middle-level) are relatively
independent but work together to form a comprehensive
understanding of the scene.
 Regularization: For ill-posed tasks where multiple solutions are
possible, regularization techniques (e.g., requiring continuity and
smoothness) are often used to make the problems well-posed and
solvable.
3. Applications in Computer Vision

 Image Rectification: Homographies can rectify images of planar


scenes to a fronto-parallel view.
 Panorama Stitching: Homographies relate different images of a
scene to create panoramic images.
 3D Reconstruction: Homographies and projective geometry help
in reconstructing 3D scenes from multiple images.

Summary

Projective geometry provides a framework for understanding how 3D


scenes are projected onto 2D images and how these projections relate to
each other across multiple views. This is crucial in computer vision for
tasks like image rectification, panorama stitching, and 3D reconstruction,
as it allows handling of perspective distortions and transformations
between views.
 Method: Solve the nonlinear least squares problem, which is
typically approached using the Levenberg-Marquardt algorithm. This
involves two steps:
1. Initial Estimate: Compute an initial estimate using a simpler,
less optimal method.
2. Refinement: Refine this estimate using a local optimization
algorithm.

Linear Estimation

 Objective: Find an initial estimate for HH by minimizing algebraic


distance, which can be solved linearly.
 Process:
1. Formulation: Reformulate the problem using
matrices GG and SS (cross-product matrix), resulting in a linear
system.
2. Solution: Solve the linear system Wh=0Wh=0 using Singular
Value Decomposition (SVD) or eigenvalue decomposition.
 Preconditioning: Normalize the points to ensure they have similar
magnitudes. This step helps in obtaining a more accurate solution
by reducing numerical issues.

Robust Estimation

 Objective: Handle cases where correspondences may contain


outliers or gross errors.
 Method: Use algorithms like RANSAC (Random Sample Consensus)
to robustly estimate HH by iteratively finding the best model that fits
the majority of the data while excluding outliers.

Summary

1. MLE provides the most statistically optimal estimate but is


computationally intensive due to its nonlinear nature.
2. Linear Estimation provides a quick initial guess and is useful for
getting close to the optimal solution.
3. Robust Estimation is essential when dealing with noisy or
erroneous correspondences to ensure accurate results.

These methods are not only applicable to homography estimation but also
extend to other problems in 3D computer vision, such as camera
calibration, triangulation, and fundamental matrix estimation.

Motion Analysis

Differential Motion Analysis Methods


1. Basic Principle:
o Motion detection using differential analysis involves subtracting
consecutive images (f1 and f2) to create a difference image (d(i,j).

o The difference image highlights areas where significant changes in


pixel values occur, indicating motion
2. Sources of Difference Image Values:
o Moving Object vs. Static Background: Difference arises when a
moving object is compared to a static background.
o Moving Object vs. Another Moving Object: Difference occurs
when a moving object is compared to another moving object.
o Different Parts of the Same Object: Differences can arise if
different parts of the same moving object are compared.
o Noise and System Errors: Differences can also result from noise
or inaccuracies in camera positioning.
o

If a static reference image isn’t available, it can be constructed by


superimposing moving objects onto a static background or
interactively.0

Difference Image Calculation (in notes)

Moving Edges Detectionin notes):

3. Motion Trajectories: Motion analysis typically involves determining


motion trajectories, often simplified by segmenting objects from the first
image.

4. Limitations and Issues:


o Differential motion analysis might not reveal motion direction and
may struggle with slow motion or small objects.
o Problems like the aperture problem (ambiguous motion
information) and the detection of only part of an object boundary
can affect accuracy.
5. Practical Application:
o Differential motion analysis is often used in digital subtraction
angiography to estimate vessel motion.

Optical Flow and motion analysis


Optical flow describes the pattern of apparent motion of objects, surfaces,
and edges in a visual scene, caused by the relative motion between the
observer and the scene. It helps to interpret various types of motion
without necessarily providing quantitative parameters.

Basic Motion Elements

1. Translation at Constant Distance: Represented by parallel motion


vectors.
2. Translation in Depth: Creates a set of vectors converging at a
common focus of expansion (FOE).
3. Rotation at Constant Distance: Results in concentric motion
vectors.
4. Rotation Perpendicular to the View Axis: Forms vectors
originating from straight line segments.

Focus of Expansion (FOE)

The FOE is the point where the flow vectors appear to converge:

 Translation at Constant Distance: FOE is at infinity.


 Translation in Depth: FOE is a single point where vectors
converge.
 Multiple Moving Objects: Each object has its own FOE.

Calculating Mutual Velocity

The mutual velocity (cx, cy, cz) of an observer and an object can be found
using optical flow. For a point in the image, its position changes over time
based on velocity components (u, v, w), and the FOE location can be
derived from these equations.

Distance (Depth) Determination

Optical flow can be used to estimate the distance of a moving object from
the observer:
 Distance Calculation: If the velocity www of an object and one
known distance z1z_1z1 are known, the distance z2z_2z2 of another
point can be calculated using the ratio of their velocities and
distances.

Collision Prediction

For robots and autonomous systems, optical flow helps in predicting


potential collisions by:

 Determining the path of the observer (robot) relative to objects.


 Calculating the smallest distance of approach to avoid collisions.

Practical Applications

 Obstacle Detection: Using optical flow to identify and avoid


obstacles.
 Ego-Motion Estimation: Determining the motion of the observer
from multiple camera views.
 Time to Collision: Estimating when a collision might occur using
optical flow data.

Additional References

 Motion interpretation and range computation methods.


 Techniques for ego-motion estimation from multiple cameras.
 Obstacle detection and time-to-collision calculations using optical
flow.

You might also like