0% found this document useful (0 votes)
9 views

IVP notes

Image segmentation is a crucial computer vision technique that partitions digital images into meaningful segments for easier analysis. Various methods exist, including thresholding, edge detection, region growing, clustering, and deep learning approaches like CNNs. Object detection, another key task, identifies and locates objects within images, combining classification and localization to enhance applications in areas like autonomous vehicles and healthcare.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

IVP notes

Image segmentation is a crucial computer vision technique that partitions digital images into meaningful segments for easier analysis. Various methods exist, including thresholding, edge detection, region growing, clustering, and deep learning approaches like CNNs. Object detection, another key task, identifies and locates objects within images, combining classification and localization to enhance applications in areas like autonomous vehicles and healthcare.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Image segmentation :- Image segmentation is a fundamental process or

technique which is used in computer vision , that involves dividing a digital


image into it’s meaningful parts or in general way it is a process of
partitioning an image or video into distinct regions or objects, each of
which represents a particular object or class of objects.

It’s like breaking down a digital image into smaller pieces, like
objects, groups of pixels .

Types of image segmentation : -

Thresholding-Based Segmentation:- Thresholding: Divides an image


into foreground and background based on a threshold value. It can be global
(single threshold for the entire image) or adaptive (different thresholds for
different regions).

Edge-Based Segmentation:- Edge Detection:- Identifies edges within an


image to form boundaries of segments. Common edge detectors include Sobel,
Canny, and Prewitt.

Region-Based Segmentation:-Region Growing:- Starts from seed points


and grows regions by including neighboring pixels that have similar
properties, such as intensity or color.

Clustering-Based Segmentation:- K-Means Clustering: Partitions the


image into K clusters based on pixel values, grouping similar pixels together.
Deep Learning-Based Segmentation:- Convolutional Neural Networks
(CNNs) and Variants: Uses neural network architectures like Fully
Convolutional Networks (FCNs), U-Net, and Mask R-CNN for pixel-wise
segmentation. These methods leverage large datasets and powerful
computational resources to achieve high accuracy.

Here’s a breakdown of what image segmentation is and what it does:

● Goal: Simplify and analyze images by separating them into different

segments. This makes it easier for computers to understand the content of

the image.

● Process: Assigns a label to each pixel in the image. Pixels with the same

label share certain properties, like color or brightness.

● Benefits:

○ Enables object detection and recognition in images.

○ Allows for more detailed analysis of specific image regions.

○ Simplifies image processing tasks.pen_spark

Object Detection : - Object detection is a key task in computer vision that


involves identifying and locating objects within an image or video. It combines
aspects of image classification and object localization. The primary goals of
object detection are to classify each detected object and to draw bounding
boxes around each instance of the objects detected.
Key Concepts in Object Detection:

1. Classification: Determining what objects are present in an image.


2. Localization: Determining where those objects are in the image, usually
represented with bounding boxes.

How Object Detection Works:

1. Image Input: The process begins with an input image or frame from a
video.
2. Feature Extraction: The image is analyzed to extract features using
techniques like convolutional neural networks (CNNs). These features
help in identifying parts of the image that might contain an object.
3. Region Proposals: Potential regions where objects might be located
are proposed. Methods such as Selective Search, Region Proposal
Networks (RPNs), or sliding windows can be used.
4. Classification and Bounding Box Regression: Each proposed
region is classified to determine if it contains an object and which class
the object belongs to. Simultaneously, the bounding box of the object is
refined to more accurately encompass the object.
5. Post-processing: Techniques like Non-Maximum Suppression (NMS)
are used to remove redundant bounding boxes and keep the best ones,
ensuring that each object is detected only once.
Popular Object Detection Algorithms:

1. R-CNN (Regions with Convolutional Neural Networks):


○ R-CNN: Generates region proposals and uses CNN to classify
these regions.
○ Fast R-CNN: Improves R-CNN by sharing convolutional layers
between region proposals.
○ Faster R-CNN: Introduces Region Proposal Networks (RPN) to
generate proposals, making it end-to-end trainable.
2. YOLO (You Only Look Once):
○ A single CNN predicts bounding boxes and class probabilities for
multiple objects in one pass. It’s known for its real-time detection
capability.
3. SSD (Single Shot MultiBox Detector):
○ Like YOLO, it detects objects in a single shot using a CNN but with
multiple feature maps to handle objects of different sizes.
4. RetinaNet:
○ Uses a feature pyramid network (FPN) and introduces a novel loss
function called Focal Loss to address the class imbalance issue
during training.
5. Mask R-CNN:
○ Extends Faster R-CNN by adding a branch for predicting
segmentation masks for each Region of Interest (RoI) along with
class and bounding box prediction.
Applications of Object Detection:

● Autonomous Vehicles: Detecting pedestrians, vehicles, traffic signs, and


obstacles.
● Surveillance: Identifying suspicious activities and intruders.
● Retail: Inventory management and customer behavior analysis.
● Healthcare: Analyzing medical images to detect diseases or anomalies.
● Robotics: Enabling robots to recognize and interact with objects in their
environment.

Object detection is crucial for many AI applications, providing the foundation


for systems that can understand and interpret visual data similarly to human
vision.

what is "breaking image into parts" in image and video


processing ? : →Breaking image into parts" in image and video
processing refers to the technique of dividing an image into smaller
segments or regions. This is a crucial step in various computer vision
and image analysis tasks.

Techniques and Algorithms are used.

1. K-means Clustering: Used for color-based segmentation by clustering pixel


colors.
2. Graph-based Segmentation: Uses graph theory to model the relationships
between pixels or regions.
3. Watershed Algorithm: Treats the image like a topographic surface and
finds the lines that represent the boundaries of objects.
4. Convolutional Neural Networks (CNNs): For tasks like semantic and
instance segmentation, using architectures like U-Net, Mask R-CNN, etc.
1. Image Segmentation:

● Definition: Image segmentation is the process of partitioning an image


into multiple segments (sets of pixels, also known as image objects) to
simplify the representation of the image and make it more meaningful
and easier to analyze.
● Types:
○ Semantic Segmentation: Assigns a label to each pixel in the image
based on the category of the object it belongs to (e.g., sky, road,
car).
○ Instance Segmentation: Like semantic segmentation, but also
distinguishes between different instances of the same object
category.
○ Panoptic Segmentation: Combines semantic and instance
segmentation, providing a unified view.

2. Sliding Window:

● Definition: A technique where a fixed-size window is moved across the


image to capture different parts or patches of the image. Each patch is
then processed individually.
● Application: Often used in object detection to find objects within
different regions of an image by classifying each patch.
3. Superpixel Segmentation:

● Definition: Divides an image into clusters of pixels with similar


characteristics, known as superpixels.
● Purpose: Reduces the number of elements to be processed and makes
subsequent tasks like object detection and image segmentation more
efficient.

4. Region Proposal Methods:

● Selective Search: Generates potential object locations (regions) by


combining superpixels based on various criteria like color, texture, size,
and shape compatibility.
● Region Proposal Networks (RPN): Neural networks used in frameworks
like Faster R-CNN to propose candidate object regions in an image.

5. Grid-Based Division:

● Definition: Divides the image into a grid of smaller, equally-sized


patches.
● Application: Common in convolutional neural networks (CNNs) where
feature maps of different layers represent grid-like divisions of the
original image.
Digital Pixel :- A pixel is the smallest unit of a digital image or graphic that can
be displayed and represented on a digital display. They serve as the building
blocks of digital images, used to display everything from text to intricate graphics
and photos.
Every digital image consists of pixels, with each pixel representing a single point
of color – a dot or square – on a display screen. The density of pixels per inch
(PPI) determines the resolution; higher PPI results in a sharper image.

Pixels are arranged in a uniform two-dimensional grid, although different devices


use various sampling patterns. For example, liquid crystal display (LCD) screens
sample the three primary colors across a staggered grid, while digital cameras use
a more regular grid pattern.

Image representation in computer vision refers to the process of converting


an image into a numerical or symbolic form that can be easily understood and
processed by a computer. Images are typically represented as a collection of
pixels, where each pixel corresponds to a specific color or intensity value. The
goal of image representation is to extract relevant features and information from
the image, enabling the computer to perform various tasks, such as object
recognition, image classification, and image segmentation.

Techniques

There are several common techniques for image representation in computer


vision:
Pixel-based representations:

● Grayscale: This represents an image using a single channel, where each pixel
holds a brightness value (often from 0 for black to 255 for white).
● RGB (Red, Green, Blue): This is the standard format for colored images, where
each pixel stores intensity values for red, green, and blue channels.
Global descriptors: These capture the overall properties of an image in a single
feature vector. They are useful for image classification tasks. Some examples include:

● Bag-of-words models: Treat image features like words and represent the image
by their frequency.
● Histogram of Oriented Gradients (HOG): Captures the distribution of local
gradients in an image.

Deep learning-based representations: Convolutional Neural Networks (CNNs)


have become a dominant force in image representation. They automatically learn
hierarchical features from images, allowing them to capture complex patterns highly
effective for various vision tasks.

Common color models utilized in image representation include RGB (Red, Green, Blue),
HSV (Hue, Saturation, Value/Brightness), and CMYK (Cyan, Magenta, Yellow, Black).
Different multimedia applications might prefer one model over another based on their
distinct requirements.

For instance, RGB model is typically used in computer graphics while CMYK model is
largely used for printing purposes.

● RGB: Red (0-255), Green (0-255), Blue (0-255) - Used in computer screens
● CMYK: Cyan (0-100%), Magenta (0-100%), Yellow (0-100%), Black (0-100%) -
Used in professional printing
● HSV: Hue (0-360), Saturation (0-100%), Value/Brightness (0-100%) - Used in
television broadcasting

Computer vision is a subfield of artificial intelligence that deals with the


processing, understanding, and interpretation of digital images and videos. It involves
the use of algorithms and techniques to analyze visual data, extract features, and
recognize patterns in the images.
Some common applications of computer vision include object detection and recognition,
facial recognition, image segmentation, and tracking.
Core Functionalities:

● Image Processing: Techniques for preprocessing, filtering, and enhancing


images to prepare them for further analysis.
● Feature Extraction: Identifying and extracting key characteristics (features)
from images, such as edges, shapes, colors, and textures. These features serve
as the building blocks for higher-level tasks.
● Object Recognition: Classifying objects within images, determining their
presence and potentially their specific type (e.g., identifying a dog in a photo).
● Image Segmentation: Dividing an image into meaningful regions, often
corresponding to individual objects or distinct parts of a scene.
● Scene Understanding: Interpreting the overall context of an image,
encompassing the relationships between objects, actions, and the
environment.

Key Technologies:

● Machine Learning (ML): Especially deep learning with Convolutional Neural


Networks (CNNs), plays a pivotal role in training computer vision models to
recognize patterns and features.
● Image Processing Libraries: Libraries like OpenCV (Open Source Computer
Vision Library) provide tools for image processing, feature extraction, and
other essential tasks.

Real-World Applications:

Computer vision permeates our daily lives in numerous ways, including:

● Self-driving Cars: Recognizing objects, pedestrians, traffic signals, and road


markings to navigate safely.
● Facial Recognition: Used in security systems, social media platforms, and
smartphone unlocking.
● Medical Imaging: Assisting doctors in diagnosing diseases by analyzing
medical scans (X-rays, CT scans, etc.).
● Augmented Reality (AR): Superimposing computer-generated content onto
the real world for enhanced experiences (e.g., Pokémon Go).
● Robotics: Enabling robots to interact with their environment by perceiving
objects and obstacles.
In computer vision, the numerical representation of images is crucial for
computers to process and understand visual data. It essentially boils down to
converting the visual information we see into a format that computers can work
with, which is primarily numbers. Here's a breakdown of the key concepts:

Pixels and Color Channels:

● An image is made up of tiny squares called pixels. Each pixel represents a


single point in the image and holds the numerical information for its color.
● For colored images, the most common representation is the RGB (Red, Green,
Blue) color model. Each pixel stores three values: one for the intensity of red,
green, and blue light. These values typically range from 0 (black) to 255
(brightest white).
● Grayscale images use a single value per pixel, representing the intensity of
light (ranging from 0 for black to 255 for white).

Storing and Representing Image Data:

● Images are typically stored in digital files using formats like JPEG, PNG, or
BMP. These formats encode the pixel values and additional information like
image dimensions and color space.
● There are also specialized image processing libraries like OpenCV that provide
various numerical representations of images suitable for different computer
vision algorithms.

The image augmentation technique is a great way to expand the size of


your dataset. You can come up with new transformed images from your
original dataset. But many people use the conservative way of augmenting the
images i.e. augmenting images and storing them in a numpy array or in a
folder. I have got to admit, I used to do this until I stumbled upon the Keras
ImageDataGenerator class.
Image augmentation is a technique of applying different transformations to
original images which results in multiple transformed copies of the same
image. Each copy, however, is different from the other in certain aspects
depending on the augmentation techniques you applied .Applying these small
amounts of variations on the original image does not change its target class
but only provides a new perspective of capturing the object in real life. And so,
we use it is quite often for building deep learning models.
Key Techniques in Image Augmentation:

1. Geometric Transformations:
○ Rotation: Rotating the image by a certain angle (e.g., -30 to +30 degrees)
to simulate different orientations.
○ Translation: Shifting the image horizontally or vertically to change its
position.
○ Scaling: Resizing the image to make objects appear larger or smaller.
○ Shearing: Applying a shear transformation to slant the image along the x
or y axis.
○ Flipping: Flipping the image horizontally or vertically.

2. Color Space Transformations:


○ Brightness Adjustment: Increasing or decreasing the brightness of the
image.
○ Contrast Adjustment: Altering the contrast to make the image appear
more or less sharp.
○ Saturation Adjustment: Changing the intensity of the colors.
○ Hue Adjustment: Shifting the hue values to modify the colors in the image.

3. Noise Injection:
○ Gaussian Noise: Adding random noise to the image to simulate variations
and imperfections.
○ Salt-and-Pepper Noise: Randomly setting some pixels to maximum and
minimum values.

4. Cropping and Padding:


○ Random Cropping: Cropping a random portion of the image and resizing
it to the original dimensions.
○ Padding: Adding borders around the image to simulate different sizes and
contexts.

It works by applying random transformations to existing images, creating new


variations that the model can learn from. This helps to address several challenges in
deep learning:
Overfitting: When a model is trained on a limited dataset, it can become overly reliant
on specific features in those images and perform poorly on unseen data. Image
augmentation helps prevent this by introducing variations that the model must learn to
generalize from.

Limited Data: Collecting large, high-quality datasets can be expensive and


time-consuming. Image augmentation allows you to effectively leverage a smaller
dataset by creating more training examples.

Benefits of Image Augmentation:

1. Increased dataset size: Augmentation can significantly increase the size of the
training dataset, making it more representative of the real-world data.
2. Improved model robustness: By exposing the model to a wide range of image
variations, augmentation helps improve its robustness to different conditions and
scenarios.
3. Reduced overfitting: Augmentation can reduce overfitting by making it more
difficult for the model to memorize the training data.
4. Better generalization: Augmentation helps the model generalize better to new,
unseen data by making it more adaptable to different image styles and conditions

Tools and Libraries for Image Augmentation:

1. TensorFlow:tf.image.
2. PyTorch: torchvision.transforms.
3. OpenCV: cv2.rotate() and cv2.flip().
Image enhancement is the process of digitally manipulating a stored image
using software. The tools used for image enhancement include many different
kinds of software such as filters, image editors and other tools for changing
various properties of an entire image or parts of an image.
Image enhancement is a process that involves improving the quality
and appearance of an image by modifying its color, contrast,
brightness, and other features. The goal is to make the image more
clear and visible, or to highlight certain features.

Image Enhancement Techniques:

1. Histogram equalization: Adjusting the histogram of the image to


improve its contrast and brightness.

Histogram equalization is a technique used to improve the contrast of an image by


adjusting the intensity distribution (brightness levels) of the pixels. The goal is to flatten
the histogram of the image, making it more uniform, which can help to:

● Improve the visibility of details in both bright and dark regions


● Enhance the overall contrast of the image
● Reduce the effect of noise and artifacts

The process involves:

● Calculating the histogram of the image


● Computing the cumulative distribution function (CDF) of the histogram
● Mapping the original pixel values to new values based on the CDF
● Applying the new pixel values to the image

Example:

Suppose we have an image with a histogram that is skewed towards the darker end of
the intensity range. After applying histogram equalization, the histogram becomes more
uniform, and the image appears more balanced, with improved contrast and visibility of
details
2. Contrast stretching: Stretching the contrast of the image to make it
more visually appealing.

Contrast stretching, also known as normalization, improve the contrast of an


image by stretching the range of intensity of pixel values. The goal is to make the
brightest pixels brighter and the darkest pixels darker, which can help to:

● Enhance the overall contrast of the image


● Improve the visibility of details in both bright and dark regions
● Reduce the effect of noise and artifacts

The process involves:

● Identifying the minimum and maximum pixel values in the image


● Stretching the range of pixel values to occupy the entire intensity range
● Applying the new pixel values to the image

Example:

Suppose we have an image with a limited contrast range, where the brightest pixels are
not very bright and the darkest pixels are not very dark. After applying contrast
stretching, the image appears more vivid, with improved contrast and visibility of
details.

3. Gamma correction: Adjusting the gamma value of the image to improve


its brightness and contrast.

(Gamma value :- describes the relationship between a color value and its
brightness on a particular device.)Gamma correction is a nonlinear operation
used to adjust the brightness of an image. It helps in correcting the nonlinear
intensity response of display systems and enhancing the perceptual quality of an
image.
4. Filtering: Applying filters such as Gaussian filters, median filters, and
Wiener filters to remove noise and improve image quality.

Filtering is a technique used to remove noise and artifacts from an image by applying a
mathematical operation to the pixel values. The goal is to:

● Reduce the effect of noise and artifacts


● Improve the overall quality and clarity of the image
● Enhance the visibility of details

There are several types of filters, including:

● Low-pass filters: Remove high-frequency noise and preserve low-frequency


details
● High-pass filters: Remove low-frequency noise and preserve high-frequency
details
● Band-pass filters: Remove noise in a specific frequency range and preserve
details in that range
● Median filters: Remove noise by replacing each pixel value with the median value
of neighboring pixels
● Gaussian filters: Remove noise by applying a Gaussian distribution to the pixel
values

Example:

Suppose we have an image with salt and pepper noise. After applying a median filter,
the noise is reduced, and the image appears smoother and more detailed.
Types of Image Enhancement:

1. Contrast enhancement: Adjusting the contrast between different regions of the


image to make it more visually appealing.
2. Brightness adjustment: Adjusting the overall brightness of the image to improve
its visibility.
3. Color correction: Adjusting the color balance and saturation of the image to
improve its color accuracy and appeal.
4. Noise reduction: Removing random fluctuations in the image to improve its
clarity and smoothness.
5. Sharpening: Enhancing the edges and details of the image to improve its overall
clarity.

In computer vision, a contour is like a digital representation of that outline.


It can be described as the series of connected points that define the boundary
of an object, separating and/or highlighting it from the background. These
points tend to share similar color or intensity values, making them distinct
from their surroundings
Edge Detection:
- Edge detection is a fundamental concept in image processing that focuses on
identifying points in an image where the brightness changes sharply or
discontinuously. These points often correspond to boundaries between different
objects or regions in the image.
- Edge detection algorithms typically work by applying filters or operators to the
image that highlight areas of high gradient or intensity variation. Popular edge
detection techniques include the Sobel operator, Canny edge detector, and
Laplacian of Gaussian.
- Edge detection is more concerned with local changes in intensity and is
primarily used to find the boundaries of objects or regions within an image.

Contour Detection:
- Contour detection is a higher-level concept that involves identifying and
extracting the boundaries of objects or regions in an image. Contours are
continuous curves that outline the boundaries of objects in an image.
- Contour detection algorithms aim to identify these continuous curves that
represent the boundaries of objects based on color, intensity, texture, or other
visual cues.
- Contour detection can involve more complex algorithms than edge detection
and may include additional processing steps such as noise reduction, smoothing,
or curve fitting.
- Contour detection is often used in tasks such as object recognition, shape
analysis, and image segmentation.

In summary, while edge detection is focused on identifying local changes in intensity


that correspond to object boundaries, contour detection is concerned with extracting
continuous curves that represent the boundaries of objects or regions in an image.
Edge detection is a fundamental step in contour detection, as contours are often
composed of edges. Both techniques are essential tools in image processing and
computer vision for extracting meaningful information from images.

● Face recognition
● Computer vision
● Machine vision
● Fingerprint recognition
● Medical imaging
● Vehicle detection (traffic control)
Background subtraction is a widely used technique in computer
vision and image processing. The background subtraction technique
aims to detect moving objects in a sequence of frames from a static
camera. It allows image foreground (moving object) and background
(stationary object) to be extracted for further processing, such as
object recognition.

One of the most commonly used techniques is frame differencing. It


takes the absolute difference of subsequent frames to detect the
change of motion between consecutive frames. After differencing, a
threshold is used to select only the relevant changes in successive
frames. Mathematically it can be written as

Image(i) - Image(i-1) > threshold

Assumption:

● Background subtraction typically works well when the camera is


stationary, capturing the scene from a fixed viewpoint. This allows the
background to remain static, while objects move relative to it.

Background Modeling:

● The initial step involves establishing a model of the background. This


can be done by:
○ Averaging a series of frames without any objects in motion.
○ Using statistical techniques to capture the distribution of pixel
values representing the background over time.
● The background model is essentially a reference for what pixels typically
represent the static scene.
Foreground Detection:

● For each frame in the video sequence:


○ The current frame is compared to the background model (pixel by
pixel).
○ Any significant difference is considered a potential foreground
object.
● A threshold value is often used to determine the level of difference that
signifies a foreground pixel.

Techniques for Background Subtraction:

Simple Frame Differencing


Running Average:
Gaussian Mixture Models (GMM):
R-CNN stands for Region-based Convolutional Neural Network. It is a family
of machine learning models used for computer vision tasks, specifically object
detection. Traditionally, object detection was done by scanning every grid
position of an image using different sizes of frames to identify the object’s
location and class. Applying CNN on every frame took a very long time.
R-CNN reduced this problem. It uses Selective Search to select the candidate
region and then applies CNN to each region proposal. However, it was still
slow due to the repeated application of CNN on overlapping candidate regions.
Fast R-CNN extracts features by applying convolution layers on the entire
image. It then selects CNN features on each region proposal obtained by
Selective Search. Thus, Fast R-CNN was more than 200 times faster than
R-CNN but the latency due to region proposal using selective search was still
high.
Faster R-CNN eliminated the bottleneck due to Selective Search by using a
neural network for region proposal. RPN reduced the latency by 10 times and
the model could run in real-time. It was proven to be more efficient because it
used feature maps, whereas, selective search used raw image pixels. Moreover,
it does not add much overhead because the feature maps are shared between
RPN and the rest of the network. Refer to the figure below.

R-CNN vs Fast R-CNN vs Faster R-CNN

Feature R-CNN Fast R-CNN Faster R-CNN

Region External algorithm (e.g., Pre-generated Region Proposal


Proposal Selective Search) bounding boxes Network (RPN)

Feature Separate CNN pass for Single CNN pass Single CNN pass
Extraction each proposed region for entire image for entire image

Processing Slowest Faster than R-CNN Fastest


Speed

Feature High (same features Lower than R-CNN Lowest


Redundancy computed multiple
times)

Year 2013 2015 2015


Introduced
R-CNN:-

Fast R-CNN:-
Faster R-CNN:-
Tracking techniques in image processing are methods used to
detect and follow objects over time within a sequence of images or
video frames. Here’s an overview of the three common techniques:
Optical Flow, Kalman Filter, and Particle Filter.

1. Optical Flow

Optical flow is a method that estimates the motion of objects


between two consecutive frames in a video sequence. It is based on
the apparent motion of brightness patterns in the images. The key
assumptions are that the intensity of the objects remains constant
over time and the motion is smooth over the image plane.

● Applications: Object tracking, motion detection, video


compression.
● Algorithms:
○ Lucas-Kanade method: A differential method for optical
flow estimation which assumes a small motion and
solves a set of linear equations to find the flow.
○ Horn-Schunck method: An approach that imposes
smoothness constraints on the flow field, making it more
robust to noise.

2. Kalman Filter

The Kalman Filter is a recursive algorithm that estimates the state


of a dynamic system from a series of noisy measurements. It is
widely used in control systems and signal processing.

● Applications: Object tracking, navigation systems, financial


data forecasting.
● Process:
○ Prediction: Estimates the current state and error
covariance.
○ Update: Incorporates new measurements to refine the
estimates.
● Advantages: Optimal for linear systems with Gaussian noise,
provides real-time processing, low computational cost.
3. Particle Filter

The Particle Filter is a non-parametric method for implementing a


recursive Bayesian filter by Monte Carlo simulations. It represents
the posterior distribution of the state by a set of random samples
(particles) with associated weights.

● Applications: Non-linear and non-Gaussian tracking


problems, robot localization, and navigation.
● Process:
○ Initialization: Generate a set of particles representing
initial state hypotheses.
○ Prediction: Move particles according to the system's
model.
○ Update: Weigh particles based on their likelihood given
the new measurement.
○ Resampling: Resample particles to focus on the most
likely states.
● Advantages: Effective in handling non-linear and
non-Gaussian models, flexible.

You might also like