Digital Image Processing - Unit 5
Digital Image Processing - Unit 5
application of dip
- pattern recognition
- video processing
- medical field
- remote sensing
- image enhancement
what is dip
- the ability for computers to understand images and videos on their own, the use
of a digital computer to process digital images through an algorithm(DIP)
Components of DIP
network
<------------------------------------------------------>
image display<---->computer<----->mass storage
^
|
hard copy device<---->|<----> image processing software
|
V
image sensors
^
|
problem domain
sampling:
- definition: conversion of continuous signal to discrete signal by taking samples
at regular intervals
- process: captures discrete points from the continuous signal
- purpose: allows digitization of continuous signals from digital processing
- image context: converts continuous image into discrete pixels
- sampling rate: how frequently samples are taken
- results: creates a digital representation of signal with discrete data points
- impact on quality: higher sampler rate, better quality
quantization:
- definition: mapping continuous amplitude values to set of discrete amplitude
levels
- process: assigns discrete digital values to the sampled amplitudes
- purpose: representation of analog signals using limited digital values
- image context: assigns digital values to pixel intensities
- sampling rate: N/A
- result: digital representation of the signal with limited discrete amplitude
levels
- impact on quality: higher bit depth, better quality
Unit - 5
###Image features:
- keypoint features: specific locations in images, described by the appearance of
patches of pixels surrounding the point location
- Edges: these features can be matched based on their orientation and local
appearance, these are good indicators of object boundaries and occlusion events in
image sequences. Edges can be grouped into longer curves and straight line which
can be directly matches or analyzed. Points, edges and lines provide information to
both keypoint and region based descriptors.
###Point features:
- finding feature points and their correspondences in 2 approaches:
- local search technique, correlation or last squares
- match features based on their local appearance
###Feature detection:
- Auto correlation-based keypoint detector algorithm
- compute the gradient at each point in the image
- create the H matrix from the entries in the gradient
- compute the eigenvalues
- find the points with large response(λmin > threshold)
- choose those points where λmin is local maximum as features
###Feature descriptors:
- it is an algorithm which takes an image and outputs feature descriptors/ feature
vectors
- it encode interesting information of images into a series of numbers that can be
used to differentiate one feature from another.
#PCA-SIFT:
- it computes x and y derivatives over a 39x39 patch and then reduces the
resulting 3042-D vector to 36.
###Feature Matching:
- once features and their descriptors are extracted from 2 or more images, match
the preliminary features b/w the images.
- Euclidean distance:
- Euclidean distances in feature space is used for ranking potential matches.
- if certain parameters in a descriptor are more reliable than others, then re-
scale these axes ahead of time to determine how much they vary when compared to
other known good matches.
- it is the simplest matching strategy
- settings the threshold too high -> too many false positives
- settings the threshold too low -> too many false negatives
- TP: true positive, FP: false positive, FN: false negative, TN: true negative
- Feature Tracking:
- the process of selecting good features to track closely related feature for
more general recognition applications.
- one of the applications of fast feature tracking is performance-driven
animation
- Haar wavelets:
- during the matching structure construction, each 8x8 scales, oriented and
normalized MOPS patch is converted into a three-element index by performing sums
over different quadrants.
- the resulting three values are normalized by their standard deviations and then
mapped into 2 nearest 1D bins.
- the coefficients in the bin can be used to approximate nearest neighbors for
further processing
#Edge detection:
- Edges are significant local changes of intensity in digital image.
- a set of connected pixels that forms a boundary b/w 2 disjoint regions
- 3 types of edges
- horizontal edges
- vertical edges
- diagonal edges
###Edge Detection:
- edge detection is a method of segmenting an image into regions of discontinuity.
- it is used for pattern recognition, image morphology, feature extraction, etc
- it allows users to observe the features of an image for a significant change in
grey level.
#Sobel operator:
- it is a discrete differentiations operator
- it used two 3x3 kernels to calculate the vertical and horizontal edges
- advantages:
- simple and time efficient
- very easy at searching for smooth edges
- limitations:
- diagonal edges are not detected
- high sensitive to noise
- not very accurate
- detect with thick and rough edges doesn't give good results
#Prewitt operator:
- similar to sobel
- detects horizontal and vertical
- best way to detect orientation and magnitude of an image
- it used kernels or masks
- advantages:
- good performance on detecting vertical and horizontal edges
- best operator to detect the orientation of an image
- limitations:
- magnitude of coefficient is fixed and cant be changed
- diagonal edges are not detected
#Robert operator:
- this is gradient based operator
- computes the sum of squares of the differences b/w diagonally adjacent pixels
- used 2x2 kernel or masks
- advantages:
- detection of edges and orientation are very easy
- diagonal edges are also detected
- limitations
- very sensitive to noise
- not very accurate in edge detection
#Laplacian of guassian:
- it uses the Laplacian to take second derivative of an image
- advantages:
- easy to detect edges and their various orientations
- there is fixed characteristics in all directions
- limitations:
- very sensitive to noise
- false edges
#SIFT VS SURF
- scale invariance feature Transform vs speeded up robust features
- more no. of features can be detected vs less no. of features are detected
- expensive vs cheaper
###Object detection:
###convolution:
- convolutional neural network are deep artificial neural network that are ued
primarily to classify images, cluster them by similarity
- it can be used to identify faces, any signs, tumors, etc
# importance of CNN
- CNN was proposed by Yann LeCun in 90's, when hhe was inspired from the human
visual perception of recognizing things.
- CNN are complex feed forward neural networks.
- CNN are used for image classification and recognition cuz of its high accuracy
- CNN follows a hierarchical model which works on building a network, a fully
connected layer where all neurons are connected to each other and the output is
processed
#Types of layers:
- convolution layers
- feature map or filter
- shared weights
- subsampling or max pooling
- full connected layer (classification)
# convolutional layer
1. first step is to extract features from an input image
2. in second step, convolution preserves the relationship b/w pixels by learning
image features
3. a mathematical operation that take 2 inputs such as image matrix and filter
(kernel)
# max pooling(subsampling)
- takes smaller blocks from convolutional layer
- subsmaples to produce single output
- several ways - average or maximum or learned linear combination of neurons
- max pooling layers take maximum out of the block
# principal components
1. principal component(PC1)
- the eigenvalue with largest absolute value will indicate the data have the
largest variance along its eignevector, the direction along which there is greatest
variation
2. principal component(PC2)
- the direction with maximum variation left in data, orthogonal to PC1
-general about PC
- summary variables
- linear combinations of the original variables
- uncorrelated with each other
- capture as much of the original variance as possible
#image classification
- it is the process of categorizing and labeling groups of pixels or vectors within
an image based on specific rules.
- it is a task in computer vision and ML where an algorithm analyses an image and
assigns it to one or more classes
- it involes labeling or tagging an enitre image based on pre-existing training
data
- it has numerous application across various domains, object recognition, medial
diagnosis, auto driving, etc
## supervised learning
- ANNs are trained on labelled dataset, where each input data point is associated
with target label
- during training, ANN learns to map inputs to outputs by adjusting its parameters
to minimize the difference b/w predicted o/p and actual labels
- Supervised classfication algorithms
- CNN
- support vector machine
- random forest
## unsupervised learning
- ANN can be used for tasks such as clustering, dimensionality reduction and
generative modeling.
# Hyperplane
- it is a decision boundary that separates data points into different classes in
SVM.
- it is a plane of dimension one less than the dimension of data sapce.
- it should have the largest margin to separate given into two classes.
- the margin b/w the two classes represent the longest distnace b/w closest point
to these 2 classes
# applications of SVM
- text categorization
- image classification
- bioinformatics
- hand written character recognition
### ANN
- it posses a large no. of processing elements called nodes/neurons which operate
in parallel
- neurons are connected with other by connection linl
- each link is associated with weights which contain info abt input signal
- each node has internal state of its own, which is a func of inputs(activation
level)
# models of ANN
- interconntions
- learning rules
- activation function
## Layers in ANN
1. input layer
2. hidden layers
3. output layer
## Learning
- it is a process by which NN adapts itself to a stimulus, resulting in desired
response
- 2 types
- parameter learning: connection weights are updated
- structure learning: change in network structure
# back propagation
- it is the most fundamental building block in a NN
- the algorithm is used to effectively train a NN through a method called chain
rule
- after each forward pass through network, backpropagation performs a backward pass
while adjusting the model's parameters(weights and biases)
Advantages of backprogation
- it is fast, simple and easy to program
- it has no parameters to tune apart from no. of input
- it is flexible method as it does not require piror knowledge abt network