0% found this document useful (0 votes)
7 views

Digital Image Processing - Unit 5

application of dip - pattern recognition - video processing - medical field - remote sensing - image enhancement

Uploaded by

123456magimagi
Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Digital Image Processing - Unit 5

application of dip - pattern recognition - video processing - medical field - remote sensing - image enhancement

Uploaded by

123456magimagi
Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 11

Unit - 1

application of dip
- pattern recognition
- video processing
- medical field
- remote sensing
- image enhancement

what is dip
- the ability for computers to understand images and videos on their own, the use
of a digital computer to process digital images through an algorithm(DIP)

what is image processing


- it is the process of transforming an image into a digital form and performs some
mathematical operations to get some useful information from it.

Types of image processing


- visualization
- recognition
- sharpening and restoration
- pattern recognition
- retrieval

Components of DIP

network
<------------------------------------------------------>
image display<---->computer<----->mass storage
^
|
hard copy device<---->|<----> image processing software
|
V
image sensors
^
|
problem domain

Steps in digital image processing


- image acquisition: it involves retrieving the image from a source, usually a
hardware based source
- image filtering and enhancement: used to extract some hidden and useful details
from an image
- image restoration: deals with appealing of an image, based on mathematical or
probabilistic model
- color image processing: deals with pseudo color and full color processing
- wavelets and multiresolution processing: foundation of representing image sin
various degrees
- compression: deals with image size or resolution
- morphological processing: extracting image components that are useful in the
representation & description of shape
- segmentation
- representation and description
- object detection
- knowledge base

Image sensing and Acquisition


- it refers to the process of capturing visual information from the real world and
converting it into a digital form that can be stored, processed and analyzed.
- how images are captured: images formed when energy from illumination source
interacts with objects and these energy are detected by devices to capture an image
- the incoming energy is transformed into a voltage by the combination of
electrical power and sensor that is responsive to the particular type of energy
being detected.
- the image acquired is completely unprocessed.
- different types of sensors
- single sensor
- line sensor
- array senor
- key steps in image sensing and acquisition
- image sensing
- light capture
- analog-to-digital Conversion(ADC)
- digital representation
- image processing
- image storage
- image sensing:
- device capable of detecting and converting light into electrical signal is used
- common sensors used: CCD(charged couple device), CMOS, etc
- light capture:
- when light falls on the image sensor, it generates an electrical charge in each
pixel. the sensor measure the intensity of light at various points on its surface,
effectively capturing the information of the scene.
- ADC:
- the signals generated by the sensors are in analog form. To process and store
these signal in digital devices, they need to be converted into digital format,
this is achieved thru Analog-to-digital Converter(ADC)
- it samples the analog signals and assigns digital values to represent the
intensity at each pixel
- digital representation:
- the image from ADC is represented as matrix of digital values, with each value
corresponding to brightness or color at specific pixel in the image.
- for grayscale, each pixel has a single value representing the brightness level
- for color images, each pixel has 3 values (R,G,B)
- image processing:
- once the image is in digital form, it can processed, enhanced or manipulated
using various algorithms and techniques.
- it may include tasks like noise reduction, contrast enhanced, color correction,
etc
- image storage:
- the processed image can be stored into digital storage media such as hard
drives, memory cards, cloud storage

Image sampling and quantization


- sampling: digitizing the coordinate values
- quantization: digitizing the amplitude values

sampling:
- definition: conversion of continuous signal to discrete signal by taking samples
at regular intervals
- process: captures discrete points from the continuous signal
- purpose: allows digitization of continuous signals from digital processing
- image context: converts continuous image into discrete pixels
- sampling rate: how frequently samples are taken
- results: creates a digital representation of signal with discrete data points
- impact on quality: higher sampler rate, better quality

quantization:
- definition: mapping continuous amplitude values to set of discrete amplitude
levels
- process: assigns discrete digital values to the sampled amplitudes
- purpose: representation of analog signals using limited digital values
- image context: assigns digital values to pixel intensities
- sampling rate: N/A
- result: digital representation of the signal with limited discrete amplitude
levels
- impact on quality: higher bit depth, better quality

Unit - 5
###Image features:
- keypoint features: specific locations in images, described by the appearance of
patches of pixels surrounding the point location
- Edges: these features can be matched based on their orientation and local
appearance, these are good indicators of object boundaries and occlusion events in
image sequences. Edges can be grouped into longer curves and straight line which
can be directly matches or analyzed. Points, edges and lines provide information to
both keypoint and region based descriptors.

###Feature Detection and matching:


- feature detection: each image is searched for locations that are likely to match
well in other images.
- feature description: each region around detected keypoint is converted into a
more compact and stable descriptor under changes in illumination, scale,
translation, rotation that can be matched against other descriptors.
- feature matching: efficient searches of likely matching candidates in other
images.
- feature tracking: alternative to feature matching that only searches a small
neighborhood around each detected feature. More suitable for video processing.

#Invariant Local features:


- Geometric invariance: translation, rotation, scale
- photometric invariance: exposure, brightness, contrast, etc

#Advantages of local features:


- locality: features are local, so robust to occlusion and clutter
- distinctiveness: can differentiate a large database of objects
- quantity: hundreds or thousands in a single image
- efficiency: real time performance achievable
- generality: exploit different types of features in different situations

#Interest Point features/corners:


- the point at which the direction of boundary of the object changes abruptly or
intersection of point b/w 2 or more edge segments.
- it is used to find a sparse set of corresponding locations in different images as
pre-cursor to computing camera pose.
- denser set of correspondences are used to align different images, eg. video
stabilization
- used to perform object instance and category recognition
- feature based correspondence techniques used in stereo matching, image stitching,
automated 3D modeling applications.

###Point features:
- finding feature points and their correspondences in 2 approaches:
- local search technique, correlation or last squares
- match features based on their local appearance

###Feature detection:
- Auto correlation-based keypoint detector algorithm
- compute the gradient at each point in the image
- create the H matrix from the entries in the gradient
- compute the eigenvalues
- find the points with large response(λmin > threshold)
- choose those points where λmin is local maximum as features

###Harris operator/harris corner detector


- it used local maxima in rotationally invariant scalar measures.
- this is derived from auto-correlation matrix to locate keypoints for sparse
features matching.
- it uses guassian weighting window instead of square patches, which makes the
detector response insensitive to in-plane rotations.
- the minimum eigenvalue λ0 is not only quantity that can be used to find
keypoints.
- f= determinant(H)/trace(H)

###Feature detector properties:


- Adaptive non-maximal suppression(ANMS)
- measuring repeatability
- scale invariance
- rotational invariance and orientation estimation
- affine invariance

- Adaptive Non-Maximal Suppression(ANMS):


- most of the feature detectors simply look for local maxima in the interest
function
- to mitigate this problem, only detect features that are both local maxima and
whose response value is significantly (10%) greater than all of its neighbors
within a radius r.
- first sorting them with its response strength and then creating a second list
sorted by decreasing suppression radius.
- Measuring repeatability:
- the ratio b/w no. of keypoints simultaneously present in all images of series
over the total no. of detections.
- used to assess keypoint detection performance
- measures the detector's ability to identify the same feature despite variations
in viewing conditions.
- Scale Invariance:
- scale invariance is a feature of objects that do not change by varying the
scale of length
- sufficient features may not exist in images, so extract features at variety of
scales
- Rotational invariance and orientation estimation:
- the average gradient within a region around the keypoint
- gradient is computed using the histogram of orientations around the keypoint.
- dominant orientation is computing by creating histogram of all the gradient
orientations and finding the significant peaks in the distribution.
- Affine invariance:
- the surfaces are considered the same under affine(an affine function is the
composition of a linear function with a translation)

###Feature descriptors:
- it is an algorithm which takes an image and outputs feature descriptors/ feature
vectors
- it encode interesting information of images into a series of numbers that can be
used to differentiate one feature from another.

- some features of descriptors are:


- bias and gain normalization or multi scale orientated patches
- scale invariant feature transform(SIFT)
- PCA-SIFT
- Gradient location-orientation histogram(GLOH)

#Bias and gain normalization or multi scale oriented patches(MOPS):


- do not exhibit large amounts of foreshortening such as image stitching. patch
intensities are re-scales to compensate the affine photometric variations.

#Scale Invariant Feature Transform(SIFT)


- SIFT is invariance to image scale and rotation.
- it is formed by computing the gradient at each pixel in a 16x16 window around
the detected keypoint, using the guassian pyramid at which keypoint was detected.
- to reduce the effects of location and dominant orientation mis-estimation, each
of the 256 graidents is softly added to 2x2 histogram bins
- it gives 128 non-negative values that forms a raw version of the SIFT descriptor
vector.
- to reduce the effects of contrast or gain, 128-D vector is normalized to unit
length.

#PCA-SIFT:
- it computes x and y derivatives over a 39x39 patch and then reduces the
resulting 3042-D vector to 36.

#Gradient Location-Orientation Histogram(GLOH):


- it is variant on SIFT that uses a log-polar binning structure instead of the
four quadrants.

###Feature Matching:
- once features and their descriptors are extracted from 2 or more images, match
the preliminary features b/w the images.

- Matching strategy and error rates:


- determining which feature matches are reasonable to process further depends on
the context in which the matching is being performed.

- Euclidean distance:
- Euclidean distances in feature space is used for ranking potential matches.
- if certain parameters in a descriptor are more reliable than others, then re-
scale these axes ahead of time to determine how much they vary when compared to
other known good matches.
- it is the simplest matching strategy
- settings the threshold too high -> too many false positives
- settings the threshold too low -> too many false negatives
- TP: true positive, FP: false positive, FN: false negative, TN: true negative

- Efficient matching using indexing structure approach:


- it is multi-dimensional search tree of a hash table, to rapidly search for
features near a given feature.
- such indexing structures can either be built for each image independently or
globally for all images in a given DB.
- it maps descriptors into fixed size buckets based on some functions.
- at matching time, each new feature is hashed into a bucket, a search of nearby
buckets gives potential matchings.

- Feature Tracking:
- the process of selecting good features to track closely related feature for
more general recognition applications.
- one of the applications of fast feature tracking is performance-driven
animation

- Haar wavelets:
- during the matching structure construction, each 8x8 scales, oriented and
normalized MOPS patch is converted into a three-element index by performing sums
over different quadrants.
- the resulting three values are normalized by their standard deviations and then
mapped into 2 nearest 1D bins.
- the coefficients in the bin can be used to approximate nearest neighbors for
further processing

- Locality sensitive hashing


- it uses unions of independently computed functions to index the features

#Edge detection:
- Edges are significant local changes of intensity in digital image.
- a set of connected pixels that forms a boundary b/w 2 disjoint regions
- 3 types of edges
- horizontal edges
- vertical edges
- diagonal edges

###Edge Detection:
- edge detection is a method of segmenting an image into regions of discontinuity.
- it is used for pattern recognition, image morphology, feature extraction, etc
- it allows users to observe the features of an image for a significant change in
grey level.

- edge detection operations are of 2 types


- gradient based operator
- sobel operator
- prewitt operator
- Robert operator
- guassian based operator
- canny edge detector
- Laplacian of guassian

#Sobel operator:
- it is a discrete differentiations operator
- it used two 3x3 kernels to calculate the vertical and horizontal edges
- advantages:
- simple and time efficient
- very easy at searching for smooth edges
- limitations:
- diagonal edges are not detected
- high sensitive to noise
- not very accurate
- detect with thick and rough edges doesn't give good results

#Prewitt operator:
- similar to sobel
- detects horizontal and vertical
- best way to detect orientation and magnitude of an image
- it used kernels or masks
- advantages:
- good performance on detecting vertical and horizontal edges
- best operator to detect the orientation of an image
- limitations:
- magnitude of coefficient is fixed and cant be changed
- diagonal edges are not detected

#Robert operator:
- this is gradient based operator
- computes the sum of squares of the differences b/w diagonally adjacent pixels
- used 2x2 kernel or masks
- advantages:
- detection of edges and orientation are very easy
- diagonal edges are also detected
- limitations
- very sensitive to noise
- not very accurate in edge detection

#Laplacian of guassian:
- it uses the Laplacian to take second derivative of an image
- advantages:
- easy to detect edges and their various orientations
- there is fixed characteristics in all directions
- limitations:
- very sensitive to noise
- false edges

#Canny edge detector:


- not susceptible to noise
- it extracts image features without affecting or altering the image
- it has advanced algorithms
- detects edges based on 3 criteria
- low error rate
- edge points must be localized
- there should be just one single edge response
- advantages:
- good localization
- extracts image features without altering the image
- less sensitive to noise
- limitations:
- false zero crossing
- complex computation and time consuming

#Speeded up Robust Features(SURF)


- SURF was first presented by Herbert Bay, at 2006 European Conference on computer
vision
- it is an algorithm to detect and describe local feature of image
- used for takes such as object recognition, classification, etc
- SURF uses an integer approximation of the determinant of hessian blob detector
- its feature descriptor is based on the sum of the Haar wavelet response.

#SIFT VS SURF
- scale invariance feature Transform vs speeded up robust features

- slow, takes more time vs faster, takes less time

- more no. of features can be detected vs less no. of features are detected

- expensive vs cheaper

###Object detection:
###convolution:
- convolutional neural network are deep artificial neural network that are ued
primarily to classify images, cluster them by similarity
- it can be used to identify faces, any signs, tumors, etc

# importance of CNN
- CNN was proposed by Yann LeCun in 90's, when hhe was inspired from the human
visual perception of recognizing things.
- CNN are complex feed forward neural networks.
- CNN are used for image classification and recognition cuz of its high accuracy
- CNN follows a hierarchical model which works on building a network, a fully
connected layer where all neurons are connected to each other and the output is
processed

input image --> cnn --> output label(image class)

#Types of layers:
- convolution layers
- feature map or filter
- shared weights
- subsampling or max pooling
- full connected layer (classification)

# convolutional layer
1. first step is to extract features from an input image
2. in second step, convolution preserves the relationship b/w pixels by learning
image features
3. a mathematical operation that take 2 inputs such as image matrix and filter
(kernel)

# max pooling(subsampling)
- takes smaller blocks from convolutional layer
- subsmaples to produce single output
- several ways - average or maximum or learned linear combination of neurons
- max pooling layers take maximum out of the block

# full connected layer


- high level reasoning in NN
- takes all neurons from previous layer and connects it to every neuron it has
- these are not spatially located, therefore no convolutional layers after fully
connected layer

### convolutional nerual netwoek(CNN)


- network structure designed extracts relevant features, restricting neural weights
of one layer to a field in previous layer.
- the degree of shift and distortion variance is acheieved by reducing the spatial
resolution of feature map.

From NNs to Convolutional NNs


- local connectivity
- shared weights ("tied")
- convolution with 1D filter
- multiple feature maps
- pooling

### Region-based Convolutional Neural Network(RCNN)


- selective search for region proposals
- does hierarchical clustering at different scales
- such clustering is important as we may find different object at different scales
- propsed regions are cropped to form mini-images
- each mini-image is scaled to match CNN input size
- any CNN can be trained to use for feature extraction
- outputs from fc7 layer are taken as features
- CNN is fine tuned using ground truth object images

Region proposals: selective search


feature extraction: CNN
classifier: linear

## Principal Components Analysis (PCA)


- an exploratory technique used to reduce the dimensionality of dataset to 2D or 3D
- used to
- reduce number of dimensions in data
- find patterns in high-dimensional data
- visualize data of high dimensionality
- examples: face recognition, image compression, gene expression analysis

# principal components
1. principal component(PC1)
- the eigenvalue with largest absolute value will indicate the data have the
largest variance along its eignevector, the direction along which there is greatest
variation
2. principal component(PC2)
- the direction with maximum variation left in data, orthogonal to PC1

- in general, only few directions manage to capture most of the variability in


data.

-general about PC
- summary variables
- linear combinations of the original variables
- uncorrelated with each other
- capture as much of the original variance as possible

### Image classification using SVM-ANN-Feed forward and Back Propagation

#image classification
- it is the process of categorizing and labeling groups of pixels or vectors within
an image based on specific rules.
- it is a task in computer vision and ML where an algorithm analyses an image and
assigns it to one or more classes
- it involes labeling or tagging an enitre image based on pre-existing training
data
- it has numerous application across various domains, object recognition, medial
diagnosis, auto driving, etc

# two general methods of IC


- supervised
- unsupervised

## supervised learning
- ANNs are trained on labelled dataset, where each input data point is associated
with target label
- during training, ANN learns to map inputs to outputs by adjusting its parameters
to minimize the difference b/w predicted o/p and actual labels
- Supervised classfication algorithms
- CNN
- support vector machine
- random forest

## unsupervised learning
- ANN can be used for tasks such as clustering, dimensionality reduction and
generative modeling.

- unsupervised classification algorithms


- K-means clustering
- hierarchical clustering
- self-organizing maps (SOM)

## image classification using SVM


- SVM is a type of supervised learning used in ML to solve classification and
regression
- SVM is applicable for the data that are linearly separable
- in non-linear data, kernel functions are used
- SVM can be used for text detection, img classification, spam detection, etc

# Three porperties of SVM


1. SVM construct a maximum margin separator - a decision boundary with largest
possible distance to example points
2. SVMs create a linear separating hyperplane, but they have the ability to embed
data into higher-dimensional space
3. SVMs are nonparametric method - they retain training examples and potentially
need to store them all.

# Hyperplane
- it is a decision boundary that separates data points into different classes in
SVM.
- it is a plane of dimension one less than the dimension of data sapce.
- it should have the largest margin to separate given into two classes.
- the margin b/w the two classes represent the longest distnace b/w closest point
to these 2 classes

some commonly used kernel functions


1. linear
2. polynomial of degree d
3. guassian radial basis function
4. tanh kernel

# applications of SVM
- text categorization
- image classification
- bioinformatics
- hand written character recognition

### ANN
- it posses a large no. of processing elements called nodes/neurons which operate
in parallel
- neurons are connected with other by connection linl
- each link is associated with weights which contain info abt input signal
- each node has internal state of its own, which is a func of inputs(activation
level)

# models of ANN
- interconntions
- learning rules
- activation function

## Layers in ANN
1. input layer
2. hidden layers
3. output layer

## Learning
- it is a process by which NN adapts itself to a stimulus, resulting in desired
response
- 2 types
- parameter learning: connection weights are updated
- structure learning: change in network structure

## multilayer Neural network


- feed forward propagation
- Error back-propagation

# feed forward propagation


- it is an neural network where in connections b/w the nodes do not form a cycle
- it was the first and simplest type of ANN devised.
- in this network, the info moves only one direction - forward
- the goal of this network is to approximate some function f*
- when feedforward NN are extended to include feedback connections, they are called
recurrent NN

# back propagation
- it is the most fundamental building block in a NN
- the algorithm is used to effectively train a NN through a method called chain
rule
- after each forward pass through network, backpropagation performs a backward pass
while adjusting the model's parameters(weights and biases)

Advantages of backprogation
- it is fast, simple and easy to program
- it has no parameters to tune apart from no. of input
- it is flexible method as it does not require piror knowledge abt network

You might also like