0% found this document useful (0 votes)
27 views27 pages

Computer Vision SM-2

The document is a study material for a Computer Vision course at Visvesvaraya Technological University, covering topics such as image processing, neighborhood operators, Fourier transforms, and wavelets. It discusses various filtering techniques, including median filtering, bilateral filtering, and morphological operations, along with their applications in image analysis. Additionally, it explains the Fourier Transform's significance in signal processing and image filtering, as well as the concepts of pyramids and wavelets for multi-resolution analysis.

Uploaded by

notfairksd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views27 pages

Computer Vision SM-2

The document is a study material for a Computer Vision course at Visvesvaraya Technological University, covering topics such as image processing, neighborhood operators, Fourier transforms, and wavelets. It discusses various filtering techniques, including median filtering, bilateral filtering, and morphological operations, along with their applications in image analysis. Additionally, it explains the Fourier Transform's significance in signal processing and image filtering, as well as the concepts of pyramids and wavelets for multi-resolution analysis.

Uploaded by

notfairksd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

VISVESVARAYA TECHNOLOGICAL UNIVERSITY

BELGAUM

COMPUTER VISION

(COUSE Code: BPLCK105B)

STUDY MATERIAL

VI-SEMESTER

Dr. Krishna Prasad K


Associate Professor, Dept of ISE

A J INSTITUTE OF ENGINEERING & TECHNOLOGY


DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
(A unit of Laxmi Memorial Education Trust. (R))
NH - 66, Kottara Chowki, Kodical Cross - 575 006

1
Computer Vision BCS613B

MODULE -2

Syllabus: Image processing: More neighborhood operators, Fourier transforms, Pyramids and
wavelets, and Geometric transformations

2|Page Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

MODULE -2

2.1 More neighborhood operators

Neighborhood operators go beyond traditional linear filtering techniques and include non-
linear filters that are more effective at preserving image structures such as edges while
removing noise.

2.1.1 Median Filtering

• Median filtering is a non-linear filter that replaces each pixel with the median value
from its neighborhood.
• It is highly effective for salt-and-pepper noise (impulse noise) because the extreme
values are ignored.
• Unlike Gaussian smoothing, median filtering preserves edges because it does not
average intensities.
• Computational efficiency:
o A median filter can be computed in O(n log n) time using sorting.
o Optimized implementations (e.g., Perreault and Hébert, 2007) can compute it in
constant time.
Example of Median Filtering
Original noisy image (5×5 window):
[ 8, 7, 255, 8, 9]
[ 7, 6, 255, 7, 8]
[ 6, 5, 255, 6, 7]
[ 7, 6, 255, 7, 8]
[ 8, 7, 255, 8, 9]
Sorted values in the 3×3 window:
[6, 6, 7, 7, 7, 7, 8, 8, 9]
Output pixel = 7 (Median value)
2.1.2 α-Trimmed Mean Filter
• It is a combination of mean and median filtering.
• Instead of using all pixels, it discards the α-fraction of extreme values and averages
the rest.
• It helps in cases where Gaussian noise dominates but impulse noise is present.
Example Calculation
For a 5×5 window, if α = 20%:
1. Sort the values.

3|Page Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

2. Remove the smallest and largest 10%.


3. Compute the mean of remaining pixels.
2.1.3 Weighted Median Filtering
• Instead of treating all pixels equally, this filter assigns weights to nearby pixels.
• This technique helps preserve small details while filtering out noise.
• It is commonly used in medical image processing and edge-preserving denoising.
Example Calculation
A weighted 3×3 filter:
[ 1, 2, 1 ]
[ 2, 5, 2 ]
[ 1, 2, 1 ]
The center pixel has the highest weight (5 times).
The final output is the weighted median instead of the simple median.
2.1.4 Edge-Preserving Non-Linear Filters
• Standard Gaussian filters smooth edges, which is undesirable in image sharpening.
• Bilateral filtering and anisotropic diffusion preserve edges while smoothing noise.
2.2 Bilateral Filtering
2.2.1 Introduction
• Bilateral filtering extends the idea of Gaussian smoothing by incorporating pixel
intensity differences.
• Unlike median filters, it does not discard outliers but weights them based on
similarity.
2.2.2 How Bilateral Filtering Works
The output g (i, j) at pixel (i, j) is computed as:
∑𝑘,𝑙 𝑓(𝑘,𝑙)𝑤(𝑖,𝑗,𝑘,𝑙)
g(i, j) = ∑𝑘,𝑙 𝑤(𝑖,𝑗,𝑘,𝑙)

where the weight function w (i, j, k, l) is:

4|Page Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

2.2.3 Interpretation
• Pixels that are spatially close and intensity-similar contribute more.
• Unlike median filters, it does not completely discard different pixels but weights them
less.
2.2.4 Example

For a noisy image with a sharp edge:

Pixels Weights

20 0.8

22 0.9

100 0.05

21 0.85

The sharp edge (100) has very low weight.


The filter smooths noise without blurring edges.
2.2.5 Speedup Methods
• Bilateral filtering is computationally expensive.
• Faster approaches:
o Bilateral grid (Chen, Paris & Durand, 2007)
o Permutohedral lattice approach (Adams, Baek, Davis, 2010)
2.3 Binary Image Processing
2.3.1 Introduction
• Binary images contain only two intensity values (0 and 1).
• Used in OCR, medical imaging, object recognition.
2.3.2 Morphological Operations
Morphology operations modify shapes in binary images.
2.3.2.1 Dilation
• Expands white regions (adds pixels to object boundaries).
• Formula:
θ(c,1)

5|Page Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

Used for bridging gaps in objects.


2.3.2.2 Erosion
• Shrinks white regions (removes boundary pixels).
• Formula:
θ(c,S)
• Used to remove noise.
2.3.2.3 Opening & Closing
• Opening = Erosion → Dilation (removes small objects).
• Closing = Dilation → Erosion (fills small gaps).
2.3.3 Distance Transform
• Converts binary images into distance maps.
• Computes distance to the nearest object boundary.

• Applications:
o Object segmentation
o Feature extraction
o Skeletonization (medial axis computation)
2.3.4 Connected Component Analysis
• Identifies groups of connected pixels.
• Methods:
o N4 connectivity (horizontal + vertical).
o N8 connectivity (includes diagonal).
Example Application: OCR
• Segments letters from scanned text images.
2.3.4 Summary
• Non-linear filtering outperforms linear methods in preserving edges.
• Bilateral filtering is a powerful method for edge-aware smoothing.
• Binary image processing enables morphological operations and object segmentation.

6|Page Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

2.4 Fourier transforms


2.4.1 Introduction to Fourier Transforms
The Fourier Transform (FT) is one of the most important tools in signal processing, image
processing, and computer vision. It allows us to analyze signals and images in terms of
frequency components, rather than just spatial information.
2.4.2 What Does Fourier Transform Do?
• Converts a signal/image from the spatial domain to the frequency domain.
• Decomposes a function into sinusoidal basis functions (sine & cosine waves).
• Helps in filtering, compression, denoising, and feature extraction.
2.4.3 Example: Understanding Fourier Transform
Consider a simple waveform, such as a sum of two sine waves:

This is a signal that consists of two frequencies:


• A 3 Hz wave (large amplitude)
• A 10 Hz wave (smaller amplitude)
When we compute the Fourier Transform, it will show two peaks at 3 Hz and 10 Hz,
representing the frequency components of the signal.
2.4.4 Mathematical Formulation
2.4.4.1 Continuous Fourier Transform (CFT)
For a continuous-time function h(x)h(x)h(x), the Fourier Transform is defined as:

Example: Fourier Transform of a Gaussian Function


A Gaussian function:

7|Page Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

2
h(x) = 𝑒 −𝑥
has a Fourier Transform:
2
H(ω)= 𝑒 −ω
This means that a Gaussian function in the spatial domain remains a Gaussian function in
the frequency domain.
2.4.4.2 Discrete Fourier Transform (DFT)
Since images and signals are discrete, we use the Discrete Fourier Transform (DFT):

Example: Fourier Transform of a Simple Signal


Consider the sequence:
h(x)=[1,2,3,4]
Applying the DFT formula, we get frequency domain values:
H(k)=[10,−2+2j,−2,−2−2j]
This tells us how the original signal is composed of different frequencies.
2.4.5 Fast Fourier Transform (FFT)
2.4.5.1 Why FFT?
The DFT requires O(N²) computations, which is slow for large data.
The Fast Fourier Transform (FFT) reduces this complexity to O(N log N), making it much
faster.
2.4.5.2 How FFT Works
• Breaks down the DFT computation into smaller parts.

8|Page Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

• Uses the Divide and Conquer approach.


• Butterfly operations are used at each stage.
Example: FFT in Image Processing
When applying FFT to an image, it helps analyze and filter frequency components.
1. Compute the FFT of the image.
2. Modify the frequency spectrum (e.g., remove high frequencies).
3. Compute the Inverse FFT (IFFT) to reconstruct the image.
Applications:
• Audio processing
• Image compression (JPEG)
• Medical imaging (MRI)
2.4.6 2D Fourier Transform for Images

Images are 2D signals, so we use a 2D Fourier Transform:

For discrete images, the 2D Discrete Fourier Transform (D2FT) is:

Understanding Frequency Domain in Images:


• Low frequencies = smooth regions.
• High frequencies = edges, textures.
2.4.7 Fourier Transform in Image Filtering
Filtering in the frequency domain is often faster than direct convolution.
2.4.7.1 Low-Pass Filtering (LPF)
• Removes high-frequency details (edges, noise).
• Used for blurring and denoising.
2.4.7.2 High-Pass Filtering (HPF)
• Removes low-frequency details (smooth regions).
• Enhances edges and sharp details.

9|Page Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

Example: Edge Detection Using Fourier Transform


1. Compute FFT of an image.
2. Apply high-pass filter in the frequency domain.
3. Compute the Inverse FFT (IFFT) to recover edges.
2.4.8. Discrete Cosine Transform (DCT)
2.4.8.1 Why Use DCT?
• The Discrete Cosine Transform (DCT) is widely used in JPEG image compression.
• Unlike FFT, it does not introduce imaginary components.
2.4.8.2. 2D Discrete Cosine Transform (DCT)

Example: JPEG Compression


1. Convert an image to DCT coefficients.
2. Remove small coefficients (quantization).
3. Store only the important frequency components.
2.4.9. Fourier Transform and Image Restoration
2.4.9.1 Wiener Filtering
A blurry image can be represented as:

10 | P a g e Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

2.4.10 Properties of Fourier Transform (FT)


The Fourier Transform (FT) has several important mathematical properties that make it
extremely useful for signal processing, image processing, and physics. These properties help
in simplifying complex problems and performing operations efficiently.
2.4.10.1 Linearity Property
Statement:

Example:

2.4.10.2 Time Shifting Property

11 | P a g e Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

Example

2.4.10.3 Frequency Shifting Property


Statement:

2.4.10.4 Scaling property


Statement

12 | P a g e Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

This shows that compressing a function in time expands it in frequency.


2.4.10.5. Convolution Theorem
Statement:
The Fourier Transform of the convolution of two signals is the product of their Fourier
Transforms:

Similarly, multiplication in time domain corresponds to convolution in frequency domain:

Example

This is useful in image filtering and signal processing.


2.4.10. 6. Parseval’s Theorem
Statement:

This property is useful in signal power analysis.

13 | P a g e Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

2.4.10.7 Duality Property


Statement:

This shows a symmetry between the time and frequency domains.


2.4.10. 8. Differentiation Property
Statement:
If:

hen differentiation in the time domain corresponds to multiplication by jωj\omegajω in


frequency domain.
Example:
For:

Using Fourier Transform:

This property is used in edge detection and feature extraction.

14 | P a g e Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

2.4.10.11 Applications of Fourier Transforms


1. Image Compression (JPEG, MPEG)
2. Edge Detection & Enhancement
3. Image Filtering (Sharpening, Smoothing)
4. Pattern Recognition
5. Medical Imaging (MRI, CT Scan)
6. Speech and Audio Processing
7. Optical Character Recognition (OCR)
Conclusion
• Fourier Transform is crucial in signal and image processing.
• FFT speeds up computation.
• 2D Fourier Transforms help in filtering and compression.
• DCT is widely used in JPEG compression.
• Fourier-based methods help in denoising and image restoration.
2.5 Pyramids and wavelets
2.5.1 Introduction
2.5.1.1 What are Pyramids and Wavelets?
• Pyramids and Wavelets are multi-resolution techniques used in image
processing, signal processing, and computer vision.
• They help analyze images at different scales and frequencies, making them
useful for:
o Image compression
o Feature detection
o Noise removal
o Object recognition

Feature Pyramids Wavelets

Wavelet functions (Haar,


Basis Function Gaussian/Laplacian
Daubechies, etc.)

Decomposition Multi-scale resolution Multi-scale frequency


Type reduction decomposition

15 | P a g e Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

Feature Pyramids Wavelets

High (overcomplete
Redundancy Low (compact representation)
representation)

Image pyramids, blending, Compression (JPEG2000),


Applications
object detection denoising, feature extraction

2.5.2. Image Pyramids


2.5.2.1 What is an Image Pyramid?
• An image pyramid is a set of images at different resolutions, arranged
hierarchically.
• It helps analyze images at multiple scales.
2.5.2.2 Types of Pyramids
2.5.2.2.1 Gaussian Pyramid
• A Gaussian Pyramid is created by:
1. Smoothing an image using a Gaussian filter.
2. Downsampling by a factor of 2.
3. Repeating the process to create multiple levels.
Mathematical Formulation:

where w(i, j) is a Gaussian filter kernel.


Example:
• Input: 256×256 image
• Gaussian Pyramid Levels:
o Level 0: 256×256
o Level 1: 128×128
o Level 2: 64×64
o Level 3: 32×32
o Level 4: 16×16
o Level 5: 8×8

16 | P a g e Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

2.5.2.2.2 Laplacian Pyramid


• A Laplacian Pyramid represents only the high-frequency details of an image.
• It is formed by subtracting adjacent levels in a Gaussian Pyramid.
Mathematical Formulation:

where:

Example

2.5.3 Applications of Image Pyramids


2.5.3.1 Image Compression
• The Laplacian Pyramid is used in JPEG2000 compression.
• High-resolution details are stored separately, reducing storage requirements.
2.5.3.2 Object Detection
• Pyramids help detect objects at different scales.
• Used in Haar cascades for face detection.
2.5.3.3 Image Blending
• Used to blend two images smoothly without visible seams.
2.5.4 Wavelet Transform
2.5.4.1 What are Wavelets?
• Wavelets are mathematical functions that analyze a signal in both time and
frequency domains.
• Unlike Fourier Transform, which analyzes only global frequency, wavelets provide
localized frequency information.
2.5.4.2 Wavelet Transform Process
1. Apply high-pass and low-pass filters to an image.

17 | P a g e Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

2. Downsample the output to remove redundancy.


3. Repeat for multiple levels to obtain frequency decomposition.
Mathematical Formulation (1D Wavelet Transform):

where ψ(x, t) is the wavelet function.


2.5.4.3 2D Wavelet Transform
• A 2D wavelet transform decomposes an image into four frequency components:
1. LL (Low-Low): Low-pass in both directions (blurred image).
2. LH (Low-High): Horizontal edges.
3. HL (High-Low): Vertical edges.
4. HH (High-High): Diagonal edges.
2.5.5. Comparison: Pyramids vs. Wavelets

Feature Pyramids Wavelets

Type Resolution-based decomposition Frequency-based decomposition

Orientation Sensitivity Low High

Redundancy High Low

Compression JPEG2000 (Laplacian Pyramid) JPEG2000 (Wavelets)

Applications Face detection, image blending Denoising, feature extraction

2.5.6. Applications of Wavelets


2.5.6.1 JPEG2000 Image Compression
• Uses wavelet decomposition instead of DCT.
• Provides better compression with fewer artifacts.
2.5.6.2 Denoising
• Removes high-frequency noise while preserving edges.
2.5.6.3 Feature Extraction
• Used in fingerprint recognition and texture analysis.

18 | P a g e Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

2.5.6.4 Medical Imaging


• MRI and X-ray images use wavelets for noise removal.
2.6 Geometric Transformations
2.6.1 Introduction to Geometric Transformations
Geometric transformations involve modifying the spatial arrangement of pixels in an image
while maintaining its intensity values. These transformations are crucial for image registration,
object tracking, image warping, and 3D reconstruction.
Types of Transformations
• Affine Transformations: Preserve parallel lines.
• Projective Transformations: Preserve straight lines but not parallelism.
• Non-linear Transformations: Involve deformations like warping.
2.6.2. Basic Transformations
2.6.2.1 Translation
• Shifts an image by (Tx, Ty) units.
• Given an image point (x, y), after translation:

The homogeneous coordinate representation:

Example:
Shifting an image by (50, 30) pixels moves each pixel 50 units right and 30 units down.
2.6.2.2 Scaling
• Changes the size of an image.

In Matrix form:

19 | P a g e Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

Example:
Scaling an image by Sx = 2, Sy = 1.5 makes it twice as wide and 1.5 times taller.
2.6.2.3 Rotation
Rotates an image around the origin by θ degrees.
Using trigonometric functions:

Matrix Form

Example:
Rotating an image by 90° counterclockwise swaps x and y coordinates.
2.6.3.4 Shearing

2.6.4 Affine Transformations


• A linear transformation + translation.
• Preserves parallelism, but not angles or lengths.
• General form

20 | P a g e Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

Example Uses
1. Face Detection: Detect faces at different scales.
2. Template Matching: Align objects in images.
3. Optical Character Recognition (OCR): Normalize text images.
2.6.5 Projective Transformations (Homography)
• Extends affine transforms to preserve straight lines but not parallelism.
• General form

The final coordinates are obtained by dividing by w

Example Uses
1. Panoramic Stitching: Aligns overlapping images.
2. Perspective Correction: Rectifies tilted images.
3. Augmented Reality (AR): Maps virtual objects onto real-world images.
2.6.6 Non-Linear Transformations
• Warping: Distorts images non-linearly.
• Thin-Plate Splines: Used for facial expression transfer.
Example Uses
1. Medical Imaging: Warping scans for alignment.
2. Facial Recognition: Normalizing expressions.
2.6.7. Interpolation Methods
Since geometric transformations map pixels to non-integer locations, interpolation is needed.
2.6.7.1 Nearest Neighbor Interpolation

21 | P a g e Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

• Takes the closest pixel value.


• Fast but causes blocky artifacts.
2.6.7.2 Bilinear Interpolation
• Uses four nearest pixels for smooth blending.
• Better quality than nearest neighbor.
2.6.7.3 Bicubic Interpolation
• Uses 16 nearest pixels.
• Best for smooth results.
2.6.8 Applications of Geometric Transformations

Application Transformation Used

Image Stitching Homography

Face Detection Scaling, Rotation

OCR (Text Recognition) Affine Transform

Augmented Reality Perspective Warping

Medical Image Registration Non-linear Warping

2.6.9 Conclusion
• Geometric transformations enable image scaling, rotation, translation, and warping.
• Affine and projective transformations are widely used in computer vision tasks.
• Interpolation techniques ensure smooth transformation results.
• Applications include panoramic stitching, face detection, and OCR.
Important Questions
1. Demonstrate various types of neighborhood operators with examples.
2. Given an image with noise, apply Fourier Transform technique to improve its quality.
Explain your choice
3. Given a distorted image, explain how you would apply geometric transformations
image transformations to correct and enhance the image. Provide real-world examples
where such transformations are crucial in image processing applications.
4. You are given an image with multiple objects of varying sizes. How would you use
image pyramids and wavelet transforms to perform object detection efficiently?
Explain your approach with practical examples where multi-scale analysis is essential.

22 | P a g e Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

5. What is bilateral filtering, and how does it help in edge-preserving smoothing? How is
it different from median filtering?
6. Why is Fourier Transform important in image processing? How does it help in image
enhancement and compression?
7. How does Optical Character Recognition (OCR) benefit from geometric
transformations? What role does interpolation play in improving image quality?
8. What is a Laplacian pyramid in image compression? How does it compare with
wavelet-based compression techniques like JPEG2000?
9. How are high-pass and low-pass filters used in medical imaging? What aspects of
medical images do they improve?
10. What are non-linear geometric transformations like warping? How are they useful in
medical imaging, augmented reality, and facial recognition?

Brief Answers for the Questions


1. Demonstrate various types of neighborhood operators with examples.
Neighborhood operators process a pixel based on its surrounding pixels, commonly used for
noise removal and edge preservation.
• Median Filtering
o Replaces each pixel with the median value from its neighborhood.
o Example: Effective for salt-and-pepper noise, as extreme values are ignored.
o Given a 5×5 noisy window, sorting and selecting the middle value removes
noise.
• α-Trimmed Mean Filter
o Discards α fraction of extreme values and averages the remaining pixels.
o Application: Suitable when both Gaussian and impulse noise are present.
• Weighted Median Filtering
o Assigns different weights to nearby pixels before computing the median.
o Used in medical imaging and edge-preserving denoising.
2. Given an image with noise, apply Fourier Transform technique to improve its quality.
Explain your choice.
The Fourier Transform (FT) converts an image from the spatial domain to the frequency
domain, making noise removal more efficient.
Steps to improve noisy images using FT:
1. Apply Fast Fourier Transform (FFT): Converts the image into frequency
components.

23 | P a g e Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

2. Identify noise frequencies: Noise often appears as high-frequency components in


the spectrum.
3. Apply Low-Pass Filtering (LPF):
o Retains low frequencies (important image details).
o Removes high frequencies (noise).
4. Use Inverse Fourier Transform (IFFT): Converts the filtered image back to the
spatial domain.
Why Fourier Transform?
• Noise often manifests in high-frequency regions, which can be selectively removed.
• Helps in image denoising, sharpening, and feature extraction.
3. Given a distorted image, explain how you would apply geometric transformations to
correct and enhance the image. Provide real-world examples where such transformations
are crucial in image processing applications.
Geometric transformations modify the pixel arrangement while preserving intensity values.
• Translation: Moves the image to align objects correctly.
• Scaling: Enlarges or reduces the image while maintaining aspect ratio.
• Rotation: Adjusts orientation (e.g., correcting tilted scanned documents).
• Affine Transformations: Preserve parallelism and are useful for face detection and
template matching.
• Homography (Projective Transformation): Used in panorama stitching and
perspective correction.
Real-World Examples:
• Augmented Reality (AR): Mapping virtual objects onto real-world scenes.
• Medical Imaging: Aligning CT scans for proper diagnosis.
• Document Scanning: Correcting skewed or distorted scanned text.
4. You are given an image with multiple objects of varying sizes. How would you use
image pyramids and wavelet transforms to perform object detection efficiently? Explain
your approach with practical examples where multi-scale analysis is essential.
Approach:
• Gaussian Pyramid:
o Downsamples an image progressively to detect objects at different scales.
• Laplacian Pyramid:

24 | P a g e Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

o Captures high-frequency details to reconstruct the image after processing.


• Wavelet Transform:
o Decomposes an image into frequency sub-bands for analyzing different levels
of detail.
Practical Applications:
• Face Detection: Haar cascades use pyramids to detect faces of different sizes.
• Fingerprint Recognition: Multi-scale analysis helps in feature extraction.
• Texture Analysis: Used in medical imaging and industrial inspection.
5. What is bilateral filtering, and how does it help in edge-preserving smoothing? How is
it different from median filtering?
Bilateral Filtering:
• A non-linear filtering technique that smooths images while preserving edges.
• Weights are assigned based on spatial closeness and intensity similarity.
• Example: If a pixel differs significantly from its neighbors, it is less affected, unlike
traditional blurring.
Comparison with Median Filtering:

Feature Bilateral Filtering Median Filtering

Edge Preservation Excellent Moderate

Noise Removal Good for Gaussian Best for impulse noise

Computational Cost High Lower

6. Why is Fourier Transform important in image processing? How does it help in image
enhancement and compression?
Importance of FT in Image Processing:
• Converts an image into frequency components.
• Useful for denoising, edge detection, and compression.
Applications:
• Enhancement: High-pass filtering sharpens images, and low-pass filtering removes
noise.
• Compression:

25 | P a g e Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

o Discrete Cosine Transform (DCT) (used in JPEG) keeps significant


frequency components while discarding less important ones.
o Wavelet Transform (JPEG2000) provides better compression efficiency.
7. How does Optical Character Recognition (OCR) benefit from geometric
transformations? What role does interpolation play in improving image quality?
OCR benefits from:
• Geometric Transformations:
o Rotation correction for misaligned text.
o Scaling to adjust font size for better recognition.
o Affine transformations to normalize text shapes.
• Interpolation in OCR:
o Nearest Neighbor: Fast but blocky.
o Bilinear/Bicubic: Produces smoother text images, improving OCR accuracy.
8. What is a Laplacian pyramid in image compression? How does it compare with
wavelet-based compression techniques like JPEG2000?
Laplacian Pyramid:
• Stores high-frequency details separately to allow multi-resolution image
reconstruction.
• Used in progressive image encoding and seamless image blending.
Comparison with Wavelet Compression (JPEG2000):

Feature Laplacian Pyramid Wavelet Transform

Compression Ratio Moderate High

Detail Preservation Good Excellent

Redundancy High Low

Used In Image blending, Progressive decoding JPEG2000 compression

9. How are high-pass and low-pass filters used in medical imaging? What aspects of
medical images do they improve?
• High-Pass Filters:
o Enhance sharp details like edges.

26 | P a g e Dept of CSE, AJIET Mangaluru


Computer Vision BCS613B

o Used in tumor detection, MRI scans, and X-ray imaging.


• Low-Pass Filters:
o Remove noise and smoothen images.
o Used in ultrasound imaging and noise removal from CT scans.
10. What are non-linear geometric transformations like warping? How are they useful in
medical imaging, augmented reality, and facial recognition?
• Warping:
o Non-linear transformation that reshapes images.
Applications:
• Medical Imaging: Aligning MRI scans for accurate diagnosis.
• Augmented Reality: Adjusting perspective for realistic overlays.
• Facial Recognition: Aligning facial features for consistent comparison.

27 | P a g e Dept of CSE, AJIET Mangaluru

You might also like