0% found this document useful (0 votes)
2 views

3.1 - Image Fundamentals

The document covers fundamental concepts in data science, focusing on image processing and computer vision. It discusses image representation, pixel structures, convolution operations, and the importance of scaling and cropping images. Additionally, it introduces convolutional neural networks (CNNs) as a key topic for further study.

Uploaded by

toufeeqdata
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

3.1 - Image Fundamentals

The document covers fundamental concepts in data science, focusing on image processing and computer vision. It discusses image representation, pixel structures, convolution operations, and the importance of scaling and cropping images. Additionally, it introduces convolutional neural networks (CNNs) as a key topic for further study.

Uploaded by

toufeeqdata
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Current Topics in Data Science

Image Fundamentals
Leandro L. Minku
Video Presentation
Assignment
• Presentation about a data science problem of your choice
• Max 5 min.

• Your proposed approach to solve it.

• No need to implement or run this approach, just propose it!

• It can be any data science approach, not restricted to the


approaches learned in this module.

• Recommendation: note that one of the marking criteria is the


suitability of the proposed approach.
• Explain why you believe your proposed approach is suitable!

2
Module Topics
1. Time series

2. Natural language processing

3. Computer vision

4. Fairness

3
Computer Vision
• Image fundamentals.
• Convolutional Neural Networks and variations.

[Image source: https://miro.medium.com/v2/resize: t:720/format:webp/1*uAeANQIOQPqWZnnuH-VEyw.jpeg] 4


fi
Computer Vision
• Image fundamentals.
• Convolutional Neural Networks and variations.
• Image classi cation, image segmentation.

[Image source: Deep Learning with Python book] 5


fi
Computer Vision
• Image fundamentals.
• Convolutional Neural Networks and variations.
• Image classi cation, image segmentation.

[Image source: Deep Learning with Python book] 6


fi
Outline
• Pixels — the building blocks of images.

• Grayscale and RGB colour space.

• Images as arrays.

• Scaling, aspect ratios, cropping.

• Kernels and convolutions.

7
8
Pixels

• An image can be seen as a grid.

• Pixel is the colour or intensity of a position in the grid.

• Smallest building block of an image.

9
Pixel Representation
• Grayscale
• Scalar between 0 (black) and 255 (white), i.e., in
{0,…,255}.

10
Pixel Representation
• Colour
• Various colour representations exist.

• Red, Green, Blue (RGB): each pixel has three channels.


• R: scalar in {0,…,255}.
• G: scalar in {0,…,255}.
• B: scalar in {0,…,255}.

11
RGB: Three Channels

[Image source: Deep Learning for Computer Vision with Python book] 12
Images as Arrays:
1 Channel (2d Matrices)

• Images are matrices, indexed


by (row, column)

M= M1,1, M1,2, M1,3, …


M2,1, M2,2, M2,3, …
M3,1, M2,1, M2,2, …

13
Images as Arrays:
1 Channel (2d Matrices)

• Images are arrays, indexed by


(row, column)

M= M[0,0], M[0,1], M[0,2], …


M[1,0], M[1,1], M[1,2], …
M[2,0], M[2,1], M[2,2], …

14
Images as Arrays:
3 Channels (3d Tensors)

• Images are arrays, indexed by


(row, column, depth)

M[0,0,2],
M[0,0,1],
M[0,0,0],

Note: the above corresponds to the


OpenCV library, which actually stores
pixels in BGR. Other libraries like
matplotlib assume RGB, such that the red
pixel would be M[0,0,0].
15
Image Resolution

Height

Width

• Level of detail.
• Resolution of 1000 x 750 means Note: a pixel may have one or more
• 1000 pixels wide and channels (depth of image). This is
not considered for the resolution.
• 750 pixels tall.
• I.e., 750,000 pixels in total. 16
Demo
• image-fundamentals.ipynb
• Load an image with OpenCV library
• Convert from BGR to RGB
• Convert image to Grayscale
• Plot each channel of the image

17
Scaling (Resizing)
• Increasing or decreasing the size of an image in terms of
width and height.
• We will often need to do this when using machine learning.

• Aspect ratio: width / height.

Note: aesthetically, maintaining the aspect ratio


is important, but it is not always the case for
image machine learning problems.

[Image source: Deep Learning for Computer Vision with Python book] 18
Cropping

19
Demo
• image-fundamentals.ipynb
• Rescaling an image
• Rescaling by factor
• Cropping an image

20
Image Convolution
• Application of a lter to an image, e.g., to:
• Blur.
• Sharpen.
• Detect edges.
• Etc.

21
fi
Image Convolution
• Application of a lter to an image, e.g., to:
• Blur.
• Sharpen.
• Detect edges.
• Etc.

22
fi
Image Convolution
• Application of a lter to an image, e.g., to:
• Blur.
• Sharpen.
• Detect edges.
• Etc.

23
fi
Convolution
• Based on an element-wise multiplication of two matrices,
followed by a sum.

0 -1 0 93 139 101

-1 5 -1 ⋆ 26 252 196 =
0 -1 0 135 230 18

Note: mathematically,
(0 x 93) + (-1 x 139) + (0 x 101) this operation is
called cross-
+(-1 x 26) + (5 x 252) + (-1 x 196) correlation, but the
computer vision uses
+ (0 x 135) + (-1 x 230) + (0 x 18) the term convolution

≈ 669
for it.
24
Convolution
• Operation between a “large” matrix (image) and a “small” matrix ( lter,
a.k.a., kernel) that produces another matrix (image).

• The kernel slides over the image from left to right, top to bottom.

0 -1 0 93 139 101 4 3 5 4 3 5 4

-1 5 -1 669 196 3
26 252 2 5 3 2 5 3

0 -1 0 135 230 18 2 4 3 2 4 3 2

Note: the convolution is 1 3 5 4 3 5 4 3 5 4


performed by applying the kernel 7 2 5 3 2 5 3 2 5 3
to the original values of the
1 4 3 2 4 3 2 4 3 2
image!
1 3 5 4 3 5 4 3 5 4
Note: convolution is usually
applied separately to different 7 2 5 3 2 5 3 2 5 3
channels. 1 4 3 2 4 3 2 4 3 2

Note: the resolution of the new 1 3 5 4 3 5 4 3 5 4


image is smaller. fi
25
Zero Padding

0 0 0 0 0 0 0 0 0 0 0 0

0 -1 0 0 93 139 101 4 3 5 4 3 5 4 0

-1 5 -1 0 26 252 196 3 2 5 3 2 5 3 0

0 -1 0 0 135 230 18 2 4 3 2 4 3 2 0
0 1 3 5 4 3 5 4 3 5 4 0

0 7 2 5 3 2 5 3 2 5 3 0
0 1 4 3 2 4 3 2 4 3 2 0
0 1 3 5 4 3 5 4 3 5 4 0
Note: other forms of padding
exist, like replicating the values at 0 7 2 5 3 2 5 3 2 5 3 0
the borders.
0 1 4 3 2 4 3 2 4 3 2 0
0 1 3 5 4 3 5 4 3 5 4 0
0 0 0 0 0 0 0 0 0 0 0 0
26
Intensities Outside Range
[0,255]
• Clip to the nearest allowed value, e.g.:
• -1 becomes 0.
• 300 becomes 255.

• Rescale, e.g.:
• values between -1 and 300 are rescaled to the range
{0,…,255}.

old_value − old_min
new_value = 255 ×
old_max − old_min
• E.g.: rescale value of 150 considering that the values are
between -1 and 300.
150 - (-1)
= 255 x ≈ 128
300 - (-1) 27
Demo
• image-fundamentals.ipynb
• Applying convolutions

28
Examples of Kernels
Sharpen Blur
0 -1 0 1/9 1/9 1/9

-1 5 -1 1/9 1/9 1/9

0 -1 0 1/9 1/9 1/9

Edge
Laplacian Sobel X Sobel Y
Detection
-1 -1 -1 0 1 0 -1 0 1 -1 -2 -1

-1 8 -1 1 -4 1 -2 0 2 0 0 0

-1 -1 -1 0 1 0 -1 0 1 1 2 1

29
Kernels and Computer
Vision Problems
• Kernels were typically designed to create features to be given
as inputs to machine learning algorithms.

• Designing kernels is challenging and problem-dependent.

• What about learning kernels automatically?

30
Summary
• Images are represented as arrays storing their pixels.
• The width is represented by the columns.
• The height by the rows.
• The channels by the depth.

• Image resolution corresponds to the level of detail in number of


pixels.

• Scaling, aspect ratio, cropping.

• Convolution, lters/kernels, padding, dealing with values


outside range.

• Next: Convolutional Neural Networks (CNNs).

31
fi
References
• Deep Learning for Computer Vision with Python, Chp 3 and
Chp 11 until 11.1.4 (inclusive).
• If you are keen on programming, you can also read Section
11.1.5.
• If you are not keen on programming, you can still read this
section, but you don’t need to pay attention at the coding
part.
• No matter if you read or don’t read this section, you need to
understand the fact that we are sliding the kernel over the
image and may need padding, as explained in this lecture.

32

You might also like