Combine PDF
Combine PDF
1 2 3
4 5 6
Vector Space Analysis and Vector Space Analysis and Vector Space Analysis and
Linear Algebra Linear Algebra Linear Algebra
Matrix: an n-by-m entity with n*m values (2D) Vector operations: Vector operations:
4 6 75 8.4 8 0 8 8 0
• e.g.: −8 5 6 55 is a 3-by-4 matrix (3 rows and 4 column) • Addition: 5 + −2 = 3 • Dot product: 5 . −2 = 8 ∗ 0 + 5 ∗ −2 + 4 ∗ 14 = 46
0 0 42 54 4 14 18 B 4 14
C • The result of vector dot product (aka inner product) is a scalar (just a number!)
8 8
• The dot product of a vector with itself is the square of its magnitude: 5 . 5 =
4 4
Tensor: an n-by-m-by-l entity with n*m*l values (3D) 8 ∗ 8 + 5 ∗ 5 + 4 ∗ 4 = 105
A • The dot product is also related to the angle between the two vectors: 𝐴.⃗ 𝐵 =
𝐴⃗ 𝐵 cos 𝜃 where 𝜃 is the angle btw the two vectors.
• If 𝐴⃗. 𝐵 =0, it means that the 2 vectors are perpendicular to each other.
av
8 24
• Scalar product: 3* 5 = 15 v
4 12
7 8 9
1
2024-09-25
Vector Space Analysis and Vector Space Analysis and Vector Space Analysis and
Linear Algebra Linear Algebra Linear Algebra
Vector operations: Covariance and correlation Covariance and correlation
5 ∗ 14 − −2 ∗ 4 • Covariance is a mathematical term to quantitatively measure how • A problem with covariance is that it is difficult to interpret. What is a
8 0 78
much two vectors are related to each other. Similarly, it measures how large vs. low covariance value? Is 10, 50, 5000 high or low?
• Cross product: 5 × −2 = −(8 ∗ 14 − 4 ∗ 0) = −112
two vectors change with respect to one another • In order to solve that problem, we need a normalized metric, i.e.
4 14 8 ∗ −2 − 0 ∗ 5 −16
• 𝑐𝑜𝑣 𝑥,𝑦 =
∑ 𝑥 𝑖 – 𝑥 ̅ 𝑦𝑖–𝑦̅
, no need to remember this formula as we will use python correlation.
• The result of vector cross product is another vector ( NOT just a number!) 𝑁–1
• The resultant vector is perpendicular to both original vectors. to calculate covariance.
10 11 12
Vector Space Analysis and Vector Space Analysis and Vector Space Analysis and
Linear Algebra Linear Algebra Linear Algebra
Correlation Matrix operations: Matrix operations:
• 𝜌𝑥,𝑦 = 1 means that there is perfect correlation. 𝑏 ± 𝑒 𝑓 𝑎±𝑒 𝑏± 𝑓 1 0 0
• Element-wise addition: 𝑎 =
• 𝜌𝑥,𝑦 = 0 means that there is no correlation. 𝑐 𝑑 𝑔 ℎ 𝑐± 𝑔 𝑑±ℎ • Identity matrix: 𝐼𝑛 = 0 ⋱ 0
0 0 1
• 𝜌𝑥,𝑦 = −1 means that there is perfect inverse correlation.
1 on the main diagonal and zero elsewhere
𝑎 𝑏 . 𝑒 𝑓 = 𝑎𝑒 + 𝑏𝑔 𝑎 𝑓 +
• Multiplication: • Inverse of a matrix (𝐴–1): 𝐴𝐴 –1 = 𝐴 –1 𝐴 = 𝐼𝑛
𝑐 𝑏ℎ 𝑔 ℎ
𝑑 𝑐𝑒 +
𝑑𝑔 𝑐 𝑓 + 𝑑ℎ • The above simple inverse is only defined for square matrices.
• Even for square matrices, an inverse may NOT always exist.
• Is AB=BA? Try with A= 1 2 and B= 4 1
7 4 −2 0 • Transpose of a matrix (𝐴′ or 𝐴𝑇 ): flipping a matrix over its diagonal;
switching the row and column indices of the matrix.
0 can be used to scale a vector by 𝛼: 𝛼 0 𝑥 𝑏 =
• 𝛼
=
𝛼𝑥
= 𝛼
𝑥 • Determinant of a 2-by-2 matrix: det A = 𝐴 = 𝑎
𝑐 𝑑
0 𝛼 0 𝛼 𝑦 𝛼𝑦 𝑦
𝑎 𝑏 𝑐
• Determinant of a 3-by-3 matrix: 𝑑 𝑒 𝑓 =
𝑔 ℎ 𝑖
13 14 15
16
2
2024-09-25
𝑁
IDF =𝑙𝑜𝑔
Building blocks of human language are words, but machine 𝑑𝑓𝑥
learning algorithms usually work with vectors of features. 𝑑𝑓𝑥 : number of documents containing x
SCHOOL OF COMPUTER TECHNOLOGY
𝑁: total number of documents
AASD 4001
Text vector of features
Mathematical Concepts for Machine Learning • e.g.: Let’s assume the size of the corpus is 100 documents. If there are 20
documents that contain the term “student”, then the IDF is:
Lecture 2 𝑙𝑜𝑔
100
= 𝑙𝑜𝑔5 = 0.70
20
Let’s see how it can be done.
Reza Moslemi, Ph.D., P.Eng. Mathematically, TF-IDF (𝑊𝑥,𝑦 ) of a word x in a document y is
obtained from:
𝑊𝑥,𝑦 = 𝑇𝐹𝑥,𝑦 ×𝐼𝐷𝐹𝑥
1 4 7
2 5 8
It is common to improve the bag of words by using the TF-IDF In machine learning, regression algorithms are supervised models
What is Natural Language Processing (NLP)?
(Term Frequency – Inverse Document Frequency) method. for estimating the relationships between a dependent variable
• Natural language processing (NLP) is a branch of artificial intelligence
• TF: importance of the term in the document and one or more independent variables.
that helps computers understand, interpret and manipulate human
language. NLP draws from many disciplines, including computer science • IDF: importance of the term in all documents (corpus) • Linear Regression
and computational linguistics, in its pursuit to fill the gap between human
• Logistic Regression (for classification)
communication and computer understanding.
TF-IDF is a popular scoring approach used to weigh terms for NLP • Etc.
tasks because it assigns a value to a term according to its
importance in a document scaled by its importance across all
documents in your corpus, which mathematically eliminates What is a supervised learning model?
Why NLP?
naturally occurring words, and selects words that are more
• There is a wealth of audio- and text-based resources. If we can use NLP
to our advantage, we gain worthy information.
descriptive of your text.
3 6 9
1
2024-09-25
Actual Class
P N
Y Y? Y? Predicted P TP FP
Let’s look at this in 2D: Class N FN TN
where:
P =Positive; N =Negative
TP =True Positive; FP =False Positive; TN =True Negative; FN =False Negative
10 13 16
Y l2
The first 2 are Binary classification (two classes only, usually 0 and 1) whereas P N
the last example is multiclass classification (Multinomial Logistic/Multi-class Predicted P 40 15
l1 Logistic Regression using Softmax function instead of Sigmoid)
Class N 5 140
Logistic regression employs the so-called logistic function (sigmoid):
1
𝑝=
1 + 𝑒–𝑙 A couple of definitions:
𝑝
𝑙𝑜𝑔𝑖𝑡: 𝑙 = ln( )
1−𝑝 • Accuracy: (TP+TN)/TOTAL =180/200 =90%
• Misclassification error: (FP+FN)/TOTAL = 20/200 =10%
11 14 17
Which one to choose? Using logistic function, it can predict values that lie between 0 and We try to avoid FP and FN predictions:
1. Then, using a threshold (usually 0.5) it can classify into 2 classes.
Constraints: ground truth, intercept, etc. Logistic regression has discreet outputs as opposed to linear
regression. Dependent variable is categorical.
We will adopt a least square loss function
• A metric to minimize the overall error (sum of squared residuals)
• 𝐿𝑆𝐿𝐹 = ∑𝑛𝑖=1 𝑒𝑖2 = ∑𝑛𝑖=1 𝑦𝑖 − 𝑦/𝑖 2 where 𝑦𝑖 and 𝑦/𝑖 are the actual and estimated y-
values at 𝑥𝑖, respectively.
𝑅2 = 1 −
𝑆𝑆 r𝑒𝑠
=1−
∑𝑖𝑛' ( 𝑦 𝑖 ' 𝑦( 𝑖 *
∑𝑖𝑛' ( 𝑦 𝑖 ' 𝑦) 𝑖 *
Y e3 e4
𝑆𝑆 𝑡𝑜𝑡
12 15 18
2
2024-09-25
Lec ture 3
Split on A Split on B Split on C
Reza Moslemi, Ph.D., P.Eng. A=0 A=1 B=0 B=1 C=0 C=1
YY ZZ YYY ZZ YYYYY ZZZZ Y ZZZ YYYY Z
1 4 7
2 5 8
3 6 9
1
2024-09-25
Gradient Descent
Gradient Descent algorithm Support Vector Machines
algorithm (cont’d)
Gradient Descent algorithm • There is a chance that you miss the minimum if the steps are too large. What are Support Vec tor Mac hines?
• What is gradient? • Move in smaller steps, i.e. −∇𝑓(𝑝) multiplied by a small learning rate, usually 0.01 • Supervised machine learning algorithms to analyze data and recognize patterns
• What is a gradient descent algorithm? as a starting point • Used for both regression and also classifica tion
• A learning rate that is too large can cause the model to converge too
• What about stochastic gradient descent algorithm? • Can perform linear and non-linear analysis (decision boundaries)
quickly to a suboptimal solution, whereas a learning rate that is too small
ca n ca use the process to get stuck. • Work well with high-dimensional data (data with more than a few
number of features), but can be computationally expensive
Gradient: • We can take larger steps when we are far from the minimum and smaller
steps when close to the minimum. This process is called learning rate • It chooses extreme vec tors or support vec tors to c reate the hyperplane
• The gradient vector can be interpreted as the "direction and rate of
fastest inc rease" scheduling and is often carried out automatically by the gradient descent • Support vec tors are defined as the
• Consider a room where the temperature is given by T(x, y, z). libraries. data points, whic h are c losest to the
At each point in the room, the gradient of Tat that point will • Avoid getting stuck at local minima or saddle hyperplane and have some effec t on
show the direc tion in whic h the temperature rises most quic kly, its position. As these vec tors are
moving away from (x, y, z). The magnitude of the gradient will • Randomness helps to arrive at a global minimum
and avoid getting stuc k at loc al minima supporting the hyperplane, therefore
determine how fast the temperature rises in that direction. named as Support vec tors
• For non-linear dec ision boundaries,
• Consider a surface whose height above sea level at point (x, y) is H(x, y).
The gradient of H at a point is a vec tor pointing in the direc tion of the data c an be c onverted into a linear
steepest slope at that point. one using higher dimensions
10 13 16
11 14 17
12 15 18
2
2024-09-25
19 22
20 23
K-Means clustering
• Attempts to group similar clusters of data together (based on their
• distanc e from K c entroids)
• Minimizes within cluster variance
• Unsurprised machine learning algorithm – you do not need/have
target information
• In simple terms: only need 𝑥1 , … , 𝑥𝑛 but not 𝑦1 , … , 𝑦𝑛
• What are possible applications of K-Means clustering (and clustering,
in general)
21
3
2024-09-25
What is Singular Value Decomposition (SVD)? Let’s assume that there are n users, m movies, and k (𝑘 ≤ 𝑛, to
be chosen by the algo. designer) latent features, based on
the singular value decomposition of a complex m-by-n matrix whic h the users have rated the movies.
SCHOOLOF COMPUTERTECHNOLOGY M is given by:
AASD 4001 • 𝑴 = 𝑼𝚺𝑽𝑻 If we form an R matrix consisting of the data shown in the
movie database, we can use SVD-like decomposition to
Mathematic al Conc epts for Mac hine Learning write:
where m-by-m 𝑼 and n-by-n 𝑽 are orthogonal matric es, and
Lec ture 4 𝚺 is a m-by-n rectangular diagonal matrix with non-negative
real numbers on the diagonal. 𝑹 ≈ 𝑷 ∗ 𝑸𝑻 = 𝑹.
Reza Moslemi, Ph.D., P.Eng. orthogonal matrix: Each row of the n-by-k 𝑷 matrix denotes the association btw
• a square matrix whose columns and rows are orthonormal vectors a user and the features.
• In simple terms: 𝑽𝑻𝑽 = 𝑽𝑽𝑻 = 𝑰 or 𝑽𝑻 = 𝑽#𝟏 Each row of the m-by-k 𝑸 matrix denotes the association btw
a movie and the features.
1 4 7
o M atrix Factorization 1 0 0 0 2
0 0 3 0 0
0 0 0 0 0
o M athematics of Digital Signal Proc e ssing 0 2 0 0 0
0 0 −1 0 0
0 −1 0 0 3 0 0 0 0 − 0.2 0 0 0 − 0.8
= −1 0 0 0 0 5 0 0 0 0 −1 0 0 0
0 0 0 −1 0 0 2 0 0 0 0 0 1 0
0 0 −1 0 0 0 0 0 0
− 0.8 0 0 0 0.2
2 5 8
Factorization: How does SVD relate to ML and recommendation systems in But how do we find P and Q matric es?
• 𝑥 2 − 𝑦2 = 𝑥 − 𝑦 𝑥 + 𝑦 particular? • SVD only works for matrices without missing values, but the rating matrix
• 𝑥 2 + 𝑦2 + 2𝑥𝑦 = 𝑥 + 𝑦 2 has a lot of missing values
Consider the following rating matrix (user-item, user-movie here) : • We can form an optimization problem to solve that issue
The same idea can be applied to matrices and is called matrix
factorization! Movie #1 Movie #2 Movie #3 Movie #4
One a pproac h is to initialize the P and Q matric es with some
User #1 5 3 - 1
• There are various methods for matrix factorization, each with its own
User #2 4 - - 1 random values, calculate 𝑹. and calculate how different it is
properties and applic ations
• LU (lower-upper) Decomposition
User #3 1 1 - 5 from ac tual 𝑹 (the error).
• Cholesky Decomposition User #4 1 - - 4
• QR Dec omposition User #5 - 1 5 4
• Singular Value Dec omposition (SVD)
• Etc. Where eac h user has rated some movies 0-5 (the “–” in the table In simple terms, we need to minimize the following loss function
means that the user has not rated that movie). containing squared errors:
• e.g.: Netflix released such a database for completion on Kaggle (appox.
What is most relevant to us in this course, especially for 500,000 user and 17,000 movies). )
rec ommender systems, is SVD. 𝑚𝑖𝑛𝑷,𝑸 1 𝑟𝑥i − 𝒑 𝒊. 𝒒𝑻
𝒙
i,𝑥
3 6 9
1
2024-09-25
and the iterative update formula for P and Q matrices is given by:
• 𝑷 ← 𝑷 − 𝜂. ∇𝑷
• 𝑸 ← 𝑸 − 𝜂. ∇𝑸
10 13 16
11 14 17
12 15 18
2
2024-09-25
19 22
20 23
21 24
3
2024-09-25
1 4 7
𝑀–1
𝐹(𝑢)𝑒j2𝜋𝑢𝑡/𝑀
ϕ(u)
𝑓 𝑡 = 3 for t = 0, 1, 2, ..., M −1
𝑢=0
2 5 8
Even functions that are not periodic but whose area under the 𝑀–1
curve is finite can be expressed as the integral of sines and 1 2𝜋𝑢𝑡 2𝜋𝑢𝑡
𝐹 𝑢 = 3 𝑓(𝑡)(cos − 𝑗 sin ) for u = 0, 1, 2, ..., M −1
cosines multiplied by a weighing function. The formulation in this 𝑀 𝑀 𝑀
𝑡=0
case is the Fourier transform, and its utility is even greater than the
Fourier series in most practical applications. Each term of the Fourier transform is composed of the sum of all values of the function f (t).
The values of f (t) are multiplied by sines and cosines of various frequencies. The domain
This core technology allowed for the first time practical
(values of u) over which the values of u range is called the frequency domain, because u
processing and meaningful interpretation of a wide range of determines the frequency of the components of the transform. Each of the M terms of F(u)
is called a frequency component of the transform. u u
signals, from medical monitors and scanners to modern a ·f1 + b ·f2 + c ·f3 + d ·f4
electronic communication.
3 6 9
1
2024-09-25
10 13 16
Visual interpretation of 2D DFT 2D DFT Magnitude and Phase Filtering in the frequency domain
11 14 17
2D Discrete Fourier Transform 2D DFT Magnitude and Phase Basic filters and their properties
Notch filter Satellite image of Florida Fourier spectra
We define the 2D Fourier spectrum and the gulf of Mexico showing noise
Used to remove repetitive "Spectral" noise
| F(u, v)| = [R2(u, v) + I2(u, v)]1/2
from an image
phase angle
A notch filter is a filter that contains nulls in its
ϕ (u, v) = tan−1 I(u, v) frequency response.
R(u, v) They are used in many applications where
and power spectrum as specific frequency components must be
P(u, v) = | F(u, v)|2 = R2(u, v) + I2(u, v) Image Inverse FT eliminated.
noise
filter to capture
Notch pass
FT-Magnitude FT-Phase
Ignoring the phase Assuming that the FT has been centered, we
where R(u, v) and I(u, v) are the real and imaginary parts of F(u, v).
can mathematically define the Notch filter
It is a common practice to multiply the input image function by (−1)x+y Image after notch filtering
for an illustration on the right- hand side as
prior to computing the Fourier transform. It has been shown
mathematically that if v = 0,
H = 30
otherwise
filter
by notch pass
Noise captured
1
F( f (x, y) (−1) x+ y) =
F(u −M/2 , v −N/2)
The equation states that the origin of the Fourier transform of f (x, y)(−1)x+y The filter would set the central vertical
line to 0 and leave all other frequency
is located at u=M/2 and v=N/2, which is the centre of the M x N area
Lowpass filter Highpass filter components untouched.
occupied by the 2D DFT. We refer to this area as the frequency domain.
How can we obtain the processed image?
12 15 18
2
2024-09-25
19 22
𝑓 𝑥, 𝑦 ∗ ℎ(𝑥, 𝑦) = 3 3 𝑓 𝑚, 𝑛 ℎ(𝑥 − 𝑚, 𝑦 − 𝑛)
Gaussian 𝑚=0 𝑛=0
Gaussian
20 23
21
3
2024-09-25
Convolution Convolution
AASD 4001
Mathematic al Conc epts for Mac hine Learning
Lec ture 6
1 4 7
2 5 8
3 6 9
1
2024-09-25
Geometric transformations of
Convolution/spatial filtering Affine Transformation
images
The response of the filter at pixel (x, y) is A shear in the x direction shown in the below graph is produced by
given by the sum of productsof the filter
Geometric transformations are widely used for the removal of
u = x + 0.2y
coefficientsand the corresponding image artifacts and for image registration (geographical
v=y
image pixel values in the area covered mapping).
by the filter mask. For the 3 × 3 matrix
shown on the right panel, the response of It is often necessary to perform a geometric transformation of
filtering with with filter at a point (x, y) in the image co ordinate system in order to:
the image is
R = w(-1, -1) f (x −1, y −1) + w(-1, 0) f (x −1, y) + . . .
- Align images that were taken at different times or with
+ w(0, 0) f (x, y) + . . . + w(1,0) f (x + 1, y) + w(1, 1) f (x + 1,y + 1)
different sensors
- Correc t images for lens distortion
For a mask of size m × n, we assume that
m=2a+1 and n=2b+1, a> 0, b> 0. In - Correct effects of camera orientation
practice, we use filters of odd sizes, with
the smallest meaningful size being 3 × 3 .
- Create special effects by morphing images
10 13 16
Geometric transformations of
Convolution/spatial filtering Affine transformation
images
In general, filtering of an image f (x, y) of size M × N with a filter mask w of size m × n is This transformation produces both, a shear and a rotation.
given by:
In a geometric/spatial transformation each point (x, y) of image f (x, y)
𝑎 𝑏 is mapped to a point (u, v) in a new coordinate system. u = x + 0.2y
g 𝑥, 𝑦 = & & 𝑤 𝑠, 𝑡 𝑓 𝑥 + 𝑠, 𝑦 + 𝑡 (*) u = f1(x, y) v = -0.3x + y
𝑠"#𝑎 𝑡 " # 𝑏 v = f2(x, y)
𝑎 𝑏
Convolution: g 𝑥, 𝑦 = 𝑤 ∗ 𝑓 𝑥, 𝑦 = 3 3 𝑤 𝑠, 𝑡 𝑓 𝑥 − 𝑠, 𝑦 − 𝑡
𝑠"#𝑎 𝑡"#𝑏
What happens when the centre of the filter approaches the border of the image? If A digital image array has an implicit grid that is mapped to discrete
the centre of the mask moves any closer to the border, one or more rows or points in the new domain. These points may not fall on grid point in
columns of the mask will be located outside of the plane.
the new domain.
11 14 17
12 15 18
2
2024-09-25
How to Find
Affine transformation Affine transformations
Transformation
We can evolve a sequence of basic affine transformations into a complex affine
transform. Suppose that you are given a pair of images to align. You want
Combinations of transformations are most easily described in terms of matrix to try an affine transform to register one to the coordinate system
operations. To use matrix operations we introduce homogeneous coordinates. These of the other. How do you find the transform parameters?
enable all affine operations to be expressed as a matrix multiplication. Otherwise,
translation is an exception.
The rotation of a point, straight line or an entire image on the screen, about a point
other than origin, is achieved by first moving the image until the point of rotation
occupies the origin, then performing rotation, then finally moving the image to its
original position.
Translation of point by the change of coordinate cannot be combined with other
transformation by using simple matrix application. Such a combination is essential if
we wish to rotate an image about a point other than origin by translation, rotation
again translation.
To combine these three transformations into a single transformation, homogeneous
coordinates are used. In homogeneous coordinate system, two-dimensional
Cartesian coordinate positions (x, y) are represented by triple-coordinates (h.x, h.y, h),
h≠0 in homogeneous c oordinates.
19 22 25
How to Find
Affine transformation Affine Transformations
Transformation
The transformation matrix of a sequenc e of affine Find a number of points {p0, p1, . . . , pn−1} in image A that match
As such, the affine equations are expressed as: transformations, T1, T2, T3 is: points {q0, q1, . . . , qn−1} in image B. Use the homogeneous
T = T3T2T1 coordinate representation of each point as a column in matrices
𝑢 𝑎 𝑏 𝑐 𝑥 P and Q:
𝑣 = 𝑑 𝑒 𝑓 𝑦 The composite transformation for the example above is
1 0 0 1 1 0.92 0.39 -1.56
T = T3T2T1 = -0.39 0.92 2.35
An equivalent expression using matrix notation is
0.0 0.0 1.0
q = Tp
Any combination of affine transformations in this way is an
Where, affine transformation.
𝑢 𝑎 𝑏 𝑐 𝑥 then we can write:
The inverse transform is:
q= 𝑣 , 𝐓 = 𝑑 𝑒 𝑓 , 𝐩= 𝑦
T−1 = T−1T−1T−1 Q=HP
1 0 0 1 1 1 2 3
We need to solve for H in order to find the appropriate
transformation.
20 23 26
Composite Affine
Affine transformation Affine transformation
Transformation
The transformation matrices can be used as building blocks. Given scaling and rotation matrices, R, S, T for rotation, scaling, Matrices for two-dimensional transformation in homogeneous coordinate:
and translation, obtain formula for their produc t H=RST.
How do transformation matrices for translation and scaling look
like?
R S T=
(counter-clockwise)
(clockwise)
Translation by (x0, y0) Scaling by s1 and s2 Rotating by 𝜃
H
You will usually want to translate the center of the image to the origin of the
coordinate system, do any rotations and scalings, and then translate it back.
21 24 27
3
2024-09-25
28 31 34
Contrast-stretching
Photographic negative Logarithmic transformations
transformations
Logarithmic transformation c an be used to brighten the Contrast-stretching transformations increase the contrast between the darks
Assume input pixel intensities are in range [0, L], where L=255. intensities in an image. It is used to increase the and the lights. Sometimes, we want to increase the intensity around a certain
detail(contrast) of lower intensity values. They are especially grey level. As a result, dark colours become a lot darker and light colours
useful for bringing out detail in Fourier transforms. The become a lot lighter with only a few levels of grey around the level of interest.
logarithmic transform of image f (x, y) is: Contrast-stretching transformation can be created with the following function:
1 Contrast-stretching transformations with changing E
g(x, y) = c log(1 + f (x, y)) g(x, y) =
(1 + m /( f (x, y) + ϵ)) E
The constant c is typically used to scale the range of the log
E controls the slope of the function and m is the
function to match the intensity range of the original image. mid-line where we want to switch from dark to
light values. ϵ is a small constant used to
c = 255/log(1 + 255) prevent division by 0.
It can also be used to further increase contrast. The higher the c,
the brighter the image will appear.
(a) (b)
(a) Original digital mammogram f (x, y), Log transformation compresses the dynamic range of
(b) Negative image obtained using the negative transformation images with large variations in pixel values.
g(x, y) = 255 − f (x, y) Grey levels What is the midline point m in this plot?
29 32 35
f (x, y)
30 33 36
4
2024-09-25
37 40
Given below is another example of the smoothing filter. What is the constant Final exam (30 pts) will focus on the material that was covered
multiplier in front of the mask equal to? during the co urse sessions:
This mask is called a weighted average. The image pixels are multiplied by • You need to have knowledge of the underlying concepts and mathematics
different coefficients, thus giving more importance to some pixels at the of the topic s c overed in the c lassroom.
expense of others. The pixel at the centre of the mask is multiplied by a higher • The final exam will NOT have any (python) coding questions
value than any other, thus giving this pixel more importanc e in the calculation fo
• You CAN use your c ourse material during the exam
the average. The other pixels are inversely weighted as a function of distance
from the ce ntre of the mask. • This is NOT a group exam. Each student shall only use his/her knowledge to
answer the questions.
• You cannot communicate (in any form) with your classmates or other
individualsto answer the questions.
The diagonal terms are further away from the centre
1 2 1 than the orthogonal neighbours by a fac tor of 2 • Failure to comply with GBC exam policies results in academic consequences.
and thus are weighted less. The basic strategy
1 behind weighing the centre point the highest and
× 2 4 2
16 then reducing the value of the coefficients as a
func tion of inc reasing distanc e from the origin is
1 2 1 simply an attempt to reduce blurring in the
smoothing proce ss.
38 41
Smoothing filters
Given below is a general form of a 3 × 3 smoothing filter.
(1)
39