MLFA Spring 2024
MLFA Spring 2024
2. KNearest Neighbor
(at In certain situations K in a KNN cannot be too small or too large. Explain
what are those situations? (2)
(b) Explain the difference between KNN and Weighted KNN. State whether the
weighted KNN solves the problem of too small K or too large K. Justify. (2
(é) In the context of KNN, explain (a) Why we need to normalize the input features
before KNN classification. (b) What is curse of dimensionality. (2)
(a A KNN classifier assigns a test instance the majority class associated with its
K nearest training instances. Distance between instances is measured using
Euclidean distance. Suppose we have the following training set of positive (+)
and negative (-) instances and a single test instance (o). All instances are
projected onto a vector space of two real-valued features (X and Y). Answer
the following questions. Assume "unweighted" KNN (every nearest neighbor
contributes equally to the final vote).
Y
test instance
Page 2
(A)
(B)
(E) (F
(2)
( A random sample of eight drivers insured with a company and
having similar
auto insurance policies was selected. The following table lists their driving
experiences (in years) and monthly auto insurance premiums.
5
Driving Experience (years) Monthly Auto Insurance Premium
64 USD
2 87
12 50
9 71 ()c-9
15 44
6 56
25 42
16 60 }2
Page 3
A. Naive Bayes
(a Here's a naive Bayes model with the following conditional probability table and
the following prior probabilities over classes.
Word type b C
Best wishes
Page 4
So
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Mid Spring Semester Examination 2023-24
Date of Examination: 20-02-2024 Session: FN Duration:2 Hrs_Full Marks: 40
Subject No. :AM2001 Subject : MACHINE LEARNING FOUNDATIONS AND APPUGATIONS
DepartmentUCenter/School: Artiflclal Intellgence
Specific charts, graph paper, log book et., requlred No (Ensure qucstion papct has 10 questions)
Special Instructions (if any) :Calculators arc allowed. Rough work must be prescnt in the answct sript itscif.
Define explicitly the cost/error function E, assuming that a set of training examples D is provided, where
cach training example deDis associated with the target output ta. (2]
3. Which of the following statements are true for k-NN classifiers (provide all answers that are correct)? [1]
a) The classification accuracy is better with larger values of k.
B. Now assume we only observe a single input for each output (that is, a set of fx, y} pairs). We would like
to compare the following two models on our input datasct (for cach onc we split into training and testing
set to evaluate the lcamed model). Assume we have an unlimited amount of data:
Model A: y=wx,
Model B: y = wx.
Which of the following is correct (chose the answer that best describes the outcome)? Justify.
a. There are datasets for which A would perform better than B
b. There are datasets for which B would perfom better than A
c. Both l and 2 are correct.
d. They would perform equally well on all datasets
C. For the data above we are now comparing the following two models:
Model A: y=wix+ W2x,
Model B: y= wx.
Note that model A now uses two parameters (though both multiply the same input value, x). Again, we
assume unlimited data. Which of the following is correct (chose the answer that best describes the
outcome)? Justify your answer.
a. There are datasets for which A would perform better than B
b. There are datasets for which B would perform better than A
c. Bothl and 2are correct.
d. They would perform equally well on all datasets.
.7. Suppose you are given a Linear Classification problem with the dataset in Fig. 1. [3)
Fig. 1
y = wo + w;X) + w:X; Illustrate using suitable
We would like to use the following classification model: weights: (a) No regularization (b) LI
decision boundaries, what is the impact of regularization on
regularization and (c) L2 regularization. We must aim for the
least loss value and assume that we can
neglect atmost 1 misclassified datapoint, if needed.
of different fruits. Apply Naïve Baye's and predict
. Suppose we have the following table which has attributes 3]
if a fruit has the following properties: (Yellow, Sweet, Long}, then which type of the fruit it is.
that
have
re
Frequency Table:
Fruit Yellow Sweet Long Total
Mango |350 450 0 650
Banana |400 300 350 400 20
Others 50 100 50 |150
Total 800 850 400 1200
ISRO intends to include a module in Pragyan, the lunar probe of Chandrayaan-3, that will discriminate
between igneous rocks found in Moon (M) and igneous rocks found in Earth (E) based on the folowing
characteristics (attributes): Water content E (N, Y), Number of distinct textures E(> 10,< 10), Size E
{S, L), Smelly ¬ (N, Y}. Available training data is as follows. (S+3+2]
Index Type Water No: of Textures Size Smelly
1 Y > 10/ Y
(2) M N < 10 L N
(3 N > 10 / N
(4 M < 10 S
(5 Y < 10
(6) E Y < 10 S
(7) Y < 10
8 E N < 10 S N
a Train a decision tree using the above data and draw the tree (provide all your calculations).
Write the learned concept for an igneous rocks found in Moon as a set of conjunctive rules (using
AND and OR operators)
Figure 2shows a decision tree with depth two. Show that this decision tree perfectly classifies the
given data. Though this decision tree gives a simpler hypothesis with zero error, why does the
approach employed in question (a) fails to output this kind of simpler decision tree
Size
Smelly Water
Fig. 2
Learning. You should mathematically derive
0. Explain the concept of Bias-variance trade-off in Machine
boosting methods to counteract bias
the bias-variance relation. Also explain, how we use bagging and
individually help. [5]
variance trade-off, and which part ofbias-variance trade-off will they
Good luck!
Al42001 - Machine Learning Foundations and
Applications
Class Test 2
Instuctions: Please answer all questions. The maximum points of this test is 15.
and you are allowed 30 minutes to complete the test. This is a closed book test, and
the use of electronic devices other than non-progranmable calculators are not
permitted during the duration of the test. Good luck!
Question 1
Pick the correct option in each of the following questions. Some questions may
have more than one correct option, and you have need to identify all of them to
receive full credit.
Page 1 of 4
C. ky = 2401, k, = 49
D. k, = 330, k, = 0
3. The figure shows two decision boundaries obtained using soft-margin SVM
classifiers, Aand Bobtained using soft-margin SVM as discussed in class:
M
min
i=l
s. t.y(w"r0+b) 1-§ Vi
>0Vi
A B
The values of the hyper parameter C are C and Cg for the learned classifiers A
and B. What is the relationship between C and Cg? [1point]
A. CC<Cg
B. CA> CB
C. CA = CB
D. Cannot be determined from available infornmation
A. Suppose you have a deep CNN model having several layers that perfoms an
image classification task by learning on a dataset of 256 X 256 images, and a
logistic regression model operating on 10 features extracted from the sante
dataset of images to perform the same classification task. Which ensemble
Page 2 of 4
learning technique would you apply to improve the bias-variance tradeoff of
each learner? [1point)
A. Boosting for deep CNN, bagging for logistic regression
B. Boosting for both learners
C. Bagging for both learners
D. Bagging for deep CNN, boosting for logistic regression
Question 2
Design a 2 input XOR gate using 3 units (artificial neurons). You may assume that
the inputs x and x, as wellas the output ytake binary values, i.e xj, X). y
e{-1,1), while the weights and biases can take integer values (positive or
negative). Clearly show the structure of the network and specify the weights and
biases of each unit. (5 points]
Question 3
The receptive field of a layer with respect to the input is defined as the number of
pixels of input that influences each element of the feature map produced by the
corresponding layer. Assume that a 64 × 64 X3 image, I is passed through two
convolutional layers followed by a max pooling layer to produce a feature map Fas
shown in the figure below.
Page 3 of 4
i) What is the receptive field of the first convolutional layer C,? 1 point)
i) What is the receptive field of the second convolutional layer C, 2 points]
0v) Calculate the receptive field of the max pooling layer. 2 points]
Page 4 of 4