0% found this document useful (0 votes)

5 views

Data Science and ML - End Term

iim kashipur

Uploaded by

tansley007

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

Data Science and ML - End Term

iim kashipur

Uploaded by

tansley007

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Term IV 2023: End-Term Exam Question Paper - Part B

Subject:Data Science and Machine Learning

Program :MBA / MBA(Analytics)

Faculty : Prof. K. Venkataraghavan

Date of Exam : 22. 09.2023

Part B
Use of Internet: Yes
Use of Laptop : Allowed
Open book : Yes
Calculator : Yes
Duration : 75 Minutes
Part B Marks : 20 Marks

Instructions

Submit

1. One single . ipynb file containing code, output and

interpretation for all questions.
2. Donot submit .py file. No marks will be given
3. Datasets can be downloaded from End-Term Exam Placeholder in
Google Classroom of DSML Course.
4. The single .ipynb file must be uploaded in End-Term Exam
Placeholder in Google Classroom.
5. It is your responsibility to ensure that you upload the ipynb
file before deadline in the Google Classroom . Late Submissions
will be summarily disregarded. I can very well see late
Submissions in Google Classroom.
6. I Will not accept any files later on. Emailing of files is
strictly prohibited.
7. E-Mailing of Part B will attract penalty marks in Part A.
8. No Excuses (Laptop Crashed / Could Not Connect to Internet)
will be entertained.
9. If you copy, you will be awarded zero for End-Term

Note

1. If your .ipnyb does not run, you may not get any marks.
2. It is your responsibility to ensure that the files you submit
are complete in all aspects.
3. Do not forget to mention your name and rollnumber in the file
name

4. You should use your computer to run the code. Sharing of code
files is strictly prohibited.

Datasets Supplied
1. Use the Use the german rec.t dataset German Credit Data csv'

Q1:Use the dataset German Credit Data.csv. Your task is as follows.

L. Divide the data into train and test using last 3 digits of your
roll number as random state [2 Marks]
Build three classification models using the train dataset by
applying (a) 1ogistic regression and (b) SVM- RBF Kernel and (c)
Decision Trees [2 Marks]

Get the predicted probabilities for the test data set for each
of the above three models. Store the three predicted
probabilities in a dataframe named "result". [2 Marks]

4. Obtain the predicted classes in each case (a) logistic

regression and (b) Naive-Bayes and (c) Decision Trees using
appropriate thresholds. Store the three predicted classes in the
dataframe "result". [2 Marks]

5. Approach I - Do a polling of the three classes to get the

predicted class. Call this predicted class class as
"pred_class_voted". Store it in dataframe "result" [2 Marks]

6. Combine the predicted probabilities of (a) logistic regression

and (b) Naive-Bayes and (c) Decision Trees, weighted by their AUC
Values. Call it as "pred_prob_all" and store it to the dataframe
"result ". [2 Marks]
7. Approach. II - Obtain the predicted class from "pred prob_all".
Call the class as "pred_class_by prob" and store in the dataframe
"result". [2 Marks]

8. Obtain two confusion matrices for the predicted classes - From

Step 5 and Step 7 and display them. [4 Marks]

9. Which approach is better based on the results. [2 Marks]

Term IV 2023: End-Term Exam Question Paper - Part A
Subject:Data Science and Machine Learning
Program: MBA /MBA(Analytics)
Faculty :Prof. K. Venkat araghavan
Date of Exam : 22.09.2023

Mode Pen and Paper fxam

Use of Internet: No
Open book : No
Calculator :No
Duration :50 Minutes
Part A Marks :30 Marks

PART A
Q1 State True or False [Total Marks 09]

1. GINI Index = 0, indicates high impurity.

2. Higher the entropy, higher is the impurity.
data.
3. Accuracy is a good indicator of classifier performance with class imbalanced
4. F1-Score is harmonic mean of accuracy and recall.
5. In SVM, increasing the cost of misclassification is a good idea.
6. Gamma is ahyperparameter of RBF Kernel.
7. Random forest is a type of feature bagging.
8. SGB helps avoid overfitting
9. Feature Importance can be determined for Tree Based Models.

PART B
Q1. Find the Gini Index of the following nodes. Show the steps in deriving the answe.
(3 Marks]

Node A: N=120, Class A = 70, Class B = 50

Node B: N=70, Class A = 50, Class B = 20
Node C: N=50, Class A = 20, Class B = 30

Q2.Find the Entropy of the following nodes. Show the steps in deriving the answer. (3
Marks]

Node A: N=120, Class A= 70, Class B = 50

Node B: N=70, Class A = 50, Class B = 20
Node C: N=50, Class A= 20, Class B = 30
Q3 Calculate the Log-Loss Values in Each Case (3 Marks)

1. Actual Class = 1, Predicted Probability = 0.9

2. Actua! Class = 1, Predicted Probability = 0.5
3.Actual Class = 0, Predicted Probability =0.4
customer base on 1Million
Q4. YoU are amarketing manager in ABC Inc. You have a
cost of campaign is
Customers and the average response rate for acampaiqn is 10%. The
30000 responses.
Rs 10per customer. So, what will be the cost of campaign if you need
Showthe steps in deriving the answer. [3 Marks]

that the response rates

Q5. Following on previous question assume that a classifier tells you
20%,
in first, second and third deciles of predicted probabilities (in descending order) are
10% and 10%. So, what will be the cost of campaign if you need 30000 responses. Show the
[3 Marks].
steps in deriving the answer. Is there any financial benefit of using the classifier

Q6. Answer the following questions from the following figure (3 Marks]

checkin acc A14s 0.5

gini=0.419
samples700
s value [491, 209]s
class = Good Credit
True False

duration <=33.0 plans A143. s=0.5

gini = 0.484 gIni 0.222
samples =425 samples 275
value = [251, 174] Lvalue 240 351
class = Good Credit class Good Ciedit

gini = 0,458 gini = 0.464 gini = 0423 gini0.167

samples= 343 samples = 82 samples 46 samples229
value = (221, 122] value =[30, 521 value [32, 14) value 208, 211
class Good Credit class = Bad Credit class Good Credit class GOod Credit

1.What is the probability of being good credit if Checkin_acc_Al4 =0and duration is 20?
2. What is the probability of being good credit if Checkin_acc_Al4 = 0 and duration is 35 ?
3. What is the probability of being good credit if Checkin_acc_Al4 = 1 and inst_plans_A143
= 0?

Q7.Answer the following questions from confusion matrix below [3 Marks]

In a given confusion matrix,

TN = 188
TP = 30
FP = 21
FN = 61
I increase the threshold which makes TN = 208 and FN = 85. T74P

Find the accuracy, sensitivity, specificity for the new confusion

matriX.

MMXM Model
100% (8)
MMXM Model
145 pages
CAN-QUEST Modelling Guide - Mar31 2016
No ratings yet
CAN-QUEST Modelling Guide - Mar31 2016
193 pages
CS3002 Solution Paper 2015.16 - v2
No ratings yet
CS3002 Solution Paper 2015.16 - v2
6 pages
SMAI Question Papers
No ratings yet
SMAI Question Papers
13 pages
Practice Midterm
No ratings yet
Practice Midterm
4 pages
MachineLearning MidTerm UMT Spring 2021
No ratings yet
MachineLearning MidTerm UMT Spring 2021
12 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
10 pages
Machine Learning PYQ 2021
No ratings yet
Machine Learning PYQ 2021
4 pages
Machine Learning Cheatsheet
No ratings yet
Machine Learning Cheatsheet
12 pages
G 203008076 - 4 - Christhian Quiñonez - Ex1 - 2 A PDF
No ratings yet
G 203008076 - 4 - Christhian Quiñonez - Ex1 - 2 A PDF
20 pages
Exp 3 Bi 30
No ratings yet
Exp 3 Bi 30
7 pages
ML Final Solution Set Obe 2021
No ratings yet
ML Final Solution Set Obe 2021
8 pages
2023-24 AIML ML Mid-Semester Make-Up Answer-Keys
No ratings yet
2023-24 AIML ML Mid-Semester Make-Up Answer-Keys
6 pages
ML June 2024
No ratings yet
ML June 2024
12 pages
Wa0030.
No ratings yet
Wa0030.
36 pages
ML4N_exam_sample_2024
No ratings yet
ML4N_exam_sample_2024
6 pages
DWM - END SEM LAB Questions
No ratings yet
DWM - END SEM LAB Questions
9 pages
QB - Data Science
No ratings yet
QB - Data Science
4 pages
Exam Spring 10
No ratings yet
Exam Spring 10
10 pages
DIT865 2018 Mar Solution
No ratings yet
DIT865 2018 Mar Solution
9 pages
Machine Learning
No ratings yet
Machine Learning
10 pages
IML19_Term1
No ratings yet
IML19_Term1
5 pages
2022 Jan
No ratings yet
2022 Jan
37 pages
AI & ML Notes
No ratings yet
AI & ML Notes
22 pages
ML Week7 Soln
No ratings yet
ML Week7 Soln
3 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
11 pages
ML Midterm Question Pool
No ratings yet
ML Midterm Question Pool
7 pages
# ELG 5255 Applied Machine Learning Fall 2020 # Quiz 1 (Bayesian Decision Theory)
No ratings yet
# ELG 5255 Applied Machine Learning Fall 2020 # Quiz 1 (Bayesian Decision Theory)
6 pages
LP III Lab Manual
100% (1)
LP III Lab Manual
8 pages
QB - Data Science
No ratings yet
QB - Data Science
7 pages
Machine 2021 Jan-Apr
No ratings yet
Machine 2021 Jan-Apr
45 pages
3 LogisticRegression
No ratings yet
3 LogisticRegression
30 pages
Machine Learning
100% (2)
Machine Learning
30 pages
Project Report
100% (3)
Project Report
36 pages
HW_02
No ratings yet
HW_02
3 pages
Machine Learning Interview Questions PDF
No ratings yet
Machine Learning Interview Questions PDF
14 pages
ECON 460202E006 MLforBI2 S23o
No ratings yet
ECON 460202E006 MLforBI2 S23o
5 pages
Ifjo 320 Fy 98324 Fo 3 F 2 Ifr
No ratings yet
Ifjo 320 Fy 98324 Fo 3 F 2 Ifr
6 pages
IBM322 Last Year ETE
No ratings yet
IBM322 Last Year ETE
5 pages
40 Interview Questions On Machine Learning - AnalyticsVidhya
100% (1)
40 Interview Questions On Machine Learning - AnalyticsVidhya
21 pages
19ECE357_V Sem End_Odd 2023
No ratings yet
19ECE357_V Sem End_Odd 2023
4 pages
ml-20230316-1
No ratings yet
ml-20230316-1
9 pages
BAUDM Assignment2
No ratings yet
BAUDM Assignment2
16 pages
Question 1 The Given Dataset Can Be Visualized As Follows
No ratings yet
Question 1 The Given Dataset Can Be Visualized As Follows
13 pages
Interview Questions On Machine Learning
100% (4)
Interview Questions On Machine Learning
22 pages
MBA786M Project
No ratings yet
MBA786M Project
2 pages
MidA-F21
No ratings yet
MidA-F21
8 pages
1000099853
No ratings yet
1000099853
2 pages
Assignment 2: Hive
No ratings yet
Assignment 2: Hive
11 pages
ML Unit 1
No ratings yet
ML Unit 1
73 pages
CMPUT 466/551 - Assignment 1: Paradox?
No ratings yet
CMPUT 466/551 - Assignment 1: Paradox?
6 pages
Homework3
No ratings yet
Homework3
10 pages
Machine 2021 Jul-Dec
No ratings yet
Machine 2021 Jul-Dec
46 pages
Exercises695Clas Solution
100% (2)
Exercises695Clas Solution
13 pages
ML 2023a Midsem Solution
No ratings yet
ML 2023a Midsem Solution
9 pages
EE 769 Introduction To Machine Learning: Sheet 4 - 2020-21-2 Linear Classification
No ratings yet
EE 769 Introduction To Machine Learning: Sheet 4 - 2020-21-2 Linear Classification
4 pages
Mock Exams 2024
No ratings yet
Mock Exams 2024
81 pages
Soft Computing Lab Practical Assignment 2
No ratings yet
Soft Computing Lab Practical Assignment 2
10 pages
Practice Tests for CASAS Math GOAL 2 Level C, Forms 925M and 926M
From Everand
Practice Tests for CASAS Math GOAL 2 Level C, Forms 925M and 926M
Coaching For Better Learning
No ratings yet
Blue Prism Developer Certification Case Based Practice Question - Latest 2023
From Everand
Blue Prism Developer Certification Case Based Practice Question - Latest 2023
Exam OG
No ratings yet
Neo4j Graph Data Science Certified - Exam Practice Tests
From Everand
Neo4j Graph Data Science Certified - Exam Practice Tests
Cristian Scutaru
No ratings yet
The Supervised Learning Workshop - Second Edition: A New, Interactive Approach to Understanding Supervised Learning Algorithms, 2nd Edition
From Everand
The Supervised Learning Workshop - Second Edition: A New, Interactive Approach to Understanding Supervised Learning Algorithms, 2nd Edition
Blaine Bateman
No ratings yet
Belimo Bu Butterfly Valve-125
No ratings yet
Belimo Bu Butterfly Valve-125
6 pages
Roomba SCI
No ratings yet
Roomba SCI
4 pages
CAMPOS Case
No ratings yet
CAMPOS Case
2 pages
(Vinod Sharma, R.N. Agarwal) Planning Irrigation PDF
No ratings yet
(Vinod Sharma, R.N. Agarwal) Planning Irrigation PDF
158 pages
A Review of Vehicle Inspection Laws in Malaysia
No ratings yet
A Review of Vehicle Inspection Laws in Malaysia
5 pages
COAble Common COA Findings
No ratings yet
COAble Common COA Findings
4 pages
Bodyweight Metcon 2
No ratings yet
Bodyweight Metcon 2
15 pages
Sallen-Key Topology
No ratings yet
Sallen-Key Topology
5 pages
Purpose and Components of Assessment
100% (1)
Purpose and Components of Assessment
35 pages
MAT 3103: Computational Statistics and Probability Chapter 7: Stochastic Process
No ratings yet
MAT 3103: Computational Statistics and Probability Chapter 7: Stochastic Process
20 pages
4 Channel DJ Audio Mixer Circuit Part 2 1
No ratings yet
4 Channel DJ Audio Mixer Circuit Part 2 1
12 pages
BMC Lokmanya Tilak Municipal General Hospital Bharti 2024
No ratings yet
BMC Lokmanya Tilak Municipal General Hospital Bharti 2024
2 pages
Course Content Form: MAT 241 Calculus III
No ratings yet
Course Content Form: MAT 241 Calculus III
3 pages
Haruma
No ratings yet
Haruma
5 pages
Seminars-Conferences in India Forms
No ratings yet
Seminars-Conferences in India Forms
6 pages
Global Demography and Global Migration
90% (10)
Global Demography and Global Migration
15 pages
MRS 2021
No ratings yet
MRS 2021
352 pages
Kanji 2004
No ratings yet
Kanji 2004
10 pages
Quiz On NCM 104
No ratings yet
Quiz On NCM 104
3 pages
C2d94fa3e374f 1
No ratings yet
C2d94fa3e374f 1
1 page
Effective Office Administration
No ratings yet
Effective Office Administration
62 pages
Fast Path To B2C Commerce Developer Certification - Module 2 - Cartridges and Controllers
No ratings yet
Fast Path To B2C Commerce Developer Certification - Module 2 - Cartridges and Controllers
18 pages
PMP Formulas Cheat Sheet Statement
No ratings yet
PMP Formulas Cheat Sheet Statement
1 page
Case Two - GE Healthcare in India
100% (1)
Case Two - GE Healthcare in India
3 pages
Get Berry College A History 1st Edition Ouida Dickey PDF ebook with Full Chapters Now
100% (9)
Get Berry College A History 1st Edition Ouida Dickey PDF ebook with Full Chapters Now
60 pages
Advanced Network Defense
No ratings yet
Advanced Network Defense
35 pages
Colgate Palmolive
No ratings yet
Colgate Palmolive
4 pages
2022 CMT SS 308L
No ratings yet
2022 CMT SS 308L
20 pages

Data Science and ML - End Term

Uploaded by

Data Science and ML - End Term

Uploaded by

Term IV 2023: End-Term Exam Question Paper - Part B

Subject:Data Science and Machine Learning

Faculty : Prof. K. Venkataraghavan

1. One single . ipynb file containing code, output and

Q1:Use the dataset German Credit Data.csv. Your task is as follows.

4. Obtain the predicted classes in each case (a) logistic

5. Approach I - Do a polling of the three classes to get the

6. Combine the predicted probabilities of (a) logistic regression

8. Obtain two confusion matrices for the predicted classes - From

9. Which approach is better based on the results. [2 Marks]

Mode Pen and Paper fxam

1. GINI Index = 0, indicates high impurity.

Node A: N=120, Class A = 70, Class B = 50

Node A: N=120, Class A= 70, Class B = 50

1. Actual Class = 1, Predicted Probability = 0.9

that the response rates

checkin acc A14s 0.5

duration <=33.0 plans A143. s=0.5

gini = 0,458 gini = 0.464 gini = 0423 gini0.167

Q7.Answer the following questions from confusion matrix below [3 Marks]

In a given confusion matrix,

Find the accuracy, sensitivity, specificity for the new confusion

You might also like