CS4038D Data Mining Assignment 2 - 2024 (1)

The assignment for CS4038D Data Mining at NIT Calicut requires students to complete three tasks involving data analysis of Iris flower species. Students must implement classifiers and clustering algorithms, submit their code and documentation in specified formats, and ensure team collaboration for submission. The deadline for submission is November 10, 2024, and evaluation will include a viva/quiz to assess individual contributions.

Uploaded by

SHREE KRIPA SHANKARI S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

CS4038D Data Mining Assignment 2 - 2024 (1)

Uploaded by

SHREE KRIPA SHANKARI S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

National Institute of Technology Calicut

Department of Computer Science and Engineering

CS4038D DATA MINING MONSOON 2024 - Assignment 2

Submission deadline (on or before):

10th November, 2024 10:00:00 PM

Complete the following Assignment Questions, and submit your assignment in the moodle (Eduserver)
course page, on or before the submission deadline. Only one member among your team should make a
submission in Eduserver on behalf of the entire team. Include a README.PDF which contains the name
and roll number of the group members. Total of 7 files (3 Zip files, 3 PDF files related with each of the 3
following tasks and the README file) is expected to be submitted as part of this assignment.

During evaluation, the genuinity of the submission and contribution of each member will be checked
either through viva/quiz. The total marks for the assignment is 12. The marks awarded will be based on
the uploaded documents and the viva/quiz.

Assignment Questions

Perform the following tasks and submit the outcomes described for each task.

Dataset: It consists of 150 samples of three species of Iris flowers (Iris setosa, Iris versicolor, and Iris
virginica). From each class, take 75% for training (including 5% validation) and 25% for testing
randomly. Store it in train.csv, train-valid.csv and test.csv. These files should be included in the respective
python code folders.

1. Task 1: Implement Naive Bayes classifier from scratch (do not use libraries) by discretizing the
numeric feature into three equal-width bins. You should use Laplacian correction. After
calculating the likelihood and prior probabilities in trainNBayes.py, save the probability values to
a txtfile. The testNBayes.py should read the probabilities from the txtfile, and can use it for
predicting the class labels for test dataset, and finally find accuracy, precision and recall by
creating the confusion matrix.
Outcome: Zip file (T1CODE_<TeamNumber>.zip) containing all the python codes. Document
(T1_<TeamNumber>.PDF) which describes the experiment setup (if needed) and tabulates the
evaluation measures - Accuracy, Precision and Recall, obtained for the task. Also, write down
your inference on the misclassified samples.
2. Task2.1: Implement k-Nearest Neighbor classifier from scratch without using libraries for ‘k’ =
5, and Euclidean distance measure.
Task2.2: Use the scikit-learn library to create k-Nearest Neighbor classifier. Experiment with
different distance measures, and different ‘k’ values (Use grid search for fine tuning parameters).
You are allowed to do any sort of tuning on the parameters.
Outcome: Zip file (T2CODE_<TeamNumber>.zip) containing python codes. Document
(T2_<TeamNumber>.PDF) which describes the experiment setup (if needed) and tabulates the
evaluation measures - Accuracy, Precision and Recall, obtained for tasks. Also, write down your
inference on the misclassified samples. The optimized model parameters and the performance of
the model should be tabulated well.

3. Task3: Implement k-means clustering using scikit-learn library. Make use of the ground truth
class labels only for cluster evaluation. Find the optimal number of clusters using any automated
method. Experiment with different distance measures.
Outcome: Zip file (T3CODE_<TeamNumber>.zip) containing python codes. Document
(T3_<TeamNumber>.PDF) which describes the experiment setup (if needed) and tabulates the
extrinsic and intrinsic cluster evaluation measures obtained for the task. The optimized model
parameters and the performance of the model should be tabulated well.

LogicalReasoningTest4 Solutions PDF
No ratings yet
LogicalReasoningTest4 Solutions PDF
14 pages
Numerical Methods in Engineerin - B. S. Grewal PDF
85% (62)
Numerical Methods in Engineerin - B. S. Grewal PDF
950 pages
Pattern Recognition Lab
No ratings yet
Pattern Recognition Lab
24 pages
CS178 Homework #1: Problem 0: Getting Connected
No ratings yet
CS178 Homework #1: Problem 0: Getting Connected
4 pages
Panasonic TB358K Installation Manual
100% (1)
Panasonic TB358K Installation Manual
2 pages
DM Lab Assignment 2
No ratings yet
DM Lab Assignment 2
2 pages
sanyam assignment1
No ratings yet
sanyam assignment1
3 pages
original ML lab manual (1)
No ratings yet
original ML lab manual (1)
22 pages
kapoor ass 1
No ratings yet
kapoor ass 1
3 pages
Data Mining and Warehousing Concepts Lab: (ITPC - 228)
No ratings yet
Data Mining and Warehousing Concepts Lab: (ITPC - 228)
6 pages
Heart Merged
No ratings yet
Heart Merged
8 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
25 pages
machine learning final manual
No ratings yet
machine learning final manual
45 pages
ML - LAB - FILE Amrit
No ratings yet
ML - LAB - FILE Amrit
13 pages
B.TECH Machine Learning-Lab
No ratings yet
B.TECH Machine Learning-Lab
99 pages
ML - LAB - FILE Pankaj
No ratings yet
ML - LAB - FILE Pankaj
13 pages
Machine Learning LAB
No ratings yet
Machine Learning LAB
20 pages
ML Lab Manual1
No ratings yet
ML Lab Manual1
23 pages
ML Lab R20
No ratings yet
ML Lab R20
42 pages
My ML Lab Manual
No ratings yet
My ML Lab Manual
21 pages
ML RECORD NEW FORMAT
No ratings yet
ML RECORD NEW FORMAT
48 pages
ML Lab Manual (1-9)
No ratings yet
ML Lab Manual (1-9)
37 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
ML With Python Practical
No ratings yet
ML With Python Practical
22 pages
Python
No ratings yet
Python
38 pages
CP4252 MACHINE LEARNING LABORATORY
No ratings yet
CP4252 MACHINE LEARNING LABORATORY
37 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
ML_Assignment_2025(2022_25)
No ratings yet
ML_Assignment_2025(2022_25)
1 page
Exercise and Experiment 3
No ratings yet
Exercise and Experiment 3
14 pages
MLFA KNN Assignment Ons-8
No ratings yet
MLFA KNN Assignment Ons-8
3 pages
# ELG 5255 Applied Machine Learning Fall 2020 # Assignment 3 (Multivariate Method)
No ratings yet
# ELG 5255 Applied Machine Learning Fall 2020 # Assignment 3 (Multivariate Method)
8 pages
DATA SCIENCE Internship Tasks
No ratings yet
DATA SCIENCE Internship Tasks
12 pages
Assignment-2 IDS
No ratings yet
Assignment-2 IDS
2 pages
CP4252 SET2
No ratings yet
CP4252 SET2
4 pages
AI and ML Lab Manual
No ratings yet
AI and ML Lab Manual
29 pages
Machine Learning Lab Record: Dr. Sarika Hegde
No ratings yet
Machine Learning Lab Record: Dr. Sarika Hegde
23 pages
SHASHANK ML.docx
No ratings yet
SHASHANK ML.docx
23 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
33 pages
cp4252-machine learning lab manual 23-24
No ratings yet
cp4252-machine learning lab manual 23-24
28 pages
178 hw1
No ratings yet
178 hw1
4 pages
hw2 2020
No ratings yet
hw2 2020
3 pages
Tushar ML
No ratings yet
Tushar ML
52 pages
AI & ML Question Bank
No ratings yet
AI & ML Question Bank
4 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
31 pages
Important Questions
No ratings yet
Important Questions
4 pages
ASSIGNMENT-1 (1)
No ratings yet
ASSIGNMENT-1 (1)
2 pages
Jntuk R20 ML
No ratings yet
Jntuk R20 ML
43 pages
Unit2 ML Programs
No ratings yet
Unit2 ML Programs
7 pages
ML Experiments
No ratings yet
ML Experiments
22 pages
AD3461-Machine Learning Lab Manual
No ratings yet
AD3461-Machine Learning Lab Manual
26 pages
DSCI 303: Machine Learning For Data Science Fall 2020
No ratings yet
DSCI 303: Machine Learning For Data Science Fall 2020
5 pages
ML Lab Programs (1-12)
No ratings yet
ML Lab Programs (1-12)
35 pages
Manjunath 01JST18EI023
No ratings yet
Manjunath 01JST18EI023
20 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
39 pages
CSC 325 AI Assignment 02 23102023 033111pm
No ratings yet
CSC 325 AI Assignment 02 23102023 033111pm
5 pages
ML LAB
No ratings yet
ML LAB
23 pages
24CSPC212-PIC Lab Manual
No ratings yet
24CSPC212-PIC Lab Manual
45 pages
# ELG 5255 Applied Machine Learning Fall 2020 # Quiz 1 (Bayesian Decision Theory)
No ratings yet
# ELG 5255 Applied Machine Learning Fall 2020 # Quiz 1 (Bayesian Decision Theory)
6 pages
Assignment_2_Solutions
No ratings yet
Assignment_2_Solutions
2 pages
ML Hota Assign4
No ratings yet
ML Hota Assign4
3 pages
ML LAB 146
No ratings yet
ML LAB 146
50 pages
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
From Everand
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
Manish Soni
No ratings yet
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
final reportttt
No ratings yet
final reportttt
36 pages
Wa0002
No ratings yet
Wa0002
2 pages
Exp 3 4 Single Multi Amp
No ratings yet
Exp 3 4 Single Multi Amp
3 pages
Exp-7 classAB 231016 140705
No ratings yet
Exp-7 classAB 231016 140705
3 pages
Electromagnetic Induction 2 QP
No ratings yet
Electromagnetic Induction 2 QP
9 pages
H2o2 Ap
No ratings yet
H2o2 Ap
12 pages
Ee 483
No ratings yet
Ee 483
4 pages
Product Keys 2019
No ratings yet
Product Keys 2019
3 pages
Chemistry Past Paper
No ratings yet
Chemistry Past Paper
8 pages
Digital Controller R35
No ratings yet
Digital Controller R35
12 pages
MC Manuel - Gardner - Fernandes Pickup Music V1 PDF
88% (8)
MC Manuel - Gardner - Fernandes Pickup Music V1 PDF
32 pages
Modeling Platform-Based Product Configuration Usin
No ratings yet
Modeling Platform-Based Product Configuration Usin
24 pages
2023 - WLY - Blockchain For Real World Applications - Garg
No ratings yet
2023 - WLY - Blockchain For Real World Applications - Garg
415 pages
Principles of Good Lighting
No ratings yet
Principles of Good Lighting
4 pages
268 Codigo Activo
No ratings yet
268 Codigo Activo
7 pages
9701 Chemistry Learner Guide 2015.indd
No ratings yet
9701 Chemistry Learner Guide 2015.indd
84 pages
BCHCT-133 EM 2024
No ratings yet
BCHCT-133 EM 2024
19 pages
Branch and Year: EEE & I YEAR: JCT College of Engineering and Technology, Coimbatore Lesson Plan
No ratings yet
Branch and Year: EEE & I YEAR: JCT College of Engineering and Technology, Coimbatore Lesson Plan
3 pages
Arihant Mathematics Engineering Solved Papers - Watermark
100% (5)
Arihant Mathematics Engineering Solved Papers - Watermark
1,136 pages
An Instruction Manual: HTSR Series Blowers
No ratings yet
An Instruction Manual: HTSR Series Blowers
10 pages
IEEE TPS - KlyC 1.5D Large Signal Simulation Code For Klystrons
No ratings yet
IEEE TPS - KlyC 1.5D Large Signal Simulation Code For Klystrons
8 pages
Kick Tolerance.
No ratings yet
Kick Tolerance.
9 pages
L11 - 01. GussetPlateDesignConsiderations PDF
No ratings yet
L11 - 01. GussetPlateDesignConsiderations PDF
7 pages
Breathing at Depth Physiologic and Clinical Aspects of Diving While Breathing Compressed Gas.
No ratings yet
Breathing at Depth Physiologic and Clinical Aspects of Diving While Breathing Compressed Gas.
26 pages
Mba Strategy Quant Advanced
No ratings yet
Mba Strategy Quant Advanced
196 pages
Tilted Plate Interceptor (TPI or CPI)
100% (2)
Tilted Plate Interceptor (TPI or CPI)
5 pages
SolutionArchitect Part1
No ratings yet
SolutionArchitect Part1
28 pages
Module 13 - Aircraft Aerodynamics Structures and Systems
No ratings yet
Module 13 - Aircraft Aerodynamics Structures and Systems
8 pages
Engine Integrated Control System: Integrated Ignition, Air / Fuel Control and Speed Governing
No ratings yet
Engine Integrated Control System: Integrated Ignition, Air / Fuel Control and Speed Governing
3 pages
Folds Faults and Joints
0% (1)
Folds Faults and Joints
3 pages
Government Polytechnic, Jalgaon
0% (1)
Government Polytechnic, Jalgaon
21 pages

CS4038D Data Mining Assignment 2 - 2024 (1)

Uploaded by

CS4038D Data Mining Assignment 2 - 2024 (1)

Uploaded by

National Institute of Technology Calicut

Department of Computer Science and Engineering

Submission deadline (on or before):

You might also like