0% found this document useful (0 votes)

53 views10 pages

A1388404476 - 64039 - 23 - 2023 - Machine Learning II

This syllabus outlines a Machine Learning II course that covers the following topics: 1. Model selection and evaluation techniques like cross-validation and regularization 2. Advanced regression methods including linear, polynomial, and regularized regression 3. Support vector machines for classification, including kernels and nonlinear models 4. Decision trees and random forests for classification and regression It includes 6 units that provide theory and Python demonstrations for each major topic, as well as practical labs applying the techniques to datasets. The goal is for students to learn how to apply, evaluate, and tune machine learning models for problems like housing price prediction, churn prediction, and disease classification.

Uploaded by

raj241299

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views10 pages

A1388404476 - 64039 - 23 - 2023 - Machine Learning II

Uploaded by

raj241299

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

SYLLABUS: MACHINE LEARNING-II

SEMESTER- VII

COURSE OUTCOMES
Upon successful completion of this course, students will be able to:

CO1 Evaluate the effectiveness of machine learning models and select appropriate techniques for model
selection, regularization, and hyperparameter tuning using cross-validation.

CO2 Analyze and apply different linear regression techniques, including simple linear regression, multiple
linear regression, polynomial regression, and regularization, to model and make predictions on datasets.

CO3 Analyze and apply support vector machine (SVM) techniques, including the concept of a hyperplane,
maximal margin classifier, soft margin classifier, slack variables, and cost of misclassification, to classify
data in both two and three dimensions.

CO4 Evaluate and apply kernel methods, including feature transformation and the kernel trick, to map nonlinear
data to a linear feature space and build nonlinear models using support vector machines (SVM) in Python.

CO5 Evaluate and apply decision tree techniques, including building decision trees, measuring impurity and
feature importance, and choosing hyperparameters, to model and make predictions on both classification
and regression datasets in Python.

CO6 Evaluate and apply ensemble techniques, including random forests, to improve model performance and
feature importance, and apply these techniques to real-world datasets such as telecom churn prediction.

UNIT-WISE CONTENT SAGRIGATION

Unit 1 Module 1: Model Selection and Evaluation

Introduction to Model Selection, Model and Learning Algorithm, Simplicity, Complexity and Overfitting, Bias-
Variance Tradeoff, Regularization and Hyperparameters, Model Evaluation and Cross Validation, Model Evaluation:
Python Demonstration-I, Model Evaluation: Python Demonstration-II, Cross-Validation: Motivation, Cross-Validation:
Python Demonstration, Cross-Validation: Hyperparameter Tuning

Unit 2 Module 2: Advanced Regression

Linear Regression - Review, Estimating Coefficients in SLR, Matrix Representation for SLR, Estimating Coefficients
in MLR, Assumptions of Linear Regression, Multiple Linear Regression in Python, Identifying Nonlinearity in Data,
Polynomial Regression, Data Transformation, Nonlinear Regression, Linear Regression Pitfalls, Regularization -
Introduction, Ridge Regression, Ridge Regression - Python Implementation, Lasso Regression, Regularization - Python
Demo, Geometrical Representation of Ridge and Lasso

Unit 3 Module 3: Support Vector Machines - I

Introduction to SVM, Concept of a Hyperplane in 2D, Concept of a Hyperplane in 3D, Maximal Margin Classifier, The
Soft Margin Classifier, The Slack Variable, Comprehension 1: Notion of Slack Variables, Cost of Misclassification,
SVM Python Labs

Unit 4 Module 3: Support Vector Machines - II

Introduction to Kernels, Mapping Non - Linear Data to Linear Data, Feature Transformation, The Kernel Trick,
Building Non - Linear Models in Python, Shiny Apps - Types of Kernels, Choosing a Kernel Function, Letter
Recognition using SVM

Unit 5 Module 4: Tree Models - I

Introduction to Decision Trees, Interpreting a Decision Tree, Building Decision Trees, Comprehension - Decision Tree
Classification in Python, Tree Models over Linear Models, Splitting and Homogeneity, Impurity Measures,
Comprehension: The GINI Index, Feature Importance in Decision Trees, Disadvantages of Decision Trees, Tree
Truncation, Building Decision Trees in Python, Choosing Tree Hyperparameters in Python, Comprehension -
Hyperparameters, Decision Tree Regression, Decision Tree Regression in Python

Unit 6 Module 4: Tree Models - II

Ensembles, Comprehension - Ensembles, Introduction to Random Forests, Comprehension - OOB (Out-of-Bag) Error,
Feature Importance in Random Forests, Random Forests in Python, Random Forest Regression in Python, Telecom
Churn Prediction.

PRACTICAL LIST

Unit 1: Model Selection and Evaluation

For the following three questions, use the boston housing dataset provided here.

Unit 1 |Lab 1

How can we determine the optimal complexity of a model to prevent overfitting while maintaining good performance
on the test dataset?

Unit 1 | Lab 2

How do different regularization techniques, such as L1 and L2 regularization, affect the bias-variance tradeoff in a
model, and how can we select the optimal regularization hyperparameters for a given dataset?

Unit 1 | Lab 3

How does cross-validation help us evaluate the generalization performance of a model, and how can we use cross-
validation to tune hyperparameters for a given model?

Unit 2: Advanced Regression

Unit 2 | Lab 1

Linear Regression Analysis: Validating Assumptions

Validate the assumptions of linear regression using the dataset provided here.

Assumptions:
1. Linear relationship between target and independent variables: The response variable y should be linearly
related to the explanatory variables X.
2. No or Little multicollinearity between independent variables: Linear regression assumes that there is little or no
multicollinearity in the data. Multicollinearity occurs when the independent variables are too highly correlated
with each other.
3. Residual errors must be normally distributed: The residual errors should be normally distributed.
4. Residual errors must be homoscedastic: The residual errors should have constant variance. Otherwise it is
known as heteroscedastic.

Unit 2 | Lab 2

Advanced Regression

A US-based housing company named Surprise Housing has decided to enter the Australian market. The company uses
data analytics to purchase houses at a price below their actual values and flip them on at a higher price. For the same
purpose, the company has collected a data set from the sale of houses in Australia. The data is provided in the CSV file
below.

The company is looking at prospective properties to buy to enter the market. You are required to build a regression
model using regularisation in order to predict the actual value of the prospective properties and decide whether to invest
in them or not.

The company wants to know the following things about the prospective properties:

● Which variables are significant in predicting the price of a house, and

● How well those variables describe the price of a house.

Also, determine the optimal value of lambda for ridge and lasso regression.

Note: You can download the dataset from the platform.

Unit 2 | Lab 3

Regression
Stock Price Prediction

Use machine learning algorithms with multiple linear regressions to develop a stock prices predictor and then take it
even further by using Lasso and Ridge regression models, and test on the Tesla stock from the 2010 to 2020 dataset
from Kaggle.

Unit 3: Support Vector Machines - I

For the following two questions, make use of the Iris data set provided here.

Unit 3 | Lab 1

How does the concept of a hyperplane in SVMs help us classify data points in higher-dimensional spaces, and how can
we visualize this process using tools such as 3D plots or decision boundary plots?

Unit 3 | Lab 2

How can we use SVMs to classify data points that are not linearly separable, and what are some common kernel
functions that can be used to transform the data into a higher-dimensional space where linear separation is possible?

Unit 3 | Lab 3

Support Vector Machines

Dataset

Problem Description

Predict Heart Disease using the concepts of support vector machines based on given attributes.

● 0 - NO HEART DISEASE
● 1 - HEART DISEASE

Attribute Information:

1. age
2. sex (1 = male; 0 = female)
3. chest pain type (4 values) Value 1: typical angina Value 2: atypical angina Value 3: non-anginal
pain Value 4: asymptomatic
4. resting blood pressure
5. serum cholesterol in mg/dl
6. fasting blood sugar > 120 mg/dl
7. resting electrocardiographic results (values 0,1,2) -- Value 0: normal -- Value 1: having ST-T wave
abnormality (T wave inversions and/or ST elevation or depression of > 0.05 mV) -- Value 2:
showing probable or definite left ventricular hypertrophy by Estes' criteria
8. maximum heart rate achieved
9. exercise induced angina
10. oldpeak = ST depression induced by exercise relative to rest
11. the slope of the peak exercise ST segment
12. number of major vessels (0-3) coloured by fluoroscopy
13. thal: 3 = normal; 6 = fixed defect; 7 = reversible defect

Unit 4: Support Vector Machines - II

For the following two questions, you could use the USPS dataset, which contains images of handwritten
digits (0-9) and can be used for tasks such as classification and recognition.

Unit 4 | Lab 1

How can we use different types of kernel functions (e.g., linear, polynomial, radial basis function) to transform non-
linear data into a linearly separable form for classification using SVMs, and how do we choose the appropriate kernel
function for a given dataset?

Unit 4 | Lab 2

How can we apply SVMs with kernel methods to the task of letter recognition, and what are some challenges and
limitations of this approach?

Unit 4 | Lab 3

SVM – Linear Model – Diabetes Dataset

Description

The Pima Indians Diabetes dataset contains observations on various health-related attributes, such as plasma glucose
concentration and body mass index (BMI). Each row contains a patient’s attributes and whether they had diabetes.

Now, build a linear SVM model using cost, C = 1, to predict whether a given patient has diabetes.

A sample of the training data is shown below.

The training data is provided here:

/data/training/diabetes_train.csv

After you train the model, use the test data to make predictions. The test data can be accessed here.
/data/test/diabetes_test.csv
You have to write the prediction in the file given below. In the following format, carefully note the names of the
columns.
/code/output/diabetes_predictions.csv

Your model’s accuracy will be evaluated on an unseen test dataset.

Datasets

● Training dataset

Unit 4 | Lab 4

SVM Hyperparameter Tuning – Diabetes Data

Description

You have already built a linear SVM model on the Pima Indians Diabetes dataset, which contains observations on
various health-related attributes, such as plasma glucose concentration and Body Mass Index (BMI).

Recall that you used C = 1 while building the model.

In this question, you will find the optimal value of the hyperparameter ‘C’ using GridSearchCV() and then build a
linear SVM model to predict whether a given patient has diabetes.
To find the optimal value of ‘C’, you can plot training and test accuracy versus ‘C’ using matplotlib (the code is already
written; you will see the plot displayed below the coding console).

A sample of the training data is shown below:

The training data is provided here:

/data/training/diabetes_train.csv

After you train the model, use the test data to make predictions. The test data can be accessed here.
/data/test/diabetes_test.csv

You have to write the prediction in the file given below.

/code/output/diabetes_predictions.csv

Note the names of the columns carefully in the format provided in the dataset given below.

Your model's accuracy will be evaluated on an unseen test dataset.

Datasets

● Training dataset
Unit 4 | Lab 5 You are required to develop a model by using a support
vector machine which should correctly classify
handwritten digits from 0–9 based on the pixel values
given as features. Thus, this is a 10-class classification
problem.

For this problem, we use the MNIST data, which is a large

database of handwritten digits. The ‘pixel values’ of each
digit (image) comprise the features, and the actual number
between 0–9 is the label.

Each image has 28 x 28 pixels. Each pixel has a feature, and

there are 784 features in each image. MNIST digit
recognition is a well-studied problem in the machine learning
community, and people have trained numerous models (such
as neural networks, SVMs and boosted trees), achieving error
rates as low as 0.23% (i.e., accuracy = 99.77%, with a
convolutional neural network).

However, before the popularity of neural networks, models

like SVMs and boosted trees were state-of-the-art in such
problems.

In this assignment, try to experiment with various

hyperparameters in SVMs and observe the highest accuracy
you can get. With a sub-sample of 10%–20% of the training
data (see note below), you should expect more than 90%
accuracy.

Note: Since the training dataset is quite large (42,000

labelled images), it would take a lot of time to train an SVM
on the full MNIST data. So, you can sub-sample the data for
training (10%–20% of the data should be enough to achieve
decent accuracy). It may also take hours to run a
GridSearchCV() if you use a large value of k (fold-CV), such
as 10, and a wide range of hyperparameters; k = 5 should be
adequate.

You can download the dataset from Kaggle here. You can
use train.csv to train the model and test.csv to evaluate the
results.

Unit 5: Tree Models - I

Unit 5 | Lab 1

Decision Tree Classifier

Build a Decision Tree Classifier to predict the safety of the car.

You can find the dataset here.

For the upcoming two questions, you could use the Boston Housing dataset, which contains information
about various housing features such as crime rate, number of rooms, and median value. You could use
decision trees to model the relationship between these features and the target variable (median value), and
explore feature importance and hyperparameter tuning to optimize the model's performance.

Unit 5 | Lab 2

How can we build a decision tree classifier using Python, and how can we interpret and visualize the resulting tree to
gain insights into the underlying data and decision-making process?

Unit 5 | Lab 3

How can we use decision tree regression to model a given dataset and make predictions on new data, and what are some
strategies for tuning hyperparameters such as the maximum tree depth and minimum sample split size?

Unit 6: Tree Models - II

Unit 6 | Lab 1

Decision Tree Regression Model

Problem Statement

Predict the median value of owner-occupied homes with the help of the Decision Tree Regression Model.

You can find the dataset here

Unit 6 | Lab 2

Decision Tree Regression in Python

Problem Statement

● Identify the variables affecting house prices, such as the area and the number of rooms and bathrooms,
● Create a linear model that quantitatively relates house prices with variables, such as the area and the number of
rooms and bathrooms, and
● Know the variables that significantly contribute towards predicting house prices.

Use the ‘Housing Data Set’ provided in the platform.

Unit 6 | Lab 3

Telecom Churn Prediction using Random Forest

Problem Statement

A telecom company has all its clients’ data. The main types of attributes are as follows:

● Demographics (age, gender, etc.)

● Services used (internet packs purchased, special offers taken, etc.)
● Expenditure (amount of recharge done per month, etc.)

Based on all this data, you want to construct a model that predicts whether a customer would churn, i.e., switch service
providers. So, the target variable is ‘Churn’, which tells us whether a customer has churned. 1 signifies that the customer
has churned, while 0 means they haven’t.

You can download the data sets from the platform.

Text Books:
Reference Books:

UNIT-6
No ratings yet
UNIT-6
107 pages
Regression Models: by Mayuri Bhandari
No ratings yet
Regression Models: by Mayuri Bhandari
64 pages
Chapter 3. Linear Regression
No ratings yet
Chapter 3. Linear Regression
41 pages
Week 10_Lecture 10
No ratings yet
Week 10_Lecture 10
59 pages
Regression Analysis Q&A Imp
0% (1)
Regression Analysis Q&A Imp
3 pages
UNIT 2 Machine Learning BCAI601BCDS062.pptx
No ratings yet
UNIT 2 Machine Learning BCAI601BCDS062.pptx
244 pages
Computer Lab 2 Block 1-3
No ratings yet
Computer Lab 2 Block 1-3
7 pages
l05_machine_learning
No ratings yet
l05_machine_learning
34 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
Exercise - 3: DS203-2024-S1 Roll Number: 23B2215
No ratings yet
Exercise - 3: DS203-2024-S1 Roll Number: 23B2215
25 pages
Fringe Benefits and Job Satisfaction
100% (1)
Fringe Benefits and Job Satisfaction
22 pages
LAB 5
No ratings yet
LAB 5
4 pages
L03 The Regression Pipeline - 2
No ratings yet
L03 The Regression Pipeline - 2
58 pages
INSY446 - 02 - Linear Model Part 1
No ratings yet
INSY446 - 02 - Linear Model Part 1
27 pages
CrimeStat IV Chapter 19 PDF
No ratings yet
CrimeStat IV Chapter 19 PDF
41 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
12 pages
ml record
No ratings yet
ml record
21 pages
Final Lab Manual
No ratings yet
Final Lab Manual
34 pages
PS Notes (Machine Learning
No ratings yet
PS Notes (Machine Learning
14 pages
20kn710b - Rendra Achyunda Anugrah Putra
No ratings yet
20kn710b - Rendra Achyunda Anugrah Putra
50 pages
IJARCCE.2022.115105
No ratings yet
IJARCCE.2022.115105
6 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
43 pages
index
No ratings yet
index
2 pages
Seminar Presentation
No ratings yet
Seminar Presentation
25 pages
frmCourseSyllabusIPDownload (2)
No ratings yet
frmCourseSyllabusIPDownload (2)
3 pages
Experiment Number: 3: Aim:-Study of The Linear Regression in The Machine Learning Using The Boston Housing Dataset. 1)
No ratings yet
Experiment Number: 3: Aim:-Study of The Linear Regression in The Machine Learning Using The Boston Housing Dataset. 1)
14 pages
M.E MACHINE LEARNING -CP4252 LAB MANUAL4716718074353656238
No ratings yet
M.E MACHINE LEARNING -CP4252 LAB MANUAL4716718074353656238
26 pages
B24 ML Exp-3
No ratings yet
B24 ML Exp-3
10 pages
ML File - Merged
No ratings yet
ML File - Merged
24 pages
Stanford ML
No ratings yet
Stanford ML
168 pages
Final Report (1)
No ratings yet
Final Report (1)
17 pages
Assignment II Machine Learning
No ratings yet
Assignment II Machine Learning
8 pages
3. Undergraduate Fundamentals of Machine Learning Author William J. Deuschle
No ratings yet
3. Undergraduate Fundamentals of Machine Learning Author William J. Deuschle
143 pages
ML Lab Record_250625_105014
No ratings yet
ML Lab Record_250625_105014
29 pages
Agniva
No ratings yet
Agniva
16 pages
Vertopal.com C1 W2 Lab02 Multiple Variable Soln
No ratings yet
Vertopal.com C1 W2 Lab02 Multiple Variable Soln
11 pages
2_DataPreProcessing_code
No ratings yet
2_DataPreProcessing_code
46 pages
New Rap Mist Paper
No ratings yet
New Rap Mist Paper
28 pages
Support Vector Machines For Classification and Regression: Steve R. Gunn
No ratings yet
Support Vector Machines For Classification and Regression: Steve R. Gunn
66 pages
House Price Prediction: Project Description
No ratings yet
House Price Prediction: Project Description
11 pages
Motivation For Data Mining The Information Crisis
No ratings yet
Motivation For Data Mining The Information Crisis
13 pages
2010-05-29 145847 Hospital
No ratings yet
2010-05-29 145847 Hospital
21 pages
ml2020 Pythonlab02
No ratings yet
ml2020 Pythonlab02
3 pages
ICTT Final Report
No ratings yet
ICTT Final Report
106 pages
Lec4 Oct12 2022 PracticalNotes LinearRegression
No ratings yet
Lec4 Oct12 2022 PracticalNotes LinearRegression
34 pages
Machine Learning Strategies
No ratings yet
Machine Learning Strategies
59 pages
Data Cleaning and Data Pre Processing
100% (1)
Data Cleaning and Data Pre Processing
72 pages
machinelearning_lab manual
No ratings yet
machinelearning_lab manual
26 pages
sahil_ml
No ratings yet
sahil_ml
21 pages
G 203008076 - 4 - Christhian Quiñonez - Ex1 - 2 A PDF
No ratings yet
G 203008076 - 4 - Christhian Quiñonez - Ex1 - 2 A PDF
20 pages
Machine learning lab manual
No ratings yet
Machine learning lab manual
22 pages
Lecture 3
No ratings yet
Lecture 3
51 pages
Machine Learning with Python for Everyone (Addison Wesley Data & Analytics Series) 1st Edition, (Ebook PDF) - Download the ebook and explore the most detailed content
100% (1)
Machine Learning with Python for Everyone (Addison Wesley Data & Analytics Series) 1st Edition, (Ebook PDF) - Download the ebook and explore the most detailed content
60 pages
ML Labs
No ratings yet
ML Labs
46 pages
easy pract ml
No ratings yet
easy pract ml
7 pages
Untitled Document
No ratings yet
Untitled Document
6 pages
ML Lab Manual
No ratings yet
ML Lab Manual
38 pages
Forecasting Techniques in Agriculture
No ratings yet
Forecasting Techniques in Agriculture
15 pages
Machine Learning
No ratings yet
Machine Learning
8 pages
Course No. MSO 201a Probability and Statistics 2016-17-II Semester
No ratings yet
Course No. MSO 201a Probability and Statistics 2016-17-II Semester
13 pages
Ml Cyber Lab
No ratings yet
Ml Cyber Lab
16 pages
MlLabManualdocx 2024 09 04 22 02 58
No ratings yet
MlLabManualdocx 2024 09 04 22 02 58
19 pages
5qqmn938 - Week 1
No ratings yet
5qqmn938 - Week 1
77 pages
Tools For Measuring Construction Materials Management Practices and Predicting Labor Productivity in Multistory Building Projects
No ratings yet
Tools For Measuring Construction Materials Management Practices and Predicting Labor Productivity in Multistory Building Projects
13 pages
Analysis Factors Influencing Financial M
No ratings yet
Analysis Factors Influencing Financial M
19 pages
Lecture 02
No ratings yet
Lecture 02
43 pages
Thermal Conductivity Analysis of A Briquette With
No ratings yet
Thermal Conductivity Analysis of A Briquette With
8 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
10 pages
Statistics in Hydrology
100% (1)
Statistics in Hydrology
4 pages
Week 7 Laboratory Activity
No ratings yet
Week 7 Laboratory Activity
12 pages
LP III Lab Manual
100% (1)
LP III Lab Manual
8 pages
Assignment-1 (C-2) Curve Fitting by Linear and Polynomial Regression
No ratings yet
Assignment-1 (C-2) Curve Fitting by Linear and Polynomial Regression
3 pages
Assignment 1: The Simple Linear Regression Model
No ratings yet
Assignment 1: The Simple Linear Regression Model
3 pages
8.0 Lakeland College
No ratings yet
8.0 Lakeland College
2 pages
Group 8 - Study On Customer Perception On Current Brands in Home Furniture Industry in Dhaka Metropolitan City
No ratings yet
Group 8 - Study On Customer Perception On Current Brands in Home Furniture Industry in Dhaka Metropolitan City
29 pages
Sta 316 Cat2 PDF
No ratings yet
Sta 316 Cat2 PDF
2 pages
Econometrics Tutorial 1
No ratings yet
Econometrics Tutorial 1
6 pages
Real Estate Price Prediction With Regression and Classification
No ratings yet
Real Estate Price Prediction With Regression and Classification
5 pages
Data Management Using R and Excel
No ratings yet
Data Management Using R and Excel
3 pages
Unit1 6thsemCS
No ratings yet
Unit1 6thsemCS
22 pages
000+ +curriculum+ +Complete+Data+Science+and+Machine+Learning+Using+Python
No ratings yet
000+ +curriculum+ +Complete+Data+Science+and+Machine+Learning+Using+Python
10 pages
Even You Can Learn Statistics and Analytics: An Easy to Understand Guide, 4th Edition David M. Levinepdf download
100% (1)
Even You Can Learn Statistics and Analytics: An Easy to Understand Guide, 4th Edition David M. Levinepdf download
62 pages
Excel BASIC OPERATIONS
No ratings yet
Excel BASIC OPERATIONS
6 pages
Effect of Information and Communication Technology On Organizational Performance in Nigeria
No ratings yet
Effect of Information and Communication Technology On Organizational Performance in Nigeria
7 pages
Econometrics_Review Questions
No ratings yet
Econometrics_Review Questions
4 pages
Stock Market Analysis Using Supervised Machine Learning
No ratings yet
Stock Market Analysis Using Supervised Machine Learning
3 pages
Técnicas Estadísticas para la Ciencia de Datos a través de R. Aprendizaje Supervisado: Análisis Discriminante, Árboles de Decisión, Redes Neuronales y Modelos Lineales Generalizados
From Everand
Técnicas Estadísticas para la Ciencia de Datos a través de R. Aprendizaje Supervisado: Análisis Discriminante, Árboles de Decisión, Redes Neuronales y Modelos Lineales Generalizados
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet

A1388404476 - 64039 - 23 - 2023 - Machine Learning II

Uploaded by

A1388404476 - 64039 - 23 - 2023 - Machine Learning II

Uploaded by

SYLLABUS: MACHINE LEARNING-II

UNIT-WISE CONTENT SAGRIGATION

Unit 1 Module 1: Model Selection and Evaluation

Unit 2 Module 2: Advanced Regression

Unit 3 Module 3: Support Vector Machines - I

Unit 4 Module 3: Support Vector Machines - II

Unit 5 Module 4: Tree Models - I

Unit 6 Module 4: Tree Models - II

Unit 1: Model Selection and Evaluation

Unit 2: Advanced Regression

Linear Regression Analysis: Validating Assumptions

● Which variables are significant in predicting the price of a house, and

Note: You can download the dataset from the platform.

Unit 3: Support Vector Machines - I

Support Vector Machines

Unit 4: Support Vector Machines - II

SVM – Linear Model – Diabetes Dataset

A sample of the training data is shown below.

The training data is provided here:

Your model’s accuracy will be evaluated on an unseen test dataset.

SVM Hyperparameter Tuning – Diabetes Data

Recall that you used C = 1 while building the model.

A sample of the training data is shown below:

The training data is provided here:

You have to write the prediction in the file given below.

Your model's accuracy will be evaluated on an unseen test dataset.

For this problem, we use the MNIST data, which is a large

Each image has 28 x 28 pixels. Each pixel has a feature, and

However, before the popularity of neural networks, models

In this assignment, try to experiment with various

Note: Since the training dataset is quite large (42,000

Unit 5: Tree Models - I

Decision Tree Classifier

Build a Decision Tree Classifier to predict the safety of the car.

You can find the dataset here.

Unit 6: Tree Models - II

Decision Tree Regression Model

You can find the dataset here

Decision Tree Regression in Python

Use the ‘Housing Data Set’ provided in the platform.

Telecom Churn Prediction using Random Forest

● Demographics (age, gender, etc.)

You can download the data sets from the platform.

You might also like