BMI 704 - Machine Learning Lab

This document provides an overview of machine learning topics including supervised learning, unsupervised learning, algorithms, and packages. Supervised learning involves predicting outcomes using labeled training data and evaluating models. Unsupervised learning explores patterns in unlabeled data through techniques like principal component analysis and clustering methods including k-means and hierarchical clustering. Popular algorithms and packages for implementing these methods in R are also discussed.

Uploaded by

jakekei5258

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views

BMI 704 - Machine Learning Lab

Uploaded by

jakekei5258

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 23

BMI 704 – Machine Learning

Lab
030719
Topics
• Introduction to Supervised Learning
• Introduction to Unsupervised Learning

• Algorithms and Packages

Supervised Learning
• Outcome
• You know the outcome (labelled variables; Y)
Your model
• Continuous or binary

• Features
• i.e. variables (Xs)
• Inputs you are using to predict outcome

• Model Models
• 1) Pick a guy Diabetes = 0.5*age + 0.2*sex + 2.1*BMI + …
• 2) sub his features into the model
Height = 0.2*age + 0.8*sex + 1.3*weight + …
• 3) now you know his outcome
Where is the predicting model come from?
• 1) Pick an algorithm
• Linear model
• Y = X1 + X2 + X3

• 2) Split your data set into train and test (e.g. 80/20,
70/30)

• 3) Build your model using the training data set

• Cross validation find best model parameters

• 4) Run your optimized model using the test data set

• 5) Report model performance and your results

Measurement of how well your algorithm did?
Loss function
• Objective metric, max or
min

Simple Regression
• R2 - amount of variance
explained

Multiple regression with

varying model size
• Adjusted R2
• AIC/BIC/Cp
Measurement of how well your algorithm did?
Classification (Y = binary)
• Receiver operating
characteristic (ROC) curve
and area under the curve
(AUC)

• If Y = 1 or 0;
• High sensitivity:
• Y = 1; ➙ Y^ = 1
• High specificity:
• Y = 0; ➙ Y^ = 0
Which model (algorithm) should you use?
Unsupervised Learning
• Not interest in predicting Y but exploratory analysis (Xs)
• discovering patterns
• Find subgroups that you don’t know
• Visualize the results

• Hard to validate results

• Principle component analysis

• X1, X2, X3, X4 … Xn
• ➙ create latent variables (PCs)

• A few latent variables to capture the most of the information of the data
• i.e. the variance explained

• Variance explained: PC1 > PC2 > PC3 …

Score plot loading plot

loading x%
Score x%
Unsupervised Learning
• Clustering
• PCA looks to ﬁnd a low-dimensional representation of the observations that
explain a good fraction of the variance;
• Clustering looks to ﬁnd homogeneous subgroups among the observations.

• K-means clustering
• hierarchical clustering
K-means clustering
• partitioning a data set into K distinct, non-overlapping clusters.
• Specify how many clusters do you want
• The algorithm looks for
local optimum
• Run a few times to see
the different
hierarchical clustering
• tree-based representation of the
observations, called a
dendrogram.
• bottom-up clustering
Algorithms and Packages
• ML Algorithms (many, many, many!)
• Basics: linear-based
• Shrinkage Methods
• Lasso and Ridge regression
• ElasticNet
• Non-linear methods
• Spline
• Support Vector Machines
• Tree based methods
• Decision trees
• Random Forests
• Packages in R
• Individual packages for each algorithm - glmnet
• Meta packages – caret
Unsupervised Learning (con’t)
• Clustering
• Partitional methods
• K-means: partition {x1,…xn} into K clusters where K is
predefined.
• Build a new partition by associating each point with the nearest
centroid
• Compute the centroid (mean point) for each set. Repeat until
converge.
• “kmeans” function in R.
Unsupervised Learning
• Not interest in predicting but discovering patterns
• Find subgroups that you don’t know
• Visualize the results
• Principle component
• Clustering
• Hierarchical clustering– Build a hierarchy of clusters
• Agglomerative: A “bottom up” approach. You start with each element in a separate
cluster, then merge them according to a given property.
• Divisive: A “top down” approach. All elements start in one all-inclusive cluster, then you
split recursively.

BMI 704 - Machine Learning Lab
No ratings yet
BMI 704 - Machine Learning Lab
17 pages
BMI 704 - Machine Learning Lab
No ratings yet
BMI 704 - Machine Learning Lab
7 pages
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
100% (1)
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
13 pages
Fiches Machine Learning
No ratings yet
Fiches Machine Learning
21 pages
ML - Machine Learning PDF
No ratings yet
ML - Machine Learning PDF
13 pages
AIML
No ratings yet
AIML
30 pages
machine learning
No ratings yet
machine learning
37 pages
Understanding Machine Learning Algorithms - in Depth
No ratings yet
Understanding Machine Learning Algorithms - in Depth
167 pages
Algorithms 1
No ratings yet
Algorithms 1
23 pages
ML Algorithms Week 3
No ratings yet
ML Algorithms Week 3
30 pages
Machine Learning Theory
100% (1)
Machine Learning Theory
12 pages
University Institute of Computing: Big Data Analytics 22CAH-782
No ratings yet
University Institute of Computing: Big Data Analytics 22CAH-782
27 pages
ML-2-PPT-UNIT-2
No ratings yet
ML-2-PPT-UNIT-2
214 pages
Machine Learning and Deep Learning Supervised Learning 1682688720
No ratings yet
Machine Learning and Deep Learning Supervised Learning 1682688720
121 pages
ML_Valkenborg
No ratings yet
ML_Valkenborg
84 pages
Python 06 MachineLearning
No ratings yet
Python 06 MachineLearning
45 pages
Unsupervised Machine Learning in Python
100% (1)
Unsupervised Machine Learning in Python
89 pages
Session 5 ppt
No ratings yet
Session 5 ppt
36 pages
AIYA SESSION 4
No ratings yet
AIYA SESSION 4
42 pages
ICT202B AI ML and Emerging technologies UNIT 3 (Classification and Regression) 2
No ratings yet
ICT202B AI ML and Emerging technologies UNIT 3 (Classification and Regression) 2
23 pages
Module 3 (1)
No ratings yet
Module 3 (1)
63 pages
Types of Machine Learning Algorithms
No ratings yet
Types of Machine Learning Algorithms
14 pages
Unit III - I
No ratings yet
Unit III - I
15 pages
Module 1 ML Mumbai University
No ratings yet
Module 1 ML Mumbai University
47 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
167 pages
AI and DS QB1
No ratings yet
AI and DS QB1
31 pages
Unit-4 Part 2 Modelling and Evaluation
No ratings yet
Unit-4 Part 2 Modelling and Evaluation
35 pages
Accelerated Data Science Introduction To Machine Learning Algorithms
No ratings yet
Accelerated Data Science Introduction To Machine Learning Algorithms
37 pages
FAM_QUESTION_BANK_CT[1]
No ratings yet
FAM_QUESTION_BANK_CT[1]
14 pages
ML UNIT-4
No ratings yet
ML UNIT-4
20 pages
ChatGPT - Machine Learning Overview
No ratings yet
ChatGPT - Machine Learning Overview
34 pages
02 Machine-Learning-V-Statistics
No ratings yet
02 Machine-Learning-V-Statistics
16 pages
Chapter 2 Machine Learning Draft-85-172
No ratings yet
Chapter 2 Machine Learning Draft-85-172
88 pages
Project Report 2
No ratings yet
Project Report 2
11 pages
Top 10 Machine Learning Algo PDF
No ratings yet
Top 10 Machine Learning Algo PDF
15 pages
Introduction r
No ratings yet
Introduction r
9 pages
Machine Learning For Beginners
100% (1)
Machine Learning For Beginners
30 pages
Deep Learning
No ratings yet
Deep Learning
68 pages
Lecture 9
No ratings yet
Lecture 9
27 pages
APznzab0G8iLD5cDfn798Gn-fXshRpam8ullbf6ZS5Hd4l0BEcKNHy9gDG24DS66RfgvnKXAQjMAivMmmi5cmDWF9tqOaPMy3afuzafCU1kpG1xfQIr7b98q406ZWiqt50nL8WhMI6azoYzWSgf7c7khnqww3VlQ9I90ROmc0QL4DbmipYYoLleGYR6TO4UYmc_PsaQB5v0XmLUwPEub3QuwGdUnUEr2dp_hV4bds0MuRbpJ
No ratings yet
APznzab0G8iLD5cDfn798Gn-fXshRpam8ullbf6ZS5Hd4l0BEcKNHy9gDG24DS66RfgvnKXAQjMAivMmmi5cmDWF9tqOaPMy3afuzafCU1kpG1xfQIr7b98q406ZWiqt50nL8WhMI6azoYzWSgf7c7khnqww3VlQ9I90ROmc0QL4DbmipYYoLleGYR6TO4UYmc_PsaQB5v0XmLUwPEub3QuwGdUnUEr2dp_hV4bds0MuRbpJ
34 pages
Machine Learning Concepts
No ratings yet
Machine Learning Concepts
68 pages
ML notes
No ratings yet
ML notes
10 pages
[English (Auto-generated)] All Machine Learning Algorithms Explained in 17 Min [DownSub.com]
No ratings yet
[English (Auto-generated)] All Machine Learning Algorithms Explained in 17 Min [DownSub.com]
19 pages
Introduction to Machine Learning (1)
No ratings yet
Introduction to Machine Learning (1)
89 pages
Essentials of Machine Learning Algorithms
No ratings yet
Essentials of Machine Learning Algorithms
15 pages
ML Algorithms
No ratings yet
ML Algorithms
12 pages
R Data Analysis
No ratings yet
R Data Analysis
10 pages
Module 1 & 2
No ratings yet
Module 1 & 2
21 pages
LECTURE-2
No ratings yet
LECTURE-2
36 pages
Intro To Data Science Summary
No ratings yet
Intro To Data Science Summary
17 pages
SK Learn
No ratings yet
SK Learn
9 pages
Evolutional Study On KNN and K-Means Algorithms (SP)
No ratings yet
Evolutional Study On KNN and K-Means Algorithms (SP)
9 pages
Week 01
No ratings yet
Week 01
37 pages
Machinelearning
No ratings yet
Machinelearning
59 pages
Week11_regularization and optimization
No ratings yet
Week11_regularization and optimization
75 pages
Scikit Learn Docs
100% (1)
Scikit Learn Docs
2,201 pages
Interview Questions On Machine Learning
100% (4)
Interview Questions On Machine Learning
22 pages
The Numpy Pocketbook: Essentials on the Go
From Everand
The Numpy Pocketbook: Essentials on the Go
Silas Meadowlark
No ratings yet
data science course training in india hyderabad: innomatics research labs
From Everand
data science course training in india hyderabad: innomatics research labs
innomatics research labs
No ratings yet
Environmental Chemicals, The Human Microbiome, and Health Risk PDF
No ratings yet
Environmental Chemicals, The Human Microbiome, and Health Risk PDF
123 pages
Bmi O2
No ratings yet
Bmi O2
7 pages
Compound Names
No ratings yet
Compound Names
13 pages
2011health Indus & Mana Holistic Care & Valuation
No ratings yet
2011health Indus & Mana Holistic Care & Valuation
33 pages
K-Means Clustering: Sargur Srihari Srihari@cedar - Buffalo.edu
No ratings yet
K-Means Clustering: Sargur Srihari Srihari@cedar - Buffalo.edu
20 pages
AD3461-Machine Learning Lab Manual
No ratings yet
AD3461-Machine Learning Lab Manual
26 pages
Data Mining Using RFM Analysis, Derya Birant, Dokuz Eylul University, Turkey
100% (1)
Data Mining Using RFM Analysis, Derya Birant, Dokuz Eylul University, Turkey
18 pages
K Means Clustering Solved Numerical - 5 Minutes Engineering
No ratings yet
K Means Clustering Solved Numerical - 5 Minutes Engineering
8 pages
Uber
No ratings yet
Uber
46 pages
Development of A Theoretical Delay Model For Heter
No ratings yet
Development of A Theoretical Delay Model For Heter
20 pages
Weka
No ratings yet
Weka
22 pages
CDR-1967 Determining Measured Mile T Zhao 2015 AACE
100% (1)
CDR-1967 Determining Measured Mile T Zhao 2015 AACE
17 pages
ML Unsupervised Notes
No ratings yet
ML Unsupervised Notes
26 pages
Hierarchical Cluster Analysis
No ratings yet
Hierarchical Cluster Analysis
4 pages
Machine Learning
No ratings yet
Machine Learning
14 pages
Data mining and machine learning
No ratings yet
Data mining and machine learning
48 pages
Important Question of Introduction of Data Science
No ratings yet
Important Question of Introduction of Data Science
10 pages
Unit 4 - Machine Learning - WWW - Rgpvnotes.in PDF
No ratings yet
Unit 4 - Machine Learning - WWW - Rgpvnotes.in PDF
27 pages
Data Warehousing and Data Mining Dec 2023
No ratings yet
Data Warehousing and Data Mining Dec 2023
7 pages
A06-A Survey of Clustering Techniques
No ratings yet
A06-A Survey of Clustering Techniques
5 pages
Unit-IV ppt
No ratings yet
Unit-IV ppt
51 pages
A Grid Clustering Algorithm: N I I I1 I2 Im
No ratings yet
A Grid Clustering Algorithm: N I I I1 I2 Im
15 pages
6 390 Lecture Notes Spring24
No ratings yet
6 390 Lecture Notes Spring24
144 pages
DWM UNIT-VI (2)
No ratings yet
DWM UNIT-VI (2)
30 pages
Compilation Sinopsis 20184 - 17012019
No ratings yet
Compilation Sinopsis 20184 - 17012019
188 pages
COMP1801 - Copy 1
No ratings yet
COMP1801 - Copy 1
18 pages
Two-Layer Obstacle Collision Avoidance With Machine Learning For More Energy-Efficient Unmanned Aircraft Trajectories
No ratings yet
Two-Layer Obstacle Collision Avoidance With Machine Learning For More Energy-Efficient Unmanned Aircraft Trajectories
16 pages
MACHINE LEARNING TUTORIAL QUESTION BANK modified
No ratings yet
MACHINE LEARNING TUTORIAL QUESTION BANK modified
13 pages
API Design For Machine Learning Software: Experiences From The Scikit-Learn Project
No ratings yet
API Design For Machine Learning Software: Experiences From The Scikit-Learn Project
15 pages
IV-cse DM Viva Questions
No ratings yet
IV-cse DM Viva Questions
10 pages
Clustering: Source: I. Business Analytics by U Dinesh Kumar Means-Example-1.htm) rial/Clustering/Numerical Example - HTM
No ratings yet
Clustering: Source: I. Business Analytics by U Dinesh Kumar Means-Example-1.htm) rial/Clustering/Numerical Example - HTM
24 pages
408 504 1 PB PDF
No ratings yet
408 504 1 PB PDF
5 pages
6 IJAEST Volume No 2 Issue No 2 Representative Based Method of Categorical Data Clustering 152 156
No ratings yet
6 IJAEST Volume No 2 Issue No 2 Representative Based Method of Categorical Data Clustering 152 156
5 pages
K Means Algorithm
No ratings yet
K Means Algorithm
6 pages

BMI 704 - Machine Learning Lab

Uploaded by

BMI 704 - Machine Learning Lab

Uploaded by

BMI 704 – Machine Learning

• Algorithms and Packages

• 3) Build your model using the training data set

• 4) Run your optimized model using the test data set

• 5) Report model performance and your results

Multiple regression with

• Hard to validate results

• Principle component analysis

• Variance explained: PC1 > PC2 > PC3 …

You might also like