0% found this document useful (0 votes)

59 views

Machine Learning in A Nutshell

Uploaded by

jayaenchantedjewellery

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

59 views

Machine Learning in A Nutshell

Uploaded by

jayaenchantedjewellery

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 36

MACHINE

LEARNING
IN A NUTSHELL
2

TOPICS COVERED

 Introduction
 Building an ML Model
 Types of ML Algorithms
 ML Drawbacks
 Neural Nets
MACHINE LEARNING

• Machine Learning is the study of

making machines learn by training
the algorithms through data without
explicitly coding.
• ML is the subset of Artificial
Intelligence.
BUILDING AN ML MODEL

Firstly , we need to split entire dataset into training and testing data
 Train the Model : (Model is the ml algorithm itself) We feed training
data to the model, so that the model will understand the features and
its respected class label. (features are the columns of the data, class
label is the expected output)
 Loss Calculation : The Loss functions calculates the difference between
the actual output(y) and predicted output (y hat) by the model. Many
different functions available to calculate loss.
 Retrain the Model after tuning parameters : After calculating the loss ,
we need to tune the parameters so that the algorithm improves in
predicting the output.
5

 Validate Model : We hold back some part of the training data to

validate it before “testing” the model. Over –fitting or Under -
fitting of the data during training can be identified by this step ,
also helps to attain the best version of the model.(if the model
performance stops improving on validation-set we can stop
training to prevent overfitting).
 Test the Model : The test data is used in testing the model (in
supervised learning we don’t provide the actual output to the
model) to check the model’s performance on unseen data.
 Performance Metrics : The results obtained on test data are
considered to calculate the performance of the model through
different types of “Performance Metrics” available in machine
learning
6

TYPES OF ML ALGORITHMS

Classification Methods

1.Supervised Learning Regression Methos

Clustering
2. Unsupuervised Learning
Dimensionality Reduction
3. Reinforcement Learning
7
SUPERVISED LEARNING

 The Models which are trained with the “expected output” are called supervised
learning techniques. (train on labelled data)

 Essentially, the expected output is the last column in the dataset. The
expected output is known by different names target variable , class label.

The supervised Learning Techniques are further classified as:

• Classification
• Regression tasks
CLASSIFICATION TASKS
8

 The Model which classifies a data point (essentially a row

in dataset) from the class labels available in the dataset
is said to be classification.

 Some common classification tasks are:

1. KNN – K Nearest Neighbours

2. SVM – Support Vector Machine
3. Naïve Bayes
4. Decision Trees
5. Random Forests
9

K NEAREST NEIGHBORS

• The KNN algorithms classifies data points based on distanace

methods.

Algorithm:
 Select a number for “K” , this considers k no.of neighbors to
classify around a data point.
 When a new data point is introduced in a plane, the KNN will
calculate the distance between the data point and all other existing
points .
 Later , it selects K no.of data points which have less distance ,
considers them as nearest neighbors.
 The class which is high in number in the “K” data points , the new
instance is classified to that class.
10

DISTANCE METRICS

Some popular Distance metrics are:

1.Euclidean Distance

2.Manhattan Distance

3. Cosine Distance

4.Squared Euclidean
11

NAVIE BAYES
 Naïve Bayes , a collection of classification problem based on “bayes theorem” . It
classifies based on the probability.
 The key assumption in all the algorithms is that “Every single feature equally
contributes to the probability of the class, without being influenced by the other
features.”
 This model predicts the probability of an instance belongs to a class with a given
set of feature value.
12
SVM
• SVM’s classify using a “Hyper plane”.

• They find a hyper-plane , which separates data points into different classes.

• A hyper – plane is formed in an N dimensional space , where N is the number of

features.

• If there are 3 input features then a 3D hyper plane is formed, if there are 2
features then a 2D plane is formed in the space.

• In the picture there are 2 input features and the data

points are classified using multiple hyperplanes (green lines).

• We only need one plane, so the goal is to find a best plane among them.
13

• A best hyper plane is said to have maximum distance between the plane and
the closest data points(support vectors) to the plane on each side
• separation margin / marginal distance = distance between the plane and the
support vectors on each side.
• So, you choose that hyper plane which maximizes the separation margin which
is called “maximum margin hyper plane or hard margin”.

• So, from the figure we choose the l2 hyper plane.

•
14

• In picture one data point (blue ball) is out of its class (outlier),
we cannot draw a hyper plane.
• But in such cases the svm ignores the outlines and finds the
best hyper-plane as discussed above.
• Along with this , for every data point which falls inside the
margin or is misplaced then a penalty is added.
• A slack variable(ξi) is introduced to the objective function and
for every data point and penalty is added to it.
• Eg : If properly classified , (penalty ) ξ1 = 0
falls inside margin , ξ2=0.5
Misplaced , ξ3 = 1
15

• Non – LinearlSVM: In most of the cases the data is not separable linearly , so
in this case the non-linear svm transforms data into higher dimensional space
using kernal trick.

Kernal trick , allows transform data into higher dimension without explicitly
calculating the co-ordinates of the higher dimension with help of its “Kernal
functions”.
Some common kernel functions include linear, polynomial, radial basis function
(RBF), and sigmoid.
DECISION TREES 16

• A decision tree is a tree-like structure which

classifies a data point based on the conditional
branching.
• A decision tree consists of root node , internal
nodes , branches , leaf nodes.
• It selects the best attribute according to metrics
like Gini Index , entropy or Information gain.
• Every time it selects a best attribute , it splits the
tree based on the values the feature holds.
• And , at last it ends at the leaf nodes where the
class labels are present.
17

METRICS USED IN DECISION TREES

Information Gain
18

RANDOM FORESTS
• Random forests algorithm is based on the technique “Ensemble Learning”.

• Ensemble Learning suggests combining multiple classifiers together to increase the

accuracy of the model & to solve a complex problem.

• Random Forest is a classifier that contains more than one decision treeand feed on
various subsets of the dataset and takes the majority voting to improve the
accuracy of the prediction.

• These prevent the problem of over fitting and improves accuracy, as they feed on
different subsets of data.

• Many techniques are available to manipulate the data into different subsets to feed
the model.
REGRESSION TASKS
19

 The Model which predicts a continuous variable from the given data point is said to
be a Regression Task.

 These Models determine relation between one or more Independent variables and
one dependent variable.

 Dependent variable is the expected output i.e. Class label and the Independent
variables are the features of the dataset.

 The model tries to find the mapping function which maps input to output.

y^ = F(x) here x = features of the dataset , y ^ = predicted

output.

 Some common Regression Tasks are:

1. Simple Linear Regression
2. Polynomial regression
3. Logistic Regression
20

SIMPLE LINEAR REGRESSION

• A simple linear regression maps a single independent variable

to a dependent variable , which are features and the class
label.

y = b1x+b0

• Where b1 = Slope of the line & b0 = intercept of the line

equation .
• The algorithm should find a best fit (best line equation) that
best models our data.
• The best fit decreases the residuals as much as possible.
• So , linear regression finds a linear relationship between single
independent and dependent variables.
MULTIPLE LINEAR REGRESSION 21

• Since , most of the data available in the real world will have
more than one feature.

• The multiple Linear Regression has “multiple ” independent

variables and a single dependent variable.

y = b0+b1x1+b2x2+b3x3+……….
22
LOGISTIC REGRESSION

This Regression technique is mostly used for classification

tasks.
The algorithm uses a sigmoid function , which lies between 0 to
1.
It takes features as inputs and maps it to the sigmoid function.
If the mapped data point is between 0 to 0.5 it is considered as
class 1 or else class 2.
23

UNSUPERVISED LEARNING
 These Models train on dataset without the class label
(unlabelled data).

 They are supposed to find the unknown patterns in the data.

 These are classified into :

1. Clustering
2. Dimensionality Reduction
24

CLUSTERING

 The process of
grouping similar
data points
together is said to
be clustering.
 K Means Clustering
K MEANS CLUSTERING 25

The clustering algorithm groups entire data points into “K” no.of clusters & the “K” is
defined by the programmer.

Algorithm:

 Randomly choose “K” number of points and call them “Centroids”.

 Calculate the distance between a point to all centroids and do it to all the data points.

 Now, assign each data point to its closest centroid.

 Compute “K” new centroids.

 Then continue the process until the results are same.

 Hard Clustering : The clusters don’t overlap

 Soft Clustering : Clusters may overlap.

K MEANS CLUSTERING
27

DIMENSIONALITY REDUCTION

• In ML every feature i.e every column is considered as a dimension.

• In real world data there might me hundreds of features in the

dataset, which is very hard to handle.

• So, we try to obtain some important features from the dataset this
is known as “Dimensionality reduction”.

• Algorithms for Dim-Reduction : PCA (Principal Component Analysis)

PRINCIPAL COMPONENT ANALYSIS

• A Technique which address “CURSE OF DIMENSIONALITY”.

• The “curse of Dimensionality” refers to the difficulties which will arise to
algorithms due to higher dimensional space.
• In Order to work with high dimensional data, it is obvious that we try to consider
only few features but this might result in loosing important information provided
by the other features.
• So, the PCA will consider all features and combines them to form new features
which are called principal components and then maps the data (no loosing of
important information).
• The PCA’s are ranked from most important to least important

PCn>PCn-1>PCn-2>……………PC3>PC2>PC1
29

• The directions picked by PCA (for principal components) are the eigen vectors of
the Covarience matrix i.e PC1 = eigen vector 1 , PC2 = eigen vector 2
• So , the PCA converts the correlations of all features into a lower dimensional
space.
• This results in , clustering of data points which are highly correlated.
30

REINFORCEMENT LEARNING
• Reinforcement Learning is a technique based on trail and error method.

• An agent tries to find out its move through trail and error method by rewards
and penalties.

• When agent (AI robot) is placed in an environment to attain a goal, it

performs a move and then receive an award or a penalty based on its move.

• It completes the entire task in such a way that it should attain maximum
award in the entire process of completing the goal.
31

DRAWBACKS OF ML

1. ML models cannot support huge data.

2. They cannot handle non-linearity in data.

3. The features have to be picked manually before

providing the data to the model.

4. Unstructured data cannot be handled by traditional ml.

NEURAL NETS

• Neural Networks are inspired by human

brain.
• The NN models overcome the drawbacks that
ML models hold.
• The NN models are based on the architecture
unlike the ml models.
• A single neuron in a Neural Net consists of
input , output , weights and bias term.
• The input neurons accept the features of a
data point. Every input has its own weighted
term.
Z=∑wx+b
Where, Z = weighted sum of the inputs

b = bias term

w= weights

x = input
33

• The Z (weighted sum of inputs) is given to the activation

function and the activation function will return an output.

• SOME COMMON ACTIVATION FUNCTIONS :

1. Sigmoid function
2. Tan h function
3. Re-LU function
4. Leaky ReLU
5. Soft max
34

SIMPLE ARCHITECTURE OF A NEURAL NET

35
• A simple neural net consists of 3 important layers :
1. Input layer
2. Hidden layer
3. Output layer
• An input layer of NN takes the weighted sum of the inputs and pass it to the activation
function which gives a single output and pass on to all the neurons in the subsequent
layer and this process goes on till the output layer.
• The hidden layers are the layers in between input and output layer.
• The NN’s always start from left to right and as we go from left to right the model gets
into the intricacies of the features or it gets more complex.
 SOME IMPORTANT ARCHITECTURES OF NN’s are:
1. CNN (Convolution Neural Networks)
2. RNN (Recurrent Neural Networks)
3. Transformers
4. Auto – Encoders
5. GAN’s (Generative Adversial Networks)
THANK YOU

Praneetha.G
[email protected]

Project Report Face Detection and Recognition
75% (8)
Project Report Face Detection and Recognition
38 pages
Tutorial 7 Machine Learning Algorithms
No ratings yet
Tutorial 7 Machine Learning Algorithms
30 pages
ML UNIT-4
No ratings yet
ML UNIT-4
20 pages
Interview Preparing - ML Draft
No ratings yet
Interview Preparing - ML Draft
12 pages
SVM Presentation
No ratings yet
SVM Presentation
27 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
Project Report 2
No ratings yet
Project Report 2
11 pages
Chapter Four
No ratings yet
Chapter Four
75 pages
ML - Interview Prep
No ratings yet
ML - Interview Prep
9 pages
Module 3
No ratings yet
Module 3
79 pages
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
100% (1)
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
13 pages
machine learning
No ratings yet
machine learning
37 pages
Regression Models: by Mayuri Bhandari
No ratings yet
Regression Models: by Mayuri Bhandari
64 pages
ML Algorithms Week 3
No ratings yet
ML Algorithms Week 3
30 pages
Algorithms 1
No ratings yet
Algorithms 1
23 pages
ML and Ai Unit 04 and Unit 05
No ratings yet
ML and Ai Unit 04 and Unit 05
58 pages
UCS551 Chapter 6 - Classification
No ratings yet
UCS551 Chapter 6 - Classification
20 pages
Lecture - 2 & 3
No ratings yet
Lecture - 2 & 3
62 pages
Unit 1
No ratings yet
Unit 1
15 pages
Understanding Machine Learning Algorithms - in Depth
No ratings yet
Understanding Machine Learning Algorithms - in Depth
167 pages
Machine Learning (Part 1) : Iykra Data Fellowship Batch 3
No ratings yet
Machine Learning (Part 1) : Iykra Data Fellowship Batch 3
28 pages
Machine learning algorithms laiki
No ratings yet
Machine learning algorithms laiki
123 pages
Module 1 ML Mumbai University
No ratings yet
Module 1 ML Mumbai University
47 pages
Module 3 (1)
No ratings yet
Module 3 (1)
63 pages
FAM_QUESTION_BANK_CT[1]
No ratings yet
FAM_QUESTION_BANK_CT[1]
14 pages
Assessing a Single Classification Algorithm and Two Classification Algorithms
No ratings yet
Assessing a Single Classification Algorithm and Two Classification Algorithms
12 pages
Chapter 4. Classification Algorithms-Stud
No ratings yet
Chapter 4. Classification Algorithms-Stud
43 pages
ML DL NLP Definitions
No ratings yet
ML DL NLP Definitions
22 pages
DAC ML Tutorial Final Deck
No ratings yet
DAC ML Tutorial Final Deck
150 pages
Lec05 - Supervised
No ratings yet
Lec05 - Supervised
26 pages
Unit-1 DL
No ratings yet
Unit-1 DL
29 pages
Machine Learning Theory
100% (1)
Machine Learning Theory
12 pages
Unit II 2.2 ML Kernel Machines SVM
No ratings yet
Unit II 2.2 ML Kernel Machines SVM
50 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
24 pages
Unit V - Big Data Programming
No ratings yet
Unit V - Big Data Programming
22 pages
Module 5
No ratings yet
Module 5
48 pages
Linear Regression & SVM
No ratings yet
Linear Regression & SVM
33 pages
Classification Algorithms 3rd
No ratings yet
Classification Algorithms 3rd
15 pages
Refer For KNNDecison Tree SVM
No ratings yet
Refer For KNNDecison Tree SVM
90 pages
ML unit-2 (CEC)
No ratings yet
ML unit-2 (CEC)
96 pages
41 Machine Learning Algorithms I
No ratings yet
41 Machine Learning Algorithms I
8 pages
ML - Unit-2 - Machine Learning Algorithm
No ratings yet
ML - Unit-2 - Machine Learning Algorithm
42 pages
Unit-5 MECH 3-2
No ratings yet
Unit-5 MECH 3-2
14 pages
Lecture 3
No ratings yet
Lecture 3
51 pages
Lecture 9
No ratings yet
Lecture 9
27 pages
Introduction of Machine Learning
No ratings yet
Introduction of Machine Learning
9 pages
UNIT3 Machine Learning
No ratings yet
UNIT3 Machine Learning
53 pages
LFD-1
No ratings yet
LFD-1
39 pages
Supervised ML
No ratings yet
Supervised ML
69 pages
[English (Auto-generated)] All Machine Learning Algorithms Explained in 17 Min [DownSub.com]
No ratings yet
[English (Auto-generated)] All Machine Learning Algorithms Explained in 17 Min [DownSub.com]
19 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
13 pages
Machine Learning Midterm
No ratings yet
Machine Learning Midterm
18 pages
Unit 3 in Machine Intelligence
No ratings yet
Unit 3 in Machine Intelligence
62 pages
Supervised Classification Notes
No ratings yet
Supervised Classification Notes
31 pages
ML & DL Notes
No ratings yet
ML & DL Notes
30 pages
Machinelearning Algorithm Basics2 NOTES
No ratings yet
Machinelearning Algorithm Basics2 NOTES
72 pages
Supervised Learning - SVM - DT
No ratings yet
Supervised Learning - SVM - DT
43 pages
IT 802 ML Unit-2 Notes
No ratings yet
IT 802 ML Unit-2 Notes
19 pages
Lecture - 7 Classification (SVM)
No ratings yet
Lecture - 7 Classification (SVM)
48 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Document Upload to INAU for Admitted Graduate Students
No ratings yet
Document Upload to INAU for Admitted Graduate Students
5 pages
Abhyasa-Ganam-English
100% (1)
Abhyasa-Ganam-English
107 pages
Python lab manual
No ratings yet
Python lab manual
22 pages
Bank Statement
No ratings yet
Bank Statement
18 pages
The Wanderers Song Final
No ratings yet
The Wanderers Song Final
4 pages
Parkinson Detection Using Machine Learning Algorithms
No ratings yet
Parkinson Detection Using Machine Learning Algorithms
8 pages
Expt_2_ressearch[1]
No ratings yet
Expt_2_ressearch[1]
7 pages
UNIT 5 NOTES DWM
No ratings yet
UNIT 5 NOTES DWM
18 pages
Prediction of Failures in The Project Management K
No ratings yet
Prediction of Failures in The Project Management K
14 pages
Heart Disease Prediction Report
No ratings yet
Heart Disease Prediction Report
113 pages
A New Cyborg Rat Auto Navigation System Based On Finite State Machine
No ratings yet
A New Cyborg Rat Auto Navigation System Based On Finite State Machine
12 pages
Education 4.0 and 5.0 integrating Artificial Intelligence (AI) for personalized and adaptive learning
No ratings yet
Education 4.0 and 5.0 integrating Artificial Intelligence (AI) for personalized and adaptive learning
15 pages
Module-1 DM
No ratings yet
Module-1 DM
15 pages
1 - en - Print - Indd - 0014431
No ratings yet
1 - en - Print - Indd - 0014431
261 pages
Final Heart Disease Prediction
No ratings yet
Final Heart Disease Prediction
26 pages
Survey of Boosting From An Optimization Perspective: ICML 2009 Tutorial
No ratings yet
Survey of Boosting From An Optimization Perspective: ICML 2009 Tutorial
3 pages
ML_MU_Unit_2 - Supervised Learning-Classification Techniques
No ratings yet
ML_MU_Unit_2 - Supervised Learning-Classification Techniques
153 pages
Client Side Webspoofing PishCatcher
No ratings yet
Client Side Webspoofing PishCatcher
75 pages
An Anomaly Detection Model Based On One-Class
No ratings yet
An Anomaly Detection Model Based On One-Class
6 pages
Assessing Approaches To Genre Classification
No ratings yet
Assessing Approaches To Genre Classification
72 pages
Elie Niring
No ratings yet
Elie Niring
6 pages
Machine Learning Manual
No ratings yet
Machine Learning Manual
40 pages
Xie Et al-2019-AIChE Journal
No ratings yet
Xie Et al-2019-AIChE Journal
20 pages
Data Analytics - Unit-1,2,3, & 4 questions - Assignment
No ratings yet
Data Analytics - Unit-1,2,3, & 4 questions - Assignment
6 pages
Math Game For Elementary School Children Using Leap Motion Controller
No ratings yet
Math Game For Elementary School Children Using Leap Motion Controller
7 pages
Mathematics For Machine
No ratings yet
Mathematics For Machine
22 pages
Urban Land-Use Classification Using Machine Learning Classifiers Comparative Evaluation and Post-Classification Multi-Feature Fusion Approach
No ratings yet
Urban Land-Use Classification Using Machine Learning Classifiers Comparative Evaluation and Post-Classification Multi-Feature Fusion Approach
22 pages
Cyber1 Power System
No ratings yet
Cyber1 Power System
4 pages
Unit-1
No ratings yet
Unit-1
18 pages
A Review On Prognosis of Rolling Element Bearings
No ratings yet
A Review On Prognosis of Rolling Element Bearings
7 pages
Question-Answers in Machine Learning
No ratings yet
Question-Answers in Machine Learning
14 pages
mlentary
No ratings yet
mlentary
28 pages
Unit 2 - Machine Learning - WWW - Rgpvnotes.in
100% (2)
Unit 2 - Machine Learning - WWW - Rgpvnotes.in
21 pages
PUMMP: Phishing URL Detection Using Machine Learning With Monomorphic and Polymorphic Treatment of Features
No ratings yet
PUMMP: Phishing URL Detection Using Machine Learning With Monomorphic and Polymorphic Treatment of Features
20 pages

Machine Learning in A Nutshell

Uploaded by

Machine Learning in A Nutshell

Uploaded by

MACHINE

• Machine Learning is the study of

 Validate Model : We hold back some part of the training data to

1.Supervised Learning Regression Methos

The supervised Learning Techniques are further classified as:

 The Model which classifies a data point (essentially a row

 Some common classification tasks are:

1. KNN – K Nearest Neighbours

• The KNN algorithms classifies data points based on distanace

Some popular Distance metrics are:

• A hyper – plane is formed in an N dimensional space , where N is the number of

• In the picture there are 2 input features and the data

• So, from the figure we choose the l2 hyper plane.

• A decision tree is a tree-like structure which

METRICS USED IN DECISION TREES

• Ensemble Learning suggests combining multiple classifiers together to increase the

y^ = F(x) here x = features of the dataset , y ^ = predicted

 Some common Regression Tasks are:

SIMPLE LINEAR REGRESSION

• A simple linear regression maps a single independent variable

• Where b1 = Slope of the line & b0 = intercept of the line

• The multiple Linear Regression has “multiple ” independent

This Regression technique is mostly used for classification

 They are supposed to find the unknown patterns in the data.

 These are classified into :

 Randomly choose “K” number of points and call them “Centroids”.

 Now, assign each data point to its closest centroid.

 Compute “K” new centroids.

 Then continue the process until the results are same.

 Hard Clustering : The clusters don’t overlap

 Soft Clustering : Clusters may overlap.

• In ML every feature i.e every column is considered as a dimension.

• In real world data there might me hundreds of features in the

• Algorithms for Dim-Reduction : PCA (Principal Component Analysis)

PRINCIPAL COMPONENT ANALYSIS

• A Technique which address “CURSE OF DIMENSIONALITY”.

• When agent (AI robot) is placed in an environment to attain a goal, it

1. ML models cannot support huge data.

2. They cannot handle non-linearity in data.

3. The features have to be picked manually before

4. Unstructured data cannot be handled by traditional ml.

• Neural Networks are inspired by human

• The Z (weighted sum of inputs) is given to the activation

• SOME COMMON ACTIVATION FUNCTIONS :

SIMPLE ARCHITECTURE OF A NEURAL NET

You might also like