0% found this document useful (0 votes)

127 views

Machine Learning (Part 1) : Iykra Data Fellowship Batch 3

The document provides an introduction to machine learning concepts including supervised and unsupervised learning. It discusses several machine learning algorithms for classification like logistic regression, Naive Bayes, support vector machines, and K-nearest neighbors. It also covers linear regression and decision trees. For each algorithm, it describes the basic concept, working, types, pros and cons, and some applications. The goal of the document is to introduce readers to common machine learning algorithms for classification and regression.

Uploaded by

aril dan

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

127 views

Machine Learning (Part 1) : Iykra Data Fellowship Batch 3

Uploaded by

aril dan

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

Machine

Learning
(Part 1)
IYKRA DATA FELLOWSHIP BATCH 3
Outline
• Introduction to Machine
Learning

• Regression
• Linear Regression

• Classification
• Logistic Regression
• Naïve Bayes
• Support Vector Machine
• K-Nearest Neighbours
• Decision Tree
• Random Forest
[Machine Learning is the] field of
study that gives computers the
ability to learn without explicitly
programmed

- ARTHUR SAMUEL, 1959

Why we use Machine Learning?
Machine learning is great for:
Problems for which existing solutions require a lot of
hand-tuning or long lists of rules: one Machine Learning
algorithm can often simplify code and perform better.
• Complex problems for which there is no good solution
at all using a traditional approach: the best Machine
Learning techniques can find a solution.
• Fluctuating environments: a Machine Learning system
can adapt to new data.
• Getting insights about complex problems and large
amounts of data.
Types of Machine Learning
System
SUPERVISED LEARNING UNSUPERVISED LEARNING
Linear
Regression
The key objective of
regression-based tasks is
to predict output labels or
responses which are
continues numeric values,
for the given input data.
Types of Regression Model
SIMPLE REGRESSION MODEL MULTIPLE REGRESSION MODEL

This is the most basic regression model in As name implies, in this regression model
which predictions are formed from a single, the predictions are formed from multiple
univariate feature of the data. features of the data.
Applications
❖Forecasting or Predictive Analysis
❖Optimization
❖Error Correction
❖Economics
❖Finance
Gradient Descent and Cost
Function
Gradient descent is an optimization
algorithm used to minimize some function
by iteratively moving in the direction of
steepest descent as defined by the negative
of the gradient. In machine learning, we use
gradient descent to update
the parameters of our model. Parameters
refer to coefficients in Linear
Regression and weights in neural networks.
Logistic
Regression
Logistic regression is a
supervised learning
classification algorithm used to
predict the probability of a
target variable. The nature of
target or dependent variable is
dichotomous, which means
there would be only two
possible classes.
In simple words, the dependent
variable is binary in nature
having data coded as either 1
(stands for success/yes) or 0
(stands for failure/no).
Types of Logistic Regression
BINARY OR BINOMIAL MULTINOMIAL ORDINAL

In such a kind of classification, a In such a kind of classification, In such a kind of classification,

dependent variable will have only dependent variable can have 3 or dependent variable can have 3 or
two possible types either 1 and 0. more possible unordered types or more possible ordered types or the
For example, these variables may the types having no quantitative types having a quantitative
represent success or failure, yes or significance. For example, these significance.
no, win or loss etc. variables may represent “Type A”
or “Type B” or “Type C”.
Logistic Regression Assumptions
▪ In case of binary logistic regression, the target
variables must be binary always and the desired
outcome is represented by the factor level 1.

▪There should not be any multi-collinearity in the

model, which means the independent variables must
be independent of each other.

▪We must include meaningful variables in our model.

▪We should choose a large sample size for logistic

regression.
Naïve Bayes
Classification
Naive Bayes algorithm can be
defined as a supervised
classification algorithm
which is based on Bayes
theorem with an assumption
of independence among
features.
Types of Naïve Bayes
GAUSSIAN NAÏVE BAYES MULTINOMIAL NAÏVE BAYES BERNAOULLI NAÏVE BAYES

It is the simplest Naïve Bayes The features are assumed to be Another important model is
classifier having the assumption drawn from a simple Multinomial Bernoulli Naïve Bayes in which
that the data from each label is distribution. features are assumed to be binary
drawn from a simple Gaussian (0s and 1s).
distribution.
Pros and Cons of Naïve Bayes
Algorithm
PROS CONS

o It is easy to understand o Naïve Bayes classification is its strong

feature independence because in real life it
o It can also be trained on small dataset.
is almost impossible to have a set of
o It can make probabilistic predictions and features which are completely independent
can handle continuous as well as discrete of each other.
data.
o It has a ‘Zero conditional probability
o It will converge faster than discriminative Problem’, for features having zero frequency
models like logistic regression. the total probability also becomes zero.
Applications
➢ Real-time prediction

➢Multi-class prediction

➢ Text Classificatiion

➢Recommendation system
Support Vector
Machine
A set of supervised learning
methods which learn from
the dataset and can be used
for both regression and
classification
Working of SVM
• Support Vectors, Datapoints that are closest to the
hyperplane is called support vectors. Separating line
will be defined with the help of these data points
• Hyperplane − As we can see in the above diagram, it is
a decision plane or space which is divided between a
set of objects having different classes.
• Margin − It may be defined as the gap between two
lines on the closet data points of different classes. It
can be calculated as the perpendicular distance from
the line to the support vectors. Large margin is
considered as a good margin and small margin is
considered as a bad margin.
Kernels
Kernel method is used by SVM to
perform a non-linear classification.
They take low dimensional input space
and convert them into high dimensional
input space. It converts non-separable
classes into the separable one, it finds
out a way to separate the data on the
basis of the data labels defined by us.
Pros and Cons associated with
SVM
PROS CONS

It works really well with a clear margin of It doesn’t perform well when we have large
separation data set because the required training time is
higher
It is effective in high dimensional spaces.
It also doesn’t perform very well, when the
It is effective in cases where the number of data set has more noise i.e. target classes are
dimensions is greater than the number of overlapping
samples.
SVM doesn’t directly provide probability
It uses a subset of training points in the estimates, these are calculated using an
decision function (called support vectors), so it expensive five-fold cross-validation. It is
is also memory efficient. included in the related SVC method of Python
scikit-learn library.
K-Nearest
Neighbours
Works by finding the
distances between a query
and all the examples in the
data, selecting the specified
number examples (K) closest
to the query, then votes for
the most frequent label (in
the case of classification) or
averages the labels (in the
case of regression).
Working of KNN
1. Load datasets

2. Choose value of K

3. Calculate the distance between test data and

each row of training data with the help of any of
the method namely: Euclidean, Manhattan or
Hamming distance. The most commonly used
method to calculate distance is Euclidean.

4. Now, it will assign a class to the test point based

on most frequent class of these rows.
Pros and Cons of KNN
PROS CONS

It is very simple algorithm to understand and It is computationally a bit expensive algorithm

interpret. because it stores all the training data.

It is very useful for nonlinear data because there is High memory storage required as compared to other
no assumption about data in this algorithm. supervised learning algorithms.

It is a versatile algorithm as we can use it for Prediction is slow in case of big N.

classification as well as regression.
It is very sensitive to the scale of data as well as
It has relatively high accuracy but there are much irrelevant features.
better supervised learning models than KNN.
Applications of KNN
• Banking System
KNN can be used in banking system to predict weather an
individual is fit for loan approval? Does that individual have the
characteristics similar to the defaulters one?
• Calculating Credit Ratings
KNN algorithms can be used to find an individual’s credit rating
by comparing with the persons having similar traits.
• Politics
With the help of KNN algorithms, we can classify a potential
voter into various classes like “Will Vote”, “Will not Vote”, “Will
Vote to Party ‘Congress’, “Will Vote to Party ‘BJP’.
Decision Tree
A decision tree is a structure
that includes a root node,
branches, and leaf nodes.
Types of Decision Tree
BINARY VARIABLE DECISION TREE CONTINUOUS VARIABLE DECISION TREE

Decision Tree which has binary target Decision Tree has continuous target variable
variable then it called as Binary Variable then it is called as Continuous Variable
Decision Tree. Decision Tree.
Advantages and
Disadvantages of Decision
Tree
ADVANTAGES DISADVANTAGES
• Easy to Understand • Overfit
• Useful in Data Exploration • Not fit for continuous variables
• Less data cleaning required

• Data type is not a constraint

• Non parametric method

Random Forest
It uses decision tree
underneath and forms
multiple trees and eventually
takes majority vote out of it.

Machine Learning 1
No ratings yet
Machine Learning 1
29 pages
Test Lab Guide: Demonstrate Ipv6
No ratings yet
Test Lab Guide: Demonstrate Ipv6
23 pages
Machine learning algorithms laiki
No ratings yet
Machine learning algorithms laiki
123 pages
Lecture - 2 & 3
No ratings yet
Lecture - 2 & 3
62 pages
Machine_Learning
No ratings yet
Machine_Learning
35 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
Chapter 2
No ratings yet
Chapter 2
31 pages
Algorithms 1
No ratings yet
Algorithms 1
23 pages
Classification Algorithms 3rd
No ratings yet
Classification Algorithms 3rd
15 pages
Unit 1
No ratings yet
Unit 1
15 pages
Machine Learning For Beginners PDF
No ratings yet
Machine Learning For Beginners PDF
29 pages
Lesson 8 - Classification
No ratings yet
Lesson 8 - Classification
74 pages
Supervised ML
No ratings yet
Supervised ML
69 pages
Chapter Four
No ratings yet
Chapter Four
75 pages
Unit 5
No ratings yet
Unit 5
28 pages
Unit 4
No ratings yet
Unit 4
23 pages
Tutorial 7 Machine Learning Algorithms
No ratings yet
Tutorial 7 Machine Learning Algorithms
30 pages
AIML
No ratings yet
AIML
30 pages
Classification
No ratings yet
Classification
7 pages
U21amg05 Aif and ML Unit 04 Notes
No ratings yet
U21amg05 Aif and ML Unit 04 Notes
42 pages
DAC ML Tutorial Final Deck
No ratings yet
DAC ML Tutorial Final Deck
150 pages
Supervised Learning
No ratings yet
Supervised Learning
46 pages
41 Machine Learning Algorithms I
No ratings yet
41 Machine Learning Algorithms I
8 pages
Unit - 3
No ratings yet
Unit - 3
83 pages
Machine Learning
100% (6)
Machine Learning
115 pages
Machine Learning Ppts
No ratings yet
Machine Learning Ppts
38 pages
UCS551 Chapter 6 - Classification
No ratings yet
UCS551 Chapter 6 - Classification
20 pages
Unit 4 Supervised Learning
100% (1)
Unit 4 Supervised Learning
75 pages
Module 3 (1)
No ratings yet
Module 3 (1)
63 pages
Refer For KNNDecison Tree SVM
No ratings yet
Refer For KNNDecison Tree SVM
90 pages
Unit-4 AML (1. Basics and K-NN)
No ratings yet
Unit-4 AML (1. Basics and K-NN)
25 pages
Asset-V1 ColumbiaX+CSMM.101x+1T2017+type@asset+block@AI Edx ML 5.1intro
No ratings yet
Asset-V1 ColumbiaX+CSMM.101x+1T2017+type@asset+block@AI Edx ML 5.1intro
70 pages
ICT202B AI ML and Emerging technologies UNIT 3 (Classification and Regression) 2
No ratings yet
ICT202B AI ML and Emerging technologies UNIT 3 (Classification and Regression) 2
23 pages
Machine Learning in A Nutshell
No ratings yet
Machine Learning in A Nutshell
36 pages
CSE-VSEM-503-B-PR-UNIT-2-NOTES
No ratings yet
CSE-VSEM-503-B-PR-UNIT-2-NOTES
17 pages
Classification
No ratings yet
Classification
74 pages
Machine Learning QNA
No ratings yet
Machine Learning QNA
1 page
Machine Learning Theory
100% (1)
Machine Learning Theory
12 pages
Unit 3 big data
No ratings yet
Unit 3 big data
50 pages
Fulldoc - Dsec Mca - Crime Prediction (1) - 051521
No ratings yet
Fulldoc - Dsec Mca - Crime Prediction (1) - 051521
65 pages
Co-2 ML 2019
No ratings yet
Co-2 ML 2019
71 pages
DS ML CompleteSlides PDF
No ratings yet
DS ML CompleteSlides PDF
211 pages
Machine Learning File
No ratings yet
Machine Learning File
7 pages
Supervised Learning - SVM - DT
No ratings yet
Supervised Learning - SVM - DT
43 pages
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
100% (1)
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
13 pages
unit 1
100% (1)
unit 1
13 pages
ML Algorithms Week 3
No ratings yet
ML Algorithms Week 3
30 pages
UNIT-3
No ratings yet
UNIT-3
12 pages
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
No ratings yet
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
17 pages
Evolutional Study On KNN and K-Means Algorithms (SP)
No ratings yet
Evolutional Study On KNN and K-Means Algorithms (SP)
9 pages
ARTIFICIAL INTELLIGENCE LEC 3
No ratings yet
ARTIFICIAL INTELLIGENCE LEC 3
17 pages
AIYA SESSION 4
No ratings yet
AIYA SESSION 4
42 pages
Classification (NaiveBayes KNN SVM DecisionTrees)
No ratings yet
Classification (NaiveBayes KNN SVM DecisionTrees)
105 pages
ML Unit-IV Notes
No ratings yet
ML Unit-IV Notes
49 pages
Module 3
No ratings yet
Module 3
79 pages
Understanding Machine Learning Algorithms - in Depth
No ratings yet
Understanding Machine Learning Algorithms - in Depth
167 pages
The KNN
No ratings yet
The KNN
31 pages
ML notes
No ratings yet
ML notes
10 pages
ML QA
No ratings yet
ML QA
10 pages
Session 5 ppt
No ratings yet
Session 5 ppt
36 pages
Naive Bayes Classifier: Fundamentals and Applications
From Everand
Naive Bayes Classifier: Fundamentals and Applications
Fouad Sabry
No ratings yet
Hand Book - QS
No ratings yet
Hand Book - QS
11 pages
Resume Juan José Aguilar Cárcamo
No ratings yet
Resume Juan José Aguilar Cárcamo
4 pages
KKK 5 Exp
No ratings yet
KKK 5 Exp
11 pages
SystemVerilog Interface
100% (1)
SystemVerilog Interface
62 pages
Caddo Spectrum Analyzer 80058005TG
No ratings yet
Caddo Spectrum Analyzer 80058005TG
27 pages
Gul in Space
No ratings yet
Gul in Space
22 pages
5600 Series Relay Data Sheet: Other Features and Advantages
No ratings yet
5600 Series Relay Data Sheet: Other Features and Advantages
4 pages
Nptel Odd
No ratings yet
Nptel Odd
8 pages
MECH 351 Lab 2-2-11
No ratings yet
MECH 351 Lab 2-2-11
10 pages
Statement of Account: Date Narration Chq./Ref - No. Value DT Withdrawal Amt. Deposit Amt. Closing Balance
No ratings yet
Statement of Account: Date Narration Chq./Ref - No. Value DT Withdrawal Amt. Deposit Amt. Closing Balance
3 pages
Portfolio in Cem Bulalacao
No ratings yet
Portfolio in Cem Bulalacao
132 pages
Company Catalog: PT Damai Abadi
No ratings yet
Company Catalog: PT Damai Abadi
34 pages
Wash Bay BMP-Gwinnett
No ratings yet
Wash Bay BMP-Gwinnett
6 pages
Standard Power Cables: Afumex 90 Armoured Power Cable
No ratings yet
Standard Power Cables: Afumex 90 Armoured Power Cable
12 pages
Data Analysis Cases With Real Data Case 3.1
No ratings yet
Data Analysis Cases With Real Data Case 3.1
5 pages
Data Sheet: TDA1558Q
No ratings yet
Data Sheet: TDA1558Q
11 pages
The Team Dynamics Survey: Instructions
No ratings yet
The Team Dynamics Survey: Instructions
5 pages
B&G 1510 Standard Design Parts List
No ratings yet
B&G 1510 Standard Design Parts List
51 pages
HCL Microsoft Healthcare Webinar IoT WoRKS
No ratings yet
HCL Microsoft Healthcare Webinar IoT WoRKS
4 pages
560-000-Pi-T-001 - 1 - Simbologia PDF
100% (1)
560-000-Pi-T-001 - 1 - Simbologia PDF
1 page
DIY Servo With Arduino, DC Motor, and Potentiometer Drone Colony
No ratings yet
DIY Servo With Arduino, DC Motor, and Potentiometer Drone Colony
9 pages
Competitor_Analysis_Updated
No ratings yet
Competitor_Analysis_Updated
3 pages
10 Bibliography
No ratings yet
10 Bibliography
8 pages
Wayne Dispenser Security 2014-09-23
No ratings yet
Wayne Dispenser Security 2014-09-23
2 pages
Learning Outcome: Chapter 6: Analysis of Structures Topic 6.0:leaning Outcome Leave Blank
No ratings yet
Learning Outcome: Chapter 6: Analysis of Structures Topic 6.0:leaning Outcome Leave Blank
23 pages
Extension Mobility For Login & Logout Feature
No ratings yet
Extension Mobility For Login & Logout Feature
2 pages
Aarenet Carrier Add On Service Overview v2
100% (1)
Aarenet Carrier Add On Service Overview v2
36 pages
Bspconvar Whitelist
No ratings yet
Bspconvar Whitelist
8 pages
Alpha 2 XL Programming Manual
No ratings yet
Alpha 2 XL Programming Manual
188 pages

Machine Learning (Part 1) : Iykra Data Fellowship Batch 3

Uploaded by

Machine Learning (Part 1) : Iykra Data Fellowship Batch 3

Uploaded by

Machine

- ARTHUR SAMUEL, 1959

In such a kind of classification, a In such a kind of classification, In such a kind of classification,

▪There should not be any multi-collinearity in the

▪We must include meaningful variables in our model.

▪We should choose a large sample size for logistic

o It is easy to understand o Naïve Bayes classification is its strong

3. Calculate the distance between test data and

4. Now, it will assign a class to the test point based

It is very simple algorithm to understand and It is computationally a bit expensive algorithm

It is a versatile algorithm as we can use it for Prediction is slow in case of big N.

• Data type is not a constraint

• Non parametric method

You might also like