0% found this document useful (0 votes)

10 views

UNIT 2

Uploaded by

Sahil Banswani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

UNIT 2

Uploaded by

Sahil Banswani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 79

MACHINE LEARNING

ALGORITHM
Unit-II
Linearregression
Before we start LinearRegression we have to perform cleaning and
initial data analysis by

● Look at the summary of numerical variables.

● See the distribution ofvariables ● Look for possible correlation
● Explore any possible outliers
● Look for data errors with data sanity. ● Make sure data types are
correct.
Country Salary Purchased France 44 72000 No

Spain 27 48000 Yes Germany 30 54000 No

Spain 38 61000 No Germany 40 Yes France 35 58000 Yes

Spain 52000 No France 48 79000 Yes Germany 50 83000 No France 37 67000 Yes

Purchas ed
Linearregression us simply the linear

• Regression gives
Country Salary
relationship of two or more variables within a
dataset.
• We have a dependent variable and independent
variables.
France 44 72000 No Spain 27 48000 Yes Germany 30 54000 No Spain 38 61000 No Germany 40 56400 Yes France
35 58000 Yes Spain 50 52000 No France 48 79000 Yes Germany 50 83000 No France 37 67000 Yes

• Linear relationship between variables means that when the value of one
or more independent variables will change (increase or decrease), the
value of dependent variable will also change accordingly (increase or
decrease).
• Mathematically the relationship can be represented with the help of
following equation :
Y = mX + c
• Here, Y is the dependent variable we are trying to predict. • X is the
independent variable we are using to make predictions. • m is the slop
of the regression line which represents the effect X has on Y • c is a
constant.

Linear
regression
Linear regression model represents the
linear relationship between a
dependent variable and
independent variable(s) via a
sloped straight line.

Types of Linear regression

Positive
Linear
Relationship

A linear
relationship will be called
positive if both independent
and dependent variable
increases.
Types of Linear regression
Negative Linear
Relationship

A linear relationship
will be called negative
if independent
increases and dependent
variable decreases.
Simple LinearRegression(SLR)

● It is the most basic version of linear regression,

which predicts a response using a single feature.
The assumption in SLR is that the two variables
are linearly related.
● In simple linear regression, the dependent
variable depends only on a single independent
variable
● For simple linear regression, the form of the model is :
Y = β0 + β1X
Here,
• Y is a dependent variable.
• X is an independent variable.
• β0 and β1 are the regression
coefficients.
• β0 isthe intercept point
• β1 is the slope of line
Simple Linear Regression
• There are following 3 cases
possible
Case-01: β1 < 0
• It indicates that variable X
has negative impact on Y. •
If X increases, Y will
decrease and vice-versa.
Simple Linear Regression

Case-02: β1 = 0
• It indicates that variable X has
no impact on Y.
• If X changes, there will be no
change in Y
Simple Linear Regression

Case-03: β1 > 0
• It indicates that variable X
has positive impact on Y.
• If X increases, Y will
increase and vice-versa.
• Ex: the weight of a person
is depend on height of a
person.
Multiple LinearRegression
• In multiple linear regression, the dependent variable
depends on more than one independent variables. • For
multiple linear regression, the form of the model is Y = β0
+ β1X1 + β2X2 + β3X3 + …… + βnXn Here,
• Y is a dependent variable.
• X1, X2, …., Xn are independent variables. • β0, β1,…,
βn are the regression coefficients • βj (1<=j<=n) is the
slope or weight that specifies the factor by which Xj has
an impact on Y.
Multiple LinearRegression
Y = β0 + β1X1 + β2X2 + β3X3 + ……
+ βnXn
Exp: Price of Flat depend on size of flat, floor ,
location, module kitchen etc.
Y= 0.9+ 1.2 . X1+ 2 . X2 + 4. X3 + 1 . X4
It is indication that which factor is more
important for predicting price of flat
(Y).
Multiple LinearRegression
Y = β0 + β1X1 + β2X2 + β3X3 + ……
+ βnXn

Let , X3 is most important factor for

this prediction ,so keeping the
regression coefficients value 4 for X3.
Linearregression
Linearregression
Linearregression On the basis of size of house to predict selling price of a house
Linearregression
Linearregression
Linearregression
Polynomial Regression

● If your linear regression model cannot model the relationship between the
target variable and the predictor variable?

● In other words, what if they don’t have a linear relationship?

● Polynomial regression is a special case of linear regression where we fit a

polynomial equation on the data with a curvilinear relationship between the
target variable and the independent variables.

● Polynomial Regression is a form of linear regression in which the relationship

between the independent variable x and dependent variable y is modeled as
an nth degree polynomial.

Polynomial Regression
● Polynomial regression fits a nonlinear relationship between the value of x and the
corresponding conditional mean of y, denoted E(y |x)

Finding theminimum
loss

w
How can we do this for a function?
One approach: gradientdescent
Gradient descent is an optimization algorithm
used to find the values of parameters
(coefficients) of a function (f) that minimizes a
cost function (cost).
Gradient descent is best used when the
parameters cannot be calculated analytically
(e.g. using linear algebra) and must be searched
for by an optimization algorithm.
One approach: gradientdescent

Partial derivatives give

us the slope
loss
(i.e. direction to move) in
that dimension w

One approach: gradientdescent

Approach:
• pick a starting point (w) loss

• repeat:
• pick a dimension
• move a small amount in
w
that dimension towards decreasing
loss (using the derivative)
Gradient DescentProcedure

● The derivative of the cost is calculated. ● The

derivative is a concept from calculus and refers to the
slope of the function at a given
point.
● We need to know the slope so that we know the
direction (sign) to move the coefficient values in
order to get a lower cost on the next iteration.
delta = derivative(cost)
○ pick a starting point(w)
Gradientdescent
○ repeat until loss doesn’t decrease in all dimensions:

■ pick a dimension
■ move a small amount in that dimension towards decreasing
loss (using the derivative)
d
wn = wo −η dwL
What does this do?

Gradientdescent
○ pick a starting point(w)

○ repeat until loss doesn’t decrease in all dimensions:

■ pick a dimension

■ move a small amount in that dimension towards decreasing loss

(using the derivative)

d
wj = wj −η dwjloss(w)

learning rate (how much we want to move in the

error direction, often this will change over time)

Decisiontrees

● In decision analysis, a decision tree can be used to

visually and explicitly
represent decisions and decision making. ●
Decision trees can be constructed by an
algorithmic approach that can split the dataset in
different ways based on different conditions.
● Decisions trees are the most powerful algorithms
that falls under the category of supervised
algorithms.
● The two main entities of a tree are decision
nodes, where the data is split and leaves, where
we got outcome.
● The example of a binary tree for predicting
whether a person will get Loan or Not based on
his Income, Credit Score etc.
Before we started to design Decision tree,
following four steps are important
1. To find the Target / class attribute ( Profit
) 2.To Find the Information Gain of Target
attribute
3.To find the Entropy ( for deciding root of tree)
4.At the end find the Gain of each attribute
Age Competition Type
Profit
Sr. No

DataSet
1 Old Yes SW Down 2 Old No SW Down 3 Old No

HW Down 4 Mid Yes SW Down 5 Mid Yes HW

Down 6 Mid No HW Up

7 Mid No SW Up 8 New Yes SW Up 9 New No

HW Up 10 new No SW Up

● Now, Find the Information Gain of target attribute

IG = - [ ��/(��+��) 〖 log〗_2

(��/(��+��))− ��/(��+��) 〖 log〗

_2 (��/(��+��)) ] where P=down , N= Up

● Then find the Entropy of given attribute

i.e. Information gain of Attribute X Probability of that
attribute

● Finally , find the Gain for all attribute ( here for 3 attributes ), those Gain will be
greatest, we should called it as Root of Tree.

Gain =IG- E(A)

Implementing Decision TreeAlgorithm
Now, Find Information Gain of target attribute
Sr. Age Compet Type Profit
No iti on
1 Old Yes SW Down

2 Old No SW Down

3 Old No HW Down

4 Mid Yes SW Down

5 Mid Yes HW Down

6 Mid No HW Up

7 Mid No SW Up

8 New Yes SW Up

9 New No HW Up

10 New No SW Up
DecisionTree
Old Down UP

3 0

Mid 2 2

New 0 3
Now, Find the Entropy of each attributes Age
= Lets , start with attribute
Sr Age Competiti Type Profit
.N on
o

1 Old Yes SW Down

2 Old No SW Down

3 Old No HW Down

4 Mid Yes SW Down

5 Mid Yes HW Down

6 Mid No HW Up

7 Mid No SW Up

8 New Yes SW Up

9 New No HW Up

10 New No SW Up

DecisionTree Where,

Gain (Age) = IG- E(Age)

Now, find the Gain of all attribute
= 1- 0.4
= 0.6
Gain( Competition) = IG- E(Competition) = 2 Old No SW Down

0.124 3 Old No HW Down

4 Mid Yes SW Down

Gain( Type ) = IG- E(Type) 5 Mid Yes HW Down

=0 6 Mid No HW Up
Old Down UP
7 Mid No SW Up
3 0
8 New Yes SW Up
Mid 2 2
9 New No HW Up
New 0 3
10 New No SW Up
Sr Age Competiti Type Profit
.N on
o

1 Old Yes SW Down

Implementing Decision TreeAlgorithm

DecisionTree
Sr Age Competiti Type Profit
.N on
o

1 Old Yes SW Down

2 Old No SW Down

3 Old No HW Down

4 Mid Yes SW Down

5 Mid Yes HW Down

6 Mid No HW Up

7 Mid No SW Up

8 New Yes SW Up

9 New No HW Up

10 New No SW Up
OLD
Age NEW
MID

Down UP Competition

YES NO

Down UP

DecisionTree
Overfitting
things!

01 Over Fitting
Too MUCH DATA Under Fitting
given 02 so LESS DATA given
to Machine so that to Machine that it
It NOT ABLE to
become CONFUSED Understand Things.
in
Overfitting
“OverFitting" : A hypothesis h is said to overfed the
training data , if there is another hypothesis h’ , such
that h’ has more error then h on training data but, h’
has less error than h on test data.
Exp: If we have small decision tree and it has higher
error in training data and lower error on test data
compare to larger decision tree , which has smaller
error in training data and higher error on test data ,
then we say that overftting has occurred
Overfitting
Over fitting : When model is so complex. Here , your model is trying to cover
every data points of output on X , Y plot.
Underfitting
Underfitting : When model is so simple . Here , your model is trying to cover
very few data points of output on X , Y plot.
Underfitting
Example :Let us consider that , you have to train your model that if any
object looks Sphere in shape , called it is as a Ball.
It’s
Ball
Sphere

Here , we are providing only one attribute to identify the object , i.e. Shape =Sphere
Example :Let us consider that , you have provide large number of attributes
like , Sphere, Play, Not Eat, Radius=5 cm.
Sphere

Play

Not Eat

Radius = 5 cm

Here , we are providing lots of attributes to identify theobject.

Instance Based Learning / LazyAlgorithm
K Nearest Neighbor Algorithm

● KNN algorithm can be used for both classification and regression predictive problems.
However, it is more widely used in classification problems in the industry. ● KNN algorithm at
the training phase just stores the dataset and when it gets new data, then it classifies that data
into a category that is much similar to the new data.

● KNNworks by finding the distances between a query and all the examples in the data,
selecting the specified number examples (K) closest to the query, then votes for the most
frequent label (in the case of classification) or avers the labels (in the case of regression).

Instance Based Learning / LazyAlgorithm

K Nearest Neighbor Algorithm

Let’s take a simple case to understand this algorithm. Following is a

spread of red circles (RC) and green squares (GS)
Instance Based Learning / LazyAlgorithm
K Nearest Neighbor Algorithm

● Let, to find out the class of the blue star (BS).

● BS can either be RC or GS and nothing else. The “K” is KNN algorithm is the nearest
neighbor we wishto take the vote from.
● Let’s say K = 3. Hence, we will now make a circle with BS as the center just as big as to
enclose only three datapoints onthe plane.
● Refer to the following diagram for moredetails:

● The three closest points to BS is all RC.

● Hence, with a good confidence level, we can say that the BS should belong to the class
RC.
● Here, the choice became very obvious as all three votes from the closest neighbor
went to RC

Instance Based Learning K ● To find the distance between these values , we use
Euclidean DistanceA
/ LazyAlgorithm
Nearest Neighbor

Algorithm ● Let, us take an another example

● Query : X=(Math=6, Comp Sci=8) ,Is students Pass
or Fail ? ��= |��01− ��1|2+
● Here , we take K=3 any random value of K to find |��02− ��2|2
out nearest neighbors
Where , Xo is observed value
Xa is actual value 3 7 8 Pass
Sr Math Comp Result 4 5 5 Fail
.N Sci
o 5 8 8 Pass

1 4 3 Fail X 6 8 ????

2 6 7 Pass

Instance Based Learning / LazyAlgorithm

K Nearest Neighbor Algorithm 2 6 7 Pass

3 7 8 Pass
��= |��01− ��1|2+ 4 5 5 Fail
|��02− ��2|2 5 8 8 Pass
Sr Math Comp Result
.N Sci X 6 8 ????
o

1 4 3 Fail
��1 = |6 − 4|2 +|6 − 6|2 + |8 − 7|2

|8 − 3|2 ��2 = = 29 = 5.38 = 1

��3 = |6 − 7|2 + |8 − 8|2 = 1

��4 = |6 − 5|2 + |8 − 5|2= 10 =

3.16 ��5 = |6 − 8|2 + |8 − 8|2 = 2

Instance Based Learning / LazyAlgorithm

K Nearest Neighbor Algorithm Sr Math Comp Result
.N Sci
Three NN are (1,1,2,) o

1 4 3 Fail
2 6 7 Pass

3 7 8 Pass

4 5 5 Fail

5 8 8 Pass

X 6 8 Pass
Sr. No Comp Sci
Math Result

2 6 7 Pass
3 7 8 Pass

5 8 8 Pass
3 ��0��
3> 0

Instance Based Learning /

LazyAlgorithm
K Nearest Neighbor Algorithm

Temp(X) in C Humidity (Y) % Rain

Condition

27.8 76 Yes

28.2 76 Yes

28.7 80 No

28.6 81.6 Yes

27.7 89.4 Yes

30.5 89.9 No

26.7 81.4 Yes

25.9 85 No
36 90 No

31.8 88 Yes

35.7 70 No

Using KNN algorithm find the Rain Condition, Let K=3

When Temp: 29.6 C and Humidity: 78%

KNN 01
How do we
When do we
choosethe factor
‘K’? 02 04 06 useKNN?
03

Use Case:
How does KNN What isKNN? Predict whether
Algorithm work? a personwill
05 need diabetes
Why do we ornot
needKNN?
Why KNN?
Why KNN?
Why KNN?
What isKNN?
What is KNN?
The KNN algorithm assumes that similar things exist in close proximity.
In other words, similar things are near to each other.

What is KNNAlgorithm?
What is KNNAlgorithm?

What is KNNAlgorithm?
What is KNNAlgorithm?
What is KNNAlgorithm?
What is KNNAlgorithm?
What is KNNAlgorithm?
How
dowe
choose
’k’?
How do we choose’k’?

Applying The CRISP-DM Framework For Teaching Business Analytics
No ratings yet
Applying The CRISP-DM Framework For Teaching Business Analytics
20 pages
MATHSPRELIMS
No ratings yet
MATHSPRELIMS
15 pages
Mla Unit 2
No ratings yet
Mla Unit 2
99 pages
Supervised Learning Algorithms
No ratings yet
Supervised Learning Algorithms
20 pages
MACHINE LEARNING ALGORITHM Unit-II
No ratings yet
MACHINE LEARNING ALGORITHM Unit-II
115 pages
Predictive Analytics (2)
No ratings yet
Predictive Analytics (2)
46 pages
Unit 2 - NOTES1 - ML
No ratings yet
Unit 2 - NOTES1 - ML
35 pages
Forecasting and Learning Theory
No ratings yet
Forecasting and Learning Theory
46 pages
Regression Models: by Mayuri Bhandari
No ratings yet
Regression Models: by Mayuri Bhandari
64 pages
Aiml Unit 3
No ratings yet
Aiml Unit 3
9 pages
s&Ml Unit 5- q & A
No ratings yet
s&Ml Unit 5- q & A
15 pages
ML Unit-2
No ratings yet
ML Unit-2
123 pages
41 Machine Learning Algorithms I
No ratings yet
41 Machine Learning Algorithms I
8 pages
Ch-2 Supervised Machine Learning
No ratings yet
Ch-2 Supervised Machine Learning
48 pages
Regression
No ratings yet
Regression
45 pages
chp6 (10) fam
No ratings yet
chp6 (10) fam
24 pages
Machine Learning - SoS 2017
No ratings yet
Machine Learning - SoS 2017
15 pages
fileml
No ratings yet
fileml
54 pages
UNIT3 Machine Learning
No ratings yet
UNIT3 Machine Learning
53 pages
Final Ml
No ratings yet
Final Ml
54 pages
ML UNIT II
No ratings yet
ML UNIT II
30 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
Machine Learning: Bilal Khan
100% (2)
Machine Learning: Bilal Khan
20 pages
Linear Regression
No ratings yet
Linear Regression
36 pages
Linear-Regression ML
No ratings yet
Linear-Regression ML
36 pages
ML Lecture - 3
No ratings yet
ML Lecture - 3
47 pages
UNIt-3 TY
No ratings yet
UNIt-3 TY
67 pages
Machine learning
No ratings yet
Machine learning
62 pages
ML DL NLP Definitions
No ratings yet
ML DL NLP Definitions
22 pages
Essentials of Linear Regression in Python
No ratings yet
Essentials of Linear Regression in Python
23 pages
5.REGRESSION-1
No ratings yet
5.REGRESSION-1
46 pages
Machine Learning Algorithns - Unit3
No ratings yet
Machine Learning Algorithns - Unit3
124 pages
Slide 1
No ratings yet
Slide 1
29 pages
L4a - Supervised Learning
No ratings yet
L4a - Supervised Learning
25 pages
Unit No. 2
No ratings yet
Unit No. 2
30 pages
MECH4403 LR Week04
No ratings yet
MECH4403 LR Week04
25 pages
Machine Learning
No ratings yet
Machine Learning
115 pages
Chapter 6 Supervised Learning
No ratings yet
Chapter 6 Supervised Learning
6 pages
Supervised Learning
No ratings yet
Supervised Learning
24 pages
Unit III Da Notes
No ratings yet
Unit III Da Notes
43 pages
Module 5
No ratings yet
Module 5
48 pages
LinearRegression1 210720 171800
No ratings yet
LinearRegression1 210720 171800
41 pages
(Machine Learning Coursera) Lecture Note Week 1
No ratings yet
(Machine Learning Coursera) Lecture Note Week 1
8 pages
ML MU Unit 3RegressionTechniquespdf 2025 02-07-10!56!37 (1)
No ratings yet
ML MU Unit 3RegressionTechniquespdf 2025 02-07-10!56!37 (1)
115 pages
Unit 2 (3)
No ratings yet
Unit 2 (3)
100 pages
Linear Regression
No ratings yet
Linear Regression
64 pages
Unit-4 DS Student
No ratings yet
Unit-4 DS Student
43 pages
Unit I
No ratings yet
Unit I
14 pages
Regression
No ratings yet
Regression
45 pages
Regression: Unit Iii
No ratings yet
Regression: Unit Iii
54 pages
-18-Linear Regression
No ratings yet
-18-Linear Regression
29 pages
Machine Learning: Introduction and Linear Regression
No ratings yet
Machine Learning: Introduction and Linear Regression
29 pages
ML_AI
No ratings yet
ML_AI
53 pages
Linear Regression
No ratings yet
Linear Regression
62 pages
Linear Regression
No ratings yet
Linear Regression
37 pages
GradientDescent-Regression_slides
No ratings yet
GradientDescent-Regression_slides
26 pages
2.1 Linear Regression
No ratings yet
2.1 Linear Regression
39 pages
unit-3
No ratings yet
unit-3
30 pages
ML Algorithms Week 3
No ratings yet
ML Algorithms Week 3
30 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Fundamental Math
From Everand
Fundamental Math
Russell Pead
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Least Squares Estimation PDF
No ratings yet
Least Squares Estimation PDF
5 pages
Subtasking-of-MELCs-SHS - STATISTICS AND PROBABILITY
No ratings yet
Subtasking-of-MELCs-SHS - STATISTICS AND PROBABILITY
17 pages
Brand Trust and Brand Loyalty, An Empirical Study in Indonesia Consumers
No ratings yet
Brand Trust and Brand Loyalty, An Empirical Study in Indonesia Consumers
12 pages
ML Unit 2 CSE
No ratings yet
ML Unit 2 CSE
160 pages
Estimating Long Run Effects in Models With Cross-Sectional Dependence Using Xtdcce2
No ratings yet
Estimating Long Run Effects in Models With Cross-Sectional Dependence Using Xtdcce2
37 pages
Kyplot Research PDF
No ratings yet
Kyplot Research PDF
10 pages
Practical Example Full Notes
No ratings yet
Practical Example Full Notes
48 pages
Syllabus
No ratings yet
Syllabus
73 pages
Full Download Probability Statistics and Data A Fresh Approach Using R 1st Edition Darrin Speegle PDF DOCX
No ratings yet
Full Download Probability Statistics and Data A Fresh Approach Using R 1st Edition Darrin Speegle PDF DOCX
50 pages
Machine Learning
No ratings yet
Machine Learning
135 pages
0166927S21 Ljung
No ratings yet
0166927S21 Ljung
7 pages
Final Documentation
No ratings yet
Final Documentation
54 pages
Instant Download Modern Statistics with R From Wrangling and Exploring Data to Inference and Predictive Modelling Second Edition Måns Thulin PDF All Chapters
100% (1)
Instant Download Modern Statistics with R From Wrangling and Exploring Data to Inference and Predictive Modelling Second Edition Måns Thulin PDF All Chapters
86 pages
Guidebook To Data Analyst
No ratings yet
Guidebook To Data Analyst
51 pages
Automobile
No ratings yet
Automobile
15 pages
12 - Analysis of Determinants Affecting Financial Inclusion in Indonesia
No ratings yet
12 - Analysis of Determinants Affecting Financial Inclusion in Indonesia
8 pages
Master's Thesis How Electronic Toll Collection Deployment Affects Transport System Efficiency: The Study of Indonesia Toll Road Network
No ratings yet
Master's Thesis How Electronic Toll Collection Deployment Affects Transport System Efficiency: The Study of Indonesia Toll Road Network
51 pages
RBC Statistics Overview RBC
No ratings yet
RBC Statistics Overview RBC
31 pages
Data Presentation and Analysis
100% (1)
Data Presentation and Analysis
22 pages
Leslie Salt Property Project Report
No ratings yet
Leslie Salt Property Project Report
10 pages
2nd Year Syllabus
No ratings yet
2nd Year Syllabus
12 pages
Managing Management Graduates' Give Back Intentions An Empirical Study, Part I
No ratings yet
Managing Management Graduates' Give Back Intentions An Empirical Study, Part I
8 pages
Cu Stat3008 Assignment 1
0% (1)
Cu Stat3008 Assignment 1
2 pages
Data Science Pipeline, EDA & Data Preparation
No ratings yet
Data Science Pipeline, EDA & Data Preparation
14 pages
[Ebooks PDF] download Introduction to Data Analysis with R for Forensic Scientists International Forensic Science and Investigation 1st Edition Curran full chapters
100% (2)
[Ebooks PDF] download Introduction to Data Analysis with R for Forensic Scientists International Forensic Science and Investigation 1st Edition Curran full chapters
33 pages
ACEMOGLU (2005) (Atlantic Trade)
No ratings yet
ACEMOGLU (2005) (Atlantic Trade)
35 pages
Biostatistics Team Project 2
0% (1)
Biostatistics Team Project 2
8 pages
Ewan
No ratings yet
Ewan
144 pages