UNIT 2
UNIT 2
ALGORITHM
Unit-II
Linearregression
Before we start LinearRegression we have to perform cleaning and
initial data analysis by
Spain 52000 No France 48 79000 Yes Germany 50 83000 No France 37 67000 Yes
Purchas ed
Linearregression us simply the linear
• Regression gives
Country Salary
relationship of two or more variables within a
dataset.
• We have a dependent variable and independent
variables.
France 44 72000 No Spain 27 48000 Yes Germany 30 54000 No Spain 38 61000 No Germany 40 56400 Yes France
35 58000 Yes Spain 50 52000 No France 48 79000 Yes Germany 50 83000 No France 37 67000 Yes
• Linear relationship between variables means that when the value of one
or more independent variables will change (increase or decrease), the
value of dependent variable will also change accordingly (increase or
decrease).
• Mathematically the relationship can be represented with the help of
following equation :
Y = mX + c
• Here, Y is the dependent variable we are trying to predict. • X is the
independent variable we are using to make predictions. • m is the slop
of the regression line which represents the effect X has on Y • c is a
constant.
Linear
regression
Linear regression model represents the
linear relationship between a
dependent variable and
independent variable(s) via a
sloped straight line.
Positive
Linear
Relationship
A linear
relationship will be called
positive if both independent
and dependent variable
increases.
Types of Linear regression
Negative Linear
Relationship
A linear relationship
will be called negative
if independent
increases and dependent
variable decreases.
Simple LinearRegression(SLR)
Case-02: β1 = 0
• It indicates that variable X has
no impact on Y.
• If X changes, there will be no
change in Y
Simple Linear Regression
Case-03: β1 > 0
• It indicates that variable X
has positive impact on Y.
• If X increases, Y will
increase and vice-versa.
• Ex: the weight of a person
is depend on height of a
person.
Multiple LinearRegression
• In multiple linear regression, the dependent variable
depends on more than one independent variables. • For
multiple linear regression, the form of the model is Y = β0
+ β1X1 + β2X2 + β3X3 + …… + βnXn Here,
• Y is a dependent variable.
• X1, X2, …., Xn are independent variables. • β0, β1,…,
βn are the regression coefficients • βj (1<=j<=n) is the
slope or weight that specifies the factor by which Xj has
an impact on Y.
Multiple LinearRegression
Y = β0 + β1X1 + β2X2 + β3X3 + ……
+ βnXn
Exp: Price of Flat depend on size of flat, floor ,
location, module kitchen etc.
Y= 0.9+ 1.2 . X1+ 2 . X2 + 4. X3 + 1 . X4
It is indication that which factor is more
important for predicting price of flat
(Y).
Multiple LinearRegression
Y = β0 + β1X1 + β2X2 + β3X3 + ……
+ βnXn
● If your linear regression model cannot model the relationship between the
target variable and the predictor variable?
Polynomial Regression
● Polynomial regression fits a nonlinear relationship between the value of x and the
corresponding conditional mean of y, denoted E(y |x)
Finding theminimum
loss
w
How can we do this for a function?
One approach: gradientdescent
Gradient descent is an optimization algorithm
used to find the values of parameters
(coefficients) of a function (f) that minimizes a
cost function (cost).
Gradient descent is best used when the
parameters cannot be calculated analytically
(e.g. using linear algebra) and must be searched
for by an optimization algorithm.
One approach: gradientdescent
• repeat:
• pick a dimension
• move a small amount in
w
that dimension towards decreasing
loss (using the derivative)
Gradient DescentProcedure
■ pick a dimension
■ move a small amount in that dimension towards decreasing
loss (using the derivative)
d
wn = wo −η dwL
What does this do?
Gradientdescent
○ pick a starting point(w)
■ pick a dimension
d
wj = wj −η dwjloss(w)
Decisiontrees
DataSet
1 Old Yes SW Down 2 Old No SW Down 3 Old No
Down 6 Mid No HW Up
HW Up 10 new No SW Up
IG = - [ ����/(����+����) 〖 log〗_2
● Finally , find the Gain for all attribute ( here for 3 attributes ), those Gain will be
greatest, we should called it as Root of Tree.
2 Old No SW Down
3 Old No HW Down
6 Mid No HW Up
7 Mid No SW Up
8 New Yes SW Up
9 New No HW Up
10 New No SW Up
DecisionTree
Old Down UP
3 0
Mid 2 2
New 0 3
Now, Find the Entropy of each attributes Age
= Lets , start with attribute
Sr Age Competiti Type Profit
.N on
o
2 Old No SW Down
3 Old No HW Down
6 Mid No HW Up
7 Mid No SW Up
8 New Yes SW Up
9 New No HW Up
10 New No SW Up
DecisionTree Where,
=0 6 Mid No HW Up
Old Down UP
7 Mid No SW Up
3 0
8 New Yes SW Up
Mid 2 2
9 New No HW Up
New 0 3
10 New No SW Up
Sr Age Competiti Type Profit
.N on
o
2 Old No SW Down
3 Old No HW Down
6 Mid No HW Up
7 Mid No SW Up
8 New Yes SW Up
9 New No HW Up
10 New No SW Up
OLD
Age NEW
MID
Down UP Competition
YES NO
Down UP
DecisionTree
Overfitting
things!
01 Over Fitting
Too MUCH DATA Under Fitting
given 02 so LESS DATA given
to Machine so that to Machine that it
It NOT ABLE to
become CONFUSED Understand Things.
in
Overfitting
“OverFitting" : A hypothesis h is said to overfed the
training data , if there is another hypothesis h’ , such
that h’ has more error then h on training data but, h’
has less error than h on test data.
Exp: If we have small decision tree and it has higher
error in training data and lower error on test data
compare to larger decision tree , which has smaller
error in training data and higher error on test data ,
then we say that overftting has occurred
Overfitting
Over fitting : When model is so complex. Here , your model is trying to cover
every data points of output on X , Y plot.
Underfitting
Underfitting : When model is so simple . Here , your model is trying to cover
very few data points of output on X , Y plot.
Underfitting
Example :Let us consider that , you have to train your model that if any
object looks Sphere in shape , called it is as a Ball.
It’s
Ball
Sphere
Here , we are providing only one attribute to identify the object , i.e. Shape =Sphere
Example :Let us consider that , you have provide large number of attributes
like , Sphere, Play, Not Eat, Radius=5 cm.
Sphere
Play
Not Eat
Radius = 5 cm
● KNN algorithm can be used for both classification and regression predictive problems.
However, it is more widely used in classification problems in the industry. ● KNN algorithm at
the training phase just stores the dataset and when it gets new data, then it classifies that data
into a category that is much similar to the new data.
● KNNworks by finding the distances between a query and all the examples in the data,
selecting the specified number examples (K) closest to the query, then votes for the most
frequent label (in the case of classification) or avers the labels (in the case of regression).
Instance Based Learning K ● To find the distance between these values , we use
Euclidean DistanceA
/ LazyAlgorithm
Nearest Neighbor
1 4 3 Fail X 6 8 ????
2 6 7 Pass
3 7 8 Pass
����= |����01− ��������1|2+ 4 5 5 Fail
|����02− ��������2|2 5 8 8 Pass
Sr Math Comp Result
.N Sci X 6 8 ????
o
1 4 3 Fail
����1 = |6 − 4|2 +|6 − 6|2 + |8 − 7|2
1 4 3 Fail
2 6 7 Pass
3 7 8 Pass
4 5 5 Fail
5 8 8 Pass
X 6 8 Pass
Sr. No Comp Sci
Math Result
2 6 7 Pass
3 7 8 Pass
5 8 8 Pass
3 ������������������������0����������������
3> 0
27.8 76 Yes
28.2 76 Yes
28.7 80 No
30.5 89.9 No
25.9 85 No
36 90 No
31.8 88 Yes
35.7 70 No
KNN 01
How do we
When do we
choosethe factor
‘K’? 02 04 06 useKNN?
03
Use Case:
How does KNN What isKNN? Predict whether
Algorithm work? a personwill
05 need diabetes
Why do we ornot
needKNN?
Why KNN?
Why KNN?
Why KNN?
What isKNN?
What is KNN?
The KNN algorithm assumes that similar things exist in close proximity.
In other words, similar things are near to each other.
What is KNNAlgorithm?
What is KNNAlgorithm?
What is KNNAlgorithm?
What is KNNAlgorithm?
What is KNNAlgorithm?
What is KNNAlgorithm?
What is KNNAlgorithm?
How
dowe
choose
’k’?
How do we choose’k’?