0% found this document useful (0 votes)

12 views

1_logistic_regression

The document discusses the implementation of logistic regression for binary classification using Python, including data preparation, model formulation, and parameter optimization techniques such as gradient descent. It covers the sigmoid function, loss functions, and various approaches to finding model parameters, along with testing methods to ensure accuracy. The document also highlights challenges faced during implementation, such as convergence issues and the importance of selecting appropriate learning rates.

Uploaded by

thisisakshatsingh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

1_logistic_regression

Uploaded by

thisisakshatsingh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

In [0]: import numpy as np

import pandas as pd
import seaborn as sns
from pylab import rcParams
import matplotlib.pyplot as plt
from collections import OrderedDict
from scipy.special import expit
import unittest

%matplotlib inline

sns.set(style='whitegrid', palette='muted', font_scale=1.5)

rcParams['figure.figsize'] = 14, 8

RANDOM_SEED = 42

np.random.seed(RANDOM_SEED)

def run_tests():
unittest.main(argv=[''], verbosity=1, exit=False)

Your data
In [0]: data = OrderedDict(
amount_spent = [50, 10, 20, 5, 95, 70, 100, 200, 0],
send_discount = [0, 1, 1, 1, 0, 0, 0, 0, 1]
)

In [0]: df = pd.DataFrame.from_dict(data)
df

Out[0]: amount_spent send_discount

0 50 0

1 10 1

2 20 1

3 5 1

4 95 0

5 70 0

6 100 0

7 200 0

8 0 1

In [0]: df.plot.scatter(x='amount_spent', y='send_discount', s=108, c="blue");

Making decisions with Logistic regression

Logistic regression is used for classification problems when the dependant/target variable is binary. That is, its values are true or false. Logistic regression is one of the most popular and widely used algorithms
in practice (see this).

Some examples for problems that can be solved with Logistic regression are:

Email - deciding if it is spam or not

Online transactions - fraudelent or not
Tumor - malignant or bening
Customer upgrade - will the customer buy the premium upgrade or not

We want to predict the outcome of a variable y, such that:

y ∈ {0, 1}

and set 0: negative class (e.g. email is not spam) or 1: positive class (e.g. email is spam).

Can't we just use Linear regression?

The response target variable y of the Linear regression model is not restricted within the [0, 1] interval.

Logistic regression model

Given our problem, we want a model that uses 1 variable (predictor) (x1 - amount_spent) to predict whether or not we should send a discount to the customer.

hw (x) = w1 x1 + w0

where the coefficients wi are paramaters of the model. Let the coeffiecient vector W be:

w1
W = ( )
w0

Then we can represent hw (x) in more compact form:

T
hw (x) = w x

That is the Linear regression model.

We want to build a model that outputs values that are between 0 and 1, so we want to come up with a hypothesis that satisfies 0 . For Logistic regression we want to modify this and introduce
≤ hw (x) ≤ 1

another function g:

T
hw (x) = g(w x)

We're going to define g as:

1
g(z) =
−z
1 + e

where z ∈ R g . is also known as the sigmoid function or the logistic function. So, after substition, we end up with this definition:

1
hw (x) =
T
−(w x)
1 + e

for our hypothesis.

A closer look at the sigmoid function

Recall that the sigmoid function is defined as:

1
g(z) =
−z
1 + e

where z ∈ R . Let's translate that to a Python function:

In [0]: def sigmoid(z):

# return 1 / (1 + np.exp(-z))
return expit(z)

In [0]: class TestSigmoid(unittest.TestCase):

def test_at_zero(self):
self.assertAlmostEqual(sigmoid(0), 0.5)

def test_at_negative(self):
self.assertAlmostEqual(sigmoid(-100), 0)

def test_at_positive(self):
self.assertAlmostEqual(sigmoid(100), 1)

In [0]: run_tests()

...
----------------------------------------------------------------------
Ran 3 tests in 0.006s

In [0]: x = np.linspace(-10., 10., num=100)

sig = sigmoid(x)

plt.plot(x, sig, label="sigmoid")

plt.xlabel("x")
plt.ylabel("y")
plt.legend(prop={'size' : 16})
plt.show()

How can we find the parameters for our model?

Let's examine some approaches to find good parameters for our model. But what does good mean in this context?

Loss function
We have a model that we can use to make decisions, but we still have to find the parameters W . To do that, we need an objective measurement of how good some set of parameters are. For that purpose, we
will use a loss (cost) function:
m
1
(i) (i)
J (W ) = ∑ Cost(hw (x ), y )
m
i=1

−log(hw (x)) ify = 1

Cost(hw (x), y) = {
−log(1 − hw (x)) ify = 0

Which is also known as the Log loss or Cross-entropy loss function

We can compress the above function into one:

1
J (W ) = (−y log (hw ) − (1 − y) log (1 − hw ))
m

where

T
hw (x) = g(w x)

Let's implement it in Python:

In [0]: def loss(h, y):

return (-y * np.log(h) - (1 - y) * np.log(1 - h)).mean()

In [0]: class TestLoss(unittest.TestCase):

def test_zero_h_zero_y(self):
self.assertLess(loss(h=0.000001, y=.000001), 0.0001)

def test_one_h_zero_y(self):
self.assertGreater(loss(h=0.9999, y=.000001), 9.0)

def test_zero_h_one_y(self):
self.assertGreater(loss(h=0.000001, y=0.9999), 9.0)

def test_one_h_one_y(self):
self.assertLess(loss(h=0.999999, y=0.999999), 0.0001)

In [0]: run_tests()

.......
----------------------------------------------------------------------
Ran 7 tests in 0.010s

Approach #1 - I'm thinking of a number(s)

Let's think of 3 numbers that represent the coefficients w0 , w1 , w2 .

In [0]: X = df['amount_spent'].astype('float').values
y = df['send_discount'].astype('float').values

def predict(x, w):

return sigmoid(x * w)

def print_result(y_hat, y):

print(f'loss: {np.round(loss(y_hat, y), 5)} predicted: {y_hat} actual: {y}')

y_hat = predict(x=X[0], w=.5)

print_result(y_hat, y[0])

loss: 25.0 predicted: 0.999999999986112 actual: 0.0

I am pretty lazy this approach seems like too much hard work for me.

Approach #2 - Try out many numbers

Alright, these days computers are pretty fast, 6+ core laptops are everywhere also your phones can are pretty performant, too! Let's use that power for good™ and try to find these parameters by just trying out
a lot of numbers:

In [0]: for w in np.arange(-1, 1, 0.1):

y_hat = predict(x=X[0], w=w)
print(loss(y_hat, y[0]))

0.0
0.0
0.0
6.661338147750941e-16
9.359180097590508e-14
1.3887890837434982e-11
2.0611535832696244e-09
3.059022736706331e-07
4.539889921682063e-05
0.006715348489118056
0.6931471805599397
5.006715348489103
10.000045398900186
15.000000305680194
19.999999966169824
24.99999582410784
30.001020555434774
34.945041100449046
inf
inf
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:2: RuntimeWarning: divide by zero encountered in log

Approach #3 - Gradient descent

Gradient descent algorithms (yes, there are a lot of them) provide us with a way to find a minimum of some function f . They work by iteratively going in the direction of the descent as defined by the gradient.

In Machine Learning, we use gradient descent algorithms to find "good" parameters for our models (Logistic Regression, Linear Regression, Neural Networks, etc...).

Somewhat deeper look into how Gradient descent works (Source: PyTorchZeroToAll)

Starting somewhere, we take our first step downhill in the direction specified by the negative gradient. Next, we recalculate the negative gradient and take another step in the direction it specifies. This process
continues until we get to a point where we can no longer move downhill - a local minimum.

First derivative of the sigmoid function

The first derivative of the sigmoid function is given by the following equation:

′
g (z) = g(z)(1 − g(z))

Complete derivation can be found here.

First derivative of the cost function

Recall that the cost function was given by the following equation:

1
J (W ) = (−y log (hw ) − (1 − y) log (1 − hw ))
m

Given g ′ (z) = g(z)(1 − g(z))

Then:

∂J (W ) 1 1
= (y(1 − hw ) − (1 − y)hw )x = (y − hw )x
∂W m m

Updating our parameters W

The parameter updating rule we're going to use is defined by:

1
W := W − α( (y − hw )x)
m

The parameter α is known as learning rate. High learning rate can converge quickly, but risks overshooting the lowest point. Low learning rate allows for confident moves in the direction of the negative
gradient. However, it time-consuming so it will take us a lot of time to get to converge.

Big vs Small learning rate (Source: TowardsDataScoemce)

The gradient descent algorithm

Repeat until convergence {
1. Calculate gradient average
2. Multiply by learning rate
3. Subtract from weights
}

In [0]: def predict(X, W):

return sigmoid(np.dot(X, W))

def fit(X, y, n_iter=100000, lr=0.01):

W = np.zeros(X.shape[1])

for i in range(n_iter):
z = np.dot(X, W)
h = sigmoid(z)
gradient = np.dot(X.T, (h - y)) / y.size
W -= lr * gradient
return W

In [0]: class TestGradientDescent(unittest.TestCase):

def test_correct_prediction(self):
global X
global y
if len(X.shape) != 2:
X = X.reshape(X.shape[0], 1)
w, _ = fit(X, y)
y_hat = predict(X, w).round()
self.assertTrue((y_hat == y).all())

In [0]: run_tests()

E.......
======================================================================
ERROR: test_correct_prediction (__main__.TestGradientDescent)
----------------------------------------------------------------------
Traceback (most recent call last):
File "<ipython-input-15-59757b6f0a6f>", line 8, in test_correct_prediction
w, _ = fit(X, y)
ValueError: not enough values to unpack (expected 2, got 1)

----------------------------------------------------------------------
Ran 8 tests in 0.888s

FAILED (errors=1)

Well, that's not good, after all that hustling we're nowhere near achieving our goal of finding good parameters for our model. But, what went wrong? Let's start by finding whether our algorithm improves over
time. We can use our loss metric for that:

In [0]: def fit(X, y, n_iter=100000, lr=0.01):

W = np.zeros(X.shape[1])
errors = []
for i in range(n_iter):
z = np.dot(X, W)
h = sigmoid(z)
gradient = np.dot(X.T, (h - y)) / y.size
W -= lr * gradient

if(i % 10000 == 0):

e = loss(h, y)
print(f'loss: {e} \t')
errors.append(e)

return W, errors

In [0]: run_tests()

loss: 0.6931471805599453
loss: 0.41899283818630056
loss: 0.41899283818630056
loss: 0.41899283818630056
loss: 0.41899283818630056
loss: 0.41899283818630056
loss: 0.41899283818630056
loss: 0.41899283818630056
loss: 0.41899283818630056
F.......
loss: 0.41899283818630056
======================================================================
FAIL: test_correct_prediction (__main__.TestGradientDescent)
----------------------------------------------------------------------
Traceback (most recent call last):
File "<ipython-input-15-59757b6f0a6f>", line 10, in test_correct_prediction
self.assertTrue((y_hat == y).all())
AssertionError: False is not true

----------------------------------------------------------------------
Ran 8 tests in 0.938s

FAILED (failures=1)

In [0]: _, errors = fit(X, y)

plt.plot(np.arange(len(errors)), errors)
plt.xlabel("iteration^10000")
plt.ylabel("error")
plt.ylim(0, 1)
plt.show()

Good, we found a possible cause for our problem. Our loss doesn't get low enough, in other words, our algorithm gets stuck at some point that is not a good enough minimum for us. How can we fix this?
Perhaps, try out different learning rate or initializing our parameter with a different value?

In [0]: def fit(X, y, n_iter=100000, lr=0.001):

W = np.zeros(X.shape[1])

errors = []

for i in range(n_iter):
z = np.dot(X, W)
h = sigmoid(z)
gradient = np.dot(X.T, (h - y)) / y.size
W -= lr * gradient

if(i % 10000 == 0):

e = loss(h, y)
print(f'loss: {e} \t')
errors.append(e)

return W, errors

In [0]: run_tests()

----------------------------------------------------------------------
Ran 8 tests in 0.903s

FAILED (failures=1)

Hmm, how about adding one more parameter for our model to find/learn?

In [0]: def add_intercept(X):

intercept = np.ones((X.shape[0], 1))
return np.concatenate((intercept, X), axis=1)

def predict(X, W):

X = add_intercept(X)
return sigmoid(np.dot(X, W))

def fit(X, y, n_iter=100000, lr=0.01):

X = add_intercept(X)
W = np.zeros(X.shape[1])

errors = []

for i in range(n_iter):
z = np.dot(X, W)
h = sigmoid(z)
gradient = np.dot(X.T, (h - y)) / y.size
W -= lr * gradient

if(i % 10000 == 0):

e = loss(h, y)
errors.append(e)
return W, errors

In [0]: run_tests()

........
----------------------------------------------------------------------
Ran 8 tests in 0.837s

In [0]: _, errors = fit(X, y)

plt.plot(np.arange(len(errors)), errors)
plt.xlabel("iteration^10000")
plt.ylabel("error")
plt.ylim(0, 1)
plt.show();

Hiding the complexity of the algorithm

Knowing all of the details of the inner workings of the Gradient descent is good, but when solving problems in the wild, we might be hard pressed for time. In those situations, a simple & easy to use interface for
fitting a Logistic Regression model might save us a lot of time. So, let's build one!

But first, let's write some tests:

In [0]: class TestLogisticRegressor(unittest.TestCase):

def test_correct_prediction(self):
global X
global y
X = X.reshape(X.shape[0], 1)
clf = LogisticRegressor()
y_hat = clf.fit(X, y).predict(X)
self.assertTrue((y_hat == y).all())

In [0]: run_tests()

.E.......
======================================================================
ERROR: test_correct_prediction (__main__.TestLogisticRegressor)
----------------------------------------------------------------------
Traceback (most recent call last):
File "<ipython-input-25-b3cbfe737800>", line 7, in test_correct_prediction
clf = LogisticRegressor()
NameError: name 'LogisticRegressor' is not defined

----------------------------------------------------------------------
Ran 9 tests in 0.858s

FAILED (errors=1)

In [0]: class LogisticRegressor:

def _add_intercept(self, X):

intercept = np.ones((X.shape[0], 1))
return np.concatenate((intercept, X), axis=1)

def predict_probs(self, X):

X = self._add_intercept(X)
return sigmoid(np.dot(X, self.W))

def predict(self, X):

return self.predict_probs(X).round()

def fit(self, X, y, n_iter=100000, lr=0.01):

X = self._add_intercept(X)
self.W = np.zeros(X.shape[1])

for i in range(n_iter):
z = np.dot(X, self.W)
h = sigmoid(z)
gradient = np.dot(X.T, (h - y)) / y.size
self.W -= lr * gradient
return self

In [0]: run_tests()

.........
----------------------------------------------------------------------
Ran 9 tests in 1.695s

Using our Regressor to decide who should receive discount codes

Now that you're done with the "hard" part let's use the model to predict whether or not we should send discount codes.

In [0]: X_test = np.array([10, 250])

X_test = X_test.reshape(X_test.shape[0], 1)
y_test = LogisticRegressor().fit(X, y).predict(X_test)

In [0]: y_test

array([1., 0.])
Out[0]:

H2 Maths Detailed Summary
No ratings yet
H2 Maths Detailed Summary
82 pages
2017 08 11 MC886 MO444 Linear Regression PDF
No ratings yet
2017 08 11 MC886 MO444 Linear Regression PDF
154 pages
8-SET-4 MS BRIGHT
No ratings yet
8-SET-4 MS BRIGHT
4 pages
q1 w8 Lc1 Lasmath Gen Math
No ratings yet
q1 w8 Lc1 Lasmath Gen Math
12 pages
Itute 2007 Mathematical Methods Examination 1
No ratings yet
Itute 2007 Mathematical Methods Examination 1
6 pages
NAst
No ratings yet
NAst
8 pages
Math T Short Note (For Memory Flash Back Only)
No ratings yet
Math T Short Note (For Memory Flash Back Only)
10 pages
ADVANCED_MATHS_1
No ratings yet
ADVANCED_MATHS_1
18 pages
mp 2 file part 3
No ratings yet
mp 2 file part 3
10 pages
Exponential and Logarithmic Functions
No ratings yet
Exponential and Logarithmic Functions
6 pages
Solutions for Problems from Neural Networks and Learning Machines, 3rd Edition by Simon Haykin
No ratings yet
Solutions for Problems from Neural Networks and Learning Machines, 3rd Edition by Simon Haykin
5 pages
Diffcal Differential Calculus: Dr. Susan A. Roces Chemical Engineering Department
No ratings yet
Diffcal Differential Calculus: Dr. Susan A. Roces Chemical Engineering Department
13 pages
Calculating a Derivative
No ratings yet
Calculating a Derivative
6 pages
Ch.3 Methods in Calculus
No ratings yet
Ch.3 Methods in Calculus
1 page
8a. Artificial Neural Network
No ratings yet
8a. Artificial Neural Network
1 page
Math SQQM1023 - CH 1
No ratings yet
Math SQQM1023 - CH 1
17 pages
X Y X X Y y X X, Y y Y Y: Conditional Expectation
No ratings yet
X Y X X Y y X X, Y y Y Y: Conditional Expectation
21 pages
Symbolic Mathematics in Python
No ratings yet
Symbolic Mathematics in Python
1 page
Gate Ee Question Paper 2018 32
No ratings yet
Gate Ee Question Paper 2018 32
45 pages
Tutorial Answers Combined
No ratings yet
Tutorial Answers Combined
198 pages
08 - Intro Differential Equations
No ratings yet
08 - Intro Differential Equations
2 pages
Polynomial Regression From Scratch in Python - by Rashida Nasrin Sucky - Towards Data Science
No ratings yet
Polynomial Regression From Scratch in Python - by Rashida Nasrin Sucky - Towards Data Science
1 page
Functions and Graphs - Student
No ratings yet
Functions and Graphs - Student
6 pages
Practice Final MAC1105
No ratings yet
Practice Final MAC1105
19 pages
The Organization of Each Module Is Composed of The Following
No ratings yet
The Organization of Each Module Is Composed of The Following
19 pages
Maths Modelii
No ratings yet
Maths Modelii
3 pages
Chapter 1 Function (Part1)
No ratings yet
Chapter 1 Function (Part1)
6 pages
Linear Law
No ratings yet
Linear Law
29 pages
2000 P1u1 Pure Maths
No ratings yet
2000 P1u1 Pure Maths
7 pages
CHP 2 Linear Law (AddMaths Form 5)
0% (2)
CHP 2 Linear Law (AddMaths Form 5)
12 pages
Ch.8 Modelling with Differential Equations
No ratings yet
Ch.8 Modelling with Differential Equations
1 page
Confidential
No ratings yet
Confidential
8 pages
Chap 0
No ratings yet
Chap 0
31 pages
Apprec Alc Exams Unit 1 Stud
No ratings yet
Apprec Alc Exams Unit 1 Stud
28 pages
Study Unit 4: Non-Linear Functions Chapter 4: Sections 4.1 - 4.4 Types of Non-Linear Functions: Polynomials
No ratings yet
Study Unit 4: Non-Linear Functions Chapter 4: Sections 4.1 - 4.4 Types of Non-Linear Functions: Polynomials
19 pages
ML cheat sheet(1)
No ratings yet
ML cheat sheet(1)
2 pages
ML Cheatsheet 2024-2025
No ratings yet
ML Cheatsheet 2024-2025
2 pages
lec11_Higher_Order_Functions
No ratings yet
lec11_Higher_Order_Functions
36 pages
Captura de ecrã 2024-10-02 à(s) 11.41.38
No ratings yet
Captura de ecrã 2024-10-02 à(s) 11.41.38
17 pages
Precalculus Eighth Edition Ron Larson - Download the ebook today and experience the full content
100% (2)
Precalculus Eighth Edition Ron Larson - Download the ebook today and experience the full content
47 pages
Technical Mathematics P1 Exemplar GR 11 2017 Memo Eng
No ratings yet
Technical Mathematics P1 Exemplar GR 11 2017 Memo Eng
12 pages
Pre Calculus
No ratings yet
Pre Calculus
14 pages
Resource Sheet Graph of Functions 4024
No ratings yet
Resource Sheet Graph of Functions 4024
19 pages
14. Functions
No ratings yet
14. Functions
25 pages
Lab2
No ratings yet
Lab2
7 pages
Course Notes - Calculus 1
No ratings yet
Course Notes - Calculus 1
34 pages
Random Processes for Engineers 1st Hajek Solution Manualpdf download
100% (5)
Random Processes for Engineers 1st Hajek Solution Manualpdf download
53 pages
Summary (ES) أ. احمد الزريعي
No ratings yet
Summary (ES) أ. احمد الزريعي
7 pages
MA-F1 Working With Functions Typed
No ratings yet
MA-F1 Working With Functions Typed
69 pages
NM Unit5
No ratings yet
NM Unit5
6 pages
Ibhl1 Summer Packet 2022 1
No ratings yet
Ibhl1 Summer Packet 2022 1
7 pages
Week 1 Math
No ratings yet
Week 1 Math
2 pages
Cartesian Product, Relations, Graphs
0% (1)
Cartesian Product, Relations, Graphs
27 pages
Mathematics Special a Study Guide 2024
No ratings yet
Mathematics Special a Study Guide 2024
107 pages
DSC1520-SU4
No ratings yet
DSC1520-SU4
19 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Limits and Continuity (Calculus) Engineering Entrance Exams Question Bank
From Everand
Limits and Continuity (Calculus) Engineering Entrance Exams Question Bank
Mohmmad Khaja Shareef
No ratings yet
Square Summable Power Series
From Everand
Square Summable Power Series
Louis de Branges
5/5 (1)
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
Mathematical Optimization: Fundamentals and Applications
From Everand
Mathematical Optimization: Fundamentals and Applications
Fouad Sabry
No ratings yet
Reflection 2
No ratings yet
Reflection 2
6 pages
Certificate of Appreciation Dianajean Pascual Pascua
No ratings yet
Certificate of Appreciation Dianajean Pascual Pascua
8 pages
Academic and Administrative Audit (AAA)
No ratings yet
Academic and Administrative Audit (AAA)
13 pages
Acca FT LSBF - FBT 06 - 10 Proof
No ratings yet
Acca FT LSBF - FBT 06 - 10 Proof
12 pages
Erikson
No ratings yet
Erikson
2 pages
Peer Feedback and Review Sheet: ARCH7111: Student Name and Id: Yizhang Student Reviewer: JP Week: 11 Grade
No ratings yet
Peer Feedback and Review Sheet: ARCH7111: Student Name and Id: Yizhang Student Reviewer: JP Week: 11 Grade
1 page
334C6C
No ratings yet
334C6C
3 pages
Childcare Service
No ratings yet
Childcare Service
7 pages
Possessive Adjectives Writing Activity
No ratings yet
Possessive Adjectives Writing Activity
3 pages
Written Work 1. Photo Essay: HGP 12: Homeroom Guidance Program
No ratings yet
Written Work 1. Photo Essay: HGP 12: Homeroom Guidance Program
1 page
Q1-M1 Ans (Eapp)
No ratings yet
Q1-M1 Ans (Eapp)
9 pages
FITT 3 Practical Exam 1 Dance Fundamental
No ratings yet
FITT 3 Practical Exam 1 Dance Fundamental
10 pages
w07 Ram Know
No ratings yet
w07 Ram Know
5 pages
Active Learners Essay
No ratings yet
Active Learners Essay
7 pages
Another School Year: Why?: Why-By-John-Ciardi
No ratings yet
Another School Year: Why?: Why-By-John-Ciardi
5 pages
Cameron Strouth Research Project Proposal: Background
No ratings yet
Cameron Strouth Research Project Proposal: Background
3 pages
4568/CEO/Estt/SSB/2nd/2775 Dated 28.09.2020, Forwarded The Requisition For
No ratings yet
4568/CEO/Estt/SSB/2nd/2775 Dated 28.09.2020, Forwarded The Requisition For
13 pages
Quick Reference For SFT
No ratings yet
Quick Reference For SFT
3 pages
Factors That Influence Peer Pressure
No ratings yet
Factors That Influence Peer Pressure
5 pages
Dissertation Bts Le Reve
100% (3)
Dissertation Bts Le Reve
7 pages
Download Inside Book Publishing 5th Edition Giles Clark ebook All Chapters PDF
100% (6)
Download Inside Book Publishing 5th Edition Giles Clark ebook All Chapters PDF
65 pages
English
100% (1)
English
83 pages
Entrepreneurial Finance For New and Emerging Businesses 1st Edition James Mcneill Stancillinstant download
100% (1)
Entrepreneurial Finance For New and Emerging Businesses 1st Edition James Mcneill Stancillinstant download
41 pages
STD Practice
No ratings yet
STD Practice
16 pages
Evaluating statistical claims_ Observational studies and experiments answers
No ratings yet
Evaluating statistical claims_ Observational studies and experiments answers
11 pages
Brochure and Registration Form
No ratings yet
Brochure and Registration Form
3 pages
SId Resume Finaledit PDF
No ratings yet
SId Resume Finaledit PDF
2 pages
Monachino 2019
No ratings yet
Monachino 2019
7 pages
Different Perspective Conditional 2
No ratings yet
Different Perspective Conditional 2
5 pages
Personal Best B1 AmE Unit 7 Grammar Test
No ratings yet
Personal Best B1 AmE Unit 7 Grammar Test
2 pages

1_logistic_regression

Uploaded by

1_logistic_regression

Uploaded by

In [0]: import numpy as np

sns.set(style='whitegrid', palette='muted', font_scale=1.5)

Out[0]: amount_spent send_discount

In [0]: df.plot.scatter(x='amount_spent', y='send_discount', s=108, c="blue");

Making decisions with Logistic regression

Email - deciding if it is spam or not

We want to predict the outcome of a variable y, such that:

Can't we just use Linear regression?

Logistic regression model

Then we can represent hw (x) in more compact form:

That is the Linear regression model.

We're going to define g as:

for our hypothesis.

A closer look at the sigmoid function

where z ∈ R . Let's translate that to a Python function:

In [0]: def sigmoid(z):

In [0]: class TestSigmoid(unittest.TestCase):

In [0]: x = np.linspace(-10., 10., num=100)

plt.plot(x, sig, label="sigmoid")

How can we find the parameters for our model?

−log(hw (x)) ify = 1

Which is also known as the Log loss or Cross-entropy loss function

We can compress the above function into one:

Let's implement it in Python:

In [0]: def loss(h, y):

In [0]: class TestLoss(unittest.TestCase):

Approach #1 - I'm thinking of a number(s)

def predict(x, w):

def print_result(y_hat, y):

y_hat = predict(x=X[0], w=.5)

loss: 25.0 predicted: 0.999999999986112 actual: 0.0

Approach #2 - Try out many numbers

In [0]: for w in np.arange(-1, 1, 0.1):

Approach #3 - Gradient descent

First derivative of the sigmoid function

Complete derivation can be found here.

First derivative of the cost function

Given g ′ (z) = g(z)(1 − g(z))

Updating our parameters W

Big vs Small learning rate (Source: TowardsDataScoemce)

The gradient descent algorithm

In [0]: def predict(X, W):

def fit(X, y, n_iter=100000, lr=0.01):

In [0]: class TestGradientDescent(unittest.TestCase):

In [0]: def fit(X, y, n_iter=100000, lr=0.01):

if(i % 10000 == 0):

In [0]: _, errors = fit(X, y)

In [0]: def fit(X, y, n_iter=100000, lr=0.001):

if(i % 10000 == 0):

In [0]: def add_intercept(X):

def predict(X, W):

def fit(X, y, n_iter=100000, lr=0.01):

if(i % 10000 == 0):

In [0]: _, errors = fit(X, y)

Hiding the complexity of the algorithm

But first, let's write some tests:

In [0]: class TestLogisticRegressor(unittest.TestCase):

In [0]: class LogisticRegressor:

def _add_intercept(self, X):

def predict_probs(self, X):

def predict(self, X):

def fit(self, X, y, n_iter=100000, lr=0.01):

Using our Regressor to decide who should receive discount codes

In [0]: X_test = np.array([10, 250])

You might also like