Logistic _Regresssion
Logistic _Regresssion
Logistic Regression
Used to predict classes for binary classification problems
localhost:8888/notebooks/ML/ECCE5001/Logistic.ipynb 1/22
7/2/22, 11:32 PM Logistic - Jupyter Notebook
localhost:8888/notebooks/ML/ECCE5001/Logistic.ipynb 2/22
7/2/22, 11:32 PM Logistic - Jupyter Notebook
localhost:8888/notebooks/ML/ECCE5001/Logistic.ipynb 3/22
7/2/22, 11:32 PM Logistic - Jupyter Notebook
localhost:8888/notebooks/ML/ECCE5001/Logistic.ipynb 4/22
7/2/22, 11:32 PM Logistic - Jupyter Notebook
localhost:8888/notebooks/ML/ECCE5001/Logistic.ipynb 5/22
7/2/22, 11:32 PM Logistic - Jupyter Notebook
localhost:8888/notebooks/ML/ECCE5001/Logistic.ipynb 6/22
7/2/22, 11:32 PM Logistic - Jupyter Notebook
Terminology
• Type of classification outputs:
– True positive (m11): Example of class 1 predicted as class 1.
– False positive (m01): Example of class 0 predicted as class 1. Type 1 error.
– True negative (m00): Example of class 0 predicted as class 0.
– False negative (m10): Example of class 1 predicted as class 0. Type II error.
• Total number of instances: m = m00 + m01 + m10 + m11
Confusion matrix
localhost:8888/notebooks/ML/ECCE5001/Logistic.ipynb 7/22
7/2/22, 11:32 PM Logistic - Jupyter Notebook
Common measures
• Accuracy = (TP+ TN) / (TP + FP + FN + TN)
• Precision = True positives / Total number of declared positives
= TP / (TP+ FP)
• Recall = True positives / Total number of actual positives
= TP / (TP + FN)
• Sensitivity is the same as recall.
• Specificity = True negatives / Total number of actual negatives
= TN / (FP + TN)
• False positive rate = FP / (FP + TN)
• F1 measure= 2(Precision*Recall/(Precision+Recall))
localhost:8888/notebooks/ML/ECCE5001/Logistic.ipynb 8/22
7/2/22, 11:32 PM Logistic - Jupyter Notebook
localhost:8888/notebooks/ML/ECCE5001/Logistic.ipynb 9/22
7/2/22, 11:32 PM Logistic - Jupyter Notebook
localhost:8888/notebooks/ML/ECCE5001/Logistic.ipynb 10/22
7/2/22, 11:32 PM Logistic - Jupyter Notebook
localhost:8888/notebooks/ML/ECCE5001/Logistic.ipynb 11/22
7/2/22, 11:32 PM Logistic - Jupyter Notebook
data =
Exam 1 Exam 2 Admitted
0 34.623660 78.024693 0
1 30.286711 43.894998 0
2 35.847409 72.902198 0
3 60.182599 86.308552 1
4 79.032736 75.344376 1
5 45.083277 56.316372 0
6 61.106665 96.511426 1
7 75.024746 46.554014 1
8 76.098787 87.420570 1
9 84.432820 43.533393 1
data.describe =
Exam 1 Exam 2 Admitted
count 100.000000 100.000000 100.000000
mean 65.644274 66.221998 0.600000
std 19.458222 18.582783 0.492366
min 30.058822 30.603263 0.000000
25% 50.919511 48.179205 0.000000
50% 67.032988 67.682381 1.000000
75% 80.212529 79.360605 1.000000
max 99.827858 98.869436 1.000000
Out[4]: 1 60
0 40
Name: Admitted, dtype: int64
localhost:8888/notebooks/ML/ECCE5001/Logistic.ipynb 12/22
7/2/22, 11:32 PM Logistic - Jupyter Notebook
localhost:8888/notebooks/ML/ECCE5001/Logistic.ipynb 13/22
7/2/22, 11:32 PM Logistic - Jupyter Notebook
X.shape = (100, 3)
theta.shape = (3,)
y.shape = (100, 1)
cost = 0.6931471805599453
accuracy = 89%
localhost:8888/notebooks/ML/ECCE5001/Logistic.ipynb 14/22
7/2/22, 11:32 PM Logistic - Jupyter Notebook
localhost:8888/notebooks/ML/ECCE5001/Logistic.ipynb 15/22
7/2/22, 11:32 PM Logistic - Jupyter Notebook
localhost:8888/notebooks/ML/ECCE5001/Logistic.ipynb 16/22
7/2/22, 11:32 PM Logistic - Jupyter Notebook
C:\Users\ralhm\anaconda3\lib\site-packages\sklearn\utils\validation.py:63:
DataConversionWarning: A column-vector y was passed when a 1d array was exp
ected. Please change the shape of y to (n_samples, ), for example using rav
el().
return f(*args, **kwargs)
localhost:8888/notebooks/ML/ECCE5001/Logistic.ipynb 17/22
7/2/22, 11:32 PM Logistic - Jupyter Notebook
Confusion Matrix is :
[[ 4 2]
[ 0 14]]
localhost:8888/notebooks/ML/ECCE5001/Logistic.ipynb 18/22
7/2/22, 11:32 PM Logistic - Jupyter Notebook
[[1]
[0]
[0]
[1]
[0]
[0]
[1]
[1]
[1]
[1]
[0]
[1]
[1]
[0]
[1]
[1]
[1]
[1]
[1]
[1]]
localhost:8888/notebooks/ML/ECCE5001/Logistic.ipynb 19/22
7/2/22, 11:32 PM Logistic - Jupyter Notebook
localhost:8888/notebooks/ML/ECCE5001/Logistic.ipynb 20/22
7/2/22, 11:32 PM Logistic - Jupyter Notebook
57 #----------------------------------------------------
58 #Calculating Receiver Operating Characteristic :
59 #roc_curve(y_true, y_score, pos_label=None, sample_weight=None,drop_inter
60
61 fprValue, tprValue, thresholdsValue = roc_curve(y_test,y_pred)
62 print('fpr Value : ', fprValue)
63 print('tpr Value : ', tprValue)
64 print('thresholds Value : ', thresholdsValue)
65
66 #----------------------------------------------------
67 #Calculating ROC AUC Score:
68 #roc_auc_score(y_true, y_score, average=’macro’, sample_weight=None,max_f
69
70 ROCAUCScore = roc_auc_score(y_test,y_pred, average='micro') #it can be :
71 print('ROCAUC Score : ', ROCAUCScore)
72
accuracy 0.90 20
macro avg 0.94 0.83 0.87 20
weighted avg 0.91 0.90 0.89 20
localhost:8888/notebooks/ML/ECCE5001/Logistic.ipynb 21/22
7/2/22, 11:32 PM Logistic - Jupyter Notebook
(150, 4)
[[5.1 3.5 1.4 0.2]
[4.9 3. 1.4 0.2]]
[[9.69810844e-01 3.01885609e-02 5.94808016e-07]
[9.55989484e-01 4.40095854e-02 9.30437140e-07]]
score = 0.9666666666666667
No of iterations = [85]
Classes = [0 1 2]
[[50 0 0]
[ 0 47 3]
[ 0 2 48]]
In [ ]: 1
localhost:8888/notebooks/ML/ECCE5001/Logistic.ipynb 22/22