Ejemplo Reporte

Reporte

Uploaded by

AngielPixiesFer

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

Ejemplo Reporte

Reporte

Uploaded by

AngielPixiesFer

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

A.

Logistic Regression:
Logistic Regression is a type of a statistical model which is used for predicting the probability of
an event happening. 𝑙𝑜𝑔𝑖𝑡(𝑝𝑖) = 𝑙𝑛 ( ) 𝑝𝑖
1−𝑝𝑖
= β0 + β1𝑥𝑖
𝑁 𝑝
The response variable is modeled with Bernoulli distribution: 𝑙𝑛(𝐿) = ∑ ⎡⎢𝑙𝑛(1 − 𝑝𝑖) + 𝑦𝑖𝑙𝑛( 1−𝑝𝑖 )⎤⎥
𝑖=1⎣ 𝑖 ⎦

In order to maximise the above equation we need to find the most optimal 𝑝𝑖 value.

B. Lasso Regression: Lasso or the Least Absolute Shrinkage and Selection Operator is a type of
linear regression model that incorporates variable selection and regularisation to increase the
prediction accuracy. Regularization adds a penalty to change the cost function. Lasso Regression
𝑛
uses L1 penalty: 𝑅(θ) = ∑ θ𝑗
𝑗=1
| |
In Lasso Regression, the non-important features have a weight of approximately 0, making them
useless when predicting new values. In our case, the hyperparameter that needs tuning is the C
value used in the alpha parameter. Alpha value is defined by the following equation: α = ( ).
1
2𝐶
So, alpha is inversely proportional to the C value.
C. k-Fold Lasso Regression: To improve the efficiency of the model, we used k-fold cross validation
as the hyperparameter tuning. Of the previous model.
D. Ridge Regression: Ridge regression is used to analyse data with multicollinearity. When data
contains multicollinearity, the least square are unbiased and variance is huge. This leads to a
drastic reduction in prediction accuracy. Ridge regression adds penalty to cost function based of
𝑚 2
the following formula: 𝐽(θ) =
1
2 (
∑ ℎθ(𝑥 − 𝑦
𝑖=1
𝑖 𝑖
) . This model applies a L2 penalty.
E. Sequential Model: Sequential Model is a linear stack of layers which have single-input and
single-output layer.
F. K-Nearest Neighbours Model: This model is used to solve regression as well as classification
problems. KNN Model assumes that similar things exist in close proximity and groups them
accordingly. Then predicts the data based on which group it would be in.

Experiments/Results/Discussion(2-3 pages)

We first generated the wordcloud for the entire review texts and then only for the extreme ends of the
rating variable to observe the most commonly occurring words in all the cases. The data was divided in
80/20 train/test split. The wordclouds for extreme ends of rating were not upto our expectations. The
wordcloud for rating=5 was expected to contain positive words that are used as compliments and vice
versa for rating=1 wordcloud. SInce both the wordclouds had few commonly occurring words, we
realised the need for preprocessing the data so that we could remove highly repetitive but neutral words
from the review texts. The WordClouds are as follows:
The first wordcloud in the first row is for the entire dataset. The second word cloud in the first row is for
the reviews which have rating=5(maximum). The wordcloud in the second row is for the reviews which
have rating=1(minimum)

Logistic Regression:
L2 regularized Logistic Regression
A Logistic Regression Model is used to predict the ratings from the user reviews. We have tested the
model against the following values for C: [0.001, 0.01, 0.1, 1, 10, 100]
As observed from the plot of true positive rate vs true negative rate, the model with L2 penalty term
performs best when C is set to 100. The AUC score on testing data is 81.08%

L1 regularized Logistic Regression

A Logistic Regression Model is used to predict the ratings from the user reviews. We have tested the
model against the following values for C: [0.001, 0.01, 0.1, 1, 10, 100]
As observed from the plot of true positive rate vs true negative rate, the model with L2 penalty term
performs best when C is set to 100. The AUC score on testing data is 83.4%

Lasso Regression:
From the above outputs, we can observe that the Root mean square error for testing data is the least
when the alpha value is set to 0.001. As we keep increasing the alpha value, the root mean square error
increases.

On viewing the features based on the alpha value with the least root mean square error, we can confirm
that it is the best fitting model as it focuses on the negative words in the reviews on which the review
rating depends the most.

Ridge Regression:
From the above outputs, we can observe that the Root mean square error for testing data is the least
when the alpha value is set to 0.1. As we keep decreasing the alpha value, the root mean square error
increases.

On viewing the features based on the alpha value with the least root mean square error, we can confirm
that it is the best fitting model as it focuses on the positive words in the reviews on which the review
rating depends the most.

Sequential Model:
From the above plots, it can be easily identified that on increasing the epoch(the number of iteration), the
loss decreases and the accuracy increases steadily.

The above accuracy score states the accuracy of the sequential model on testing data to be 74.3%
which can be confirmed from the classification report

Summary(100-200 words)

Model Accuracy(in percentage)

Logistic Regression 83.4

Lasso Regression 74

Ridge Regression 44

Sequential Model 75

A Survey of Machine Learning Models For Financial Time Series Forecasting
No ratings yet
A Survey of Machine Learning Models For Financial Time Series Forecasting
18 pages
Sampling Techniques MCQ
100% (2)
Sampling Techniques MCQ
47 pages
Predictive Modeling Business Report
100% (3)
Predictive Modeling Business Report
69 pages
Data Science Cheatsheet
100% (1)
Data Science Cheatsheet
5 pages
Machine Learning Interview Questions.
50% (2)
Machine Learning Interview Questions.
43 pages
HW1 Final
No ratings yet
HW1 Final
4 pages
Lecture - 6 Classification (Logistic Regression)
No ratings yet
Lecture - 6 Classification (Logistic Regression)
48 pages
Midterm Sol
No ratings yet
Midterm Sol
23 pages
LR_GD
No ratings yet
LR_GD
5 pages
ML-1
No ratings yet
ML-1
24 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
module 2 modified
No ratings yet
module 2 modified
67 pages
Chapter+3+ ++Regression+Algorithms
No ratings yet
Chapter+3+ ++Regression+Algorithms
22 pages
ML Unit 3
No ratings yet
ML Unit 3
2 pages
Unit 2
No ratings yet
Unit 2
8 pages
Classification & Regression BDMDM Print
No ratings yet
Classification & Regression BDMDM Print
5 pages
Rohit Unit 2 ML Notes
No ratings yet
Rohit Unit 2 ML Notes
7 pages
Module 3.3 Classification Models, An Overview
No ratings yet
Module 3.3 Classification Models, An Overview
11 pages
Supervised Regression Notes
No ratings yet
Supervised Regression Notes
11 pages
SL_LMRG
No ratings yet
SL_LMRG
32 pages
ML 3 (1)
No ratings yet
ML 3 (1)
50 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Wa0002.
No ratings yet
Wa0002.
5 pages
Capstone Project 2
No ratings yet
Capstone Project 2
27 pages
Regression Models Overview
No ratings yet
Regression Models Overview
170 pages
DS Unit 2 Essay Answers
No ratings yet
DS Unit 2 Essay Answers
17 pages
ML Linear Model
No ratings yet
ML Linear Model
10 pages
Machine Learning Question Bank-Unit 3
No ratings yet
Machine Learning Question Bank-Unit 3
6 pages
Advanced Regression Assignment
No ratings yet
Advanced Regression Assignment
5 pages
Regression
No ratings yet
Regression
5 pages
Logistic Regression in Data Analysis: An Overview
No ratings yet
Logistic Regression in Data Analysis: An Overview
21 pages
Fundamental Math
From Everand
Fundamental Math
Russell Pead
No ratings yet
Exercise - 3: DS203-2024-S1 Roll Number: 23B2215
No ratings yet
Exercise - 3: DS203-2024-S1 Roll Number: 23B2215
25 pages
Computer Lab 2 Block 1-3
No ratings yet
Computer Lab 2 Block 1-3
7 pages
Regression
No ratings yet
Regression
24 pages
Lecture 09 ML
No ratings yet
Lecture 09 ML
26 pages
Forecasting Assignment2023
No ratings yet
Forecasting Assignment2023
3 pages
Yousef ML Washin Classification
100% (1)
Yousef ML Washin Classification
333 pages
Id5059 23 2 1
No ratings yet
Id5059 23 2 1
8 pages
Regularization and Feature Selectio N
No ratings yet
Regularization and Feature Selectio N
102 pages
Exam2Review
No ratings yet
Exam2Review
23 pages
IJARCCE.2022.115105
No ratings yet
IJARCCE.2022.115105
6 pages
ML_2023
No ratings yet
ML_2023
3 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
3 pages
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
ML_AI
No ratings yet
ML_AI
53 pages
Machine Learning Strategies
No ratings yet
Machine Learning Strategies
59 pages
Assignment 1-12 ML
No ratings yet
Assignment 1-12 ML
54 pages
ML Model Paper 2 Solution
No ratings yet
ML Model Paper 2 Solution
15 pages
ML MCQ 1
No ratings yet
ML MCQ 1
5 pages
Int 354 ML-1
No ratings yet
Int 354 ML-1
4 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
11 pages
MLS+2+-+Classification
No ratings yet
MLS+2+-+Classification
13 pages
Logistic Regression
No ratings yet
Logistic Regression
42 pages
Aiml K2
No ratings yet
Aiml K2
8 pages
Bank Marketing Prediction
No ratings yet
Bank Marketing Prediction
2 pages
3. LR, decision tree
No ratings yet
3. LR, decision tree
48 pages
ML
No ratings yet
ML
16 pages
LAB5_Regularization
No ratings yet
LAB5_Regularization
6 pages
A Broader Understanding of ML and Types of Regression
No ratings yet
A Broader Understanding of ML and Types of Regression
8 pages
Machine learning lab manual
No ratings yet
Machine learning lab manual
22 pages
Es884engineered Science-2023
No ratings yet
Es884engineered Science-2023
16 pages
Machine Learning Optimized Approach For Parameter Selection in MESHFREE Simulations
No ratings yet
Machine Learning Optimized Approach For Parameter Selection in MESHFREE Simulations
21 pages
Data Science Journal
No ratings yet
Data Science Journal
40 pages
MCQ Concept
No ratings yet
MCQ Concept
3 pages
Module 1: Introduction To Machine Learning: 1. What Is Machine Learning? How Is It Different From Human Learning?
No ratings yet
Module 1: Introduction To Machine Learning: 1. What Is Machine Learning? How Is It Different From Human Learning?
21 pages
CH8 Forecasting
No ratings yet
CH8 Forecasting
80 pages
1 - Introduction To Deep Learning
No ratings yet
1 - Introduction To Deep Learning
87 pages
Open Versus Closed View Autorefraction in Young Adults
No ratings yet
Open Versus Closed View Autorefraction in Young Adults
6 pages
Development of New Eyellipse
No ratings yet
Development of New Eyellipse
54 pages
A Dempster-Shafer Sensor Fusion Approach For Traffic Incident Detection and Localization
No ratings yet
A Dempster-Shafer Sensor Fusion Approach For Traffic Incident Detection and Localization
6 pages
Efficient Python Tricks and Tools For Data Scientists - by Khuyen Tran
No ratings yet
Efficient Python Tricks and Tools For Data Scientists - by Khuyen Tran
20 pages
Week-1 ML Slides
No ratings yet
Week-1 ML Slides
16 pages
algosintrvwques
No ratings yet
algosintrvwques
27 pages
Correlation Speech Recognition PDF
No ratings yet
Correlation Speech Recognition PDF
8 pages
Machine Learning Model For Green Building Design Prediction
No ratings yet
Machine Learning Model For Green Building Design Prediction
10 pages
Final Report_Rahma Ahme (P-EM0295-23)
No ratings yet
Final Report_Rahma Ahme (P-EM0295-23)
42 pages
Forecasting: 1. Optimal Forecast Criterion - Minimum Mean Square Error Forecast
No ratings yet
Forecasting: 1. Optimal Forecast Criterion - Minimum Mean Square Error Forecast
9 pages
Prediction of Medical Costs Using Regression Algorithms: A. Lakshmanarao, Chandra Sekhar Koppireddy, G.Vijay Kumar
0% (1)
Prediction of Medical Costs Using Regression Algorithms: A. Lakshmanarao, Chandra Sekhar Koppireddy, G.Vijay Kumar
7 pages
Chapter 13 Experimental Design and Analysis of Variance PDF
No ratings yet
Chapter 13 Experimental Design and Analysis of Variance PDF
44 pages
Clinical Nutrition: Original Article
No ratings yet
Clinical Nutrition: Original Article
9 pages
An analysis of four missing data treatment methods for supervised learning
No ratings yet
An analysis of four missing data treatment methods for supervised learning
16 pages
Performance Metrics (Error Measures) in Machine Learning Regression, Forecasting and Prognostics: Properties and Typology
No ratings yet
Performance Metrics (Error Measures) in Machine Learning Regression, Forecasting and Prognostics: Properties and Typology
37 pages
Jntuk Machine Learning 3-2 Unit-1
No ratings yet
Jntuk Machine Learning 3-2 Unit-1
31 pages
A Simple Method For Calibration of Temperature Sensor DS18B20 Waterproof in Oil Bath Based On Arduino Data Acquisition System
No ratings yet
A Simple Method For Calibration of Temperature Sensor DS18B20 Waterproof in Oil Bath Based On Arduino Data Acquisition System
9 pages
Forecasting Errors
No ratings yet
Forecasting Errors
9 pages
chapter13-non-sampling-errors
No ratings yet
chapter13-non-sampling-errors
7 pages
Lecture8 4
No ratings yet
Lecture8 4
29 pages
3-Predicting Stock Prices Using Deep Learning - by Yacoub Ahmed - Towards Data Science PDF
No ratings yet
3-Predicting Stock Prices Using Deep Learning - by Yacoub Ahmed - Towards Data Science PDF
15 pages

Ejemplo Reporte

Uploaded by

Ejemplo Reporte

Uploaded by

A.

L1 regularized Logistic Regression

Model Accuracy(in percentage)

Logistic Regression 83.4

You might also like