0% found this document useful (0 votes)

9 views

AmeyaYaminiLinearRegressionDoc (1)

The document outlines Assignment 1 for CS 6375, focusing on linear regression using gradient descent, completed by students Ameya Kulkarni and Yamini Thota. It details the dataset used, pre-processing steps, observations on model training, and results from different epochs and learning rates, highlighting the minimum mean squared error (MSE) achieved. The assignment also discusses the use of SGD linear regression and the challenges faced in plotting MSE values during training.

Uploaded by

jayashree

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

AmeyaYaminiLinearRegressionDoc (1)

Uploaded by

jayashree

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 15

CS 6375

ASSIGNMENT 1: Linear Regression

using Gradient Descent

Names of students in your group:

1. Ameya Kulkarni (ANK190006)
2. Yamini Thota (YRT190003)

Number of free late days used:

___0_________________
Note: You are allowed a total of 4 free late days for the entire semester. You can use at most
2 for each assignment. After that, there will be a penalty of 10% for each late day.

Please list clearly all the sources/references that you

have used in this assignment.
- We haven’t used any resources directly.

Part 1:
Dataset used:
https://archive.ics.uci.edu/ml/datasets/
Metro+Interstate+Traffic+Volume

Note: For the purpose of code execution we have

already provided dataset file,you need not download
it.
Plot for each input feature to the output:
Independent Features are holiday,
temp,rain_1h,snow_1h, clouds_all and
weather_main.
Dependent or target feature is traffic_volume

1. Before feature scaling

Fig1

2. After feature scaling

Fig 2

Pre-processing activities includes :

- We don’t have any rows with null values.
- We don’t have any redundant rows, verified this
using “drop_duplicates()”
- Removed weather_description column as we
already have weather_main
- We used Min-Max Scaling for scaling temp,
rain_1h, clouds_all and traffic_volume
- Dates like this ”10/2/2012 9:00:00 AM” to hour
like “9” of the day because we generally need to find
traffic volume based on hours.
- We catogrized holiday and weather_main
features
df['weather_main'] = df['weather_main'].map({
'Clouds': 0,
'Clear': 1,
'Drizzle': 2,
'Fog': 3,
'Haze': 4,
'Mist': 5,
'Rain': 6,
'Smoke': 7,
'Snow': 8,
'Squall': 9,
'Thunderstorm': 10,
})

df['holiday'] = df['holiday'].map({
'None': 0,
'Columbus Day': 1,
'Veterans Day': 2,
'Thanksgiving Day': 3,
'Christmas Day': 4,
'New Years Day': 5,
'Washingtons Birthday': 6,
'Memorial Day': 7,
'Independence Day': 8,
'State Fair': 9,
'Labor Day': 10,
'Martin Luther King Jr Day': 11
})

Observations:
How we got below observations:
- We ran our code with epoch range from 1000 to 1
reducing epoch by 20% every time.
- For each epoch we changed the learning rate
from 0.001 to 0.010 with a step of 0.001.
- We got total number of 270 observations
- On that we applied sorting to get a range of MSE
on training data in which 0.193553234889187 is
minimum MSE of training data.
- We considered 0.05 as a tolerating range to get a
minimum epoch and maximum learning rate.
- The table below shows MSE values in range (0.19
to 0.25)
- We feel that highlighted rows can be the best
values of epoch and learning rates.
Recording were taking with same starting weights

Learnin MSE Training MSE Testing

Index Epochs g Rate Data data

199 12 0.009 0.24955098 0.237985157

198 12 0.008 0.25529565 0.243804941

189 16 0.009 0.23630345 0.224245793

188 16 0.008 0.24095059 0.229016238

187 16 0.007 0.24731688 0.23557902

186 16 0.006 0.25589799 0.244425476

179 20 0.009 0.23011224 0.217891765

178 20 0.008 0.23314507 0.22099012

177 20 0.007 0.23763966 0.225608025

176 20 0.006 0.24425232 0.23242054

180 20 0.01 0.24986636 0.239225368

175 20 0.005 0.25387417 0.242340392

169 26 0.009 0.22611866 0.213869919

168 26 0.008 0.22762091 0.21536825

170 26 0.01 0.22793378 0.215984987

167 26 0.007 0.23009179 0.217871243

166 26 0.006 0.23418695 0.222060797

165 26 0.005 0.24095276 0.229021169

164 26 0.004 0.25199203 0.24040105

Below is the graph for one of the optimal values of

epoch and learning rate from our observation
Fig 3

Fig 4
Fig 5

How Satisfied?
- We are satisfied with our values. We trained our
model with different epochs and learning rates
and recorded observations. From these
observations we checked for minimum MSE on
training data and picked best values for epoch
and learning rate.
- Fig 3 shows output for 100 epochs with learning
rate of 0.008. From the graph it is clear that the
model is converging somewhere near iteration
number 20-25. Rest of the iterations are not
required. Fig 4 is a zoomed version of fig 3
- To find this convergence point we added a
condition to keep difference in error more than
0.001. Training will stop if the difference in error
is less than 0.001. Fig 5 shows that training
stopped at convergence point.

Part 2:
- We used the SGD linear regression package.
- We applied same logic as part 1 to get the
observations.

Index Epochs Learning Rate MSE Training Data MSE Testing data

131 53 0.001 0.254060491 0.258383816

1 1000 0.001 0.254148093 0.272112047

261 1 0.001 0.254416362 0.255711369

222 5 0.002 0.25441734 0.26322152

105 105 0.005 0.254566409 0.305966067

181 16 0.001 0.255023464 0.265665126

103 105 0.003 0.255304489 0.292813149

81 166 0.001 0.256135017 0.255118766

91 132 0.001 0.256341374 0.295609517

211 7 0.001 0.256353388 0.254889851

163 26 0.003 0.256754949 0.25831221

172 20 0.002 0.256846168 0.265325789

144 42 0.004 0.256873059 0.289084395

36 512 0.006 0.257148897 0.265194518

124 67 0.004 0.257982637 0.387657682

21 640 0.001 0.258289628 0.254007627

82 166 0.002 0.259437483 0.300478048

204 9 0.004 0.259596137 0.269058609

- We feel that highlighted rows can be the best

values of epoch and learning rates.
- We used number of epochs = “9” & learning rate =
“0.004” and got below values.

Plot of Average Loss and number of iterations in

scikit learn with SGDClassifier
How Satisfied?
We are happy that we got similar values in
comparison with part1.
We have a concern with SDG Classifier as fit
function is not returning any list to give MSE at each
iteration which can be used later to plot graph.
We had to use custom code to plot graph for
iterations and average loss.

The AI Wealth Creation Blueprint PDF
67% (3)
The AI Wealth Creation Blueprint PDF
50 pages
The Age of AI and Our Human Future (Henry Kissinger, Eric Schmidt Etc.) (Z-Library)
100% (8)
The Age of AI and Our Human Future (Henry Kissinger, Eric Schmidt Etc.) (Z-Library)
148 pages
How To Hack Atm
87% (15)
How To Hack Atm
1 page
Christopher Langan - CTMU, The Cognitive-Theoretic Model of The Universe, A New Kind of Reality Theory
88% (8)
Christopher Langan - CTMU, The Cognitive-Theoretic Model of The Universe, A New Kind of Reality Theory
56 pages
Data Structure and Algorithmic Thinking With Python Data Structure and Algorithmic Puzzles PDF
95% (20)
Data Structure and Algorithmic Thinking With Python Data Structure and Algorithmic Puzzles PDF
471 pages
Gayle Laakmann McDowell - Cracking The Coding Interview - 189 Programming Questions and Solutions (2015, CareerCup)
81% (48)
Gayle Laakmann McDowell - Cracking The Coding Interview - 189 Programming Questions and Solutions (2015, CareerCup)
708 pages
Gödel, Escher, Bach - An Eternal Golden Braid (20th Anniversary Edition) by Douglas R. Hofstadter (Charm-Quark) PDF
100% (10)
Gödel, Escher, Bach - An Eternal Golden Braid (20th Anniversary Edition) by Douglas R. Hofstadter (Charm-Quark) PDF
821 pages
Cracking The Coding Interview - 189 Programming Questions and Solutions (6th Edition) (EnglishOnlineClub - Com)
100% (10)
Cracking The Coding Interview - 189 Programming Questions and Solutions (6th Edition) (EnglishOnlineClub - Com)
708 pages
Chris Bailey - Hyperfocus - The New Science of Attention, Productivity, and Creativity-Viking (2018)
100% (25)
Chris Bailey - Hyperfocus - The New Science of Attention, Productivity, and Creativity-Viking (2018)
306 pages
The Art of Asking ChatGPT For High-Quality Answers A Complete Guide To Prompt Engineering Techniques (Ibrahim John) (Z-Library)
100% (24)
The Art of Asking ChatGPT For High-Quality Answers A Complete Guide To Prompt Engineering Techniques (Ibrahim John) (Z-Library)
52 pages
ECS7020P Sample Paper Solutions
No ratings yet
ECS7020P Sample Paper Solutions
6 pages
The Fabric of Reality
100% (1)
The Fabric of Reality
6 pages
Banana Pancakes - Ukulele Chord Chart
100% (1)
Banana Pancakes - Ukulele Chord Chart
2 pages
NB 12
No ratings yet
NB 12
34 pages
75 Productivity Hacks - System Sunday
100% (7)
75 Productivity Hacks - System Sunday
75 pages
Cs 229, Autumn 2016 Problem Set #2: Naive Bayes, SVMS, and Theory
No ratings yet
Cs 229, Autumn 2016 Problem Set #2: Naive Bayes, SVMS, and Theory
20 pages
Military Remote Viewing Manual
100% (5)
Military Remote Viewing Manual
72 pages
Machine Learning For Humans
100% (4)
Machine Learning For Humans
97 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
Lab2 Linear Regression
100% (1)
Lab2 Linear Regression
18 pages
LinearRegression Tutorial
No ratings yet
LinearRegression Tutorial
40 pages
Project Report: CS 574 - Computer Vision Using Machine Learning
No ratings yet
Project Report: CS 574 - Computer Vision Using Machine Learning
38 pages
sol_eval_1
No ratings yet
sol_eval_1
4 pages
Group7 Report
No ratings yet
Group7 Report
10 pages
AML Lab
No ratings yet
AML Lab
34 pages
Book Pytorch Scikit Learn Numpy
No ratings yet
Book Pytorch Scikit Learn Numpy
247 pages
Cs7602 - Machine Learning Assignment 1: Submitted by
No ratings yet
Cs7602 - Machine Learning Assignment 1: Submitted by
11 pages
CH - En.u4cse19101 Cheduri Linearregression
No ratings yet
CH - En.u4cse19101 Cheduri Linearregression
8 pages
Vertopal.com C1 W2 Lab03 Feature Scaling and Learning Rate Soln
No ratings yet
Vertopal.com C1 W2 Lab03 Feature Scaling and Learning Rate Soln
10 pages
Exercise 7 Submission Group 12
No ratings yet
Exercise 7 Submission Group 12
22 pages
Lecture3_upload
No ratings yet
Lecture3_upload
28 pages
CS 229, Autumn 2017 Problem Set #4: EM, DL & RL
No ratings yet
CS 229, Autumn 2017 Problem Set #4: EM, DL & RL
10 pages
Wine Classification
No ratings yet
Wine Classification
10 pages
Multiple Regression
No ratings yet
Multiple Regression
7 pages
Optimizing the Hyperparameters 1693296270
No ratings yet
Optimizing the Hyperparameters 1693296270
11 pages
G 203008076 - 4 - Christhian Quiñonez - Ex1 - 2 A PDF
No ratings yet
G 203008076 - 4 - Christhian Quiñonez - Ex1 - 2 A PDF
20 pages
DL2 - Jupyter Notebook
No ratings yet
DL2 - Jupyter Notebook
5 pages
CNN Code
No ratings yet
CNN Code
6 pages
Practical Work N. 3 (Travaux Pratiques N. 3) : Introduction To Machine Learning: Application To Geosciences
No ratings yet
Practical Work N. 3 (Travaux Pratiques N. 3) : Introduction To Machine Learning: Application To Geosciences
4 pages
Machine Learning Lab (3) Report (21 CP 81)
No ratings yet
Machine Learning Lab (3) Report (21 CP 81)
7 pages
1.1 ID5059 1.2 Tom Kelsey - Jan 2021: February 15, 2021
No ratings yet
1.1 ID5059 1.2 Tom Kelsey - Jan 2021: February 15, 2021
43 pages
ex_eval_1
No ratings yet
ex_eval_1
3 pages
PGP25116 - Soubhagya - Dash - DPolynomial Regression
No ratings yet
PGP25116 - Soubhagya - Dash - DPolynomial Regression
4 pages
Bil470 hw2 Summer2024
No ratings yet
Bil470 hw2 Summer2024
4 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
47 pages
J
No ratings yet
J
59 pages
Larning Rate
No ratings yet
Larning Rate
9 pages
Submission_template_513_E_div
No ratings yet
Submission_template_513_E_div
53 pages
Q. (A) What Are Different Types of Machine Learning? Discuss The Differences
No ratings yet
Q. (A) What Are Different Types of Machine Learning? Discuss The Differences
12 pages
ML LAB
No ratings yet
ML LAB
23 pages
ml
No ratings yet
ml
71 pages
ML LAB FILE (2)
No ratings yet
ML LAB FILE (2)
48 pages
(Slide) Non Linear Regression
No ratings yet
(Slide) Non Linear Regression
39 pages
Exercise - 3: DS203-2024-S1 Roll Number: 23B2215
No ratings yet
Exercise - 3: DS203-2024-S1 Roll Number: 23B2215
25 pages
Homework1
No ratings yet
Homework1
14 pages
Loadalgarve MLP
No ratings yet
Loadalgarve MLP
7 pages
P4-report
No ratings yet
P4-report
9 pages
Final Report
No ratings yet
Final Report
8 pages
ForecastingIndividualassignment MohammadMujtaba 12020063
No ratings yet
ForecastingIndividualassignment MohammadMujtaba 12020063
20 pages
CS 611 Slides 4
No ratings yet
CS 611 Slides 4
25 pages
Assignment - 1: Modeling and Simulation of Dynamic System
No ratings yet
Assignment - 1: Modeling and Simulation of Dynamic System
10 pages
ANN-Regression-Python Examples
No ratings yet
ANN-Regression-Python Examples
35 pages
Linear Regression
No ratings yet
Linear Regression
31 pages
HW1
No ratings yet
HW1
4 pages
Multilinear ProblemStatement
No ratings yet
Multilinear ProblemStatement
132 pages
Lab5 Linear Regression
No ratings yet
Lab5 Linear Regression
1 page
SC Lab File Fayiz PDF
No ratings yet
SC Lab File Fayiz PDF
29 pages
UnivariateRegression Summary
No ratings yet
UnivariateRegression Summary
36 pages
Dive Into Deep Learning
No ratings yet
Dive Into Deep Learning
972 pages
1710993830340
No ratings yet
1710993830340
9 pages
Industrial Ai
No ratings yet
Industrial Ai
13 pages
DEEP LEARNING LAB - Manual
No ratings yet
DEEP LEARNING LAB - Manual
64 pages
Dive Into Deep Learning
No ratings yet
Dive Into Deep Learning
997 pages
Sam HW2
No ratings yet
Sam HW2
4 pages
The Data Science Workshop: A New, Interactive Approach to Learning Data Science
From Everand
The Data Science Workshop: A New, Interactive Approach to Learning Data Science
Anthony So
No ratings yet
Data Mining Models: Techniques and Applications
From Everand
Data Mining Models: Techniques and Applications
Ravi Deshpande
No ratings yet
The Secrets of A Slot Machine
No ratings yet
The Secrets of A Slot Machine
4 pages
Roadmap How To Learn AI in 2024 (Uncovered AI)
No ratings yet
Roadmap How To Learn AI in 2024 (Uncovered AI)
6 pages
Teas Topics To Study
100% (12)
Teas Topics To Study
6 pages
My Ai Cheat List
100% (11)
My Ai Cheat List
3 pages
Tech Trend 2024 Report-2
No ratings yet
Tech Trend 2024 Report-2
11 pages
2045: The Year Man Becomes Immortal
No ratings yet
2045: The Year Man Becomes Immortal
9 pages
From Music To Mathematic
100% (1)
From Music To Mathematic
4 pages
Mind Control Patents
100% (1)
Mind Control Patents
41 pages
Rationality From AI To Zombies
86% (7)
Rationality From AI To Zombies
1,813 pages
Attention Is All You Need
67% (3)
Attention Is All You Need
11 pages
Python Programming and Maching Learning 2 in 1 B08Y5DPX32
100% (7)
Python Programming and Maching Learning 2 in 1 B08Y5DPX32
145 pages
Wisc V Interpretation
100% (1)
Wisc V Interpretation
8 pages
Current and Future Trends on AI Applications - Mohammed A Al-Sharafi
No ratings yet
Current and Future Trends on AI Applications - Mohammed A Al-Sharafi
456 pages
Psych Unit 7a Practice Quiz
No ratings yet
Psych Unit 7a Practice Quiz
4 pages
Introductory Statistical Learning
No ratings yet
Introductory Statistical Learning
87 pages
Journal Logo: 00 (2019) 1-20 /locate/procedia
No ratings yet
Journal Logo: 00 (2019) 1-20 /locate/procedia
20 pages
Walmart - Sales: Pandas PD Seaborn Sns Numpy NP Matplotlib - Pyplot PLT Matplotlib Datetime
100% (1)
Walmart - Sales: Pandas PD Seaborn Sns Numpy NP Matplotlib - Pyplot PLT Matplotlib Datetime
26 pages
MET 1113 - EE - Lab Simulation 2
No ratings yet
MET 1113 - EE - Lab Simulation 2
12 pages
House Price Prediction Using Machine Learning
No ratings yet
House Price Prediction Using Machine Learning
10 pages
Forecasting Methods
100% (1)
Forecasting Methods
50 pages
Lecture Notes To Accompany: Operations Management, 10 Edition Stevenson (Mcgraw-Hill 2009)
No ratings yet
Lecture Notes To Accompany: Operations Management, 10 Edition Stevenson (Mcgraw-Hill 2009)
105 pages
Problems
No ratings yet
Problems
34 pages
Areslab Adaptive Regression Splines Toolbox For Matlab/Octave
No ratings yet
Areslab Adaptive Regression Splines Toolbox For Matlab/Octave
33 pages
Solution Demand Forecasting (Part 1) FOR STUDENTS
No ratings yet
Solution Demand Forecasting (Part 1) FOR STUDENTS
27 pages
CS1 Formula Sheet
No ratings yet
CS1 Formula Sheet
15 pages
Progressive Semantic-Aware Style Transformation For Blind Face Restoration
No ratings yet
Progressive Semantic-Aware Style Transformation For Blind Face Restoration
9 pages
ML - Practical List
No ratings yet
ML - Practical List
3 pages
Inferential Statistics, T Test, ANOVA & Proportionate Test
No ratings yet
Inferential Statistics, T Test, ANOVA & Proportionate Test
117 pages
03 Spring Final Soln
No ratings yet
03 Spring Final Soln
3 pages
2012 Kruk The Habitat Template of Phytoplankton Morphology-Based Functional Groups
No ratings yet
2012 Kruk The Habitat Template of Phytoplankton Morphology-Based Functional Groups
12 pages
Sustainability 14 06199
No ratings yet
Sustainability 14 06199
23 pages
MachineLearningProject-Capillary Pressure Estimation
No ratings yet
MachineLearningProject-Capillary Pressure Estimation
46 pages
Unit V Aiml
No ratings yet
Unit V Aiml
18 pages
Theory For Penalised Spline Regression: Peter - Hall@anu - Edu.au
No ratings yet
Theory For Penalised Spline Regression: Peter - Hall@anu - Edu.au
14 pages
Statnotes PDF
No ratings yet
Statnotes PDF
300 pages
Comparison of Statistical and Machine Learning Methods for Daily SKU Demand Forecasting
No ratings yet
Comparison of Statistical and Machine Learning Methods for Daily SKU Demand Forecasting
25 pages
Boook of Presentations of Svit
No ratings yet
Boook of Presentations of Svit
51 pages
Spatial Interpolation: A Brief: Eugene Brusilovskiy
No ratings yet
Spatial Interpolation: A Brief: Eugene Brusilovskiy
58 pages
Afram Et Al. - 2017 - Artificial Neural Network (ANN) Based Model Predictive Control (MPC) and Optimization of HVAC Systems A State of T
No ratings yet
Afram Et Al. - 2017 - Artificial Neural Network (ANN) Based Model Predictive Control (MPC) and Optimization of HVAC Systems A State of T
18 pages
Deep Learning - AD3501 - Important Questions and 2 Marks With Answer - Unit 4 - Model Evaluation
No ratings yet
Deep Learning - AD3501 - Important Questions and 2 Marks With Answer - Unit 4 - Model Evaluation
12 pages
Krajewski - Om12 - 08
No ratings yet
Krajewski - Om12 - 08
74 pages
Price Prediction Research Paper
No ratings yet
Price Prediction Research Paper
10 pages
Grayscale Image Colorization
No ratings yet
Grayscale Image Colorization
6 pages
Download full (Ebook) Multi-factor Models and Signal Processing Techniques: Application to Quantitative Finance by Serges Darolles, Patrick Duvaut, Emmanuelle Jay ISBN 9781848214194, 1848214197 ebook all chapters
100% (2)
Download full (Ebook) Multi-factor Models and Signal Processing Techniques: Application to Quantitative Finance by Serges Darolles, Patrick Duvaut, Emmanuelle Jay ISBN 9781848214194, 1848214197 ebook all chapters
67 pages

AmeyaYaminiLinearRegressionDoc (1)

Uploaded by

AmeyaYaminiLinearRegressionDoc (1)

Uploaded by

CS 6375

ASSIGNMENT 1: Linear Regression

Names of students in your group:

Number of free late days used:

Please list clearly all the sources/references that you

Note: For the purpose of code execution we have

1. Before feature scaling

2. After feature scaling

Pre-processing activities includes :

Learnin MSE Training MSE Testing

199 12 0.009 0.24955098 0.237985157

198 12 0.008 0.25529565 0.243804941

189 16 0.009 0.23630345 0.224245793

188 16 0.008 0.24095059 0.229016238

187 16 0.007 0.24731688 0.23557902

186 16 0.006 0.25589799 0.244425476

179 20 0.009 0.23011224 0.217891765

178 20 0.008 0.23314507 0.22099012

177 20 0.007 0.23763966 0.225608025

176 20 0.006 0.24425232 0.23242054

180 20 0.01 0.24986636 0.239225368

169 26 0.009 0.22611866 0.213869919

168 26 0.008 0.22762091 0.21536825

170 26 0.01 0.22793378 0.215984987

167 26 0.007 0.23009179 0.217871243

166 26 0.006 0.23418695 0.222060797

165 26 0.005 0.24095276 0.229021169

164 26 0.004 0.25199203 0.24040105

Below is the graph for one of the optimal values of

131 53 0.001 0.254060491 0.258383816

261 1 0.001 0.254416362 0.255711369

222 5 0.002 0.25441734 0.26322152

105 105 0.005 0.254566409 0.305966067

181 16 0.001 0.255023464 0.265665126

103 105 0.003 0.255304489 0.292813149

81 166 0.001 0.256135017 0.255118766

91 132 0.001 0.256341374 0.295609517

211 7 0.001 0.256353388 0.254889851

163 26 0.003 0.256754949 0.25831221

172 20 0.002 0.256846168 0.265325789

144 42 0.004 0.256873059 0.289084395

36 512 0.006 0.257148897 0.265194518

21 640 0.001 0.258289628 0.254007627

82 166 0.002 0.259437483 0.300478048

204 9 0.004 0.259596137 0.269058609

- We feel that highlighted rows can be the best

Plot of Average Loss and number of iterations in

You might also like