0% found this document useful (0 votes)
4 views

Assg 2

MLFA Lab IIT Kharagpur Assignment 2

Uploaded by

Anil Yogi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Assg 2

MLFA Lab IIT Kharagpur Assignment 2

Uploaded by

Anil Yogi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

MLFA Assignment - 2

Name - Anil Kumar Yogi

Roll No. - 19MA20004

Linear and Ridge Regression


Experiment-1

Pair-wise correlation by heat-map


Various Feature distribution

Distribution plot for the Purchase amount


Count Histogram for Occupation

Count Plot for Age


Count Plot for Marital Status

Purchase amount VS Occupation distribution


Count Plot for Product_Category_1

Experiment-2
The result is on test data which is 20% of the total data

Without feature Scaling With feature Scaling

MSE 21917565.940552283 21922113.530379087


Experiment-3
MSE vs learning rate

The most optimal learning rate is = 0.0001

Values are as follows:

Learning rate MSE

0.00001 28823496.419600222

0.0001 22126124.834601153

0.001 22126276.82352324

0.01 22145885.458371203

0.05 22260542.10649711

0.1 22325013.14734848

Optimal Value of Learning rate:0.0001


Experiment-4
MSE vs alpha

Optimal Value of alpha:1

Values are as follows

alpha MSE

0 22126124.83619915

0.1 22126053.346791495

0.2 22125985.61116677

0.3 22125921.736663796

0.4 22125861.565309893

0.5 22125805.17178949
0.6 22125752.367283475

0.7 22125703.43527322

0.8 22125658.03194989

0.9 22125616.444388248

1 22125578.425521277

Optimal Value of alpha:1

Experiment-5
MSE values on Test data with feature scaling

LIN_MODEL_CLOSED LIN_MODEL_GRAD LIN_MODEL_RIDGE

MSE 21922113.530379087 153741517.51052088 152150704.97248355


Observation:
The above graph( MSE vs. models) shows that Lin_MODEL_CLOSED outperformed
LIN_MODEL_GRAD and LIN_MODEL_RIDGE. This is consistent with the theoretical result
as the Closed form solution is exact while Gradient descent solutions are approximate
solutions. The other reason for this vast difference can be that we have limited the maximum
number of iterations to 50 only. More number of epochs might have helped us to get closer
to the Closed form solution MSE.
LIN_MODEL_RIDGE model performs slightly better than LIN_MODEL_GRAD which
indicates that LIN_MODEL_GRAD has overfit the data which is corrected by
LIN_MODEL_RIDGE to some extent which results in its better performance.

You might also like