0% found this document useful (0 votes)
19 views

Lasso Regression

Uploaded by

latif_996872197
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Lasso Regression

Uploaded by

latif_996872197
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 16

LASSO

REGRESSION
Lasso Regression
◦ LASSO stands for Least Absolute Shrinkage and Selection Operator.
◦Lasso regression is a regularization technique that applies a penalty to prevent
overfitting and enhance the accuracy of statistical models.
◦Lasso Regression is a form of regularization that seeks to minimize the
magnitude of coefficients so that more relevant variables are included in the
model.
◦Like Ridge Regression, Lasso Regression adds a regularization term to the
linear Regression objective function.
◦The difference lies in the loss function used — Lasso Regression uses L1
regularization, which aims to minimize the sum of the absolute values of
coefficients multiplied by penalty factor λ.
Bias-Variance Tradeoff in Lasso Regression

◦ The balance between bias (error resulting from oversimplified assumptions in the
model) and variance (error resulting from sensitivity to little variations in the
training data) is known as the bias-variance tradeoff.
◦ While implementing lasso regression, the penalty term (L1 regularization) helps to
significantly lowers the variance of the model by decreasing the coefficients of less
significant features towards zero. By doing this, overfitting may be avoided. Hence,
the model identifies noise in the training set rather than the underlying
patterns. However, as the model may become overly simplistic and unable
to represent the true underlying relationships in the data, raising the
regularization strength to reduce variance may also increase bias.
◦ Thus, bias and variance are traded off in lasso regression, just like in other
regularization strategies. Achieving the ideal balance usually entails
minimizing the total prediction error(MSE) by adjusting the regularization
parameter using methods like cross-validation.
Linear Regression Model L1 Regularization
LASSO regression starts with the LASSO regression introduces an
standard linear regression model, additional penalty term based on the
which assumes a linear relationship absolute values of the coefficients. The
between the independent variables L1 regularization term is the sum of the
(features) and the dependent variable absolute values of the coefficients
(target). The linear regression equation multiplied by a tuning parameter λ:
can be represented as follows:
= β₀ + β₁x₁ + β₂x₂ + … + βₚxₚ + ε L₁ = λ * (|β₁| + |β₂| + … + |βₚ|)
◦ Where:
Where:
• y is the dependent variable (target). 2. λ is the regularization parameter that
• β₀, β₁, β₂, …, βₚ are the coefficients controls the amount of regularization
(parameters) to be estimated. applied.
3. β₁, β₂, …, βₚ are the coefficients.
• x₁, x₂, …, xₚ are the independent
variables (features).
• ε represents the error term.
Objective Function
◦ The objective of LASSO regression is to find the values of the coefficients that
minimize the sum of the squared differences between the predicted values and the
actual values, while also minimizing the L1 regularization term:

RSS + L₁

Where:
RSS is the residual sum of squares, which measures the error between the predicted
values and the actual values.
Shrinking Coefficients
◦ By adding the L1 regularization term, LASSO regression can shrink the coefficients
towards zero. When λ is sufficiently large, some coefficients are driven to exactly
zero.

This property of LASSO makes it useful for feature selection, as the variables with
zero coefficients are effectively removed from the model.
Tuning parameter λ
◦ The choice of the regularization parameter λ is crucial in LASSO regression. A larger
λ value increases the amount of regularization, leading to more coefficients being
pushed towards zero.

Conversely, a smaller λ value reduces the regularization effect, allowing more


variables to have non-zero coefficients.
◦ Unlike Ridge Regression, Lasso Regression can force coefficients of less significant
features to be exactly zero.
◦ As a result, Lasso Regression performs both regularization and feature selection
simultaneously.

•Penalty: The L1L1L1 penalty λ∑j=1m∣βj∣\lambda \sum_{j=1}^m |\beta_j|λ∑j=1m​∣βj​∣ shrinks


some coefficients to zero, which can lead to sparse models.
•Feature Selection: High regularization leads to fewer, more relevant features, making it a good
choice when you suspect many features may be irrelevant.
•Trade-Off: A larger λ\lambdaλ leads to a simpler model (more coefficients zeroed out), while a
smaller λ\lambdaλ allows for more complex models.
The lasso procedure encourages simple, sparse models (i.e. models with fewer
parameters). This particular type of regression is well-suited for models showing
high levels of multicollinearity or when you want to automate certain parts of
model selection, like variable selection/parameter elimination.
Evaluate the performance of your model

◦ Generally, we might print out a few values to understand model performance,


specifically R2 and MSE. R2 tells us the proportion of variance in our dependent
variable (or response variable) which is explained by independent variables. By
comparing MSE values for different values of λ, you will see if the model has been
effectively optimized for the global minimum.
When to use Lasso Regression

• Feature Selection: By reducing the coefficients of less significant features to zero,


Lasso regression automatically chooses a selection of features. When you have a lot
of features and want to find the ones that are most significant, this is helpful.
• Collinearity: By reducing the coefficients of correlated variables and choosing one,
lasso regression might be useful when there is multicollinearity—that is, when the
predictor variables have a high degree of correlation with one another.
• Regularization: By penalizing big coefficients, Lasso regression can aid in
preventing overfitting. When the number of predictors approaches or surpasses the
number of observations, this becomes particularly significant.
• Interpretability: Compared to conventional linear regression models that
incorporate all features, lasso regression often yields sparse models with fewer non-
zero coefficients. This could make the final model simpler to understand.
Advantages of Lasso Regression

• Feature Selection: Lasso regression eliminates the need to manually select the
most relevant features, hence, the developed regression model becomes simpler
and more explainable.
• Regularization: Lasso constrains large coefficients, so a less biased model is
generated, which is robust and general in its predictions.
• Interpretability: With lasso, models are often sparsity induced, therefore, they are
easier to interpret and explain, which is essential in fields like health care and
finance.
• Handles Large Feature Spaces: Lasso lends itself to dealing with high-
dimensional data like we have in genomic as well as imaging studies.
Disadvantages of Lasso Regression

• Selection Bias: Lasso, might arbitrarily choose one variable in a group of highly
correlated variables rather than the other, thereby yielding a biased model in the
end.
• Sensitive to Scale: Lasso is demanding in the respect that features of different
orders have a tendency to affect the regularization line and the model's precision.
• Impact of Outliers: Lasso can be easily affected by the outliers in the given data,
resulting into the overfitting of the coefficients.
• Model Instability: In the environment of multiple correlated variables the lasso's
selection of variable may be unstable, which results in different variable subsets
each time in tiny data change.
• Tuning Parameter Selection: Analyzing different λ (alpha) values may be
problematic and maybe solved by cross-validation.

You might also like