0% found this document useful (0 votes)
7 views

Chatgpt Unit - 2

Uploaded by

he he
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Chatgpt Unit - 2

Uploaded by

he he
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Regression Analysis in Machine Learning - Unit 2 Notes

1. Introduc on to Regression
What is Regression?
Regression is a sta s cal method used in Machine Learning to es mate the rela onships between variables. It predicts
a con nuous target variable (dependent variable) based on one or more predictor variables (independent variables).
Terminologies in Regression:
 Dependent Variable (Target): The variable we aim to predict.
 Independent Variable (Features): The variables used to predict the target.
 Regression Coefficients: Parameters that represent the rela onship between predictors and the target.
 Residuals: The differences between observed and predicted values.
Applica ons of Regression:
 Predic ng house prices.
 Es ma ng sales revenue.
 Analyzing the impact of adver sing on sales.

2. Types of Regression
1. Linear Regression:
o Predicts the target variable by fi ng a linear rela onship.
o Equa on: y=β0+β1x+ϵy = \beta_0 + \beta_1x + \epsilon.
2. Logis c Regression:
o Used for classifica on problems.
o Predicts the probability of a binary outcome.
o Uses the sigmoid func on to map outputs.
3. Polynomial Regression:
o Extends linear regression by adding polynomial terms.
o Captures non-linear rela onships.
4. Ridge and Lasso Regression:
o Add regulariza on to prevent overfi ng.
5. Mul ple Linear Regression:
o Extends linear regression to mul ple predictors.
o Equa on: y=β0+β1x1+β2x2+...+ϵy = \beta_0 + \beta_1x_1 + \beta_2x_2 + ... + \epsilon.

3. Logis c Regression
Overview:
Logis c Regression models the probability of a binary outcome.
Key Concepts:
 Sigmoid Func on: Converts predic ons into probabili es.
o P(y=1∣X)=11+e−zP(y=1|X) = \frac{1}{1 + e^{-z}}, where z=β0+β1xz = \beta_0 + \beta_1x.
 Threshold: Determines class (e.g., 0.5).
Applica ons:
 Spam detec on.
 Disease diagnosis.

4. Simple Linear Regression


Overview:
 Models the rela onship between one independent variable and one dependent variable.
 Equa on: y=β0+β1xy = \beta_0 + \beta_1x.
Assump ons:
1. Linear rela onship between variables.
2. Homoscedas city (constant variance of residuals).
3. Independence of observa ons.
4. Normally distributed residuals.
Model Building:
1. Define the dependent and independent variables.
2. Es mate parameters β0\beta_0 and β1\beta_1.
3. Fit the regression line to the data.
Ordinary Least Squares (OLS):
 Minimizes the sum of squared residuals.
 Es mates coefficients β\beta by minimizing:
o SSE=∑(yi−y^i)2\text{SSE} = \sum (y_i - \hat{y}_i)^2.
Proper es:
 Unbiased es mators.
 Minimum variance among linear es mators.
Interval Es ma on:
 Confidence intervals for predic ons.
 Provides uncertainty bounds.
Residuals:
 Differences between observed and predicted values.
 Residual plot checks model fit.

5. Mul ple Linear Regression


Overview:
 Extends simple linear regression to mul ple predictors.
 Equa on: y=β0+β1x1+β2x2+...+βkxk+ϵy = \beta_0 + \beta_1x_1 + \beta_2x_2 + ... + \beta_kx_k + \epsilon.
Assump ons:
1. Linear rela onship between predictors and target.
2. No mul collinearity (independent predictors).
3. Homoscedas city.
4. Independence of errors.
Model Evalua on:
 R-Squared ( R²): Propor on of variance explained.
 Adjusted R-Squared: Adjusted for the number of predictors.
 Standard Error: Measures predic on accuracy.
 F-Sta s c: Tests overall model significance.
 P-Values: Tests significance of individual predictors.
Interpreta on:
 Coefficients indicate the change in target for a one-unit change in predictors.
Assessing Fit:
1. R2R^2: Higher values indicate be er fit.
2. Residual analysis: Check for pa erns.

6. Feature Selec on and Dimensionality Reduc on


Importance:
 Improves model performance.
 Reduces computa onal complexity.
Techniques:
1. Principal Component Analysis (PCA):
o Converts correlated features into uncorrelated principal components.
o Retains components that explain most variance.
o Applica ons: Data visualiza on, noise reduc on.
2. Linear Discriminant Analysis (LDA):
o Finds a linear combina on of features that best separates classes.
o Commonly used in classifica on.
3. Independent Component Analysis (ICA):
o Decomposes data into independent components.
o Applica ons: Blind signal separa on.

End of Notes
These detailed notes provide comprehensive coverage of regression analysis, including logis c regression, simple and
mul ple linear regression, and advanced techniques for feature selec on and dimensionality reduc on.

You might also like