0% found this document useful (0 votes)
5 views9 pages

Assignment 5

The document provides an overview of polynomial regression, detailing its steps including data preparation, choosing polynomial degree, feature transformation, model fitting, evaluation, and visualization. It highlights the benefits of polynomial regression in capturing non-linear relationships and improving predictions compared to linear models. Additionally, it includes a problem statement with Python code implementing polynomial regression using scikit-learn, demonstrating the model's performance through error metrics and visualizations for different polynomial degrees.

Uploaded by

Yash Shirsat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views9 pages

Assignment 5

The document provides an overview of polynomial regression, detailing its steps including data preparation, choosing polynomial degree, feature transformation, model fitting, evaluation, and visualization. It highlights the benefits of polynomial regression in capturing non-linear relationships and improving predictions compared to linear models. Additionally, it includes a problem statement with Python code implementing polynomial regression using scikit-learn, demonstrating the model's performance through error metrics and visualizations for different polynomial degrees.

Uploaded by

Yash Shirsat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Assignment 5

Name: satyajit Shinde


Div: TY AI C Roll No.: 41
PRN: 12211701

Understanding Polynomial Regression


Polynomial regression is a form of regression analysis in which the
relationship between the independent variable xx and the dependent
variable y is modelled as an nth degree polynomial. This technique is
particularly useful when the data exhibits a curvilinear relationship that
cannot be well captured by a simple linear regression model.
Key Steps in Polynomial Regression
1. Data Preparation:
● Gather and preprocess the dataset, ensuring that it is clean
and ready for analysis.
● Identify the independent variable(s) (features) and the
dependent variable (target).
2. Choose Polynomial Degree:
● Determine the degree of the polynomial nn based on the
nature of the data and the underlying relationship you wish to
model.
3. Feature Transformation:
● Transform the independent variable(s) into polynomial
features. For example, if you have a single feature xx, you
would create features such as x2,x3,…,xnx2,x3,…,xn.
4. Model Fitting:
● Fit a polynomial regression model to the transformed features
using a suitable algorithm (e.g., ordinary least squares).
5. Model Evaluation:
● Evaluate the model's performance using appropriate metrics
(e.g., R-squared, Mean Squared Error) to assess how well it
fits the data.
6. Visualization:
● Plot the original data points along with the fitted polynomial
curve to visually assess how well the model captures the
underlying trend.
Benefits of Polynomial Regression
● Captures Non-Linear Relationships: Polynomial regression can
model complex relationships that linear regression cannot, making
it suitable for datasets with non-linear patterns.
● Flexibility: By adjusting the degree of the polynomial, you can
create a wide range of models that can fit various types of data
distributions.
● Improved Predictions: In cases where relationships are inherently
non-linear, polynomial regression can lead to better predictive
performance compared to linear models.
Choosing the Degree of Polynomial
● Low Degree (1-2): Suitable for simple curves; less risk of
overfitting.
● Moderate Degree (3-5): Often provides a good balance between
flexibility and overfitting; commonly used in practice.
● High Degree (>5): Can fit very complex relationships but risks
overfitting and may lead to poor generalization on unseen data.

Problem Statement: WAP to implement Polynomial Regression


Code:
import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

from sklearn.linear_model import LinearRegression

from sklearn.preprocessing import PolynomialFeatures

from sklearn.metrics import mean_squared_error, mean_absolute_error

# Load the dataset

data = pd.read_csv('Position_Salaries.csv')

X = data.iloc[:, 1:2].values # Independent variable (Position level)

y = data.iloc[:, 2].values # Dependent variable (Salary)

# Handle missing values in y (if any)

y = np.nan_to_num(y, nan=np.nanmean(y))
# Linear Regression using scikit-learn

lin_reg = LinearRegression()

lin_reg.fit(X, y)

linear_predictions = lin_reg.predict(X)

# Calculate error metrics for Linear Regression

linear_mse = mean_squared_error(y, linear_predictions)

linear_mae = mean_absolute_error(y, linear_predictions)

print(f"Linear Regression - MSE: {linear_mse:.2f}, MAE: {linear_mae:.2f}")

# Polynomial Regression Degree 2 using scikit-learn

poly_features_2 = PolynomialFeatures(degree=2)

X_poly_2 = poly_features_2.fit_transform(X)

poly_reg_2 = LinearRegression()

poly_reg_2.fit(X_poly_2, y)

poly_predictions_2 = poly_reg_2.predict(X_poly_2)

# Calculate error metrics for Polynomial Regression Degree 2

poly_mse_2 = mean_squared_error(y, poly_predictions_2)

poly_mae_2 = mean_absolute_error(y, poly_predictions_2)

print(f"Polynomial Regression Degree 2 - MSE: {poly_mse_2:.2f}, MAE:


{poly_mae_2:.2f}")

# Polynomial Regression Degree 4 using scikit-learn

poly_features_4 = PolynomialFeatures(degree=4)
X_poly_4 = poly_features_4.fit_transform(X)

poly_reg_4 = LinearRegression()

poly_reg_4.fit(X_poly_4, y)

poly_predictions_4 = poly_reg_4.predict(X_poly_4)

# Calculate error metrics for Polynomial Regression Degree 4

poly_mse_4 = mean_squared_error(y, poly_predictions_4)

poly_mae_4 = mean_absolute_error(y, poly_predictions_4)

print(f"Polynomial Regression Degree 4 - MSE: {poly_mse_4:.2f}, MAE:


{poly_mae_4:.2f}")

# Visualization for Linear Regression

plt.scatter(X, y, color='red')

plt.plot(X, linear_predictions, color='blue')

plt.title('Truth or Bluff (Linear Regression)')

plt.xlabel('Position level')

plt.ylabel('Salary')

plt.show()

# Visualization for Polynomial Regression Degree 2

X_grid = np.arange(min(X), max(X), 0.01).reshape(-1,1) # For smoother curve


plotting

X_grid_poly_2 = poly_features_2.transform(X_grid)

plt.scatter(X, y, color='red')

plt.plot(X_grid, poly_reg_2.predict(X_grid_poly_2), color='blue')


plt.title('Polynomial Regression Degree 2')

plt.xlabel('Position level')

plt.ylabel('Salary')

plt.show()

# Visualization for Polynomial Regression Degree 4

X_grid_poly_4 = poly_features_4.transform(X_grid)

plt.scatter(X, y, color='red')

plt.plot(X_grid, poly_reg_4.predict(X_grid_poly_4), color='blue')

plt.title('Polynomial Regression Degree 4')

plt.xlabel('Position level')

plt.ylabel('Salary')

plt.show()

Output:
Linear Regression - MSE: 26695878787.88, MAE: 128454.55

Polynomial Regression Degree 2 - MSE: 6758833333.33, MAE: 70218.18

Polynomial Regression Degree 4 - MSE: 210343822.84, MAE: 12681.82

You might also like