Exp 1
Exp 1
Experiment-1
Objective: To implement of Linear Regression in Python
Task 1: Implementing Linear Regression in Python
Code:
import numpy as np
from sklearn.linear_model import LinearRegression
x = np.array([5,15,25,35,45,55]).reshape((-1,1))
y = np.array([5,20,14,32,22,38])
print(x)
print(y,"\n")
model = LinearRegression()
model.fit(x,y)
r_sq = model.score(x,y)
print("coefficeient of determination: ",r_sq,"\n")
print("intercept w0: ",model.intercept_)
print("Slope w1: ",model.coef_,"\n")
new_model = LinearRegression().fit(x,y.reshape((-1,1)))
print("intercept w0: ",new_model.intercept_)
print("Slope w1: ",new_model.coef_,"\n")
y_pred = model.predict(x)
print('predicted response: ',y_pred,sep='\n')
print('\n')
y_pred = model.intercept_ + model.coef_ * x
print('predicted response: ',y_pred,sep='\n')
print('\n')
x_new = np.arange(6).reshape((-1,1))
print(x_new,'\n')
y_new = model.predict(x_new)
print(y_new,'\n')
Machine Learning 21BEC505
Output:
model = LinearRegression().fit(x,y)
r_sq = model.score(x,y)
print("coefficeient of determination: ",r_sq,"\n")
print("intercept w0: ",model.intercept_)
Machine Learning 21BEC505
y_pred = model.predict(x)
print('predicted response: ',y_pred,sep='\n')
print('\n')
y_pred = model.intercept_ + np.sum(model.coef_ * x, axis=1)
print('predicted response: ',y_pred,sep='\n')
print('\n')
x_new = np.arange(10).reshape((-1,2))
print(x_new,'\n')
y_new = model.predict(x_new)
print(y_new,'\n')
Output:
Machine Learning 21BEC505
Task 3: Write a program to predict the salary of an employee using Linear Regression.
Code:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
dataset = pd.read_csv(r'E:\Jay\NIRMA\Sem6\ML\Exp1\Salary_Data.csv')
X = dataset.iloc[:,:-1].values
y = dataset.iloc[:,1].values
model = LinearRegression().fit(X,y)
r_sq = model.score(X,y)
Machine Learning 21BEC505
Exercise:
1. Apply Linear Regression to predict the stock market price.
Code:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
dataset = pd.read_csv(r'E:\Jay\NIRMA\Sem6\ML\Exp1\prices-split-adjusted.csv')
y = dataset.iloc[:30,3].values
dataset.drop('close', inplace=True, axis=1)
X = dataset.iloc[:30,2:].values
print('Y: ',y)
print("\n")
print('X: ',X)
print("\n")
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=1/3,random_state=0)
Output:
Machine Learning 21BEC505
2. Create a regression model for an oscillating sinusoidal function corrupted with Gaussian noise of 0
mean and 0.25 variance as the output.
Generate polynomial model for all degrees starting 1 through 9 for training set data.
Get the predicted output for each fit model.
Analyze the error on the training data for each polynomial degree feature.
Investigate the best fit model to obtain the coefficients and constants and test it on the test data
set.
Evaluate the error of test set.
Conclude which regression model is best torepresent the corrupted signal.
Identify the underfit and overfit model, if there is any.
Machine Learning 21BEC505
Code:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split
x = np.linspace(-5, 5, 1000)
y = np.sin(x) + np.random.normal(0, 0.25, len(x))
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=1/3, random_state=0)
plt.scatter(x_train, y_train, s=5, label='Training data')
plt.scatter(x_test, y_test, s=5, label='Test data')
plt.legend()
plt.show()
# Plot the training and test errors for each polynomial degree
plt.plot(degrees, train_errors, label='Training error')
plt.plot(degrees, test_errors, label='Test error')
plt.legend()
plt.show()
Output:
Machine Learning 21BEC505
Machine Learning 21BEC505
Conclusion:
This experiment taught us how to divide a dataset into training and testing sets for machine learning
applications and how to use linear regression to predict stock prices. However, it is essential to keep in mind
that, despite the fact that linear regression can be a useful model for predicting stock prices, there are numerous
other factors that can influence stock prices, and accurate predictions may necessitate the application of
additional machine learning models and strategies. Additionally, the test data's lowest error indicated that the
polynomial model with degree 9 was the most accurate representation of the corrupted signal. This suggests
that, in comparison to the lower degree polynomial models, the higher degree polynomial models were better
able to capture the underlying pattern in the data. When a model is too simple and unable to capture the
underlying pattern in the data, it is considered to be an underfit model, resulting in high test and training errors.
Due to its high training and test error, the first degree polynomial model appears to be underfitting the data in
this instance. When a model is too complex and fits the data's noise, an overfit model has a low test error but
a high training error. We didn't see any overfitting in this case.