0% found this document useful (0 votes)
24 views

Exp 1

The document describes experiments implementing linear regression in Python to predict stock prices and other variables. It shows how to divide datasets into training and test sets, fit linear regression models, calculate errors, and evaluate the performance of polynomial regression models of different degrees on noisy data. The highest degree polynomial model of 9 was found to best represent the corrupted signal, while the first degree model underfit the data, demonstrating the importance of model selection.

Uploaded by

jay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Exp 1

The document describes experiments implementing linear regression in Python to predict stock prices and other variables. It shows how to divide datasets into training and test sets, fit linear regression models, calculate errors, and evaluate the performance of polynomial regression models of different degrees on noisy data. The highest degree polynomial model of 9 was found to best represent the corrupted signal, while the first degree model underfit the data, demonstrating the importance of model selection.

Uploaded by

jay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Machine Learning 21BEC505

Experiment-1
Objective: To implement of Linear Regression in Python
Task 1: Implementing Linear Regression in Python
Code:
import numpy as np
from sklearn.linear_model import LinearRegression
x = np.array([5,15,25,35,45,55]).reshape((-1,1))
y = np.array([5,20,14,32,22,38])
print(x)
print(y,"\n")

model = LinearRegression()
model.fit(x,y)
r_sq = model.score(x,y)
print("coefficeient of determination: ",r_sq,"\n")
print("intercept w0: ",model.intercept_)
print("Slope w1: ",model.coef_,"\n")

new_model = LinearRegression().fit(x,y.reshape((-1,1)))
print("intercept w0: ",new_model.intercept_)
print("Slope w1: ",new_model.coef_,"\n")

y_pred = model.predict(x)
print('predicted response: ',y_pred,sep='\n')
print('\n')
y_pred = model.intercept_ + model.coef_ * x
print('predicted response: ',y_pred,sep='\n')
print('\n')

x_new = np.arange(6).reshape((-1,1))
print(x_new,'\n')
y_new = model.predict(x_new)
print(y_new,'\n')
Machine Learning 21BEC505

Output:

Task 2: Multiple Linear Regression With scikit-learn


Code:
import numpy as np
from sklearn.linear_model import LinearRegression
x = np.array([[0,1],[5,1],[15,2],[25,5],[35,11],[45,15],[55,34],[60,35]])
y = np.array([4,5,20,14,32,22,38,43])
print(x,'\n')
print(y,"\n")

model = LinearRegression().fit(x,y)
r_sq = model.score(x,y)
print("coefficeient of determination: ",r_sq,"\n")
print("intercept w0: ",model.intercept_)
Machine Learning 21BEC505

print("Slope w1: ",model.coef_,"\n")

y_pred = model.predict(x)
print('predicted response: ',y_pred,sep='\n')
print('\n')
y_pred = model.intercept_ + np.sum(model.coef_ * x, axis=1)
print('predicted response: ',y_pred,sep='\n')
print('\n')

x_new = np.arange(10).reshape((-1,2))
print(x_new,'\n')
y_new = model.predict(x_new)
print(y_new,'\n')
Output:
Machine Learning 21BEC505

Task 3: Write a program to predict the salary of an employee using Linear Regression.
Code:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

dataset = pd.read_csv(r'E:\Jay\NIRMA\Sem6\ML\Exp1\Salary_Data.csv')
X = dataset.iloc[:,:-1].values
y = dataset.iloc[:,1].values

from sklearn.model_selection import train_test_split


X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=1/3,random_state=0)

from sklearn.linear_model import LinearRegression


regressor = LinearRegression()
regressor.fit(X_train, y_train)
y_pred = regressor.predict(X_test)

plt.scatter(X_train, y_train, color='red')


plt.plot(X_train,regressor.predict(X_train),color='blue')
plt.title('Salary vs Experience(Training set)')
plt.xlabel('Years of Experience')
plt.ylabel('Salary')
plt.show()

plt.scatter(X_test, y_test, color='red')


plt.plot(X_train,regressor.predict(X_train),color='blue')
plt.title('Salary vs Experience(Test set)')
plt.xlabel('Years of Experience')
plt.ylabel('Salary')
plt.show()

model = LinearRegression().fit(X,y)
r_sq = model.score(X,y)
Machine Learning 21BEC505

print("coefficeient of determination: ",r_sq,"\n")


print("intercept w0: ",model.intercept_)
print("Coef w1: ",model.coef_,"\n")
Output:
Machine Learning 21BEC505

Exercise:
1. Apply Linear Regression to predict the stock market price.
Code:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

dataset = pd.read_csv(r'E:\Jay\NIRMA\Sem6\ML\Exp1\prices-split-adjusted.csv')
y = dataset.iloc[:30,3].values
dataset.drop('close', inplace=True, axis=1)
X = dataset.iloc[:30,2:].values
print('Y: ',y)
print("\n")
print('X: ',X)
print("\n")
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=1/3,random_state=0)

from sklearn.linear_model import LinearRegression


regressor = LinearRegression()
regressor.fit(X_train, y_train)
y_pred = regressor.predict(X_test)
np.set_printoptions(precision=2)
print('y_predicted: \n')
print(np.concatenate((y_pred.reshape(len(y_pred),1), y_test.reshape(len(y_test),1)),1))

Output:
Machine Learning 21BEC505

2. Create a regression model for an oscillating sinusoidal function corrupted with Gaussian noise of 0
mean and 0.25 variance as the output.
 Generate polynomial model for all degrees starting 1 through 9 for training set data.
 Get the predicted output for each fit model.
 Analyze the error on the training data for each polynomial degree feature.
 Investigate the best fit model to obtain the coefficients and constants and test it on the test data
set.
 Evaluate the error of test set.
 Conclude which regression model is best torepresent the corrupted signal.
 Identify the underfit and overfit model, if there is any.
Machine Learning 21BEC505

Code:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split

x = np.linspace(-5, 5, 1000)
y = np.sin(x) + np.random.normal(0, 0.25, len(x))
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=1/3, random_state=0)
plt.scatter(x_train, y_train, s=5, label='Training data')
plt.scatter(x_test, y_test, s=5, label='Test data')
plt.legend()
plt.show()

# Generate polynomial models of degrees 1 through 9 and fit to training data


train_errors = []
test_errors = []
degrees = range(1, 10)
for degree in degrees:
poly = PolynomialFeatures(degree)
x_poly_train = poly.fit_transform(x_train.reshape(-1, 1))

# Fit model to training data


model = LinearRegression()
model.fit(x_poly_train, y_train)

# Get predicted outputs for training and test data


y_pred_train = model.predict(x_poly_train)
y_pred_test = model.predict(poly.fit_transform(x_test.reshape(-1, 1)))

# Calculate errors on training and test data


train_error = mean_squared_error(y_train, y_pred_train)
Machine Learning 21BEC505

test_error = mean_squared_error(y_test, y_pred_test)


train_errors.append(train_error)
test_errors.append(test_error)

# Print coefficients and constants for polynomial regression model


print('Degree:', degree)
print('Coefficients:', model.coef_)
print('Constants:', model.intercept_)
print('Training error:', train_error)
print('Test error:', test_error)
print()

# Plot the training and test errors for each polynomial degree
plt.plot(degrees, train_errors, label='Training error')
plt.plot(degrees, test_errors, label='Test error')
plt.legend()
plt.show()

# Find the degree with the lowest test error


best_degree = np.argmin(test_errors) + 1
print('Best fit model has degree:', best_degree)

Output:
Machine Learning 21BEC505
Machine Learning 21BEC505

Conclusion:
This experiment taught us how to divide a dataset into training and testing sets for machine learning
applications and how to use linear regression to predict stock prices. However, it is essential to keep in mind
that, despite the fact that linear regression can be a useful model for predicting stock prices, there are numerous
other factors that can influence stock prices, and accurate predictions may necessitate the application of
additional machine learning models and strategies. Additionally, the test data's lowest error indicated that the
polynomial model with degree 9 was the most accurate representation of the corrupted signal. This suggests
that, in comparison to the lower degree polynomial models, the higher degree polynomial models were better
able to capture the underlying pattern in the data. When a model is too simple and unable to capture the
underlying pattern in the data, it is considered to be an underfit model, resulting in high test and training errors.
Due to its high training and test error, the first degree polynomial model appears to be underfitting the data in
this instance. When a model is too complex and fits the data's noise, an overfit model has a low test error but
a high training error. We didn't see any overfitting in this case.

You might also like