Simple Linear Regression in Machine Learning

Simple Linear Regression is a statistical method that models the relationship between a dependent variable and a single independent variable using a linear equation. The process involves data pre-processing, fitting the model to training data, making predictions on test data, and visualizing results to assess model performance. The implementation in Python includes using libraries like NumPy, Matplotlib, and Scikit-learn to facilitate data handling and visualization.

Uploaded by

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Simple Linear Regression in Machine Learning

Uploaded by

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Simple Linear Regression in Machine Learning

Simple Linear Regression is a type of Regression algorithms that models the relationship
between a dependent variable and a single independent variable. The relationship shown by a
Simple Linear Regression model is linear or a sloped straight line, hence it is called Simple
Linear Regression.
The key point in Simple Linear Regression is that the dependent variable must be a
continuous/real value. However, the independent variable can be measured on continuous or
categorical values.
Simple Linear regression algorithm has mainly two objectives:
o Model the relationship between the two variables. Such as the relationship between
Income and expenditure, experience and Salary, etc.
o Forecasting new observations. Such as Weather forecasting according to temperature,
Revenue of a company according to the investments in a year, etc.
Simple Linear Regression Model:
The Simple Linear Regression model can be represented using the below equation:
y= a0+a1x+ ε
Where,
a0= It is the intercept of the Regression line (can be obtained putting x=0)
a1= It is the slope of the regression line, which tells whether the line is increasing or
decreasing.
ε = The error term. (For a good model it will be negligible)
Implementation of Simple Linear Regression Algorithm using Python
Problem Statement example for Simple Linear Regression:
Here we are taking a dataset that has two variables: salary (dependent variable) and experience
(Independent variable). The goals of this problem is:
o We want to find out if there is any correlation between these two variables
o We will find the best fit line for the dataset.
o How the dependent variable is changing by changing the independent variable.
In this section, we will create a Simple Linear Regression model to find out the best fitting line
for representing the relationship between these two variables.
To implement the Simple Linear regression model in machine learning using Python, we need
to follow the below steps:
Step-1: Data Pre-processing
The first step for creating the Simple Linear Regression model is data pre-processing. We have
already done it earlier in this tutorial. But there will be some changes, which are given in the
below steps:
o First, we will import the three important libraries, which will help us for loading the
dataset, plotting the graphs, and creating the Simple Linear Regression model.
1. import numpy as nm
2. import matplotlib.pyplot as mtp
3. import pandas as pd
o Next, we will load the dataset into our code:
1. data_set= pd.read_csv('Salary_Data.csv')
By executing the above line of code (ctrl+ENTER), we can read the dataset on our Spyder IDE
screen by clicking on the variable explorer option.

The above output shows the dataset, which has two variables: Salary and Experience.
Note: In Spyder IDE, the folder containing the code file must be saved as a working directory,
and the dataset or csv file should be in the same folder.
o After that, we need to extract the dependent and independent variables from the given
dataset. The independent variable is years of experience, and the dependent variable is
salary. Below is code for it:
1. x= data_set.iloc[:, :-1].values
2. y= data_set.iloc[:, 1].values
In the above lines of code, for x variable, we have taken -1 value since we want to remove the
last column from the dataset. For y variable, we have taken 1 value as a parameter, since we
want to extract the second column and indexing starts from the zero.
By executing the above line of code, we will get the output for X and Y variable as:

In the above output image, we can see the X (independent) variable and Y (dependent) variable
has been extracted from the given dataset.
o Next, we will split both variables into the test set and training set. We have 30
observations, so we will take 20 observations for the training set and 10 observations
for the test set. We are splitting our dataset so that we can train our model using a
training dataset and then test the model using a test dataset. The code for this is given
below:
1. # Splitting the dataset into training and test set.
2. from sklearn.model_selection import train_test_split
3. x_train, x_test, y_train, y_test= train_test_split(x, y, test_size= 1/3, random_state=0)
By executing the above code, we will get x-test, x-train and y-test, y-train dataset. Consider the
below images:
Test-dataset:

Training Dataset:
o For simple linear Regression, we will not use Feature Scaling. Because Python libraries
take care of it for some cases, so we don't need to perform it here. Now, our dataset is
well prepared to work on it and we are going to start building a Simple Linear
Regression model for the given problem.
Step-2: Fitting the Simple Linear Regression to the Training Set:
Now the second step is to fit our model to the training dataset. To do so, we will import
the LinearRegression class of the linear_model library from the scikit learn. After importing
the class, we are going to create an object of the class named as a regressor. The code for this
is given below:
1. #Fitting the Simple Linear Regression model to the training dataset
2. from sklearn.linear_model import LinearRegression
3. regressor= LinearRegression()
4. regressor.fit(x_train, y_train)
In the above code, we have used a fit() method to fit our Simple Linear Regression object to
the training set. In the fit() function, we have passed the x_train and y_train, which is our
training dataset for the dependent and an independent variable. We have fitted our regressor
object to the training set so that the model can easily learn the correlations between the predictor
and target variables. After executing the above lines of code, we will get the below output.
Output:
Out[7]: LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)
Step: 3. Prediction of test set result:
dependent (salary) and an independent variable (Experience). So, now, our model is ready to
predict the output for the new observations. In this step, we will provide the test dataset (new
observations) to the model to check whether it can predict the correct output or not.
We will create a prediction vector y_pred, and x_pred, which will contain predictions of test
dataset, and prediction of training set respectively.
1. #Prediction of Test and Training set result
2. y_pred= regressor.predict(x_test)
3. x_pred= regressor.predict(x_train)
On executing the above lines of code, two variables named y_pred and x_pred will generate in
the variable explorer options that contain salary predictions for the training set and test set.
Output:
You can check the variable by clicking on the variable explorer option in the IDE, and also
compare the result by comparing values from y_pred and y_test. By comparing these values,
we can check how good our model is performing.
Step: 4. visualizing the Training set results:
Now in this step, we will visualize the training set result. To do so, we will use the scatter()
function of the pyplot library, which we have already imported in the pre-processing step.
The scatter () function will create a scatter plot of observations.
In the x-axis, we will plot the Years of Experience of employees and on the y-axis, salary of
employees. In the function, we will pass the real values of training set, which means a year of
experience x_train, training set of Salaries y_train, and color of the observations. Here we are
taking a green color for the observation, but it can be any color as per the choice.
Now, we need to plot the regression line, so for this, we will use the plot() function of the
pyplot library. In this function, we will pass the years of experience for training set, predicted
salary for training set x_pred, and color of the line.
Next, we will give the title for the plot. So here, we will use the title() function of
the pyplot library and pass the name ("Salary vs Experience (Training Dataset)".
After that, we will assign labels for x-axis and y-axis using xlabel() and ylabel() function.
Finally, we will represent all above things in a graph using show(). The code is given below:
1. mtp.scatter(x_train, y_train, color="green")
2. mtp.plot(x_train, x_pred, color="red")
3. mtp.title("Salary vs Experience (Training Dataset)")
4. mtp.xlabel("Years of Experience")
5. mtp.ylabel("Salary(In Rupees)")
6. mtp.show()
Output:
By executing the above lines of code, we will get the below graph plot as an output.

In the above plot, we can see the real values observations in green dots and predicted values
are covered by the red regression line. The regression line shows a correlation between the
dependent and independent variable.
The good fit of the line can be observed by calculating the difference between actual values
and predicted values. But as we can see in the above plot, most of the observations are close
to the regression line, hence our model is good for the training set.
Step: 5. visualizing the Test set results:
In the previous step, we have visualized the performance of our model on the training set. Now,
we will do the same for the Test set. The complete code will remain the same as the above code,
except in this, we will use x_test, and y_test instead of x_train and y_train.
Here we are also changing the color of observations and regression line to differentiate between
the two plots, but it is optional.
1. #visualizing the Test set results
2. mtp.scatter(x_test, y_test, color="blue")
3. mtp.plot(x_train, x_pred, color="red")
4. mtp.title("Salary vs Experience (Test Dataset)")
5. mtp.xlabel("Years of Experience")
6. mtp.ylabel("Salary(In Rupees)")
7. mtp.show()
Output:
By executing the above line of code, we will get the output as:

In the above plot, there are observations given by the blue color, and prediction is given by the
red regression line. As we can see, most of the observations are close to the regression line,
hence we can say our Simple Linear Regression is a good model and able to make good
predictions.

Journal of International Management: Michael Minkov, Anneli Kaasa
No ratings yet
Journal of International Management: Michael Minkov, Anneli Kaasa
17 pages
2019 Data 1901
No ratings yet
2019 Data 1901
6 pages
Machine Learning 2
No ratings yet
Machine Learning 2
45 pages
2.1 ML (Implementation of Simple Linear Regression in Python)
No ratings yet
2.1 ML (Implementation of Simple Linear Regression in Python)
8 pages
Regression Dataset Example
No ratings yet
Regression Dataset Example
14 pages
Home Ai Machine Learning Dbms Java Blockchain Control System Selenium HTML Css Javascript Ds
No ratings yet
Home Ai Machine Learning Dbms Java Blockchain Control System Selenium HTML Css Javascript Ds
11 pages
ML Experiment No 1 Linear Regression Analysis
No ratings yet
ML Experiment No 1 Linear Regression Analysis
3 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
4 pages
Linear Regression2
No ratings yet
Linear Regression2
9 pages
Practical # 10
No ratings yet
Practical # 10
5 pages
Exp 1
No ratings yet
Exp 1
6 pages
Regression
No ratings yet
Regression
16 pages
EXP-4 DMusingPYTHON
No ratings yet
EXP-4 DMusingPYTHON
7 pages
Linear Regression
No ratings yet
Linear Regression
20 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
30 pages
ML manoj
No ratings yet
ML manoj
51 pages
lab mannual of ML
No ratings yet
lab mannual of ML
43 pages
Data Science Chapitre 2
No ratings yet
Data Science Chapitre 2
132 pages
Unit 2 Regression Analysis
No ratings yet
Unit 2 Regression Analysis
16 pages
Data Science Chapitre 2
No ratings yet
Data Science Chapitre 2
98 pages
CSL0777 L15
No ratings yet
CSL0777 L15
24 pages
Simple Linear Regression Code
No ratings yet
Simple Linear Regression Code
3 pages
ML LN 3
No ratings yet
ML LN 3
44 pages
Linear Regression - Numpy and Sklearn
No ratings yet
Linear Regression - Numpy and Sklearn
7 pages
ml_6_7_8 (1)
No ratings yet
ml_6_7_8 (1)
10 pages
Praktikum 1 Jupiter Machine Learning
No ratings yet
Praktikum 1 Jupiter Machine Learning
1 page
Simple Linear Regression Lab II
No ratings yet
Simple Linear Regression Lab II
5 pages
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
5 pages
Unit5 - Linear Regression
No ratings yet
Unit5 - Linear Regression
4 pages
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
7 pages
C1 W1 Lab03 Model Representation Soln-Copy1
No ratings yet
C1 W1 Lab03 Model Representation Soln-Copy1
7 pages
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
7 pages
Machine Learning Hands-On
100% (1)
Machine Learning Hands-On
18 pages
Machine Learning Assignment
No ratings yet
Machine Learning Assignment
2 pages
Linear Regression - Jupyter Notebook
100% (3)
Linear Regression - Jupyter Notebook
56 pages
m2 Data analytic and visualization
No ratings yet
m2 Data analytic and visualization
53 pages
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
5 pages
Lab Experiment 4 - AI
No ratings yet
Lab Experiment 4 - AI
7 pages
LP III Lab Manual
100% (1)
LP III Lab Manual
8 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
43 pages
Linear Regression
No ratings yet
Linear Regression
8 pages
FYMCA IDSLab A6 Submission
No ratings yet
FYMCA IDSLab A6 Submission
9 pages
Question 1 B
No ratings yet
Question 1 B
6 pages
3. Machine Learning
No ratings yet
3. Machine Learning
158 pages
Unit 5
No ratings yet
Unit 5
171 pages
Simple - Linear - Regression - Ipynb - Colaboratory
No ratings yet
Simple - Linear - Regression - Ipynb - Colaboratory
2 pages
Machine Learning With Python Algorithms
No ratings yet
Machine Learning With Python Algorithms
28 pages
Supervised Learning For Data Science...
No ratings yet
Supervised Learning For Data Science...
14 pages
ML PR-2
No ratings yet
ML PR-2
11 pages
19BCS2059 DL1
No ratings yet
19BCS2059 DL1
4 pages
ML_recordjp
No ratings yet
ML_recordjp
35 pages
Simple Linear Regression: Math Behind
No ratings yet
Simple Linear Regression: Math Behind
6 pages
2.3 ML (Implementation of Polynomial Regression Using Python)
No ratings yet
2.3 ML (Implementation of Polynomial Regression Using Python)
9 pages
Web II & DA Slip Solution
No ratings yet
Web II & DA Slip Solution
40 pages
Task1
No ratings yet
Task1
5 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
10 pages
ml2020 Pythonlab02
No ratings yet
ml2020 Pythonlab02
3 pages
Experiment 1
No ratings yet
Experiment 1
17 pages
python 1
No ratings yet
python 1
3 pages
20dit073 Jay Prajapati ML
No ratings yet
20dit073 Jay Prajapati ML
68 pages
ML Exp1 C36
No ratings yet
ML Exp1 C36
13 pages
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
Applied Linear Models with SAS - 1st Edition (FULL VERSION DOWNLOAD)
100% (3)
Applied Linear Models with SAS - 1st Edition (FULL VERSION DOWNLOAD)
14 pages
Bookbinders Case 2
0% (3)
Bookbinders Case 2
6 pages
Adaptive Thermal Comfort, Principles and Practice - Fergus Nicol, Michael Humphreys, Susan Roaf
No ratings yet
Adaptive Thermal Comfort, Principles and Practice - Fergus Nicol, Michael Humphreys, Susan Roaf
208 pages
1 Descriptive Statistics
No ratings yet
1 Descriptive Statistics
20 pages
TE 2019 DSBDA Lab Manual Sem II 2023 Final
No ratings yet
TE 2019 DSBDA Lab Manual Sem II 2023 Final
170 pages
Excel Regression Instructions 2010
No ratings yet
Excel Regression Instructions 2010
3 pages
DSV Tut 2 Answers
No ratings yet
DSV Tut 2 Answers
6 pages
The Effectiveness of International Environmental Agreements
No ratings yet
The Effectiveness of International Environmental Agreements
25 pages
A Linear Regression Approach To Predicting Salaries With Visualizations of Job Vacancies: A Case Study of Jobstreet Malaysia
No ratings yet
A Linear Regression Approach To Predicting Salaries With Visualizations of Job Vacancies: A Case Study of Jobstreet Malaysia
13 pages
Sales Analysis Using The Forecasting Method: Bit-Tech April 2019
No ratings yet
Sales Analysis Using The Forecasting Method: Bit-Tech April 2019
5 pages
RN10 BEEA StatPro RN Correlation and Regression Analyses MP RM FD
No ratings yet
RN10 BEEA StatPro RN Correlation and Regression Analyses MP RM FD
33 pages
Applied Quantitative Analysis for Real Estate 1st Edition Sotiris Tsolacos instant download
100% (1)
Applied Quantitative Analysis for Real Estate 1st Edition Sotiris Tsolacos instant download
61 pages
Business Statistics Exam Prep Solutions
No ratings yet
Business Statistics Exam Prep Solutions
3 pages
Business Analytics Presentation
No ratings yet
Business Analytics Presentation
11 pages
3 Factors Influencing External Audit Fees in Malta
No ratings yet
3 Factors Influencing External Audit Fees in Malta
15 pages
Analyze Phase Workbook - Final
100% (2)
Analyze Phase Workbook - Final
151 pages
AMTA Assignment AMTA B (Aswin Avni Navya)
No ratings yet
AMTA Assignment AMTA B (Aswin Avni Navya)
13 pages
Chapter I The Problem and Its Scope
No ratings yet
Chapter I The Problem and Its Scope
40 pages
Gurung Et Al 2017-Water Resources Research
No ratings yet
Gurung Et Al 2017-Water Resources Research
25 pages
Turner-The Drought Code Component of The Canadian Forest Fire Behavior System
No ratings yet
Turner-The Drought Code Component of The Canadian Forest Fire Behavior System
16 pages
Linear Regression Assumptions and Limitations
No ratings yet
Linear Regression Assumptions and Limitations
10 pages
Quantitative Methods For The Social Sciences PDF
100% (3)
Quantitative Methods For The Social Sciences PDF
185 pages
IDS UNIT 5 Linear Regression
No ratings yet
IDS UNIT 5 Linear Regression
27 pages
MBM Course Structure With Syllabus
100% (1)
MBM Course Structure With Syllabus
21 pages
Can I trust you: A multi-level investigation of social media influencers
No ratings yet
Can I trust you: A multi-level investigation of social media influencers
24 pages
Statistical Methods For Forecasting
No ratings yet
Statistical Methods For Forecasting
8 pages
Playing With Confidence The Relationship Between Imagery Use and Self-Confidence and Self-Efficacy in Youth Soccer Players
No ratings yet
Playing With Confidence The Relationship Between Imagery Use and Self-Confidence and Self-Efficacy in Youth Soccer Players
9 pages
Aikaeli, Mkenda - 2014 - The Botswana Journal of Economics The Journal of The Botswana Economics Association (BEA)
No ratings yet
Aikaeli, Mkenda - 2014 - The Botswana Journal of Economics The Journal of The Botswana Economics Association (BEA)
23 pages

Simple Linear Regression in Machine Learning

Uploaded by

Simple Linear Regression in Machine Learning

Uploaded by

Simple Linear Regression in Machine Learning

You might also like