0% found this document useful (0 votes)

69 views

R Unit 4th and 5th

Simple linear regression allows modeling of the relationship between one independent variable and one continuous dependent variable. It finds the linear function that best predicts the dependent variable from the independent variable. The equation is y = a + bx, where y is the predicted value, a is the y-intercept, x is the independent variable, and b is the slope. Multiple linear regression generalizes this to model the relationship between a single dependent variable and multiple independent variables. Random forest is an ensemble machine learning method that utilizes multiple decision trees and aggregates their predictions to improve accuracy.

Uploaded by

Arshad Beg

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views

R Unit 4th and 5th

Uploaded by

Arshad Beg

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

R | Simple Linear Regression

Linear Regression: It is a commonly used type of predictive analysis. It is a

statistical approach for modelling the relationship between a dependent variable
and a given set of independent variables.
There are two types of linear regression.
• Simple Linear Regression
• Multiple Linear Regression

Simple Linear Regression:

It is a statistical method that allows us to summarize and study relationships

between two continuous (quantitative) variables. One variable denoted x is
regarded as an independent variable and the other one denoted y is regarded as a
dependent variable. It is assumed that the two variables are linearly related.
Hence, we try to find a linear function that predicts the response value(y) as
accurately as possible as a function of the feature or independent variable(x).
For understanding the concept let’s consider a salary dataset where it is given the
value of the dependent variable(salary) for every independent variable(years
experienced).

Salary dataset:
Years experienced Salary

1.1 39343.00
1.3 46205.00
1.5 37731.00
2.0 43525.00
2.2 39891.00
2.9 56642.00
3.0 60150.00
3.2 54445.00
3.2 64445.00
3.7 57189.00

Scatter plot of given dataset:

Now, we have to find a line that fits the above scatter plot through which we can
predict any value of y or response for any value of x
The line which best fits is called the Regression line.

The equation of regression line is given by:

y = a + bx
Where y is the predicted response value, a is the y-intercept, x is the feature value
and b is a slope.
The basic syntax for a regression analysis in R is
lm.r = lm(Y ~ model)
coef(lm.r)
where Y is the object containing the dependent variable to be predicted and model
is the formula for the chosen mathematical model.
The command lm( ) provides the model’s coefficients but no further statistical
information.

x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)

# Apply the lm() function.

relation <- lm(y~x)

print(relation)

When we execute the above code, it produces the following result −

Call:
lm(formula = y ~ x)

Coefficients:
(Intercept) x
-38.4551 0.6746

Visualize the Regression Graphically

# Create the predictor and response variable.
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
relation <- lm(y~x)

# Give the chart file a name.

png(file = "linearregression.png")

# Plot the chart.

plot(y,x,col = "blue",main = "Height & Weight Regression",
abline(lm(x~y)),cex = 1.3,pch = 16,xlab = "Weight in Kg",ylab = "Height in cm")

# Save the file.

dev.off()

Multiple Linear Regression :

It is the most common form of Linear Regression. Multiple Linear Regression
basically describes how a single response variable Y depends linearly on a number
of predictor variables.
The basic examples where Multiple Regression can be used are as follows:

1. The selling price of a house can depend on the desirability of the location,
the number of bedrooms, the number of bathrooms, the year the house
was built, the square footage of the lot, and a number of other factors.
2. The height of a child can depend on the height of the mother, the height of
the father, nutrition, and environmental factors.

The linear Regression model is written in the form as follows:

Or,

The general mathematical equation for multiple regression is −

y = a + b1x1 + b2x2 +...bnxn
Following is the description of the parameters used −
• y is the response variable.
• a, b1, b2...bn are the coefficients.
• x1, x2, ...xn are the predictor variables.

Syntax
The basic syntax for lm() function in multiple regression is −
lm(y ~ x1+x2+x3...,data)

mpg disp hp wt
Mazda RX4 21.0 160 110 2.620
Mazda RX4 Wag 21.0 160 110 2.875
Datsun 710 22.8 108 93 2.320
Hornet 4 Drive 21.4 258 110 3.215
Hornet Sportabout 18.7 360 175 3.440
Valiant 18.1 225 105 3.460

Create Relationship Model & get the Coefficients

input <- mtcars[,c("mpg","disp","hp","wt")]

# Create the relationship model.

model <- lm(mpg~disp+hp+wt, data = input)

# Show the model.

print(model)

# Get the Intercept and coefficients as vector elements.

cat("# # # # The Coefficient Values # # # ","\n")

a <- coef(model)[1]
print(a)

Xdisp <- coef(model)[2]

Xhp <- coef(model)[3]
Xwt <- coef(model)[4]

print(Xdisp)
print(Xhp)
print(Xwt)

Stepwise Regression in R :
The stepwise regression (or stepwise selection) consists of iteratively adding and removing predictors, in
the predictive model, in order to find the subset of variables in the data set resulting in the best performing
model, that is a model that lowers prediction error.
There are three strategies of stepwise regression (James et al. 2014,P. Bruce and Bruce (2017)):
1. Forward selection, which starts with no predictors in the model, iteratively adds the most
contributive predictors, and stops when the improvement is no longer statistically significant.
2. Backward selection (or backward elimination), which starts with all predictors in the model
(full model), iteratively removes the least contributive predictors, and stops when you have a
model where all predictors are statistically significant.
3. Stepwise selection (or sequential replacement), which is a combination of forward and
backward selections. You start with no predictors, then sequentially add the most contributive
predictors (like forward selection). After adding each new variable, remove any variables that no
longer provide an improvement in the model fit (like backward selection).

Decision Tree Classifiers in R Programming

Decision tree is a graph to represent choices and their results in form of a tree. The nodes in the graph
represent an event or choice and the edges of the graph represent the decision rules or conditions. It is
mostly used in Machine Learning and Data Mining applications using R.

Examples of use of decision tress is − predicting an email as spam or not spam, predicting of a tumor is
cancerous or predicting a loan as a good or bad credit risk based on the factors in each of these.

A decision tree is a flowchart-like tree structure in which the internal node

represents feature(or attribute), the branch represents a decision rule, and each
leaf node represents the outcome. A Decision Tree consists of,
• Nodes: Test for the value of a certain attribute.
• Edges/Branch: Represents a decision rule and connect to the next node.
• Leaf nodes: Terminal nodes that represent class labels or class
distribution.

Syntax
The basic syntax for creating a decision tree in R is −
ctree(formula, data)
Following is the description of the parameters used −
• formula is a formula describing the predictor and response variables.
• data is the name of the data set used.

And this algorithm can easily be implemented in the R language. Some important
points about decision tree classifiers are,
• It is more interpretable
• Automatically handles decision-making
• Bisects the space into smaller spaces
• Prone to overfitting
• Can be trained on a small training set
• Majorly affected by noise

Example

nativeSpeaker age shoeSize score

1 yes 5 24.83189 32.29385
2 yes 6 25.95238 36.63105
3 no 11 30.42170 49.60593
4 yes 7 28.66450 40.28456
5 yes 11 31.88207 55.46085
6 yes 10 30.07843 52.83124
Loading required package: methods
Loading required package: grid
...............................
...............................

We will use the ctree() function to create the decision tree and see its graph.

# Load the party package. It will automatically load other

# dependent packages.
library(party)

# Create the input data frame.

input.dat <- readingSkills[c(1:105),]

# Give the chart file a name.

png(file = "decision_tree.png")

# Create the tree.

output.tree <- ctree(
nativeSpeaker ~ age + shoeSize + score,
data = input.dat)

# Plot the tree.

plot(output.tree)

# Save the file.

dev.off()

Random Forest
Random forest is a machine learning algorithm that uses a collection of decision
trees providing more flexibility, accuracy, and ease of access in the output. This
algorithm dominates over decision trees algorithm as decision trees provide poor
accuracy as compared to the random forest algorithm. In simple words, the random
forest approach increases the performance of decision trees. It is one of the best
algorithm as it can use both classification and regression techniques. Being a
supervised learning algorithm, random forest uses the bagging method in decision
trees and as a result, increases the accuracy of the learning model.
Random forest searches for the best feature from a random subset of features
providing more randomness to the model and results in a better and accurate
model. Let us learn about the random forest approach with an example. Suppose a
man named Bob wants to buy a T-shirt from a store. The salesman asks him first
about his favourite colour. This constitutes a decision tree based on colour feature.
Further, the salesman asks more about the T-shirt like size, type of fabric, type of
collar and many more. More criteria of selecting a T-shirt will make more decision
trees in machine learning. Together all the decision trees will constitute to random
forest approach of selecting a T-shirt based on many features that Bob would like
to buy from the store.
Syntax
The basic syntax for creating a random forest in R is −
randomForest(formula, data)
Following is the description of the parameters used −
• formula is a formula describing the predictor and response variables.
• data is the name of the data set used.
Example
We will use the randomForest() function to create the decision tree and see it's graph.

# Load the party package. It will automatically load other

# required packages.
library(party)
library(randomForest)

# Create the forest.

output.forest <- randomForest(nativeSpeaker ~ age + shoeSize + score,
data = readingSkills)

# View the forest results.

print(output.forest)

# Importance of each predictor.

print(importance(fit,type = 2))

K-Means Clustering in R Programming

K Means Clustering in R Programming is an Unsupervised Non-linear algorithm
that cluster data based on similarity or similar groups. It seeks to partition the
observations into a pre-specified number of clusters. Segmentation of data takes
place to assign each training example to a segment called a cluster. In the
unsupervised algorithm, high reliance on raw data is given with large expenditure
on manual review for review of relevance is given. It is used in a variety of fields
like Banking, healthcare, retail, Media, etc.
Theory
K-Means clustering groups the data on similar groups. The algorithm is as follows:
1. Choose the number K clusters.
2. Select at random K points, the centroids(Not necessarily from the given
data).
3. Assign each data point to closest centroid that forms K clusters.
4. Compute and place the new centroid of each centroid.
5. Reassign each data point to new cluster.
After final reassignment, name the cluster as Final cluster.

R - Pie Charts
R Programming language has numerous libraries to create charts and graphs. A pie-chart is a
representation of values as slices of a circle with different colors. The slices are labeled and the
numbers corresponding to each slice is also represented in the chart.
In R the pie chart is created using the pie() function which takes positive numbers as a vector input.
The additional parameters are used to control labels, color, title etc.

Syntax
The basic syntax for creating a pie-chart using the R is −
pie(x, labels, radius, main, col, clockwise)
Following is the description of the parameters used −
• x is a vector containing the numeric values used in the pie chart.
• labels is used to give description to the slices.
• radius indicates the radius of the circle of the pie chart.(value between −1 and +1).
• main indicates the title of the chart.
• col indicates the color palette.
• clockwise is a logical value indicating if the slices are drawn clockwise or anti
clockwise.
Example
A very simple pie-chart is created using just the input vector and labels. The below script will
create and save the pie chart in the current R working directory.

# Create data for the graph.

x <- c(21, 62, 10, 53)
labels <- c("London", "New York", "Singapore", "Mumbai")

# Give the chart file a name.

png(file = "city.png")

# Plot the chart.

pie(x,labels)

# Save the file.

dev.off()

When we execute the above code, it produces the following result −

# Plot the chart with title and rainbow color pallet.

pie(x, labels, main = "City pie chart", col = rainbow(length(x)))
# Plot the chart.
pie(x, labels = piepercent, main = "City pie chart",col = rainbow(length(x)))
legend("topright", c("London","New York","Singapore","Mumbai"), cex = 0.8,
fill = rainbow(length(x)))

R - Bar Charts
A bar chart represents data in rectangular bars with length of the bar proportional to the value of
the variable. R uses the function barplot() to create bar charts. R can draw both vertical and
Horizontal bars in the bar chart. In bar chart each of the bars can be given different colors.

Syntax
The basic syntax to create a bar-chart in R is −
barplot(H,xlab,ylab,main, names.arg,col)
Following is the description of the parameters used −

• H is a vector or matrix containing numeric values used in bar chart.

• xlab is the label for x axis.
• ylab is the label for y axis.
• main is the title of the bar chart.
• names.arg is a vector of names appearing under each bar.
• col is used to give colors to the bars in the graph.
Example
A simple bar chart is created using just the input vector and the name of each bar.
The below script will create and save the bar chart in the current R working directory.

Live Demo
# Create the data for the chart
H <- c(7,12,28,3,41)

# Give the chart file a name

png(file = "barchart.png")

# Plot the bar chart

barplot(H)

# Save the file

dev.off()

When we execute above code, it produces following result −

Example
The below script will create and save the bar chart in the current R working directory.

# Create the data for the chart

H <- c(7,12,28,3,41)
M <- c("Mar","Apr","May","Jun","Jul")

# Give the chart file a name

png(file = "barchart_months_revenue.png")

# Plot the bar chart

barplot(H,names.arg=M,xlab="Month",ylab="Revenue",col="blue",
main="Revenue chart",border="red")

# Save the file

dev.off()

When we execute above code, it produces following result −

R – Boxplots
Boxplots are a measure of how well distributed is the data in a data set. It divides the data set into
three quartiles. This graph represents the minimum, maximum, median, first quartile and third
quartile in the data set. It is also useful in comparing the distribution of data across data sets by
drawing boxplots for each of them.
Boxplots are created in R by using the boxplot() function.

Syntax
The basic syntax to create a boxplot in R is −
boxplot(x, data, notch, varwidth, names, main)
Following is the description of the parameters used −
• x is a vector or a formula.
• data is the data frame.
• notch is a logical value. Set as TRUE to draw a notch.
• varwidth is a logical value. Set as true to draw width of the box proportionate to the
sample size.
• names are the group labels which will be printed under each boxplot.
• main is used to give a title to the graph.

Creating the Boxplot

The below script will create a boxplot graph for the relation between mpg (miles per gallon) and
cyl (number of cylinders).

# Give the chart file a name.

png(file = "boxplot.png")

# Plot the chart.

boxplot(mpg ~ cyl, data = mtcars, xlab = "Number of Cylinders",
ylab = "Miles Per Gallon", main = "Mileage Data")

# Save the file.

dev.off()

When we execute the above code, it produces the following result −

R – Histograms
A histogram represents the frequencies of values of a variable bucketed into ranges. Histogram is
similar to bar chat but the difference is it groups the values into continuous ranges. Each bar in
histogram represents the height of the number of values present in that range.
R creates histogram using hist() function. This function takes a vector as an input and uses some
more parameters to plot histograms.

Syntax
The basic syntax for creating a histogram using R is −
hist(v,main,xlab,xlim,ylim,breaks,col,border)
Following is the description of the parameters used −
• v is a vector containing numeric values used in histogram.
• main indicates title of the chart.
• col is used to set color of the bars.
• border is used to set border color of each bar.
• xlab is used to give description of x-axis.
• xlim is used to specify the range of values on the x-axis.
• ylim is used to specify the range of values on the y-axis.
• breaks is used to mention the width of each bar.
Example
A simple histogram is created using input vector, label, col and border parameters.
The script given below will create and save the histogram in the current R working directory.

# Create data for the graph.

v <- c(9,13,21,8,36,22,12,41,31,33,19)

# Give the chart file a name.

png(file = "histogram.png")

# Create the histogram.

hist(v,xlab = "Weight",col = "yellow",border = "blue")

# Save the file.

dev.off()

When we execute the above code, it produces the following result −

R - Line Graphs
A line chart is a graph that connects a series of points by drawing line segments between them.
These points are ordered in one of their coordinate (usually the x-coordinate) value. Line charts are
usually used in identifying the trends in data.
The plot() function in R is used to create the line graph.

Syntax
The basic syntax to create a line chart in R is −
plot(v,type,col,xlab,ylab)
Following is the description of the parameters used −
• v is a vector containing the numeric values.
• type takes the value "p" to draw only the points, "l" to draw only the lines and "o" to
draw both points and lines.
• xlab is the label for x axis.
• ylab is the label for y axis.
• main is the Title of the chart.
• col is used to give colors to both the points and lines.
Example
A simple line chart is created using the input vector and the type parameter as "O". The below
script will create and save a line chart in the current R working directory.

# Create the data for the chart.

v <- c(7,12,28,3,41)

# Give the chart file a name.

png(file = "line_chart.jpg")

# Plot the bar chart.

plot(v,type = "o")

# Save the file.

dev.off()

When we execute the above code, it produces the following result −

R – Scatterplots
Scatterplots show many points plotted in the Cartesian plane. Each point represents the values of
two variables. One variable is chosen in the horizontal axis and another in the vertical axis.
The simple scatterplot is created using the plot() function.
Syntax
The basic syntax for creating scatterplot in R is −
plot(x, y, main, xlab, ylab, xlim, ylim, axes)
Following is the description of the parameters used −
• x is the data set whose values are the horizontal coordinates.
• y is the data set whose values are the vertical coordinates.
• main is the tile of the graph.
• xlab is the label in the horizontal axis.
• ylab is the label in the vertical axis.
• xlim is the limits of the values of x used for plotting.
• ylim is the limits of the values of y used for plotting.
• axes indicates whether both axes should be drawn on the plot.

Example
We use the data set "mtcars" available in the R environment to create a basic scatterplot. Let's
use the columns "wt" and "mpg" in mtcars.

input <- mtcars[,c('wt','mpg')]

print(head(input))

When we execute the above code, it produces the following result −

wt mpg
Mazda RX4 2.620 21.0
Mazda RX4 Wag 2.875 21.0
Datsun 710 2.320 22.8
Hornet 4 Drive 3.215 21.4
Hornet Sportabout 3.440 18.7
Valiant 3.460 18.1
Creating the Scatterplot
The below script will create a scatterplot graph for the relation between wt(weight) and mpg(miles
per gallon).

# Get the input values.

input <- mtcars[,c('wt','mpg')]

# Give the chart file a name.

png(file = "scatterplot.png")

# Plot the chart for cars with weight between 2.5 to 5 and mileage between 15 and 30.
plot(x = input$wt,y = input$mpg,
xlab = "Weight",
ylab = "Milage",
xlim = c(2.5,5),
ylim = c(15,30),
main = "Weight vs Milage"
)
# Save the file.
dev.off()

When we execute the above code, it produces the following result −

Education - Post 12th Standard - CSV
88% (16)
Education - Post 12th Standard - CSV
11 pages
Data Driven Prediction of Vehicle Cabin Thermal Comfort Using Machine Learning and High Fidelity Simulation Results
No ratings yet
Data Driven Prediction of Vehicle Cabin Thermal Comfort Using Machine Learning and High Fidelity Simulation Results
12 pages
Machine Learning Lab Manual 06
100% (1)
Machine Learning Lab Manual 06
8 pages
Statistical Analysis
No ratings yet
Statistical Analysis
26 pages
Data Analysis Using R - 5
No ratings yet
Data Analysis Using R - 5
9 pages
ML Practical File
100% (2)
ML Practical File
43 pages
ML Combined
No ratings yet
ML Combined
254 pages
Aakash S Project Report
No ratings yet
Aakash S Project Report
12 pages
SC&RP - Unit 5
No ratings yet
SC&RP - Unit 5
36 pages
Data Analysis2
No ratings yet
Data Analysis2
16 pages
4503 Rc158 010d Machinelearning 1
100% (1)
4503 Rc158 010d Machinelearning 1
6 pages
Unit3__R
No ratings yet
Unit3__R
19 pages
BDA Exp7 Removed
No ratings yet
BDA Exp7 Removed
4 pages
Unit 2
No ratings yet
Unit 2
11 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
tyit BI Practical file (1)
No ratings yet
tyit BI Practical file (1)
60 pages
Gradient Descent Algorithm
No ratings yet
Gradient Descent Algorithm
5 pages
Dimensional Reduction in R
No ratings yet
Dimensional Reduction in R
24 pages
ML Lab - Sukanya Raja
No ratings yet
ML Lab - Sukanya Raja
23 pages
Untitled Document
No ratings yet
Untitled Document
27 pages
Python 06 MachineLearning
No ratings yet
Python 06 MachineLearning
45 pages
Module 3
No ratings yet
Module 3
33 pages
Week 7 Laboratory Activity
No ratings yet
Week 7 Laboratory Activity
12 pages
BA Notes[End Sem)
No ratings yet
BA Notes[End Sem)
26 pages
Ridge and Lasso Regression in Python
No ratings yet
Ridge and Lasso Regression in Python
18 pages
Python Unit 4
No ratings yet
Python Unit 4
43 pages
Mindanao State University General Santos City: Simple Linear Regression
No ratings yet
Mindanao State University General Santos City: Simple Linear Regression
12 pages
Broadly, There Are 3 Types of Machine Learning Algorithms.
No ratings yet
Broadly, There Are 3 Types of Machine Learning Algorithms.
33 pages
Data Science Machine Learning
No ratings yet
Data Science Machine Learning
470 pages
Essentials of Machine Learning Algorithms
No ratings yet
Essentials of Machine Learning Algorithms
15 pages
Linear Regression - Jupyter Notebook
100% (3)
Linear Regression - Jupyter Notebook
56 pages
Whole ML PDF 1614408656
100% (1)
Whole ML PDF 1614408656
214 pages
DS Unit 2 Essay Answers
No ratings yet
DS Unit 2 Essay Answers
17 pages
Supervised and Unsupervised Learning
No ratings yet
Supervised and Unsupervised Learning
92 pages
WEEK 1-5
No ratings yet
WEEK 1-5
13 pages
Machine Learning QB
No ratings yet
Machine Learning QB
32 pages
Data Science Machine Learning
No ratings yet
Data Science Machine Learning
369 pages
CM
No ratings yet
CM
8 pages
unit5_R
No ratings yet
unit5_R
5 pages
TD2345
No ratings yet
TD2345
3 pages
DAR LECT 12
No ratings yet
DAR LECT 12
29 pages
Scikit - Notes ML
100% (2)
Scikit - Notes ML
12 pages
BIG DATA PART-I
No ratings yet
BIG DATA PART-I
15 pages
Commonly Used Machine Learning Algorithms
No ratings yet
Commonly Used Machine Learning Algorithms
38 pages
Data Analysis
No ratings yet
Data Analysis
8 pages
Syllabus of Machine Learning
No ratings yet
Syllabus of Machine Learning
19 pages
Practical # 10
No ratings yet
Practical # 10
5 pages
08 Decision - Tree
No ratings yet
08 Decision - Tree
9 pages
Unit - 2 ML notes
No ratings yet
Unit - 2 ML notes
14 pages
UNIT-1 Regression vs. Classification
No ratings yet
UNIT-1 Regression vs. Classification
25 pages
1737527078055
No ratings yet
1737527078055
111 pages
Raghav soni(20IOT6014) Algo_Assignment
No ratings yet
Raghav soni(20IOT6014) Algo_Assignment
14 pages
12. B Lab Manual Machine Learning SEM-7 CSE 2024
No ratings yet
12. B Lab Manual Machine Learning SEM-7 CSE 2024
49 pages
ML Lab Manual Prgm 2&3
No ratings yet
ML Lab Manual Prgm 2&3
6 pages
Classification Algorithms
No ratings yet
Classification Algorithms
16 pages
Machine Learning With Python Algorithms
No ratings yet
Machine Learning With Python Algorithms
28 pages
MLp
No ratings yet
MLp
28 pages
MODEL EXAM II Answer Key - For Merge
No ratings yet
MODEL EXAM II Answer Key - For Merge
20 pages
ML Research Paper
No ratings yet
ML Research Paper
9 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Process Performance Models: Statistical, Probabilistic & Simulation
From Everand
Process Performance Models: Statistical, Probabilistic & Simulation
Vishnuvarthanan Moorthy
No ratings yet
Harnessing Machine Learning for Diabetes Prediction: Optimizing Classifiers to Tackle Canada's Growing Health Challenge
No ratings yet
Harnessing Machine Learning for Diabetes Prediction: Optimizing Classifiers to Tackle Canada's Growing Health Challenge
9 pages
Unit 4 Updated Notes
No ratings yet
Unit 4 Updated Notes
13 pages
Project Report Sem II Final
0% (1)
Project Report Sem II Final
102 pages
Titanic - Machine Learning From Disaster: A Report ON
No ratings yet
Titanic - Machine Learning From Disaster: A Report ON
23 pages
Visvesvaraya Technological University Belagavi: House Price Prediction Using Machine Learning
No ratings yet
Visvesvaraya Technological University Belagavi: House Price Prediction Using Machine Learning
9 pages
Flight Fare Prediction Using Machine Learning Approach
No ratings yet
Flight Fare Prediction Using Machine Learning Approach
5 pages
Real Internship Report
No ratings yet
Real Internship Report
49 pages
(Ebook) Intelligent Medicine and Health Care reprint1.pdf by MDPIdownload
100% (5)
(Ebook) Intelligent Medicine and Health Care reprint1.pdf by MDPIdownload
52 pages
Mathematical Programming For Piecewise Linear Regression Analysis
No ratings yet
Mathematical Programming For Piecewise Linear Regression Analysis
43 pages
Determine The Soil Nutrients To Find The Crop Yields Using Data
No ratings yet
Determine The Soil Nutrients To Find The Crop Yields Using Data
7 pages
Comparison of Random Forest, Artificial Neural Networks and Support Vector Machine For Intelligent Diagnosis of Rotating Machinery PDF
No ratings yet
Comparison of Random Forest, Artificial Neural Networks and Support Vector Machine For Intelligent Diagnosis of Rotating Machinery PDF
13 pages
Enhancing Error Prediction in Machineries Through Sensor Data Fusion
No ratings yet
Enhancing Error Prediction in Machineries Through Sensor Data Fusion
78 pages
AIB Case Study On Uber
No ratings yet
AIB Case Study On Uber
7 pages
Project Documentation
No ratings yet
Project Documentation
45 pages
Advanced Certificate Programme DS
No ratings yet
Advanced Certificate Programme DS
34 pages
Group-3 Report
No ratings yet
Group-3 Report
38 pages
Regression Model To Predict Bike Sharing 12110784
No ratings yet
Regression Model To Predict Bike Sharing 12110784
12 pages
Data Science AI Certification Program
No ratings yet
Data Science AI Certification Program
30 pages
Fake News Detection Using Machine Learning Algorithms: June 2020
No ratings yet
Fake News Detection Using Machine Learning Algorithms: June 2020
10 pages
Module 3
No ratings yet
Module 3
79 pages
CV Generate
No ratings yet
CV Generate
1 page
Implementing KNN Algorithm: Importing Libraries
No ratings yet
Implementing KNN Algorithm: Importing Libraries
6 pages
CV Abhishek CDC
No ratings yet
CV Abhishek CDC
1 page
Class 7 Random Forest Algorithm
No ratings yet
Class 7 Random Forest Algorithm
13 pages
Ritesh Mangla ML PracticalFile
No ratings yet
Ritesh Mangla ML PracticalFile
55 pages
Classification of Input Document or Text in Different Indian IT Laws Using Machine Learning Techniques
No ratings yet
Classification of Input Document or Text in Different Indian IT Laws Using Machine Learning Techniques
6 pages
project proposal chi
No ratings yet
project proposal chi
6 pages
Writeup On Bank Customer Churn Prediction
No ratings yet
Writeup On Bank Customer Churn Prediction
14 pages
40 Interview Questions Asked at Startups in Machine Learning - Data Science
100% (1)
40 Interview Questions Asked at Startups in Machine Learning - Data Science
33 pages

R Unit 4th and 5th

Uploaded by

R Unit 4th and 5th

Uploaded by

R | Simple Linear Regression

Linear Regression: It is a commonly used type of predictive analysis. It is a

Simple Linear Regression:

It is a statistical method that allows us to summarize and study relationships

Scatter plot of given dataset:

The equation of regression line is given by:

# Apply the lm() function.

When we execute the above code, it produces the following result −

Visualize the Regression Graphically

# Give the chart file a name.

# Plot the chart.

# Save the file.

Multiple Linear Regression :

The linear Regression model is written in the form as follows:

The general mathematical equation for multiple regression is −

Create Relationship Model & get the Coefficients

input <- mtcars[,c("mpg","disp","hp","wt")]

# Create the relationship model.

# Show the model.

# Get the Intercept and coefficients as vector elements.

Xdisp <- coef(model)[2]

Decision Tree Classifiers in R Programming

A decision tree is a flowchart-like tree structure in which the internal node

nativeSpeaker age shoeSize score

# Load the party package. It will automatically load other

# Create the input data frame.

# Give the chart file a name.

# Create the tree.

# Plot the tree.

# Save the file.

# Load the party package. It will automatically load other

# Create the forest.

# View the forest results.

# Importance of each predictor.

K-Means Clustering in R Programming

# Create data for the graph.

# Give the chart file a name.

# Plot the chart.

# Save the file.

When we execute the above code, it produces the following result −

# Plot the chart with title and rainbow color pallet.

• H is a vector or matrix containing numeric values used in bar chart.

# Give the chart file a name

# Plot the bar chart

# Save the file

When we execute above code, it produces following result −

# Create the data for the chart

# Give the chart file a name

# Plot the bar chart

# Save the file

When we execute above code, it produces following result −

Creating the Boxplot

# Give the chart file a name.

# Plot the chart.

# Save the file.

When we execute the above code, it produces the following result −

# Create data for the graph.

# Give the chart file a name.

# Create the histogram.

# Save the file.

When we execute the above code, it produces the following result −

# Create the data for the chart.

# Give the chart file a name.

# Plot the bar chart.

# Save the file.

When we execute the above code, it produces the following result −

input <- mtcars[,c('wt','mpg')]

When we execute the above code, it produces the following result −

# Get the input values.

# Give the chart file a name.

When we execute the above code, it produces the following result −

You might also like