0% found this document useful (0 votes)

11 views20 pages

Computing With R

Uploaded by

Sleek Felix

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views20 pages

Computing With R

Uploaded by

Sleek Felix

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

Computing with R

A.A. Ayenigba
0ctober 31, 2019

Contents
History and Overview of R 2
Advantages of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
R for Data Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
What you will learn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
R and R Studio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
R as a calculator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Comment in R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Variable assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Variable assignment and data types in R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Naming Rules for Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Rules for naming variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Basic classes of objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Basic data structure or types 7

Create a vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Naming a vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Create a vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Arithmetic operation with vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Vector selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Short group work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Create special vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Short group work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Group work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Matrices 9
Short group work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Progressing from vector to matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Naming a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Other examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Matrices selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Arithmetic Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Short group work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Short group work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
System of linear equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Short group work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Dataframe 15
Quick, have a look at your dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Using built-in datasets in R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Statistical modelling in R 17
Simple linear regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

1
Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Multiple linear regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

History and Overview of R

Figure 1: R programming

R is a dialect of S and S is a language that was developed by John Chambers and others at the old Bell
Telephone Laboratories, originally part of AT & T Corp. S was initiated in 1976 as an internal statistical
analysis environment—originally implemented as Fortran libraries.
The R language came to use quite a bit after S had been developed. One key limitation of the S language
was that it was only available in a commercial package, S-PLUS. In 1991, R was created by Ross Ihaka
and Robert Gentleman in the Department of Statistics at the University of Auckland. In 1993 the first
announcement of R was made to the public.
R is a programming language and free software environment for statistical computations, data cleaning, data
analysis and graphical representation of data. The R language is widely used among statisticians and data
miners for developing statistical software and data analysis.

Advantages of R
1. Availability:

2
R programming language is open source. This makes it highly cost effective for a project of any size. Since it
is open source, developments in R happen at a rapid scale and the community of developers is huge. All of
this, along with a tremendous amount of learning resources makes R programming a perfect choice to begin
learning R programming for data science. Because there are many new developers exploring the landscape of
R programming it is easier and cost-effective to recruit or outsource to R developers.
2. Academia:
R is a very popular language in academia. Many researchers and scholars use R for data analysis. Many
popular books and learning resources on statistics use R for statistical analysis as well. Since it is a language
preferred by academicians, this creates a large pool of people who have a good working knowledge of R
programming. Putting it differently, if many people study R programming in their academic years than this
will create a large pool of skilled statisticians who can use this knowledge when they move to the industry.
Thus, leading increased traction towards this language.
3. Data wrangling
Data wrangling is the process of cleaning messy and complex data sets to enable convenient consumption and
further analysis. This is a very important and time taking process in data science. R has an extensive library
of tools for database manipulation and wrangling. Some of the popular packages for data manipulation in R
include:
dplyr - Created and maintained by Hadley Wickham, dplyr is best known for its data exploration and
transformation capabilities and highly adaptive chaining syntax.
data.table- It allows for faster manipulation of data set with minimum coding. It simplifies data aggregation
and drastically reduces the compute time.
readr- ‘readr’ helps in reading various forms of data into R. By not converting characters into factors it
performs the task at 10x faster speed.
4. Data visualization: Data visualization is the visual representation of data in graphical form. This allows
analyzing data from angles which are not clear in unorganized or tabulated data. R has many tools
that can help in data visualization, analysis, and representation. The R packages ggplot2 and ggedit for
have become the standard plotting packages. While the ggplot2 package is focused on visualizing data,
ggedit helps users bridge the gap between making a plot and getting all of those pesky plot aesthetics
precisely correct.
5. Specificity:
R is a language designed especially for statistical analysis and data reconfiguration. All the R libraries
focus on making one thing certain - to make data analysis easier, more approachable and detailed. Any new
statistical method is first enabled through R libraries. This makes R a perfect choice for data analysis and
projection. Members of the R community are very active and supporting and they have a great knowledge of
statistics as well as programming. This all gives R a special edge, making it a perfect choice for data science
projects.
6. Machine learning:
At some point in data science, a programmer may need to train the algorithm and bring in automation and
learning capabilities to make predictions possible. R provides ample tools to developers to train and evaluate
an algorithm and predict future events. Thus, R makes machine learning (a branch of data science) lot more
easy and approachable. The list of R packages for machine learning is really extensive. R machine learning
packages include MICE (to take care of missing values), rpart & PARTY (for creating data partitions),
CARET (for classification and regression training), randomFOREST (for creating decision trees) and much
more.

3
R for Data Science
Data science is an exciting discipline that allows you to turn raw data into understanding, insight, and
knowledge. The goal of R for Data Science is to help you learn the most important tools in R that will allow
you to do data science. Data science is a huge field, and there’s no way you can master it by reading a single
book.

What you will learn

Figure 2: Data science phases

R and R Studio
R is a statistical programming language for data analysis and visualization while R Studio is an integrated
development environment (IDE) for R programming. R Studio makes programming easier in R.

Figure 3: R Studio

4
In this section, you will take your first steps with R. You will learn how to use the console as a calculator and
how to assign variables. You will also get to know the basic data types in R. Let’s get started!

R as a calculator
In its most basic form, R can be used as a simple calculator. Consider the following arithmetic operations:
• Addition :
• Subtraction :
• Multiplication :
• Division :
• Exponentiation:
• Modulo :
Calculate 6 + 12
6 + 12

## [1] 18
Calculate 800 − 900
800 - 900

## [1] -100
Calculate 4 × 5
4 * 5

## [1] 20
2018
Calculate 2
2018 / 2

## [1] 1009
Calculate 23
2^3

## [1] 8
Calculate 20%%3
20 %% 3

## [1] 2
√
Calculate the square root of 4
sqrt(4)

## [1] 2
√
Calculate ( 4)2
(sqrt(4))^2

## [1] 4

5
Comment in R
R makes use of the # sign to add comments, so that you and others can understand what the R code is about.
Just like Twitter! Comments are not run as R code, so they will not influence your result. For example, any
code like #3 + 4 at the console is a comment. R ignores any code in #, this means that the code will not run.
# 3+4

Variable assignment
A basic concept in statistical programming is called a variable. A variable allows you to store a value (e.g. 5)
or an object (e.g. a function description) in R. You can then later use this variable’s name to easily access
the value or the object that is stored with this variable.
Example
Store the value of 4 as your first name
ezekiel <- 4

To know what is stored in memory as your first name, type your first name in the console and press return
key from the keyboard
ezekiel

## [1] 4

Variable assignment and data types in R

x <- 3
y <- 4
z <- 10

x + y

## [1] 7
z - x - y

## [1] 3
x * y

## [1] 12
z^x

## [1] 1000

Naming Rules for Variables

The best naming convention is to choose a variable name that will tell the reader of the program what the
variable represents

Rules for naming variables

• All variables must begin with a letter of the alphabet.
• After the initial letter, variable names can also contain (_ or .) and numbers. No spaces or special
characters, however are allowed.
• Uppercase characters are different from lowercase characters (in R and also in Python)

6
Example

Samples of acceptable variable names Samples of acceptable variable names

Grade Grade(Test)
GradeOnTest GradeTest#1
Ibadan_R_users Ibadan R users
sales_price_2017 2017sales_price

Basic classes of objects

R works with numerous atomic classes of objects. Some of the most basic atomic data types to get started
are:
• Decimas values like 4.7 are called numeric
• Natural numbers like 4 are called integers. Integers are also numeric
• Boolean values (TRUE or FALSE) are called logical
• Text (or string) values are called characters
• Factors : Categorical variable where each level is a category

Basic data structure or types

1. Vector : A collection of elements of the same class
2. Matrix : All columns must uniformly contain only one variable type
3. data.frame : The columns can contain different classes
4. List : Can hold object of different classes and lenght

Create a vector
Vectors are one-dimensional arrays that can hold numeric data, character data, or logical data. In R, you can
create a vector with the combine function c(). You place the vector elements separate by a comma between
the parenthesis.
For example
character.vector <- c('Ayenigba', 'Emmanuel', 'Ezekiel', 'Ajayi', 'Ebun')

numeric_vector <- c(1, 2, 3, 6, 7, 10)

Notice
Adding a space behind the commas in the c() function improves the readability of your code

Naming a vector
As a data analyst, it is important to have a clear view on the data that you are using. Understanding what
each element refers to is essential. You can give a name to the elements of a vector with the names ()
function

Create a vector
Example
sales_tax <- c(140000, 200000, 600000, 180000, 170000)
names(sales_tax) <- c(
"Monday", "Tuesday", "Wednessday",

7
"Thursday", "Friday"
)
sales_tax

## Monday Tuesday Wednessday Thursday Friday

## 140000 200000 600000 180000 170000

Arithmetic operation with vectors

It is important to know that if you sum two vectors in R, it takes the element-wise sum
Example
a <- c(1, 2, 3, 4, 5)
b <- c(6, 7, 8, 9, 10)
c <- a + b
c

## [1] 7 9 11 13 15

Vector selection
To select elements of a vector (and later matrices, data frames), you can use square brackets [ ], between the
square brackets, you indicate what elements to select.
To select the first elements of vector a, you type a[1].
To select the second element of the vector, you typed a[2], etc.
Example
a
a[1]
a[2]

Short group work

What does it do?
a[a>3]

Create special vectors

a <- 1:10 # Create sequence 1 to 10
b <- 10:1 # Create sequence 10 to 1

To create sequence with increament of 2 from 1 to 16, we can seq() function e.g.
seq(1, 16, 2)
seq(1, 20, 0.1)
seq(20, 1, -0.1)

If you have a sequence value you don’t know the last element, say you just know the start of the sequence
and the length of the sequence, e.g.
seq(5, by = 2, length = 50)
length(seq(5, by = 2, length = 50))

Repeating elements for certain number of time

8
rep(5, 10) # Repeat 5 in 10 times
rep(1:4, 5) # Repeat 1 to 4 five times
rep(1:4, each = 3) # Each element of 1 to 4 3 times

Short group work

What are the output of the following codes?
rep(1:4, each = 3, time = 2)
rep(1:4, 1:4)
rep(1:4, c(4, 1, 8, 2))

Group work

Figure 4: Group work

Matrices
In R, a matrix is a collection of elements of the same data type (numeric, character, or logical) arranged into
a fixed number of rows and columns.
Since we are only working with rows and columns, a matrix is called two dimensional array.
You can construct a matrix in R with the matrix () function.
Example
A <- matrix(1:9, nrow = 3, byrow = TRUE)
A
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 4 5 6
## [3,] 7 8 9

• The first argument is the collection of elements that #Rstats will arrange into the rows and columns of
the matrix. Here, we use 1:9 which is a shortcut for c(1, 2, . . . , 9).
• The arguement byrow indicates that the matrix is filled by the rows. If we want the matrix to be filled
by the columns, we just place byrow=FALSE

9
• The argument nrow indicates that the matrix should have 3 rows

Short group work

Construct a matrix with 3 rows containing the numbers 1 up to 9 filled column-wise

Progressing from vector to matrix

fiscal_year2016_17 <- c(140, 134)

fiscal_year2017_18 <- c(160, 158)

performance_analysis <- matrix(c(

fiscal_year2016_17,
fiscal_year2017_18
),
nrow = 2,
ncol = 2, byrow = T
)
performance_analysis

## [,1] [,2]
## [1,] 140 134
## [2,] 160 158

Naming a matrix
To help you understand what is stored in the performance analysis matrix, it is good to add the names of
the rows and columns respectively. Not only does this help you to read the data, but it also useful to select
certain elements from the matrix.
rownames(performance_analysis) <-
c(
"Fiscal year July-June 2016/17",
"Fiscal year July-June 2017/18"
)

colnames(performance_analysis) <- c("Actual", "Target")

performance_analysis

## Actual Target
## Fiscal year July-June 2016/17 140 134
## Fiscal year July-June 2017/18 160 158

Other examples
A <- matrix(c(1, 3, 5, 7, 9, 11, 13, 15, 17),
ncol = 3,
byrow = F
)
A

## [,1] [,2] [,3]

## [1,] 1 7 13

10
## [2,] 3 9 15
## [3,] 5 11 17
B <- matrix(c(2, 4, 6, 8, 10, 12, 14, 16, 18),
ncol = 3,
byrow = F
)
B

## [,1] [,2] [,3]

## [1,] 2 8 14
## [2,] 4 10 16
## [3,] 6 12 18

Matrices selection
To select elements in a matrix we can use square brackets [ , ], between the square brackets, you indicate the
position of the row and column in which the elements to select are.
To select the element in the first row and second column of matrix A, you type A[1,2].
To select the element in the third row and second column of matrix A, you type A[3,2], etc.
Example
A
A[1, 2]
A[3, 2]

Arithmetic Operation
We can perform all the arithmetic operations on matrices
• Addition
C <- A + B
C

## [,1] [,2] [,3]

## [1,] 3 15 27
## [2,] 7 19 31
## [3,] 11 23 35
• Subtraction
D <- B - A
D

## [,1] [,2] [,3]

## [1,] 1 1 1
## [2,] 1 1 1
## [3,] 1 1 1
• Multiplication
F <- A %*% B
F

## [,1] [,2] [,3]

## [1,] 108 234 360
## [2,] 132 294 456

11
## [3,] 156 354 552
• Transpose

 
1 7 13
G = t(A) = 3 9 15
5 11 17

G <- t(A)
G

## [,1] [,2] [,3]

## [1,] 1 3 5
## [2,] 7 9 11
## [3,] 13 15 17
• Determinant

1 7 13
G = det(A) = 3 9 15
5 11 17

G <- det(A)
G

## [1] 4.263256e-14

Inverse
For inverse, we use solve() a base function in R
H <- solve(B)
H
Did you encounter a problem?
Be of good cheer; for I have overcome the world!- Jesus Christ in John 16:33
Inverse function to tackle the problem
inverse <- function(A) {
if (det(A) < 0.01) {
cat("Since the given matrix is singular.
Sorry, I can't find inverse")
} else {
solve(A)
}
}
inverse(A)

## Since the given matrix is singular.

## Sorry, I can't find inverse

12
Short group work
Use the function that you wrote to find the inverse of matrix J, where J is:

 
5 1 0
J = 3 −1 2
4 0 −1

Note
Assign the matrix to J and call inverse(J) in R

Short group work

Can you also confirm the result with the base function solve(J)?
solve(J)

Are they the same? Try it with this R-code

inverse(J) == solve(J)

System of linear equation

We can use matrix skills to solve any system of linear equations
Solve the following system of equations

x−y =3
2x + 3y = −4

Matrices preparation

1 −1 x 3
A= B= C=
2 3 y −4

B = A−1 × C

Codes in R
A <- matrix(c(1, -1, 2, 3), nrow = 2, byrow = T)
A

## [,1] [,2]
## [1,] 1 -1
## [2,] 2 3
C <- matrix(c(3, -4), nrow = 2, byrow = T)
C

## [,1]
## [1,] 3
## [2,] -4

13
Codes in R
B <- solve(A) %*% C
B

## [,1]
## [1,] 1
## [2,] -2
x <- B[1, 1]
x

## [1] 1
y <- B[2, 1]
y

## [1] -2

Eigenvalues and Eigenvectors

Consider the following matrix

1 −6
B=
3 −8

1. Determine the eigenvalues of B

2. Determine the eigenvectors corresponding to each eigenvalue of B
Solution
B <- matrix(c(1, -6, 3, -8),
nrow = 2, ncol = 2,
byrow = TRUE
)
print(B) # To see the matrix

## [,1] [,2]
## [1,] 1 -6
## [2,] 3 -8

Eigenvalues and Eigenvectors

The function for calculating eigenvalues is eigen(). Note the function eigen() will produce a list as
results. You will soon know what a list() is in the next next section.
eigen(B)

## eigen() decomposition
## $values
## [1] -5 -2
##
## $vectors
## [,1] [,2]
## [1,] 0.7071068 0.8944272
## [2,] 0.7071068 0.4472136

14
Short group work
Consider the following matrix

 
4 5 −5
B = 0 4 1
0 1 2

1. Determine the eigenvalues of B

2. Determine the eigenvectors corresponding to each eigenvalue of B

Dataframe
Dataframes are another way to put data in tables! Unlike matrices, dataframes can have different types of
data!
A dataframe has the variables of a data set as columns and the observations as rows. This will be a familiar
concept for those coming from different statistical software packages such as Excel, SPSS, or STATA
The function for dataframe is data.frame().
Example
# Make a dataframe with columns named a and b
data.frame(a = 2:4, b = 5:7)

a b
2 5
3 6
4 7

The numbers 1 2 3 at the left on your console are row labels and are not a column of the dataframe
Each column in a dataframe is a vector!
Example
a <- c(6, 5, 1)

b <- c(1, 1, 3)

data <- data.frame(a, b) # The output is ?

Group work
Create a dataframe and call it data for the following vectors:
# Set the same seed to get the same sample
set.seed(123)
height <- rnorm(n = 100, mean = 135, sd = 12)
weight <- rnorm(n = 100, mean = 55, sd = 9)

Quick, have a look at your dataset

Working with large datasets is common in data science. When you work with (extremely) large datasets and
dataframes, your first task as a data analyst is to develop a clear understanding of its structure and main

15
elements. Therefore, it is often useful to show only part of the entire dataset.
1. head(): enables you to show the first observations of a dataframe.
2. tail(): enables you to print out the last observations in your dataset.
Both head() and tail() print a top line called header, which contains the names of the different variables
in your data set.
Another method that is often used to get a rapid overview of your dataset is the function str().
3. str(): Shows you the structure of your dataset
The structure of a dataframe tells you :
1. The total number of observations
2. The total number of variables
3. A full list of the variables names
4. The first observations
Note
Applying the str() function will often be the first thing that you do when receiving a new dataset or
dataframe. It is a great way to get more insight in your dataset before diving into the real analysis.
Example
Consider the vectors:
height <- rnorm(n = 120, mean = 135, sd = 12)
weight <- rnorm(n = 120, mean = 55, sd = 9)

Create a dataframe for it.

data <- data.frame(height, weight)

str(data)

## 'data.frame': 120 obs. of 2 variables:

## $ height: num 161 151 132 142 130 ...
## $ weight: num 57.1 66 43 60.9 50.3 ...
Example
head(data, 5)

height weight
161.3857 57.13687
150.7490 65.96298
131.8183 42.95103
141.5183 60.94738
130.0279 50.29379

tail(data, 3)

height weight
118 119.8962 64.76298
119 155.2132 34.97511
120 145.9367 66.12124

16
Using built-in datasets in R
There are several ways to find the included datasets in R. Using data() will give you a list of the datasets of
all loaded packages.
data()

Example
library(datasets)

data <- airquality

str(data)

To get help for the proper description of the dataset

?airquality

Statistical modelling in R
In this section, we will use R for statistical modelling.

Simple linear regression

When one variable influences the other variable, then we can say there is a linear relationship between them.
A simple linear model has one independent variable (X) that is related to the other dependent variable (Y).
The simple linear regression model is:
Y = b0 + b1 X + where b0 is the intercept on the Y-axis, b1 is the slope and is the error term.

Example
We shall use women dataset in R. The description about women dataset can be seen by using ?women i.e.
?women
data <- women

head(data)

height weight
58 115
59 117
60 120
61 123
62 126
63 129

str(women)

## 'data.frame': 15 obs. of 2 variables:

## $ height: num 58 59 60 61 62 63 64 65 66 67 ...
## $ weight: num 115 117 120 123 126 129 132 135 139 142 ...
In this data, the dependent variable is height and independent variable is weight.

17
model <- lm(height ~ weight, data = data)

The function lm() is used to fit the linear model and ~ is used separate dependent variable from independent
variable, and we specify the name of our data in argument data.
To see the results:
model

##
## Call:
## lm(formula = height ~ weight, data = data)
##
## Coefficients:
## (Intercept) weight
## 25.7235 0.2872
From the results, we see that:
height = 25.7235 + 0.2872weight.
To see the full results, we use summary() function i.e.
summary(model)

##
## Call:
## lm(formula = height ~ weight, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.83233 -0.26249 0.08314 0.34353 0.49790
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 25.723456 1.043746 24.64 2.68e-12 ***
## weight 0.287249 0.007588 37.85 1.09e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.44 on 13 degrees of freedom
## Multiple R-squared: 0.991, Adjusted R-squared: 0.9903
## F-statistic: 1433 on 1 and 13 DF, p-value: 1.091e-14

Multiple linear regression

Multiple linear regression is an extension of simple linear regression in that we have more than one dependent
variable.
The statistical model for multiple linear regression is:
Y = b0 + b1 X1 + b2 X2 + · · · + bp Xp + where p is the number of indepent variables in the model and is the
error term.

Example
We shall be using attitude dataset in R. The description about the dataset can be seen by using ?attitude
i.e.

18
?attitude
dataset <- attitude

head(dataset)

rating complaints privileges learning raises critical advance

43 51 30 39 61 92 45
63 64 51 54 63 73 47
71 70 68 69 76 86 48
61 63 45 47 54 84 35
81 78 56 66 71 83 47
43 55 49 44 54 49 34

str(dataset)

## 'data.frame': 30 obs. of 7 variables:

## $ rating : num 43 63 71 61 81 43 58 71 72 67 ...
## $ complaints: num 51 64 70 63 78 55 67 75 82 61 ...
## $ privileges: num 30 51 68 45 56 49 42 50 72 45 ...
## $ learning : num 39 54 69 47 66 44 56 55 67 47 ...
## $ raises : num 61 63 76 54 71 54 66 70 71 62 ...
## $ critical : num 92 73 86 84 83 49 68 66 83 80 ...
## $ advance : num 45 47 48 35 47 34 35 41 31 41 ...
In this data, the dependent variable is rating.
model <- lm(rating ~ ., data = dataset)

The function lm() is used to fit the linear model and ~. is used separate dependent variable from independent
variable and to include all the independent variables in the dataset, and we specify the name of our data in
argument dataset.
To see the results:
model

##
## Call:
## lm(formula = rating ~ ., data = dataset)
##
## Coefficients:
## (Intercept) complaints privileges learning raises
## 10.78708 0.61319 -0.07305 0.32033 0.08173
## critical advance
## 0.03838 -0.21706
From the results, we see that:
attitude = 10.78707639 + 0.61318761(complaints) − 0.07305014(privileges) + 0.32033212(learning) +
0.08173213(raises) + 0.03838145(critical) − 0.21705668(advance)
To see the full results, we use summary() function i.e.
summary(model)

##
## Call:

19
## lm(formula = rating ~ ., data = dataset)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.9418 -4.3555 0.3158 5.5425 11.5990
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 10.78708 11.58926 0.931 0.361634
## complaints 0.61319 0.16098 3.809 0.000903 ***
## privileges -0.07305 0.13572 -0.538 0.595594
## learning 0.32033 0.16852 1.901 0.069925 .
## raises 0.08173 0.22148 0.369 0.715480
## critical 0.03838 0.14700 0.261 0.796334
## advance -0.21706 0.17821 -1.218 0.235577
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.068 on 23 degrees of freedom
## Multiple R-squared: 0.7326, Adjusted R-squared: 0.6628
## F-statistic: 10.5 on 6 and 23 DF, p-value: 1.24e-05

12 - Поиск в Google
No ratings yet
12 - Поиск в Google
5 pages
Deployment Diagram
No ratings yet
Deployment Diagram
6 pages
Oops Alv in Abap
No ratings yet
Oops Alv in Abap
7 pages
How To Save PDF
No ratings yet
How To Save PDF
4 pages
Hikivision YancyFang
No ratings yet
Hikivision YancyFang
17 pages
R-Binder
No ratings yet
R-Binder
176 pages
Quickbooks Error 1723 (Troubleshooting Steps)
No ratings yet
Quickbooks Error 1723 (Troubleshooting Steps)
5 pages
License Agreement Template - Sample
No ratings yet
License Agreement Template - Sample
117 pages
Argentina Class I-II Registration Revalidation Form (EN)
No ratings yet
Argentina Class I-II Registration Revalidation Form (EN)
4 pages
1.5 - Centos 7 Installation Ver3 - VMware
No ratings yet
1.5 - Centos 7 Installation Ver3 - VMware
28 pages
Application of E-Filing System at Department of Purchasing PT Indofood CBP Sukses Makmur Tbk. Noodle Division Semarang
No ratings yet
Application of E-Filing System at Department of Purchasing PT Indofood CBP Sukses Makmur Tbk. Noodle Division Semarang
9 pages
Exam1 - Sample Questions-SOLUTIONS
No ratings yet
Exam1 - Sample Questions-SOLUTIONS
8 pages
Y4 Assessment Term 1
No ratings yet
Y4 Assessment Term 1
5 pages
R Programming and Development From Basics to Advanced Topics
No ratings yet
R Programming and Development From Basics to Advanced Topics
154 pages
NIELIT Centres PDF
No ratings yet
NIELIT Centres PDF
23 pages
LT1076 5
No ratings yet
LT1076 5
8 pages
h19084 Powerstore Cybersecurity
No ratings yet
h19084 Powerstore Cybersecurity
18 pages
North West University Prospectus
No ratings yet
North West University Prospectus
32 pages
Vishwas Gupta ResumeWed Oct 17-10-58!41!2018
No ratings yet
Vishwas Gupta ResumeWed Oct 17-10-58!41!2018
3 pages
NLP - PBL - Project Report - Draft.02
No ratings yet
NLP - PBL - Project Report - Draft.02
32 pages
List of Seminar Topics
No ratings yet
List of Seminar Topics
3 pages
An Introduction To R: W. N. Venables, D. M. Smith and The R Development Core Team
No ratings yet
An Introduction To R: W. N. Venables, D. M. Smith and The R Development Core Team
99 pages
Linux Commands Cheatsheet V1.01
No ratings yet
Linux Commands Cheatsheet V1.01
31 pages
WorkNC Dental
No ratings yet
WorkNC Dental
4 pages
R Intro
No ratings yet
R Intro
101 pages
Strings in C++
No ratings yet
Strings in C++
59 pages
Finite Element Analysis Using ANSYS: Appendix
No ratings yet
Finite Element Analysis Using ANSYS: Appendix
28 pages
Introduction to Analytics and R file
No ratings yet
Introduction to Analytics and R file
29 pages
An Introduction To R: W. N. Venables, D. M. Smith and The R Development Core Team
No ratings yet
An Introduction To R: W. N. Venables, D. M. Smith and The R Development Core Team
98 pages
An Introduction To R: W. N. Venables, D. M. Smith and The R Development Core Team
No ratings yet
An Introduction To R: W. N. Venables, D. M. Smith and The R Development Core Team
101 pages
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
No ratings yet
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
106 pages
Graph Theory For Dummies
No ratings yet
Graph Theory For Dummies
30 pages
Creality Ender 3d Printer User Manual
No ratings yet
Creality Ender 3d Printer User Manual
28 pages
An Introduction To R
No ratings yet
An Introduction To R
100 pages
R Programming Course Notes: Overview and History of R
No ratings yet
R Programming Course Notes: Overview and History of R
22 pages
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
No ratings yet
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
105 pages
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
No ratings yet
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
105 pages
Introduction To R Version 3.1.0
No ratings yet
Introduction To R Version 3.1.0
105 pages
R Intro
No ratings yet
R Intro
105 pages
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
No ratings yet
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
106 pages
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
No ratings yet
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
106 pages
An Introduction To R
No ratings yet
An Introduction To R
106 pages
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
No ratings yet
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
106 pages
R Programming Course Notes
No ratings yet
R Programming Course Notes
28 pages
R Lab
No ratings yet
R Lab
114 pages
R Intro
No ratings yet
R Intro
105 pages
3D TCAD Simulation For CMOS Nanoeletronic
67% (3)
3D TCAD Simulation For CMOS Nanoeletronic
337 pages
Introduction To Programming Econometrics With R - Draft
No ratings yet
Introduction To Programming Econometrics With R - Draft
55 pages
04 - Business Models
100% (2)
04 - Business Models
38 pages
Question1. What Are Steps To Install Sap ?
No ratings yet
Question1. What Are Steps To Install Sap ?
23 pages
An Introduction To R: W. N. Venables, D. M. Smith and The R Development Core Team
No ratings yet
An Introduction To R: W. N. Venables, D. M. Smith and The R Development Core Team
105 pages
An Introduction To R
No ratings yet
An Introduction To R
105 pages
Getting Started in R
No ratings yet
Getting Started in R
39 pages
The Art of R Programming
100% (2)
The Art of R Programming
193 pages
The Ai Advantage: How To Use Ai To Transform Your Business
From Everand
The Ai Advantage: How To Use Ai To Transform Your Business
Marvin Roberts
No ratings yet
Actuador Electrico Automax
No ratings yet
Actuador Electrico Automax
12 pages
R Intro A Firsts Steps
No ratings yet
R Intro A Firsts Steps
112 pages
R Programming Course Notes
No ratings yet
R Programming Course Notes
28 pages
An Introduction To R
No ratings yet
An Introduction To R
105 pages
The Ultimate Career Toolkit : A Step-by-Step Guide to Landing Your Dream Job
From Everand
The Ultimate Career Toolkit : A Step-by-Step Guide to Landing Your Dream Job
Ebenezer Edem Zuh
No ratings yet
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
No ratings yet
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
105 pages
Data Empowerment: Harnessing Advanced Mathematical and Statistical Methods for Data Science and Machine Learning
From Everand
Data Empowerment: Harnessing Advanced Mathematical and Statistical Methods for Data Science and Machine Learning
NAGARAJU CHEVURU
No ratings yet
The Future of Learning: Revolutionizing Education Through Generative AI: AI Books, #11
From Everand
The Future of Learning: Revolutionizing Education Through Generative AI: AI Books, #11
Mohammad
No ratings yet
Macs in the Ministry
From Everand
Macs in the Ministry
David Lang
2/5 (1)
Advanced college algebra study guide
From Everand
Advanced college algebra study guide
Harrison Cook
No ratings yet
Colored Pencil on Copper Jewelry: Enhance Your Metalwork the Easy Way
From Everand
Colored Pencil on Copper Jewelry: Enhance Your Metalwork the Easy Way
Roxan O'Brien
5/5 (6)
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
From Everand
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
Harrison K Cook
No ratings yet
Conquering the Competition: Strategies for Standing Out in the Gaming Content Landscape
From Everand
Conquering the Competition: Strategies for Standing Out in the Gaming Content Landscape
Rian McCullen
No ratings yet
Unlocking Statistics for the Social Sciences
From Everand
Unlocking Statistics for the Social Sciences
Norma Sinclair
No ratings yet
The Unstuck Book
From Everand
The Unstuck Book
Conor McCarthy
No ratings yet
Knit Soxx for Everyone: 25 Colorful Sock Patterns for the Whole Family
From Everand
Knit Soxx for Everyone: 25 Colorful Sock Patterns for the Whole Family
Kerstin Balke
4.5/5 (2)
The Linux Terminal for Advanced Users - The Command Line Made Easy: First Edition
From Everand
The Linux Terminal for Advanced Users - The Command Line Made Easy: First Edition
Michael Basler
No ratings yet
Aquaponics How to do Everything from Backyard to Profitable Business: from BACKYARD to PROFITABLE BUSINESS
From Everand
Aquaponics How to do Everything from Backyard to Profitable Business: from BACKYARD to PROFITABLE BUSINESS
David H Dudley
No ratings yet
Audio, Video, and Media in the Ministry
From Everand
Audio, Video, and Media in the Ministry
Clarence Floyd Richmond
No ratings yet
The First Science Fiction Novel MEGAPACK®: 6 Great Science Fiction Novels
From Everand
The First Science Fiction Novel MEGAPACK®: 6 Great Science Fiction Novels
John Gregory Betancourt
No ratings yet
Aquaponics for Profit
From Everand
Aquaponics for Profit
David H Dudley
No ratings yet
Aquaponics Design Plans, Construction, Operation, and Income: Organic Food
From Everand
Aquaponics Design Plans, Construction, Operation, and Income: Organic Food
David H Dudley
No ratings yet
Aquaponic Design Plans Everything You Needs to Know: Everything You Need to Know from Backyard to Profitable Business
From Everand
Aquaponic Design Plans Everything You Needs to Know: Everything You Need to Know from Backyard to Profitable Business
David H Dudley
No ratings yet
Keys to Better Reading
From Everand
Keys to Better Reading
Judy McFall
No ratings yet
ChatGPT for Business: Strategies for Success
From Everand
ChatGPT for Business: Strategies for Success
Matthew C. Smith
1/5 (1)
A To Z of Internet: Everything You Wanted to Know
From Everand
A To Z of Internet: Everything You Wanted to Know
Bittu Kumar
No ratings yet
Securing ChatGPT: Best Practices for Protecting Sensitive Data in AI Language Models
From Everand
Securing ChatGPT: Best Practices for Protecting Sensitive Data in AI Language Models
Matthew C. Smith
No ratings yet
Content Creation Revolution with chatGPT
From Everand
Content Creation Revolution with chatGPT
Maria Cowen
No ratings yet
10K Blueprint
From Everand
10K Blueprint
Cian O Farrell
5/5 (2)
Kellory the Warlock
From Everand
Kellory the Warlock
Lin Carter
No ratings yet
Intrusion Detection Honeypots
From Everand
Intrusion Detection Honeypots
Chris Sanders
3/5 (2)
Gray Hat Hacking the Ethical Hacker's
From Everand
Gray Hat Hacking the Ethical Hacker's
Çağatay Şanlı
5/5 (1)
Web Video Business
From Everand
Web Video Business
MUHAMMAD NUR WAHID ANUAR
No ratings yet
Software Patterns Made Easy
From Everand
Software Patterns Made Easy
Justice Nanhou
No ratings yet

Computing With R

Uploaded by

Computing With R

Uploaded by

Computing with R

Basic data structure or types 7

History and Overview of R

What you will learn

Figure 2: Data science phases

Variable assignment and data types in R

Naming Rules for Variables

Rules for naming variables

Samples of acceptable variable names Samples of acceptable variable names

Basic classes of objects

Basic data structure or types

numeric_vector <- c(1, 2, 3, 6, 7, 10)

## Monday Tuesday Wednessday Thursday Friday

Arithmetic operation with vectors

Short group work

Create special vectors

Repeating elements for certain number of time

Short group work

Figure 4: Group work

Short group work

Progressing from vector to matrix

fiscal_year2017_18 <- c(160, 158)

performance_analysis <- matrix(c(

colnames(performance_analysis) <- c("Actual", "Target")

## [,1] [,2] [,3]

## [,1] [,2] [,3]

## [,1] [,2] [,3]

## [,1] [,2] [,3]

## [,1] [,2] [,3]

## [,1] [,2] [,3]

## Since the given matrix is singular.

Short group work

Are they the same? Try it with this R-code

System of linear equation

Eigenvalues and Eigenvectors

1. Determine the eigenvalues of B

Eigenvalues and Eigenvectors

1. Determine the eigenvalues of B

data <- data.frame(a, b) # The output is ?

Quick, have a look at your dataset

Create a dataframe for it.

## 'data.frame': 120 obs. of 2 variables:

data <- airquality

To get help for the proper description of the dataset

Simple linear regression

## 'data.frame': 15 obs. of 2 variables:

Multiple linear regression

rating complaints privileges learning raises critical advance

## 'data.frame': 30 obs. of 7 variables:

You might also like