0% found this document useful (0 votes)

32 views

Writing Simple Functions in R Bootstrapping

writing simple functions in R Bootstrapping

Uploaded by

yc47398

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views

Writing Simple Functions in R Bootstrapping

writing simple functions in R Bootstrapping

Uploaded by

yc47398

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Writing Simple Functions in R: Bootstrapping as an

Example

Shu Fai Cheung @ University of Macau

Sep 2024

Table of contents
1 Aim and Scope 1
1.1 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Target Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 The Sample Functions are for Learning . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Defining Functions 2
2.1 Defining a Simple Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 Defining a More Complicated Function . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.3 Local Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.4 Default Value for an Argument . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

3 Useful Statements in R 5
3.1 if and else . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.1.1 if without else . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.1.2 if … else if … else if … else . . . . . . . . . . . . . . . . . . . . . . . . 7
3.1.3 if and NA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 for … in …. and while . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

4 Examples 8
4.1 Nonparametric Bootstrapping Confidence Intervals . . . . . . . . . . . . . . . . . . . . 8
4.1.1 Pearson’s r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

5 Optional Topics 13
5.1 Style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5.2 Pass-By-Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5.3 Dotdotdot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

6 Final Remarks 16

7 Further References 16

1 Aim and Scope

1.1 Aim
This document helps you to understand more about a function by writing some simple functions. Even if
you are just users and will never develop anything for others, you can still:
a. write some simple functions for tasks that you will do again and again in your own work, and,
b. write functions to form the nonparametric bootstrapping confidence interval for a statistic.
Bootstrapping will be used as example.

1
2

Note that, although the focus is on writing simple functions, functions not covered in previous documents
may also be introduced if necessary.

1.2 Target Audience

This document is for users, not for programmers. Therefore, some technical details may be omitted or
simplified, sometimes “oversimplified” if the details are not essential for common scenarios.

1.3 The Sample Functions are for Learning

Note that the sample functions presented in this document is for learning. Some tasks can already be
done by existing functions. Some functions can be improved in efficiency or be written in a more “R” way.
The sample functions in this document are written for illustrating the concepts to be introduced. In actual
research, there may be better ways to write them.

1.4 Style
In this and other documents, I will use my own personal style. Feel free to use whatever style you like in
your own work. See the section on style in R as a Language Part 1.

2 Defining Functions
2.1 Defining a Simple Function
You have already learned about calling a function and specifying its arguments. Now we are going to
define a very simple function that
• receives two numbers;
• adds them together;
• returns the result.
This is an example:

my_addition <- function(x, y) {

x + y
}

Recall that a function is an object. Therefore, we will assign it to a name, my_addition by <-. A function
definition starts with, well, function.
After function, there must be a pair of parentheses, ( and ). Between them are the arguments of this
function. They are either named, or are unknown number of arguments denoted by ... (dotdotdot).
Dotdotdot is useful in some cases but will be introduced later when we need it. In the example above,
the arguments are x and y, in this order. Arguments are separated by commas.
Note that the order is important because if users do not name the arguments, the values will be assigned
based on this order.
After the parentheses, the body of a function must be enclosed between a pair of curly brackets, { and },
unless it is one simple expression after the parentheses.1 We write code as usual between the brackets
to define the body of a function.
In the example above, the body is x + y. This is what we will do to add two variables, x and y.
Try and see what will happen by calling this function and then learn how it works:

1
For example, my_addition <- function(x, y) x + y is acceptable.

SSGC 8802 (Sept 2024) Writing Simple Functions

my_addition(x = 1, y = 2)
my_addition(30, 5)

It should do what we expect. With just two arguments, we do not have to use names.
How about supplying two “names” of objects? Try this:

a <- 15
b <- 21
my_addition(a, b)

We can use the names of variables.

We can even use expressions. Try this:

a <- 15
my_addition(a, 3 * 7)

Now you should have confirmed that it works. Let us see how it works.
When we call the function by my_addition(1, 2), 1 is assigned to x, and 2 is assigned to y. R will then
run the code in the body of my_addition() using these values.2
When the function finishes its operation normally, either because it finishes the last line, or it calls
return() (introduced later) to return something. If it finishes its last line, the output in this line will be
returned.
In my_addition(), the last line is x + y. Therefore, the output of x + y is automatically returned.

2.2 Defining a More Complicated Function

Let us write a slightly more complicated function which returns the minimum and the maximum of a vector
of numbers3 :

my_range <- function(x) {

x_min <- min(x)
x_max <- max(x)
c(min = x_min,
max = x_max)
}

This function has only one argument, x. The function first finds the minimum using min(), and assign it
to x_min. It then finds the maximum using max(), and assign it to x_max. It then creates a named vector
from these two numbers. This is the last line and so the result of this line will be returned.
Let’s try it:

out <- my_range(c(2, 3, 5, 1, 5, 10))

out

Good! We wrote our own function to find the range! This function has something new. New variables,
x_min and x_max, are created inside the body. This leads to the next topic …

2.3 Local Variables

You may wonder what will happen if x_min and x_max already exist outside the function. For example.

2
The arguments actually will be evaluated only when they are used. This is called lazy evaluation. This is not covered here.
3
There is a base function, range(), for doing this. This example is for learning about writing functions by finding the range
ourselves.

SSGC 8802 (Sept 2024) Writing Simple Functions

x_min <- 100

x_max <- 10
x <- c(2, 3, 5, 1, 5, 9)
my_range(x)

What is the result? Will x_min and x_max be changed? Try this:

x_min <- 100

x_max <- 10
x <- c(2, 3, 5, 1, 5, 9)
my_range(x)
x_min
x_max

First, we find that my_range gives the correct result. x_min and x_max we created before calling my_range
do not affect its operation.
Second, x_min and x_max we created, interestingly, are not affected, even if variables with the same
names are used in the function (x_min <- min(x) and x_max <- max(x)).
This introduces the idea of local variables. x_min and x_max, created by <- inside the function, are local.
They are created inside the function and so are different from what exists “out there”4 This behavior is
useful because we do not need to worry about overwriting variables that exists in the environment calling
it.5 These local variables will disappear after the function ends.

2.4 Default Value for an Argument

There is a problem with my_range. Recall that min and max have an argument na.rm to control how to
handle missing values. How can we let the users of my_range to set na.rm if they want to, and set na.rm
to FALSE, the default value, if users do not set na.rm? This is an improved version:

my_range2 <- function(x,

my.na.rm = FALSE) {
x_min <- min(x,
na.rm = my.na.rm)
x_max <- max(x,
na.rm = my.na.rm)
c(min = x_min,
max = x_max)
}

First, we add an argument, my.na.rm. We set the default value of my.na.rm to FALSE. If my.na.rm is
provided, then the provided value will be used. If not, then my.na.rm = FALSE. In the calls to min and max,
we set the argument na.rm of them to the value of our argument, my.na.rm.
Let’s see how it works by trying this:

x2 <- c(4, 2, 1, 10, NA, 7)

my_range2(x2)
my_range2(x2,
my.na.rm = TRUE)

Now users can decide how to handle missing values, and this function also has a default way to handle
missing values if the users have no instruction on how to handle missing values.
4
Technically, in the parent frame.
5
A function can overwrite variables outside it, by using <<-. However, this should be avoided. Use this only if there are no other
solutions.

SSGC 8802 (Sept 2024) Writing Simple Functions

Setting the default values of arguments makes a function easier to use, if the default values are the
values users usually want.

3 Useful Statements in R
This section introduces a few statements useful for writing R functions. Note that all these statements
can also be used in R scripts, not just in a function.

3.1 if and else

Suppose we want to write a function to check if a number is less than a cutoff or not, for example, whether
a p-value is less than the desired level of significance (alpha), say, .05. This is a description of what we
want to do:
Get the number and the cutoff.
If the number is less than the cutoff:
Return a string: "sig."
Else:
Return a string: "n.s."
This is one way to write this function:

is_sig <- function(pvalue,

alpha = .05) {
if (pvalue < alpha) {
return("sig.")
} else {
return("n.s.")
}
print("This sentence will never be printed")
}

Note that we set the default value of alpha to .05, the usual maximum level of significance.
Let’s test this function:

is_sig(pvalue = .04)
# We can omit the name
is_sig(.06)
# We set another level of significance
is_sig(.04, alpha = .01)

It should work. This function uses if ( ... ) { ... } else { ... }. After if is the condition en-
closed in a pair of parentheses. This condition should be a one-element logical vector, or an expression
that will result in a one-element logical vector. In is_sig, pvalue < alpha should result in TRUE or FALSE
(though NA is possible).
If the condition is TRUE, then the expression inside the next pair of curly brackets will be run. If FALSE,
then the expression in the pair of curly brackets after else will be run. (The case of NA will be covered
later.)
NOTE: Be care when writing a condition. The version we used above can result in an error if (pvalue <
alpha) does not return one single logical value. Try this:

ps <- c(.04, .06)

is_sig(pvalue = ps)

Do you know why it results in an error?

SSGC 8802 (Sept 2024) Writing Simple Functions

This example also introduces a new function, return(). This is used to tell the function to end and return
the argument of return immediately. Because the if ... else structure already covers all possibilities,
and return() is used in all possibilities, the line print("This will never be printed"), although
being the last line, will never be run.

3.1.1 if without else

There are other variants of this structure. For example, we can omit else. Let’s improve the is_sig()
function:

is_sig2 <- function(pvalue, alpha = 0.05) {

if ((alpha < 0) || (alpha > 1)) {
stop("The level of significance (alpha) is invalid.")
}
if (pvalue < alpha) {
return("sig.")
} else {
return("n.s.")
}
}

This introduces the idea of testing an argument. The level of significance should not be zero or less (p
< 0?) and should not be one or higher (p < 1?). Therefore, before checking the p value, we check the
alpha first. If either (alpha <= 0) or (alpha >= 1) is TRUE, then the line stop .... will be run. There is
no need for else because we only need to check whether a condition is met. If not, then we can proceed
as usual.
NOTE: || (and &&) is usually used in if condition.
This example also uses a new function, stop(). This function is commonly used in a function. It, obvi-
ously, “stops” a function. But it does not just stop the function. It will “raise” an error, and the argument
is the error message.
Let’s try this version, is_sig2:6

is_sig2(pvalue = .04, alpha = 1)

[1] "sig."

is_sig2(pvalue = .04, alpha = 0)

[1] "n.s."

is_sig2(pvalue = .04, alpha = .05)

[1] "sig."
Instead of stopping the function and raising an error, we can also return NA, that is, replace the call to
stop by return(NA). However, sometimes we may prefer raising an error in this case because NA can
also be interpreted as missing, for example, p-value is NA (although an error will actually occur if pvalue
is NA, for a reason described later).
Certainly, we can also apply a similar test to pvalue, which should range from 0 to 1. I will leave it as an
exercise for you.
6
The error messages may be printed outside the margin. I cannot yet solve this problem. formatR and tidy do not work as
some suggested.)

SSGC 8802 (Sept 2024) Writing Simple Functions

3.1.2 if … else if … else if … else

Suppose we want to check the p value and then return the conventional symbols we use in psychology
to denote the achieved level of significance, i.e., *, **, and *** for p value less than .05, .01, and .001,
respectively. We can use else if:

pstar <- function(pvalue) {

if (pvalue < .001) {
return("***")
} else if (pvalue < .01) {
return("**")
} else if (pvalue < .05) {
return("*")
} else {
return("n.s.")
}
}

Let’s try this function:

pstar(.06)
pstar(.04)
pstar(.009)
pstar(.00000001)

Note that, whenever a condition is met, the code inside the next curly brackets will be run, and all re-
maining conditions will not be checked. If p < .001, then p is also < .01 and < .05. Therefore, we need
to check p < .001 first.
Having many conditions can be difficult to read. If appropriate, we can consider using switch(). We can
also simply remove else and else if.

pstar2 <- function(pvalue) {

if (pvalue < .001) {
return("***")
}
if (pvalue < .01) {
return("**")
}
if (pvalue < .05) {
return("*")
}
"n.s"
}

This function works like pstar does.

pstar2(.06)
pstar2(.04)
pstar2(.009)
pstar2(.00000001)

This version uses if only. If the condition of an if block is not met, then R will proceed to the line after
this block.
Which version to use depends on the context and personal preference. Using else or else if may
make the code look organized. However, sometimes it can be more difficult to read than just having a
sequence of if blocks, especially when we have a lot of lines inside an if block.

SSGC 8802 (Sept 2024) Writing Simple Functions

3.1.3 if and NA
Note that, when testing the condition, NA is neither FALSE nor TRUE. It will result in an error. Therefore,
the following call will result in an error:

is_sig2(pvalue = NA)

Error in if (pvalue < alpha) {: missing value where TRUE/FALSE needed

It is because the condition is pvalue < alpha. NA < .05 is NA, and so the condition is if (NA) {...},
resulting in the error message.

3.2 for … in …. and while

They will not be covered here, though they may be introduced later if they are needed for a task.
A for … in … block, usually called a for loop, is common in many programming languages. However,
in R, usually the same task can be done by the family of apply functions (e.g., lapply, sapply, etc.), to
be introduced later.
If you only need to write some functions to simplify some tasks, while is usually not necessary. It is
used for repeating a process while a condition is true, which is common for algorithms (e.g., maximum
likelihood estimation). However, you may rarely need to use it.
You can use help("for") and help("while") to learn more about them. They share the same help
page.
In R, the family of apply-functions are usually used instead of for-loop when applicable. These functions
are useful some tasks and will be covered as the needs arise.

4 Examples
4.1 Nonparametric Bootstrapping Confidence Intervals
(This section assumes that you have learned about nonparametric bootstrapping, including its pros and
cons.)
R comes with a package boot that can do nonparametric bootstrapping. More and more packages can
form nonparametric bootstrapping confidence intervals (e.g., lavaan can do this for parameter estimates
in structural equation modeling, and psych::alpha() can do this for Cronbach’s alpha). Nevertheless,
there may be cases in which such a function has not yet been developed (or it has but you could not find
it). Even if there is such a function, it is still a good practice to learn writing a function to do this.
The boot() function in the boot package does not compute the statistic. It requires users to supply the
function to compute the statistic. Its job is to draw the bootstrap samples, compute the statistic, and
return them to the users.

4.1.1 Pearson’s r
Let us consider a practical scenario: forming a nonparametric bootstrapping confidence interval for a
Pearson’s r.
We already know that psych::cor.ci() can do this. Let’s try to do it using our own function.

4.1.1.1 Write a Function to Compute the Statistics

To use boot::boot(), we first need to write a function which must have at least these two arguments as
the first and second argument:
• First argument: The data frame.
• Second argument: A numeric vector to select rows (cases) from the data frame.

SSGC 8802 (Sept 2024) Writing Simple Functions

This function must return a vector of the statistic.

In our case, the statistic is the Pearson’s r.
So this is the form of the function (names of the first and second arguments do not matter):

my_r <- function(data,

index) {
# Resampling: Select rows from dat.
# Compute and return Pearson's r
}

Let’s try to compute the correlation first, using the dataset similar to the one used in the handout
SSGC_8802_Correlation_in_R, but with 100 cases:

library(readxl)
dat <- read_excel("correlation_example_100_cases.xlsx")
cor(dat)

SelfEsteem Happiness EmotionalIntelligence

SelfEsteem 1.00000000 0.5375000 -0.09598361
Happiness 0.53750000 1.0000000 0.18428854
EmotionalIntelligence -0.09598361 0.1842885 1.00000000
So, we know cor() can compute the correlations. But we need a vector of correlations. We can use the
subsetting techniques for matrices. However, there is a convenient function, as.vector(), to convert a
matrix to a vector:

as.vector(cor(dat))

[1] 1.00000000 0.53750000 -0.09598361 0.53750000 1.00000000

[6] 0.18428854 -0.09598361 0.18428854 1.00000000
So, we can use as.vector() and then extract the three correlations:

as.vector(cor(dat))[c(2, 3, 6)]

[1] 0.53750000 -0.09598361 0.18428854

Done! Now we adapt this to the body of the function:

my_r <- function(data,

index) {
# Resampling: Select rows from dat.
# Compute and return Pearson's r
out <- as.vector(cor(data))[c(2, 3, 6)]
return(out)
}

Let’s check the function:

my_r(data = dat)

[1] 0.53750000 -0.09598361 0.18428854

It works, but not easy to read. We can add names to the vector by names():

my_r <- function(data,

index) {
# Resampling: Select rows from dat.

SSGC 8802 (Sept 2024) Writing Simple Functions

# Compute and return Pearson's r

out <- as.vector(cor(data))[c(2, 3, 6)]
names(out) <- c("SE-HP", "SE-EI", "HP-EI")
return(out)
}

Let’s check the function again:

my_r(data = dat)

SE-HP SE-EI HP-EI

0.53750000 -0.09598361 0.18428854
Now we have completed the part to compute Pearson’s r. Let’s write the part to select cases. We can
just apply the technique we used before for selecting rows:

my_r <- function(data,

index) {
# Resampling: Select rows from dat.
data0 <- data[index, ]
# Compute and return Pearson's r
out <- as.vector(cor(data0))[c(2, 3, 6)]
names(out) <- c("SE-HP", "SE-EI", "HP-EI")
return(out)
}

I renamed dat to dat0, just to make it obvious that the correlations are computed on the sampled rows
of dat.
Let’s test the function again:

my_r(data = dat,
index = 1:5)

SE-HP SE-EI HP-EI

0.6736833 -0.2859402 0.3886301
The correct answers:

cor(dat[1:5, ])

SelfEsteem Happiness EmotionalIntelligence

SelfEsteem 1.0000000 0.6736833 -0.2859402
Happiness 0.6736833 1.0000000 0.3886301
EmotionalIntelligence -0.2859402 0.3886301 1.0000000
This function is now ready to be used in boot::boot().

4.1.1.2 Do Nonparametric Bootstrapping

After we confirmed that this function works as expected, we use it in boot. These are the arguments that
we will use (see help("boot") for further details):
• data: The dataset to be resampled.
• statistic: A function that will compute the target statistic.
• R: The number of bootstrap samples. For confidence interval, it should be at least 2000 or even
5000.
This is the code:

SSGC 8802 (Sept 2024) Writing Simple Functions

library(boot)
set.seed(23456)
boot_r <- boot(data = dat,
statistic = my_r,
R = 2000)

set.seed() is used to set the seed for the random number generator, to make the results reproducible.
The output, boot_r, stores the results from the 2000 bootstrap samples. We can use plot to examine
the distribution of the 2000 bootstrap Pearson’s rs.
Note that our function my_f() returns three correlations for each sample. Therefore, we need to add
index to indicate the statistic we need. Let’s add index = 1 to plot the 2000 bootstrap correlations
between self-esteem and happiness:

plot(boot_r,
index = 1)

Histogram of t
5

0.6
4
Density

0.4
2
1

0.2
0

0.1 0.3 0.5 0.7 −3 −1 1 2 3

t* Quantiles of Standard Normal

These are the plots for the other two correlations:

plot(boot_r,
index = 2)

SSGC 8802 (Sept 2024) Writing Simple Functions

5 Histogram of t

0.2
4

0.0
Density

t*
2

−0.2
1

−0.4
0

−0.4 −0.2 0.0 0.2 −3 −1 1 2 3

t* Quantiles of Standard Normal

plot(boot_r,
index = 3)

Histogram of t
5

0.5
4

0.3
Density

t*
2

0.1
1

−0.1
0

−0.1 0.1 0.3 0.5 −3 −1 1 2 3

t* Quantiles of Standard Normal

To get the bootstrap confidence interval, we can use boot::boot.ci(). In this example, we will only
use percentile bootstrap confidence interval. Therefore, we set type to "perc" (percentile). The level
of significance is 95%, or .95. Therefore, we set conf to .95. (See help("boot.ci") for further details.)
Note that we also need to add index in this case because we computed three correlations:

boot.ci(boot_r,
index = 1,
conf = .95,
type = "perc")

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS

SSGC 8802 (Sept 2024) Writing Simple Functions

Based on 2000 bootstrap replicates

CALL :
boot.ci(boot.out = boot_r, conf = 0.95, type = "perc", index = 1)

Intervals :
Level Percentile
95% ( 0.3440, 0.6856 )
Calculations and Intervals on Original Scale
In this example, the nonparametric bootstrap percentile 95% confidence interval of the Pearson’s r be-
tween self-esteem and happiness is 0.3440 to 0.6856.
We can compare the results with psych::cor.ci():

library(psych)

Attaching package: 'psych'

The following object is masked from 'package:boot':

logit

set.seed(23456)
cor_ci_out <- cor.ci(dat,
n.iter = 2000,
plot = FALSE)
print(cor_ci_out,
digits = 4)

Call:corCi(x = x, keys = keys, n.iter = n.iter, p = p, overlap = overlap,

poly = poly, method = method, plot = plot, minlength = minlength,
n = n)

Coefficients and bootstrapped confidence intervals

SlfEs Hppns EmtnI
SelfEsteem 1.00
Happiness 0.54 1.00
EmotionalIntelligence -0.10 0.18 1.00

scale correlations and bootstrapped confidence intervals

lower.emp lower.norm estimate upper.norm upper.emp p
SlfEs-Hppns 0.3472 0.3513 0.5375 0.6892 0.6909 0.0000
SlfEs-EmtnI -0.2715 -0.2693 -0.0960 0.0866 0.0919 0.3080
Hppns-EmtnI 0.0032 -0.0052 0.1843 0.3635 0.3630 0.0594
The nonparametric bootstrap CIs are not exactly the same because boot::boot.ci() and
psych::cor.ci() use slightly different ways to find the confidence limits. However they are close
enough practically.

5 Optional Topics
5.1 Style
Some align the closing curly bracket with first line of the definition:

my_addition <- function(x, y) {

x + y

SSGC 8802 (Sept 2024) Writing Simple Functions

I indent the closing brackets too, as in this document, simply because this is consistent with how we
indent lines in Python. I prefer a (personal) style that is similar across languages.
Some use four whitespace characters for indentation:

my_addition <- function(x, y) {

x + y
}

Using four whitespace characters is a common practice in programming. I use two whitespace characters
simply because I usually work on a small screen or window.
Some write one argument per line:

my_addition <- function(x,

y) {
x + y
}

I will just do whatever easy to type and read, for me. :)

If you write the code just for you yourself, be consistent is enough, in my opinion. Use whatever style
that suits your own need and preference.

5.2 Pass-By-Value
R functions use pass-by-value in handling argument values. Therefore, a function normally will not
change the sources of its arguments, although it can return a modified version of its arguments.
For example:

demo_pass_by_value <- function(x) {

x <- x^2
x
}
x_origin <- 10
x_squared <- demo_pass_by_value(x = x_origin)
x_squared

[1] 100

x_origin

[1] 10
Even though we set x to x_origin and then x is changed inside the function, x_origin is not changed.
It is because it is the value of x_origin that is passed to x, not x_origin itself.
Certainly, we can update x_origin to the result of demo_pass_by_value(), but this is just an reassign-
ment, not a consequence of demo_pass_by_value():

x_origin <- 10
x_origin <- demo_pass_by_value(x = x_origin)
x_origin

[1] 100

SSGC 8802 (Sept 2024) Writing Simple Functions

5.3 Dotdotdot
The argument ... is sometimes used by one function to pass arguments to another function. You may
notice that boot() has this argument (see help("boot")). This section illustrates how ... can be used
to do bootstrapping.
In doing bootstrapping, the function used to compute the target statistic may have its own arguments.
boot() collects these arguments using ..., and passes them to the function assigned to statistic.
We can use this feature to revise my_r() such that we can form the bootstrapping confidence interval of
for any two variables we want:

my_r_any2 <- function(data,

index,
x,
y) {
# Resampling: Select rows from dat.
data0 <- data[index, ]
# Compute and return Pearson's r
out <- cor(data0[c(x, y)])[2, 1]
return(out)
}

my_r_any2(data = dat,
index = 1:50,
x = "SelfEsteem",
y = "Happiness")

[1] 0.4033987

# Check the correlation

cor(dat[1:50, c("SelfEsteem", "Happiness")])

SelfEsteem Happiness
SelfEsteem 1.0000000 0.4033987
Happiness 0.4033987 1.0000000
We can try this version again. No need for index in boot.ci() because we compute only one correla-
tions:

set.seed(23456)
boot_r <- boot(data = dat,
statistic = my_r,
R = 2000)
boot.ci(boot_r,
index = 1,
type = "perc")

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS

Based on 2000 bootstrap replicates

CALL :
boot.ci(boot.out = boot_r, type = "perc", index = 1)

Intervals :
Level Percentile
95% ( 0.3440, 0.6856 )
Calculations and Intervals on Original Scale

SSGC 8802 (Sept 2024) Writing Simple Functions

set.seed(23456)
boot_r2 <- boot(data = dat,
statistic = my_r_any2,
R = 2000,
x = "SelfEsteem",
y = "Happiness")
boot.ci(boot_r2,
type = "perc")

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS

Based on 2000 bootstrap replicates

CALL :
boot.ci(boot.out = boot_r2, type = "perc")

Intervals :
Level Percentile
95% ( 0.3440, 0.6856 )
Calculations and Intervals on Original Scale
You can see that the two confidence intervals are identical.7
In your own research, whether you will use this technique depends on how flexible you want the function
to be.
• If you are pretty sure that you only need bootstrap CI for a very specific scenario (e.g., only the
statistic for a set of variables computed in a specific way), then no need to use these additional
arguments.
• However, if you think you may need to adjust the analysis, such as trying other variables or options
for the analysis (e.g., using another measure of correlation), then you may want to write a more
general function.

6 Final Remarks
There are a lot of issues about functions not covered here. I myself also still have a lot to learn. The goal
of this document is not to make you a programmer (I am also not a programmer). The goal is to help you
to know how writing function can help us to do analysis in our research. We have learned how writing
functions can make it easier to do several tasks again and again for different models or variables. We
have also learned how we can write a function to compute something that we need. This will be useful
if you learn about some new statistic or measure that you want to use but is not yet available in existing
function.
You will definitely encounter some problems when you try to write your own functions. Learn what you
need when using R in your research. Certainly, if you have some experience in programming, or maybe
you are already an experienced programmer, you can consider reading some books on programming in
R to learn more about the technical details in R.

7 Further References
In the book by Fox and Weisberg (2019) on doing regression analysis using R, they also have a chapter
on programming in R (Chapter 10), aimed for researchers. You can see if this chapter is suitable for you:
• Fox, J. &, Weisberg, S.. (2019) An R companion to applied regression (3rd Ed.). Sage.
URL https://socialsciences.mcmaster.ca/jfox/Books/Companion/index.html. (UM library has
7
I call set.seed() before each call to boot(), and use the same seed. We usually do not do this. However, these two versions
are fitted to the same dataset. Therefore, to make the results comparable, we will want these two bootstrapping analysis to have
the same bootstrap samples. This can be done by using the same number in set.seed() right before calling boot().

SSGC 8802 (Sept 2024) Writing Simple Functions

the 2nd edition: https://umlibrary.primo.exlibrisgroup.com/permalink/853UOM_INST/1jn7l3f/

alma991007321009706306)
The present document focuses on writing functions, not on bootstrapping. To learn more about doing
bootstrapping in R, you can read the following article.
• Rousselet, G. A., Pernet, C. R., & Wilcox, R. R. (2021). The percentile bootstrap: A primer with
step-by-step instructions in R. Advances in Methods and Practices in Psychological Science, 4(1),
1-10. https://doi.org/10.1177/2515245920911881
If you are interested in learning more about the technical details not covered here, you can read the
official documentation on functions:
• https://cran.r-project.org/doc/manuals/r-release/R-lang.html#Functions

SSGC 8802 (Sept 2024) Writing Simple Functions

Gesture Controlled Drone System
No ratings yet
Gesture Controlled Drone System
37 pages
Schaum's Outline of Programming with Java
From Everand
Schaum's Outline of Programming with Java
John R. Hubbard
3/5 (2)
Factors - Math Lesson Plan
100% (1)
Factors - Math Lesson Plan
6 pages
(How To Write A (Lisp) Interpreter (In Python) )
No ratings yet
(How To Write A (Lisp) Interpreter (In Python) )
14 pages
SAP HANA Studio Overview
No ratings yet
SAP HANA Studio Overview
3 pages
Ebooks Basicr Writefuns
No ratings yet
Ebooks Basicr Writefuns
11 pages
UNIT 2
No ratings yet
UNIT 2
101 pages
R Program (Exp 18-24)
No ratings yet
R Program (Exp 18-24)
12 pages
R Functions: Things Your Mother (Probably) Didn't Tell You About
No ratings yet
R Functions: Things Your Mother (Probably) Didn't Tell You About
34 pages
02 Functions in R
No ratings yet
02 Functions in R
24 pages
Unit 2 R Factorial Functions
No ratings yet
Unit 2 R Factorial Functions
6 pages
Note R Control Function Scoping Rules Vectorized Operation Date and Time
No ratings yet
Note R Control Function Scoping Rules Vectorized Operation Date and Time
15 pages
11 Scoping
No ratings yet
11 Scoping
28 pages
lec_09
No ratings yet
lec_09
16 pages
R Functions_06
No ratings yet
R Functions_06
26 pages
08 Functions
No ratings yet
08 Functions
36 pages
R Basics: Installing R
No ratings yet
R Basics: Installing R
9 pages
2 Functions
No ratings yet
2 Functions
49 pages
R-Unit 2
No ratings yet
R-Unit 2
81 pages
Stat 1st Unit
No ratings yet
Stat 1st Unit
32 pages
Advanced R PDF
No ratings yet
Advanced R PDF
4 pages
File 1
No ratings yet
File 1
27 pages
2 - Datacamp - Intermediate R Notes
No ratings yet
2 - Datacamp - Intermediate R Notes
55 pages
R Programming
No ratings yet
R Programming
50 pages
Chapter 3 Programming Basics: 3.1 Conditional Expressions
No ratings yet
Chapter 3 Programming Basics: 3.1 Conditional Expressions
7 pages
R - Lecture 7
No ratings yet
R - Lecture 7
20 pages
Pertemuan9 Fungsi Pada R PDF
No ratings yet
Pertemuan9 Fungsi Pada R PDF
4 pages
Statistics Using R Language
No ratings yet
Statistics Using R Language
5 pages
Functions
No ratings yet
Functions
6 pages
Practical 1_Data Frame Manipulation_072502
No ratings yet
Practical 1_Data Frame Manipulation_072502
16 pages
The Art of R Programming
100% (2)
The Art of R Programming
193 pages
Chapter 4 Programming Basics - Introduction To Data Science
No ratings yet
Chapter 4 Programming Basics - Introduction To Data Science
11 pages
Wa0011
No ratings yet
Wa0011
32 pages
Intermediate R
No ratings yet
Intermediate R
13 pages
Unit 5 - DS - 1st year
No ratings yet
Unit 5 - DS - 1st year
19 pages
R Imp Funtions
No ratings yet
R Imp Funtions
10 pages
Functions in R Programming
No ratings yet
Functions in R Programming
22 pages
654495996 Functions in R Programming
No ratings yet
654495996 Functions in R Programming
22 pages
R Programming 101 Part 1
No ratings yet
R Programming 101 Part 1
53 pages
Functional Programming: Hadley Wickham
No ratings yet
Functional Programming: Hadley Wickham
58 pages
RBigData NTL
No ratings yet
RBigData NTL
24 pages
R For Programmers: Important Notice
No ratings yet
R For Programmers: Important Notice
104 pages
Introduction To R PDF
No ratings yet
Introduction To R PDF
56 pages
R Programming Slides
No ratings yet
R Programming Slides
73 pages
AnalyticsEdge Rmanual PDF
100% (1)
AnalyticsEdge Rmanual PDF
44 pages
Document (1)
No ratings yet
Document (1)
32 pages
R
No ratings yet
R
13 pages
Lec 4
No ratings yet
Lec 4
18 pages
Functions: FRE6871 & FRE7241, Fall 2020
No ratings yet
Functions: FRE6871 & FRE7241, Fall 2020
57 pages
Uni T - 2 - R Programming
No ratings yet
Uni T - 2 - R Programming
10 pages
Unit 2 R
No ratings yet
Unit 2 R
16 pages
An R Tutorial Starting Out
No ratings yet
An R Tutorial Starting Out
9 pages
R Software - Notes
No ratings yet
R Software - Notes
18 pages
STTN 225 R Summary
No ratings yet
STTN 225 R Summary
18 pages
Getting Started in R
No ratings yet
Getting Started in R
39 pages
Open Data Structures: An Introduction
From Everand
Open Data Structures: An Introduction
Pat Morin
4/5 (4)
The Satisfiability Problem: Algorithms and Analyses
From Everand
The Satisfiability Problem: Algorithms and Analyses
Uwe Schöning
No ratings yet
Control Systems
From Everand
Control Systems
Francisco Luis Pagola y de las Heras
No ratings yet
Basic Research and Technologies for Two-Stage-to-Orbit Vehicles: Final Report of the Collaborative Research Centres 253, 255 and 259
From Everand
Basic Research and Technologies for Two-Stage-to-Orbit Vehicles: Final Report of the Collaborative Research Centres 253, 255 and 259
Dieter Jacob
No ratings yet
Presentations with LaTeX: Which package, which command, which syntax?
From Everand
Presentations with LaTeX: Which package, which command, which syntax?
Herbert Voß
No ratings yet
Mastering Python Advanced Concepts and Practical Applications
From Everand
Mastering Python Advanced Concepts and Practical Applications
Aissa Younes
No ratings yet
Business Object Modeling (BOM) workbook: A pattern-based approach to creating, managing and using an enterprise data model
From Everand
Business Object Modeling (BOM) workbook: A pattern-based approach to creating, managing and using an enterprise data model
Martine Alaerts
No ratings yet
A Framework For Discourse Analysis Ii
From Everand
A Framework For Discourse Analysis Ii
Dr. Wilbur Pickering, Thm, Phd
No ratings yet
Developing Intelligent Agent Systems: A Practical Guide
From Everand
Developing Intelligent Agent Systems: A Practical Guide
Lin Padgham
3/5 (1)
SSGC 8802 - Correlation
No ratings yet
SSGC 8802 - Correlation
30 pages
SSGC 8802 - Descriptive Statistics
No ratings yet
SSGC 8802 - Descriptive Statistics
32 pages
SSGC 8802 - Inferential Statistics
No ratings yet
SSGC 8802 - Inferential Statistics
30 pages
SSGC 8802 A Brief Introduction To R - GUI
No ratings yet
SSGC 8802 A Brief Introduction To R - GUI
28 pages
2.definition:: Is Not Equal To 0. Polynomial Functions of Only One
No ratings yet
2.definition:: Is Not Equal To 0. Polynomial Functions of Only One
6 pages
Answer Key: Understand
No ratings yet
Answer Key: Understand
3 pages
11 19 20 Elective 3 QUIZ
No ratings yet
11 19 20 Elective 3 QUIZ
3 pages
HDPE Material of Pipe
No ratings yet
HDPE Material of Pipe
6 pages
Drilling Practical 1
No ratings yet
Drilling Practical 1
6 pages
Material Data:: Foundation For Pipe Support
100% (1)
Material Data:: Foundation For Pipe Support
8 pages
Yang Et Al. - 2020 - Experimental Study On Single-Phase Hybrid Microcha
No ratings yet
Yang Et Al. - 2020 - Experimental Study On Single-Phase Hybrid Microcha
11 pages
Tech Paper - Reheat Steam Temperature Control Concept in Once-Through Boilers - A Review
No ratings yet
Tech Paper - Reheat Steam Temperature Control Concept in Once-Through Boilers - A Review
5 pages
Model 780-001 Indoor Explosion-Proof Single Party Handset Station
No ratings yet
Model 780-001 Indoor Explosion-Proof Single Party Handset Station
2 pages
Lec # 26 Brushless DC Motor
No ratings yet
Lec # 26 Brushless DC Motor
12 pages
Ficha Tecnica Erp025-030vc
No ratings yet
Ficha Tecnica Erp025-030vc
8 pages
Whats New in Asme A 2010
No ratings yet
Whats New in Asme A 2010
19 pages
KVL and KCL
100% (1)
KVL and KCL
14 pages
Monosaccharides Lecture
No ratings yet
Monosaccharides Lecture
24 pages
Module 2 Exercise2
No ratings yet
Module 2 Exercise2
2 pages
Keyboard Layout Author Date Source Notes: Planck/rev6 LAYOUT - Ortho - 4x12
No ratings yet
Keyboard Layout Author Date Source Notes: Planck/rev6 LAYOUT - Ortho - 4x12
2 pages
Assessing Emotional Intelligence
No ratings yet
Assessing Emotional Intelligence
16 pages
BAS-PRC031-EN, 10 Jun 2013 Product Catalog
No ratings yet
BAS-PRC031-EN, 10 Jun 2013 Product Catalog
209 pages
Class 1: Prepared By: Dynamo Robotics Academy Abdurahman Shemsedin
No ratings yet
Class 1: Prepared By: Dynamo Robotics Academy Abdurahman Shemsedin
9 pages
Physics-Informed Neural Nets for Control of Dynamical Systems
No ratings yet
Physics-Informed Neural Nets for Control of Dynamical Systems
23 pages
Xna Multi Threading
No ratings yet
Xna Multi Threading
36 pages
Lec 5-Stacks and Queues
No ratings yet
Lec 5-Stacks and Queues
71 pages
3120999146-Mining Bilateral Reviews For Online Transaction Prediction
No ratings yet
3120999146-Mining Bilateral Reviews For Online Transaction Prediction
12 pages
Economic Incentive For Intermittent Operation of Air Separation Plants With Variable Power Cost
No ratings yet
Economic Incentive For Intermittent Operation of Air Separation Plants With Variable Power Cost
8 pages
MCQ Computer Operation
No ratings yet
MCQ Computer Operation
18 pages
Student Exploration: Refraction
No ratings yet
Student Exploration: Refraction
8 pages