R Bootstrap PDF
R Bootstrap PDF
These notes work through a simple example to show how one can program R to do both
jackknife and bootstrap sampling. We start with bootstrapping.
Bootstrap Calculations
R has a number of nice features for easy calculation of bootstrap estimates and confidence
intervals. To see how to use these features, consider the following 25 observations:
8.26 6.33 10.4 5.27 5.35 5.61 6.12 6.19 5.2 7.01 8.74 7.78
7.02 6 6.5 5.8 5.12 7.41 6.52 6.21 12.28 5.6 5.38 6.6
8.74
√
Suppose we wish to estimate the coefficient of variation, CV = Var/ x. Let’s do this with
a bootstrap estimator.
First, let’s put the data into a vector, which we will call x,
> x <-c(8.26, 6.33, 10.4, 5.27, 5.35, 5.61, 6.12, 6.19, 5.2,
7.01, 8.74, 7.78, 7.02, 6, 6.5, 5.8, 5.12, 7.41, 6.52, 6.21,
12.28, 5.6, 5.38, 6.6, 8.74)
Now let’s define a functon in R, which we will call CV, to compute the coefficient of
variation,
> CV <- function(x) sqrt(var(x))/mean(x)
So, let’s compute the CV
> CV(x)
[1] 0.2524712
To generate a single bootstrap sample from this data vector, we use the command
> sample(x,replace=T)
which generates a bootstrap sample of the data vector x by sampling with replacement.
Hence, to compute the CV using a single bootstrap sample,
> CV(sample(x,replace=T))
[1] 0.2242572
The particular value that R returns for you will be different as the sample is random.
Some other useful commands:
> sum(x) returns the sum of the elements in x
> mean(x) returns the mean of the elements in x
P
> var(x) returns the sample variance, i.e., i (x − x)2 /(n − 1)
> length(x) returns the number of items in x (i.e., the sample size n)
2 Bootstrap/Jackknife Calculations in R
Efron’s confident limit (Equation 11 on resampling notes) has an upper and lower value
of
> quantile(boot,0.975)
[1] 0.3176385
and
> quantile(boot,0.025)
[1] 0.153469
While Hall’s confidence limits (Equation 12) has an upper and lower value of
> 2*CV(x) - quantile(boot,0.025)
[1] 0.3514734
and
> 2*CV(x) - quantile(boot,0.975)
[1] 0.1873039
Jackknife Calculations
We now turn to jackknifing the sample. Recall from the randomization notes that this
involves two steps. First, we generate a jackknife sample which has value xi removed and
then compute the ith partial estimate of the test statistic using this sample,
We then turn this ith partial estimate into the ith pseudovalue θbi∗ using (Equation 5c in
random notes)
θbi∗ = nθb − (n − 1)θbi
4 Bootstrap/Jackknife Calculations in R
Here’s a summary of the various estimated values, variances, and confidence intervals
Method Estimated CV Variance 95% interval
Original Estimate 0.252
Jackknife 0.262 0.0029 0.150 - 0.373
Bootstrap 0.264 0.0019
Bootstrap (normality) 0.178 - 0.351
Bootstrap (Efron) 0.153 - 0.318
Bootstrap (Hall) 0.187 - 0.351