0% found this document useful (0 votes)

37 views

HW 9 Bootstrap, Jackknife, and Permutation Tests

The document outlines the requirements for Homework 9 in STAT 5400, focusing on Bootstrap, Jackknife, and Permutation Tests, due on Nov 13, 2023. It includes specific problems related to nonparametric and parametric bootstrap analyses, confidence interval calculations, and permutation tests using provided datasets. Additionally, it emphasizes the importance of clear reporting and interpretation of results in the submitted .Rmd and .pdf files.

Uploaded by

Cade McDonald

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views

HW 9 Bootstrap, Jackknife, and Permutation Tests

Uploaded by

Cade McDonald

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

HW 9 Bootstrap, Jackknife, and

Permutation Tests
STAT 5400
Due: Nov 13, 2023 11:59 PM

Problems
Submit your solutions as an .Rmd file and accompanying .pdf file.

1. Use echo=TRUE, include=FALSE to ensure that all the code are provided but only the important output is
included. Try to write your homework in the form of a neat report and don’t pile up any redundant and
irrelevant output.
2. Always interpret your result whenever it is necessary. Try to make sure the interpretation can be understood
by people with a moderate level of statistics knowledge.

Reading assignments.
Here is an undergraduate-level introduction to the bootstrap.
https://statweb.stanford.edu/~tibs/stat315a/Supplements/bootstrap.pdf
(https://statweb.stanford.edu/~tibs/stat315a/Supplements/bootstrap.pdf)

Problems
1. Bootstrap and jackknife
Consider the airconditioning data listed below:

3, 5, 7, 18, 43, 85, 91, 98, 100, 130, 230, 487.

Suppose the mean of the underlying distribution is μ and our interest is to estimate log(μ). To estimate it, we use
¯
the log of the sample mean, i.e., log(X ) , as an estimator.

¯
a. Carry out a nonparametric bootstrap analysis to estimate the bias of log(X ).
¯ ¯
b. Based on the bootstrap analysis, is the bias of log(X) positive or negative? (In other word, does log(X )

overestimates or underestimates log(μ)) Can you explain the observation? (Hint: Jensen’s inequality)
c. Also run a nonparametric bootstrap to estimate the standard error of the log of the sample mean. In terms of
the mean square error of the estimator, do you think the bias is large given the standard error?
d. Carry out a parametric bootstrap analysis to estimate the bias of the log of sample mean. Assume that the
population distribution of failure times of airconditioning equipment is exponential.
e. Plot both the histograms of the bootstrap replications from nonparametric and parametric bootstrap.
f. Produce 95% confidence intervals by the standard normal, basic, percentile, and Bca methods. (You may
need to attend the lecture on Oct 17 for this question.)
g. Use jackknife to estimate the standard error and bias of the log of the sample mean.

The bias is negative and thus it is an underestimate.

# Nonparametric bootstrap for SE

log_mean_exp <- function(x, indices) {

result <- log(mean(rexp(length(indices))))
return(result)
}

# Parametric bootstrap
param_bootstrap_results <- boot(data, statistic = log_mean_exp, R = 1000)
param_bias_estimate <- mean(param_bootstrap_results$t - log_mean(data))
# Interpretation
param_bias_estimate

## [1] -4.715413

param_bootstrap_results

##
## ORDINARY NONPARAMETRIC BOOTSTRAP
##
##
## Call:
## boot(data = data, statistic = log_mean_exp, R = 1000)
##
##
## Bootstrap Statistics :
## original bias std. error
## t1* -0.026407 -0.006103388 0.3065978

The Bias is quite large at -4.7 compared to the std. error at .307

# Plot histograms
par(mfrow = c(1, 2))
hist(bootstrap_results$t, main = "Nonparametric Bootstrap", xlab = "Log Mean")
hist(param_bootstrap_results$t, main = "Parametric Bootstrap", xlab = "Log Mean")
# Function to calculate confidence intervals
calculate_ci <- function(results, method) {
ci_output <- boot.ci(results, type = method)
print(ci_output) # Add this line to check the output
return(ci_output$bca[, 4:5])
}

# Confidence intervals for the log of the sample mean

log_mean <- function(x, indices) {
return(log(mean(x[indices])))
}

# Nonparametric bootstrap for log mean

set.seed(5400)
bootstrap_results <- boot(data, statistic = log_mean, R = 1000)

# Standard Normal CI
standard_normal_ci <- calculate_ci(bootstrap_results, "norm")
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 1000 bootstrap replicates
##
## CALL :
## boot.ci(boot.out = results, type = method)
##
## Intervals :
## Level Normal
## 95% ( 4.031, 5.464 )
## Calculations and Intervals on Original Scale

# Basic CI
basic_ci <- calculate_ci(bootstrap_results, "basic")

## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS

## Based on 1000 bootstrap replicates
##
## CALL :
## boot.ci(boot.out = results, type = method)
##
## Intervals :
## Level Basic
## 95% ( 4.132, 5.548 )
## Calculations and Intervals on Original Scale

# Percentile CI
percentile_ci <- calculate_ci(bootstrap_results, "perc")

## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS

## Based on 1000 bootstrap replicates
##
## CALL :
## boot.ci(boot.out = results, type = method)
##
## Intervals :
## Level Percentile
## 95% ( 3.818, 5.234 )
## Calculations and Intervals on Original Scale

# BCa CI
bca_ci <- calculate_ci(bootstrap_results, "bca")
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 1000 bootstrap replicates
##
## CALL :
## boot.ci(boot.out = results, type = method)
##
## Intervals :
## Level BCa
## 95% ( 4.022, 5.391 )
## Calculations and Intervals on Original Scale
## Some BCa intervals may be unstable

jackknife_se_estimate = 0.2558932 jackknife_bias_estimate = 0.01983011 ### 2. Failure of bootstrap The

bootstrap is not foolproof. To see this, consider analysis of a binomial model with n trials. You observe 0
successes. Discuss what would happen if you were to use the standard, non-parametric bootstrap in constructing
a 95% C.I. for the binomial parameter p.

When you observe 0 succeses with the non-parametric model it leads to the case of the confidence interval being
(0,0) ### 3. Bootstrap estimate of the standard error of trimmed mean.

Consider an artificial data set consisting of eight observations:

1, 3, 4.5, 6, 6, 6.9, 13, 19.2.

Let θ^ be the 25% trimmed mean, which is computed by deleting two smallest numbers and two largest numbers,
and then taking the average of the remaining four numbers.

a. Calculate \widehat{\mathrm{se}_B for B = 25, 100, 200, 500, 1000, 2000. From these results estimate the
ideal bootstrap estimate \widehat{\mathrm{se}_{\infty} .
b. Repeat part (a) using twenty different random number seeds. Comment on the trend of the variablity of each
\widehat{\mathrm{se}_B .
# Given data
data <- c(1, 3, 4.5, 6, 6, 6.9, 13, 19.2)

# Function to calculate the 25% trimmed mean

trimmed_mean <- function(x) {
sorted_data <- sort(x)
trimmed_data <- sorted_data[3:6] # Remove two smallest and two largest numbers
return(mean(trimmed_data))
}

# Function to perform bootstrap and calculate standard error

bootstrap_se <- function(B, seed) {
set.seed(seed)
bootstrap_results <- boot(
data,
statistic = function(x, indices) trimmed_mean(x[indices]),
R = B
)
return(sd(bootstrap_results$t)) # Use sd to calculate standard deviation
}

# Values of B
B_values <- c(25, 100, 200, 500, 1000, 2000)

# Results for each B

results <- numeric(length(B_values))

# Loop through different values of B

for (i in seq_along(B_values)) {
results[i] <- bootstrap_se(B_values[i], seed = 5400) # Use a fixed seed for reproducibility
}

# Ideal Bootstrap Estimate (B = Inf)

ideal_bootstrap_estimate <- sd(trimmed_mean(data))

# Print results
cat("Bootstrap Standard Errors for Different B values:\n")

## Bootstrap Standard Errors for Different B values:

print(results)

## [1] 2.238965 2.310789 2.132144 2.061035 2.104525 2.152344

cat("\nIdeal Bootstrap Estimate (B = Inf):\n")

##
## Ideal Bootstrap Estimate (B = Inf):
print(ideal_bootstrap_estimate)

## [1] NA

The mean of the estimated standard errors decreases as B increases, with a larger number of bootstrap samples,
the estimates tend to stabilize and become more precise. The standard deviation of the estimated standard errors
also decreases with increasing B, suggesting reduced variability in the estimates.

The plot shows a decreasing trend in the mean estimated standard errors as B increases. As B increases, the
bootstrap procedure becomes more robust, providing more consistent and reliable estimates of the standard error.
In practice, it is common to choose a sufficiently large B to achieve stable results.

4. Permutation distribution of the difference between sample means

Load the chickwts data in R. Have a quick graphical summary using the following code:

> attach(chickwts)
> boxplot(formula(chickwts))

Use two-sample t-test to see whether the population weight of soybean and linseed are the same or not.
Comment on the assumptions of the t-test.

Design permutation tests to answer the previous question.

5. Permutation distribution of the difference between two distributions

Instead of only comparing population means, design permutation tests based on Kolmogorov-Smirnov (K-S)
statistic to check if the distributions of soybean weights and linseed weights are the same or not?

Below is the definition of K-S statistic:

D = sup |Fn (zi ) − Gm (zi )| ,

1≤i≤n

where zi is the pooled sample, Fn is the empirical CDF of X1 , … , Xn and Gm is the emprical CDF of
Y 1 , … , Y m . You may want to try ks.test in R.

In fact, the K-S statistic is primarily used for univariate distributions. Try another alternative to the K-S
statistics: perform permutation tests based on Cramer-von-Mises statistic:

n m
mn 2 2
W = [∑(Fn (x i ) − Gm (x i )) + ∑(Fn (yj ) − Gm (yj )) . ]
2
(m + n)
i=1 j=1

Bradley Efron, R.J. Tibshirani An Introduction To Bootstrap
60% (5)
Bradley Efron, R.J. Tibshirani An Introduction To Bootstrap
225 pages
Microbiology Principles and Explorations 9th Edition Black Solutions Manual Download
100% (18)
Microbiology Principles and Explorations 9th Edition Black Solutions Manual Download
8 pages
Crompton Greaves Domestic Pumps Price List
100% (1)
Crompton Greaves Domestic Pumps Price List
12 pages
Braun Bootstrap2012 PDF
No ratings yet
Braun Bootstrap2012 PDF
63 pages
Resampled Inference Resampled Inference
No ratings yet
Resampled Inference Resampled Inference
21 pages
L22 Bootstrap
No ratings yet
L22 Bootstrap
7 pages
Nonparametric Standard Errors and Confidence Intervals
No ratings yet
Nonparametric Standard Errors and Confidence Intervals
21 pages
Wasserman 8 PDF
No ratings yet
Wasserman 8 PDF
12 pages
Komputasi Statistik: Pertemuan 14
No ratings yet
Komputasi Statistik: Pertemuan 14
34 pages
An Introduction To Bootstrap Methods and Their Application: Prof. Dr. Diego Kuonen, Cstat Pstat Csci
No ratings yet
An Introduction To Bootstrap Methods and Their Application: Prof. Dr. Diego Kuonen, Cstat Pstat Csci
73 pages
Bootstrap Report
No ratings yet
Bootstrap Report
92 pages
Lecture 9 PDF
No ratings yet
Lecture 9 PDF
22 pages
Bootsteps
No ratings yet
Bootsteps
30 pages
Lecture 19 20
No ratings yet
Lecture 19 20
5 pages
R Session Bootstrapping Randomisation 2024
No ratings yet
R Session Bootstrapping Randomisation 2024
4 pages
Resampling
No ratings yet
Resampling
5 pages
AOD Lec9
No ratings yet
AOD Lec9
26 pages
Lecture 4
No ratings yet
Lecture 4
6 pages
Bootstrapping Techniques in Statistical Analysis and Approaches in R MATH 289
No ratings yet
Bootstrapping Techniques in Statistical Analysis and Approaches in R MATH 289
10 pages
Mathematica Laboratories For Mathematical Statistics
No ratings yet
Mathematica Laboratories For Mathematical Statistics
26 pages
Bootstrap Up
No ratings yet
Bootstrap Up
5 pages
s-m-s-t-c--lecture-2425-4
No ratings yet
s-m-s-t-c--lecture-2425-4
43 pages
unit5_randomsampling
No ratings yet
unit5_randomsampling
21 pages
3030 Slides Module 1A
No ratings yet
3030 Slides Module 1A
24 pages
A Practical Guide To Bootstrap in R
No ratings yet
A Practical Guide To Bootstrap in R
4 pages
Monte Carlo R-Solutions
No ratings yet
Monte Carlo R-Solutions
42 pages
Bootstrap Method PDF
No ratings yet
Bootstrap Method PDF
14 pages
Homework 3 R Tutorial: How To Use This Tutorial
No ratings yet
Homework 3 R Tutorial: How To Use This Tutorial
8 pages
STA 202 Project Assignment 2 Q5
No ratings yet
STA 202 Project Assignment 2 Q5
5 pages
What Teachers Should Know About The Bootstrap Resa
No ratings yet
What Teachers Should Know About The Bootstrap Resa
84 pages
Lab #6: Bootstrap Intervals: Why It Works
No ratings yet
Lab #6: Bootstrap Intervals: Why It Works
7 pages
MIT18 05S14 Class24-Slde-A
No ratings yet
MIT18 05S14 Class24-Slde-A
16 pages
Intro Bootstrap 341
No ratings yet
Intro Bootstrap 341
18 pages
DSCI 100 Bootstrap Concept Cheat Sheet
No ratings yet
DSCI 100 Bootstrap Concept Cheat Sheet
2 pages
Bootstrap: Estimate Statistical Uncertainties
No ratings yet
Bootstrap: Estimate Statistical Uncertainties
22 pages
Eco 570 Assign
No ratings yet
Eco 570 Assign
10 pages
Mis Notas de R PDF
100% (1)
Mis Notas de R PDF
396 pages
This Content Downloaded From 140.213.190.131 On Tue, 13 Apr 2021 09:26:31 UTC
No ratings yet
This Content Downloaded From 140.213.190.131 On Tue, 13 Apr 2021 09:26:31 UTC
23 pages
Lecture+14 SAS Bootstrap and Jackknife
No ratings yet
Lecture+14 SAS Bootstrap and Jackknife
12 pages
Stat-340 - Assignment 4 - 2014 Spring Term: Part 1 - Breakfast Cereals - Easy
No ratings yet
Stat-340 - Assignment 4 - 2014 Spring Term: Part 1 - Breakfast Cereals - Easy
16 pages
Bootstrapping Regression Models: 1 Basic Ideas
No ratings yet
Bootstrapping Regression Models: 1 Basic Ideas
14 pages
bootstrap-methods-2020
No ratings yet
bootstrap-methods-2020
16 pages
Estimation Through Bootsrtapping
No ratings yet
Estimation Through Bootsrtapping
6 pages
Bootstrap
No ratings yet
Bootstrap
52 pages
An Introduction To The Bootstrap
No ratings yet
An Introduction To The Bootstrap
7 pages
Appendix Bootstrapping
No ratings yet
Appendix Bootstrapping
18 pages
Bootstrap PDF
No ratings yet
Bootstrap PDF
28 pages
R Bootstrap PDF
No ratings yet
R Bootstrap PDF
5 pages
An Introduction To The Bootstrap: Teaching Statistics June 2001
No ratings yet
An Introduction To The Bootstrap: Teaching Statistics June 2001
7 pages
Bootstrap 1
No ratings yet
Bootstrap 1
7 pages
A Leisurely Look at The Bootstrap, The Jackknife, and Cross-Validation (1983 13s) - BRADLEY EFRON
No ratings yet
A Leisurely Look at The Bootstrap, The Jackknife, and Cross-Validation (1983 13s) - BRADLEY EFRON
13 pages
Lecture 7 Classification
No ratings yet
Lecture 7 Classification
52 pages
The Bootstrap and The Jackknife
No ratings yet
The Bootstrap and The Jackknife
15 pages
An Introduction To Bootstrap Methods With Applications To R
No ratings yet
An Introduction To Bootstrap Methods With Applications To R
236 pages
Bacs HW3
No ratings yet
Bacs HW3
12 pages
STAT359 Study Guide
No ratings yet
STAT359 Study Guide
7 pages
Bootstrap
No ratings yet
Bootstrap
28 pages
Bootstrap Explained
No ratings yet
Bootstrap Explained
1 page
Validation Model 2024-2
No ratings yet
Validation Model 2024-2
37 pages
Microeconometrics Slides
No ratings yet
Microeconometrics Slides
346 pages
C Language Programming Codes
From Everand
C Language Programming Codes
Durgesh
No ratings yet
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
Classification of Player Roles in The Team-Based Multi-Player Game Dota 2
No ratings yet
Classification of Player Roles in The Team-Based Multi-Player Game Dota 2
14 pages
Delhi Forecast
No ratings yet
Delhi Forecast
2 pages
ChE 152 Lecture 2a Vapor-Liquid Equilibria
No ratings yet
ChE 152 Lecture 2a Vapor-Liquid Equilibria
33 pages
Plant Layout: End Term Jury
No ratings yet
Plant Layout: End Term Jury
16 pages
Volunteering Program For Teenagers
No ratings yet
Volunteering Program For Teenagers
2 pages
Deluga (1994) - Supervisor Trust Building, Leader-Member Exchange and OCB
No ratings yet
Deluga (1994) - Supervisor Trust Building, Leader-Member Exchange and OCB
13 pages
1967 C150G Owners Manual PDF
No ratings yet
1967 C150G Owners Manual PDF
28 pages
Unreal Past Teaching
No ratings yet
Unreal Past Teaching
13 pages
Mech of Materials
No ratings yet
Mech of Materials
34 pages
Conceptual Data Model - Online Store Sample
No ratings yet
Conceptual Data Model - Online Store Sample
5 pages
A Feasibility Study On Soil Improvement Method
No ratings yet
A Feasibility Study On Soil Improvement Method
47 pages
Mosier Digital Citizenship 42slides
No ratings yet
Mosier Digital Citizenship 42slides
42 pages
Customer Segments: Key Partners Value Proposition
No ratings yet
Customer Segments: Key Partners Value Proposition
2 pages
Product Brief LevelPlus-Series TankSLAYER 551864 en
No ratings yet
Product Brief LevelPlus-Series TankSLAYER 551864 en
2 pages
AdvStats - W2 - Probability
No ratings yet
AdvStats - W2 - Probability
18 pages
Engineering Marvel
No ratings yet
Engineering Marvel
3 pages
Top 5 Telecom Operators in The Middle East 2018
No ratings yet
Top 5 Telecom Operators in The Middle East 2018
3 pages
IFFA6312 - Activity 3 Touch Point
No ratings yet
IFFA6312 - Activity 3 Touch Point
12 pages
Solutions Stat CH 7
No ratings yet
Solutions Stat CH 7
6 pages
Iot Lab Manual
No ratings yet
Iot Lab Manual
18 pages
FPT F32tm1a
No ratings yet
FPT F32tm1a
3 pages
(Ebook) Multicultural Education Policies in Canada and the United States by Reva Joshee ISBN 9780774813259, 0774813253 - Discover the ebook with all chapters in just a few seconds
No ratings yet
(Ebook) Multicultural Education Policies in Canada and the United States by Reva Joshee ISBN 9780774813259, 0774813253 - Discover the ebook with all chapters in just a few seconds
56 pages
419df92f633b69b64a4fd
No ratings yet
419df92f633b69b64a4fd
3 pages
Nursing Informatics 101
100% (1)
Nursing Informatics 101
37 pages
2010 RAMS Doe and Data Analysis
No ratings yet
2010 RAMS Doe and Data Analysis
30 pages
ROBINS-I: A Tool For Assessing Risk of Bias in Non-Randomised Studies of Interventions
No ratings yet
ROBINS-I: A Tool For Assessing Risk of Bias in Non-Randomised Studies of Interventions
7 pages
3412 Gas Engine Firing Order
100% (1)
3412 Gas Engine Firing Order
1 page
3LD25140TK51 Datasheet en
No ratings yet
3LD25140TK51 Datasheet en
6 pages