0% found this document useful (0 votes)
13 views

HWK3_324

This document outlines the homework assignment for Statistics 324, detailing submission guidelines and exercises involving probability distributions, expected values, and standard deviations. It includes calculations related to customer orders, patient blood pressure, defective items in shipments, and bonding strength of glue, with specific statistical methods and interpretations provided. The document emphasizes the importance of including explanations and relevant code outputs for R in the homework submissions.

Uploaded by

jonathanolden9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

HWK3_324

This document outlines the homework assignment for Statistics 324, detailing submission guidelines and exercises involving probability distributions, expected values, and standard deviations. It includes calculations related to customer orders, patient blood pressure, defective items in shipments, and bonding strength of glue, with specific statistical methods and interpretations provided. The document emphasizes the importance of including explanations and relevant code outputs for R in the homework submissions.

Uploaded by

jonathanolden9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Statistics 324 Homework #3

Jonathan Nolden

*Submit your homework to Canvas by the due date and time. Email your lecturer if you have extenuating
circumstances and need to request an extension.
*If an exercise asks you to use R, include a copy of the code and output. Please edit your code and output
to be only the relevant portions.
*If a problem does not specify how to compute the answer, you many use any appropriate method. I may
ask you to use R or use manually calculations on your exams, so practice accordingly.
*You must include an explanation and/or intermediate calculations for an exercise to be complete.
*Be sure to submit the HWK3 Auto grade Quiz which will give you ~20 of your 40 accuracy points.
*50 points total: 40 points accuracy, and 10 points completion

Exercise 1: A chemical supply company ships a certain solvent in 10-gallon drums. Let X represent the
number of drums ordered by a randomly chosen customer. Assume X has the following probability mass
2
function (pmf). The mean and variance of X is : µX = 2.2 and σX = 1.76 = 1.326652 :

X P(X=x)
1 0.4
2 0.3
3 0.1
4 0.1
5 0.1

a. Calculate P (X ≤ 2) and describe what it means in the context of the problem.

The probability means that 70% of randomly chosen customers ordered 2 or fewer drums of the solvent.
This shows that most customers place small orders of 1-2 drums and 30% place orders bigger than that.

0.4 + 0.3

## [1] 0.7

b. Let Y be the number of gallons ordered, so Y = 10X. Complete the probability mass
function of Y.

y P(Y=y)
10 0.4
20 0.3
30 0.1

1
y P(Y=y)
40 0.1
50 0.1

c. Calculate µY . Interpret this value.

The mean number of gallons ordered by a randomly chosen customer is 22 gallons. This value represents
the average order size across all the customers after many random samples. It makes sense that this value
is 10 times the previous mean since y is 10*x.

y = c(10,20,30,40,50)
px=c(0.4, 0.3, 0.1, 0.1, 0.1)
meany = sum(y*px)
meany

## [1] 22

d. Calculate σY . Interpret this value.

The σY is 13.27, this means the number of gallons ordered by customers typically varies by about 13.27
gallons from the average of 22 gallons. The standard deviation shows us how the data is spread from the
mean.

vary = sum(px*(y-meany)ˆ2)
vary

## [1] 176

SD = sqrt(vary)
SD

## [1] 13.2665

2
Exercise 2 Four patients make appointments to have their blood pressure checked at a clinic. Let X be the
number of them who have high blood pressure. Based on data from the National Health and Examination
Survey, the approximate probability distribution of X based on long term data for this type of patient is:

x 0 1 2 3 4
P(X=x) 0.22 0.40 0.28 0.09 0.01

a. Compute the value of P (X ≥ 2). What does this value mean in the context of the question?

The probability that at least 2 of 4 patients have high blood pressure is 38%. THis means that there is a
38% chance that 2 or more patients out of four will have high blood pressure. Showing that 62% of the time
1/4 or 0/4 will have high blood pressure based on the random sample.

0.28+0.09+0.01

## [1] 0.38

b. Compute the probability that at least one of the 4 patients will have high blood pressure.

There is a 78% chance that at least 1 of 4 patients will have high blood pressure.

1 -.22

## [1] 0.78

c. Compute the expected value of X, µX What does this value mean in the context of the
question?

The expected number of patient out of 4 with high blood pressure is 1.27. This means on average you can
expect 1-2 patients out of every group to have high blood pressure.

P_patients = c(0.22,.4,.28,.09,.01)
x_p = c(0,1,2,3,4)

u_x = sum(x_p*P_patients)
u_x

## [1] 1.27

d. Compute the standard deviation of X, σX . What does this value mean?

The standard deviation of x is about .94 and it means that the number of patients out of 4 with high blood
pressure will vary from the mean by .94 on average.

var_x = sum((P_patients*(x_p-u_x)ˆ2))
var_x

## [1] 0.8771

3
SD = sqrt(var_x)
SD

## [1] 0.9365362

e. Consider using a binomial random variable with n=4 to approximate the distribution of
X given above, X ∼ Bin(n = 4, π =??). What is an approximate probability of a single
patient of this type having high blood pressure when they make an appointment, π?

The approximate probability of a single patient of this type having high blood pressure when they make an
appointment is 0.3175. This is shown in the calculations below.

n=4
pi= u_x/n #simplified version of generic discrete formulas
pi

## [1] 0.3175

Exercise 3 A customer receives a very large shipment of items. The customer assumes 15% of the items
in the shipment are defective. You can assume that the defectiveness of items is independent within the
shipment and use a 0.15 probability of defectiveness for each item.
Someone on the quality assurance team samples 4 items. Let X be the random variable for the number of
defective items in the sample.

a. Determine the probability distribution of X (write out the pmf) using probability theory.

x 0 1 2 3 4
P(X=x) 0.5220 0.3685 0.0975 0.0115 0.0005

n=4

dbinom(0:4 ,size=n, prob = 0.15)

## [1] 0.52200625 0.36847500 0.09753750 0.01147500 0.00050625

b. Compute P(X>0). What does this value mean in the context of the scenerio?

There is a 47.8% chance that at least one of the items in the random sample of 4 items will be defective
from the shipment.

1-.5220

## [1] 0.478

c. Compute the expected value for X, µX . What does that value mean in the context of the
scenerio?

4
The expected value µX is 0.6. This means that on average you can expect to find 0.6 defective items when
sampling 4 items from the shipment. This means the defective items should be around 0 to 1 out of 4
according to the average.

Defective = c(0.52200625, 0.36847500, 0.09753750, 0.01147500, 0.00050625)


x_d = c(0,1,2,3,4)
mu_x = sum(x_d*Defective)
mu_x

## [1] 0.6

d. Compute the standard deviation for X, σX .

The standard deviation is 0.714 as shown below.

var_x = sum((Defective*(x_d-mu_x)ˆ2))
var_x

## [1] 0.51

SD = sqrt(var_x)
SD

## [1] 0.7141428

e. Update the following simulation and use it to check your answers for (at least ) part (3a).
You will need to change a few values and functions to reflect the random process correctly.
(Why did I define IsDefective as I did? What values would be helpful stored into the
CountDefective vector & how can we compute those? What does the histogram show?)

The histogram matches the probability distribution of x shown in 3a. This is because as you move up in
defects the probability of those four samples having more and more defects decreases. While the probabiliy
of a sample of four having 0 defects is the highest. ie: the P(x=0) = .522 according to part a and is equal
to .5218 according to the histogram. Proving the histogram matches the data before. This histogram also
matches the data found in part b of the question, that being P(X>0) = 1- .522. IsDefective was defined that
way to crease a vector of 100 items where 15 are defective (1) and 85 are non-defective (0).

IsDefective=c(rep(1,15), rep(0,85))

manytimes=100000 ###Run it more times for more accuracy

CountDefective=rep(0,manytimes)

set.seed(1)

for (i in 1:manytimes){
samp=sample(IsDefective, 4, replace=TRUE) ###, ####chanhe sample size to correct amount -> 4

CountDefective[i]=sum(samp) ###Change to sum because we need to sum the number of defective items in
}

5
hist(CountDefective, labels=TRUE,
ylim=c(0,.7*manytimes), breaks=seq(-0.5, 4.5, 1),main = "Number of defective in 4 sample trials", x

sum(CountDefective==0)/manytimes

f. Suppose the quality assurance employee is now going to look at 20 items from the shipment.
They still believe it is reasonable to use a Binomial model (n=20, π = 0.15) to describe the
number of items in those 20 that will have a defect.

fi. Compute the probability that exactly 5 of those 20 items have a defect.

The probability of exactly 5 being being defective is 10.3% shown in the calculations below.

n = 20
k = 5
pi = 0.15

prob_5 = dbinom(k,size=n,prob=pi)
prob_5

## [1] 0.1028452

fii. Compute the probability that 5 or more of those 20 items have a defect.

The probability that 5 or more of those 20 will have a defect is 17.0% as shown

prob_5_o_mo = sum(dbinom(5:20,size=n,prob=pi))
prob_5_o_mo

## [1] 0.1701532

fiii. Which histogram given below correctly shows the pdf for the binomial model
described in f?

par(mfrow=c(2,2), mar=c(4,4,2,1))
barplot(names=0:5,dbinom(0:5, 15, prob=0.85),
xlab="", ylab="Probability", main="Graph A", space=0)
barplot(names=0:20, dbinom(0:20, 20, prob=0.85),
xlab="", ylab="Probability", main="Graph B", space=0)
barplot(names=0:20, dbinom(0:20, 20, prob=0.15),
xlab="", ylab="Probability", main="Graph C", space=0)
barplot(names=0:5, dbinom(0:5, 15, prob=0.15),
xlab="", ylab="Probability", main="Graph D", space=0)

6
Graph A Graph B

0.20
Probability

Probability
4e−06

0.10
0e+00

0.00
0 1 2 3 4 5 0 2 4 6 8 11 14 17 20

Graph C Graph D
0.20

0.20
Probability

Probability
0.10

0.10
0.00

0.00

0 2 4 6 8 11 14 17 20 0 1 2 3 4 5

par(mfrow=c(1,1), mar=c(5.1, 4.1, 4.1, 2.1))

7
Exercise 4: The bonding strength S of a drop of plastic glue from a particular manufacturer is thought
to be well approximated by a normal distribution with mean 98 lbs and standard deviation 7.5 lbs. S ∼
N (98, 7.52 ). Compute the following values using a normal model assumption.
Thinking about studying for the midterm. . . Make sure you could use the output of this code in your
solutions:

Figure 1: R Output

a. Compute the proportion of drops of plastic glue that will have a bonding strength between
95 and 104 lbs according to this model.

The proportion of drops of plastic glue that will have a bonding strength between 95 and 104 lbs is .444
according to the calculations below.

mu = 98
sd = 7.5
bot_val = pnorm(95,mu,sd)
top_val = pnorm(104,mu,sd)
top_val - bot_val

## [1] 0.4435663

b. A single drop of that glue had a bonding strength that is 0.5 standard deviations above the
mean. Compute the proportion of glue drops that have a bonding strength that is higher.

The proportion of glue with a bonding strength higher than 0.5 standard deviations above the mean is 0.309
as shown below.

z = .5
1- pnorm(.5)

## [1] 0.3085375

c. Compute the 90th percentile bonding strength for the drops of glue.

The 90th percentile bonding strength is 107.6 as shown below.

8
qnorm(.90,mu,sd)

## [1] 107.6116

d. Compute the IQR of bonding strength for drops of glue from this manufacturer.

Q3 = qnorm(.75,mu,sd)
Q1 = qnorm(.25,mu,sd)

IQR = Q3-Q1

cat("The IQR for this data set is", IQR)

## The IQR for this data set is 10.11735

e. Drops of a similar plastic glue from another manufacturer (manufacturer B) is claimed


to have bonding strength well approximated by a normal distribution with mean 43 kg
and standard deviation of 3.5 kg WB.kg ∼ N (43, 3.52 ). Compute the probability that a
drop of manufacturer B’s glue will have a strength above the 90th percentile strength of
manufacturer A’s glue.

The probability that a drop of man. B’s glue will have a strength above the 90th percentile of man. A’s
glue is 4.8% according to the calculations below.

mu = 43
sd = 3.5

#Previous 90th percentile = 107.6116 lbs

other_90 = 107.6116*(0.45359237) #convert to kgs

1- pnorm(other_90,mu,sd)

## [1] 0.0484055

You might also like