Homework 1 Sol
Homework 1 Sol
Homework 1 Solution
1. Numerical summaries. The “cold start ignition time” of an automobile engine is being
investigated by a gasoline manufacturer. The following times (in seconds) were obtained
for a test vehicle: 1.75, 1.84, 2.12, 1.92, 2.62, 2.35, 3.09, 3.15, 2.53, 1.91, 3.25, 2.83.
(a) Calculate the sample mean, sample variance, and sample standard deviation.
(b) Construct a box plot of the data by hand (not using R or other statistical softwares).
You need to show the derivation for box plot.
Answer:
(a) The sample mean:
1.75 + 1.84 + 2.12 + 1.92 + 2.62 + 2.35 + 3.09 + 3.15 + 2.53 + 1.91 + 3.25 + 2.83
x̄ = = 2.4467.
12
Variance:
(1.75 − x̄)2 + · · · + (2.83 − x̄)2
s2 = = 0.2975.
12 − 1
Standard deviation: √
s= s2 = 0.5454.
(b) First sort the data set and find out the quartiles:
1.75 1.84 1.91 1.92 2.12 2.35 2.53 2.62 2.83 3.09 3.15 3.25
Since the sample size n = 12 is even, the median is the average of the 6-th and 7-th
value:
2.35 + 2.53
M= = 2.44.
2
To compute the first quartile, we first find the location (index) of the 25th percentile.
Since i = (0.25) ∗ 12 = 3 is an integer, the first quartile (the 25th percentile) can be
obtained by the average of 3-th and 4-th value:
Q1 = (1.91 + 1.92)/2 = 1.915.
To compute the third quartile, we first find the location. Since i = (0.75) ∗ 12 = 9 is
an integer, the third quartile (the 75th percentile) can be obtained by the average of
9-th and 10-th value:
Q3 = (2.83 + 3.09)/2 = 2.960.
The IQR
IQR = Q3 − Q1 = 2.960 − 1.915 = 1.045.
From
Q3 +1.5∗IQR = 2.960+1.5∗1.045 = 4.5275, Q1 −1.5∗IQR = 1.915−1.5∗1.045 = 0.3475,
we can see all data points are larger than Q1 − 1.5 ∗ IQR and smaller than Q3 +
1.5 ∗ IQR, therefore, there are no outliers, and the whiskers of its box plot are at the
minimum and maximum value.
1
2. A life insurance company issues standard, preferred, and ultra-preferred policies. Of the
company’s policyholders of a certain age, 60% have standard policies and a probability of
0.01 of dying in the next year, 30% have preferred policies and a probability of 0.008 of
dying in the next year, and 10% have ultra-preferred policies and a probability of 0.007
of dying in the next year. A policyholder of that age dies in the next year. What are
the conditional probabilities of the deceased having had a standard, a preferred, and an
ultra-preferred policy?
Answer: Let B be the event that the policyholder dies. Let A1 , A2 , A3 be the events
that this policyholder has a standard, preferred and ultra-preferred policy respectively.
We know that
P (A1 ) = 0.6, P (A2 ) = 0.3, P (A3 ) = 0.1
P (B | A1 ) = 0.01, P (B | A2 ) = 0.008, P (B | A3 ) = 0.007.
Hence by Bayes’ theorem,
P (A1 )P (B | A1 )
P (A1 | B) = = 0.659
P (A1 )P (B | A1 ) + P (A2 )P (B | A2 ) + P (A3 )P (B | A3 )
P (A2 )P (B | A2 )
P (A2 | B) = = 0.264
P (A1 )P (B | A1 ) + P (A2 )P (B | A2 ) + P (A3 )P (B | A3 )
P (A3 )P (B | A3 )
P (A3 | B) = = 0.077
P (A1 )P (B | A1 ) + P (A2 )P (B | A2 ) + P (A3 )P (B | A3 )
3. Place eight chips in a bowl: Three have the number 1 on them, two have the number 2,
and three have the number 3. Say each chip has a probability of 1/8 of being drawn at
random, let the random variable X be the number on the chip that is selected, so that the
space of X is S = {1, 2, 3}. Make reasonable probability assignments to each of these three
outcomes, and compute the mean µ and the variance σ 2 of this probability distribution.
Answer: We have
3 2 3
f (1) = P (X = 1) = , f (2) = , f (3) = .
8 8 8
Therefore, by definition,
3 2 3
µ = E(X) = 1 ×
+ 2 × + 3 × = 2,
8 8 8
2 2 2 2 3 2 2 2 3 2 3
σ = E(X ) − µ = 1 × + 2 × + 3 × − 2 = .
8 8 8 4
4. Flaws in a certain type of drapery material appear on the average of one in 150 square feet.
If we assume a Poisson distribution, find the probability of at most one flaw appearing in
225 square feet.
Answer: For Poisson distribution, we first find the parameter λ, which is the average
number of flaws appearing in 225 square feet. So, λ = (1/150) × 225 = 1.5 and X ∼
P oisson(1.5),
1.50 × e−1.5 1.51 × e−1.5
P (X ≤ 1) = P (X = 0) + P (X = 1) = + = 0.558
0! 1!
2
5. Suppose that 2000 points are selected independently and at random from the unit square
{(x, y) : 0 ≤ x < 1, 0 ≤ y < 1}. Let W equal the number of points that fall into
A = {(x, y) : x2 + y 2 < 1}. Note that the area of a unit circle (i.e. the area of A) is π.
Answer:
(a) Let S = {(x, y) : 0 ≤ x < 1, 0 ≤ y < 1}. Then the area of S is 1 and the area of A in
S (i.e. A ∩ S) is π/4. So, the probability for a point falls into A is
π
p= .
4
It means that W ∼ Binomial(2000, π/4).
(a) Using the properties of a binomial distribution, we have
π
E(W ) = 2000 × = 500π
4
π π
Var(W ) = 2000 × × 1 − = 337
√ 4 4
σ = 337 = 18.36
P (X > x + y) e−(x+y)/µ
P (X > x + y | X > x) = = = e−y/µ = P (X > y),
P (X > x) e−x/µ
where µ = E(X).
3
The sampling distribution of X1 /n1 − X2 /n2 is also a normal distribution,
where µ1 , µ2 is the true mean of Y and Z, σ12 , σ22 is the true variance of Y and Z.
However we don’t know the true means and variances, the estimated values are