0% found this document useful (0 votes)
6 views

hw_1

The document outlines a homework assignment for an M Tech (CS) course on Statistical Methods, covering topics such as descriptive statistics, sampling distributions, and probability. It includes various problems related to data analysis, including calculating means, variances, percentiles, and correlation coefficients, as well as drawing plots and detecting outliers. The assignment also explores theoretical concepts like the Central Limit Theorem and properties of different statistical distributions.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

hw_1

The document outlines a homework assignment for an M Tech (CS) course on Statistical Methods, covering topics such as descriptive statistics, sampling distributions, and probability. It includes various problems related to data analysis, including calculating means, variances, percentiles, and correlation coefficients, as well as drawing plots and detecting outliers. The assignment also explores theoretical concepts like the Central Limit Theorem and properties of different statistical distributions.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Statistical Methods: M Tech (CS) Jan-May 2025

Homework 1
(descriptive statistics, sampling distributions, order statistics, probability and distributional convergence,
CLT )

1. The average particulate concentration, in micrograms per cubic meter, was measured in a petrochem-
ical complex at 36 randomly chosen times, with the following concentrations resulting:

5, 18, 15, 7, 23, 220, 130, 85, 103, 25, 80, 7, 24, 6, 13, 65, 37, 25,
24, 65, 82, 95, 77, 15, 70, 110, 44, 28, 33, 81, 29,14, 45, 92,17, 53

(a) Obtain a grouped frequency distribution with number of classes = 5.


(b) Draw a histogram corresponding to the grouped frequency distribution.
(c) What can you say about the skewness of the histogram?

2. The following data represents the scores of 28 students in a SAT exam:

43, 52, 60, 72, 83, 90, 91, 85, 74, 62, 55, 46, 55, 63,
74, 85, 94, 87, 75, 64, 56, 58, 66, 77, 88, 77, 66, 78

(a) Draw a stem and leaf plot of the scores. What can you say about the skewness of the data from
this plot?
(b) Compute the 30th and 80th percentiles.
(c) Find the five number summary of the data.
(d) Draw a box plot.
(e) Detect outliers if there exist any.

3. Suppose data values {y1 , . . . , yn } are obtained from the data values {x1 , . . . , xn } using a linear trans-
formation yi = α + βxi where i = 1, . . . , n and α and β are constants.

(a) Find the relation between the sample means x̄ and ȳ, i.e., express ȳ in terms of x̄.
(b) What is the relation between the sample medians x̃ and ỹ? Express ỹ in terms of x̃.
1 Pn
(c) Determine the relation between the sample variances Sx2 = n−1 2
i=1 (xi − x̄) and
1 n
Sy2 = n−1 2
P
i=1 (yi − ȳ) .
(d) What will be the relation between the sample coefficient of skewness for x-samples and y-samples?
(e) If β < 0, show all steps to find the sample correlation coefficient r of the paired observations
(xi , yi ), i = 1, . . . , n.
x −3 −2 −1 1 2 3
4. (a) Find the Pearson’s sample correlation coefficient for the data
y 9 4 1 1 4 9
(b) Draw a scatter diagram. Do you see any pattern that the points are following?
(c) Show that the sample correlation coefficient can be alternatively expressed as
Pn
i − nx̄ȳ
i=1 xi yq
r = qP .
n 2 − nx̄2
Pn 2 − nȳ 2
x
i=1 i i=1 i y

(d) If in a survey it is found that each woman marries a man who is 4 years older than the woman.
What would be the correlation coefficient between the ages of husbands and wives?
(e) State true or false, and explain briefly: “If y is usually less than x, the correlation between x and
y will be negative.”

5. Sample variance involves the sum of all squared deviations from the mean. For a data set x1 , . . . , xn ,
a new measure of spread based on “sum of all squared deviations from the median” is defined as
n
1 X
(xi − x̃)2 where x̃ is the sample median.
n−1
i=1

Prove that the new measure will always produce a higher value than usual sample variance. In other
words, prove that
n n
1 X 1 X
(xi − x̃)2 ≥ (xi − x̄)2 where x̄ is the sample mean.
n−1 n−1
i=1 i=1

6. Let X be a Laplace random variable with density function


 
1 |x − µ|
fX (x) = exp − where − ∞ < µ < ∞ and b > 0.
2b b

Find the excess kurtosis of X.

7. Suppose {(xi , yi ), i = 1, . . . , n} are n pairs of observations and let ri (si ) be the rank of xi (yi ) among
the x-observations (y-observations). Show that the Spearman’s rank correlation rSP is same as the
Pearson’s correlation coefficient r based on the rank variables {(ri , si ) : i = 1, . . . , n} if all ranks are
distinct integers.

8. Suppose {(xi , yi ), i = 1, . . . , n} are n pairs of observations and let ri (si ) be the rank of xi (yi )
among the x-observations (y-observations). Show that the Pearson’s correlation coefficient r based on
{(aij , bij ) : i, j = 1, . . . , n} gives the Kendall’s τ (if all ranks are distinct integers) where for each pair
(i, j), 
+1 if ri < rj

aij = −1 if ri > rj

0 if ri = rj .

and 
+1
 if si < sj
bij = −1 if si > sj

0 if si = sj .

9. A die is rolled. Let X be the face value that turns up, and X1 , X2 be two independent observations
on X. Find the probability mass function of sample mean X.

10. Let X1 , . . . , Xn be a random sample from N (µ, σ 2 ) and X and S 2 , respectively, be the sample mean
and the sample variance. Let Xn+1 ∼ N (µ, σ 2 ), and p assume that X1 , . . . , Xn , Xn+1 are independent.
Find the sampling distribution of [(Xn+1 − X)/S] n/(n + 1).

11. Derive the moment generating function (MGF) of a χ2n random variable. Suppose X1 , X2 , and Y are
random variables such that X1 ∼ χ2n1 , Y := X1 + X2 ∼ χ2n and X1 and X2 are independent. Show
that X2 ∼ χ2n−n1 .

12. A manufacturer of booklets packages them in boxes of 100. It is known that, on the average, the
booklets weigh 1 ounce, with a standard deviation of 0.05 ounce. The manufacturer is interested in
calculating the probability that 100 booklets weigh more than 100.4 ounces, a number that would
help detect whether too many booklets are being put in a box. Explain how you would calculate the
(approximately) value of this probability. Mention any relevant theorems or assumptions needed.

13. Each of the batteries in a collection of 40 batteries is equally likely to be either a type A or a type
B battery. Type A batteries last for an amount of time that has mean 50 and standard deviation 15;
type B batteries last for an amount of time that has mean 30 and standard deviation 6.

(a) Approximate the probability that the total life of all 40 batteries exceeds 1700.
(b) Suppose it is known that 20 of the batteries are type A and 20 are type B. Approximate the
probability that the total life of all 40 batteries exceeds 1700.

14. The opening prices per share X1 , X2 , . . . , X5 of five similar stocks are independent random variables,
each with a density function given by
1 x−3
f (x) = e− 2 I{x≥3} .
2
On a given day, an investor buys shares of whichever stock is least expensive. Find the

(a) probability density function for the price per share that the investor will pay.
(b) expected cost per share that the investor will pay.

15. Let X1 , . . . , Xn be independent observations from an exponential distribution with mean θ. Assume
n = 2m + 1 for some positive integer m. Find the probability density function of X(1) , X(n) and the
sample median X̃.
16. Let X1 , X2 , . . . be a sequence of random variables such that Xn ∼ Geometric(λ/n) for n = 1, 2, 3, . . .
and constant λ > 0. Show that Xn /n converges in distribution to Exponential(λ) as n → ∞.

17. Let X1 , X2 , . . . be a sequence of i.i.d. N (0, σ 2 ) random variables. Define Yn+1 = Yn +X


3
n+1
for n =
0, 1, 2, . . . with Y0 = 0. Which random variable does Yn converge in distribution to as n → ∞?

18. Suppose random variables Xn , Yn and Y are such that Yn = Y Xn for n = 1, 2, . . . where P (|Y | ≤
M ) = 1 for some positive constant M . If E(Xn ) = µ and V (Xn ) = σ 2 /n, then which random variable
does Yn converge in probability to as n → ∞?

1 1
19. Let X1 , X2 , . . . be a sequence of i.i.d. Cauchy random variables with density function f (x) = π 1+x2 .
Where does Xn /n converge to in probability as n → ∞?

a sequence of i.i.d. random variables with V (X1 ) = σ 2 . Show that the sample
20. Let X1 , X2 , . . . be P
1 n
variance S 2 = n−1 2 2
i=1 (Xi − X) converges in probability to σ .

You might also like