0% found this document useful (0 votes)
2 views4 pages

Sample Paper 3

The document consists of a series of questions related to advanced topics in statistics, probability, regression analysis, metric spaces, and linear algebra. Each question requires proofs, derivations, or explanations of concepts such as the Gauss-Markov theorem, Central Limit Theorem, and methods for variable selection in regression. Additionally, it includes practical applications and implications in data science, as well as mathematical tools for analyzing data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views4 pages

Sample Paper 3

The document consists of a series of questions related to advanced topics in statistics, probability, regression analysis, metric spaces, and linear algebra. Each question requires proofs, derivations, or explanations of concepts such as the Gauss-Markov theorem, Central Limit Theorem, and methods for variable selection in regression. Additionally, it includes practical applications and implications in data science, as well as mathematical tools for analyzing data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

HARD

Q1 – Compulsory (5 × 3 = 15 marks)

a) Prove that the OLS estimator is Best Linear Unbiased Estimator (BLUE) using the Gauss-
Markov Theorem.

b) Let A be a 3×3 matrix with distinct eigenvalues. Prove that its eigenvectors form a linearly
independent set.

c) Define a complete metric space. Give one example of a space that is not complete and explain
why.

d) Derive the likelihood ratio test for comparing two nested regression models.

e) Differentiate between log-transformations, Box-Cox transformations, and standardization,


including when each is most appropriate.

UNIT I – Probability, Distributions & Hypothesis


Testing
Q2

a) Derive the moment generating function (MGF) of the standard normal distribution. Use it to
compute the first two moments. (6 marks)

b) Let X1,X2,...,Xn N(μ,σ2). Prove that the sample mean Xˉ and sample variance S^2 are
independent. (6 marks)

c) A researcher claims that the mean recovery time for a new drug is less than 10 days. A sample
of 16 patients had a mean recovery time of 9.4 days and standard deviation of 1.2 days. Test the
claim at the 1% level using a one-tailed t-test. (3 marks)

Q3
a) State and prove the Central Limit Theorem for independent and identically distributed (i.i.d.)
variables. Discuss two practical implications in data science. (8 marks)

b) A six-sided die is rolled 90 times and the results are:


Face 1: 10, Face 2: 18, Face 3: 16, Face 4: 14, Face 5: 16, Face 6: 16
Test whether the die is fair using the Chi-Square Goodness of Fit Test at α=0.05. (7 marks)

UNIT II – Regression, Model Selection & Diagnostics


Q4

a) Derive the normal equations for multiple linear regression using matrix notation. Explain
conditions under which the matrix X^T X is invertible. (8 marks)

b) Describe how you would use stepwise regression for variable selection. What are its
limitations? (4 marks)

c) How would you detect and deal with multicollinearity in a regression model? Explain with
mathematical tools (e.g., VIF). (3 marks)

Q5

a) Design and analyze a 2^3 factorial experiment. Show how interaction effects can be isolated.
(7.5 marks)

b) The residuals of your regression model show a clear funnel shape when plotted against fitted
values.
i) What assumption is violated?
ii) Propose and explain a remedy using transformation techniques. (4.5 marks)

c) Explain how Ridge regression and Lasso regression help in high-dimensional data modeling.
Include objective functions. (3 marks)
UNIT III – Metric Spaces, Sequences & Convergence
Q6

a) Prove that every compact metric space is complete and totally bounded. (6 marks)

b) Let (xn) be a sequence in a metric space such that every subsequence has a convergent
subsequence. Prove that (xn) is compact. (5 marks)

c) Define a Cauchy sequence. Construct an example of a Cauchy sequence that does not
converge in the space Q. (4 marks)

Q7

a) Define an open cover and compactness. Show that a closed interval [a,b][a, b][a,b] in R is
compact using the open cover definition. (7.5 marks)

b) Give an example of a sequence in a non-complete metric space that is Cauchy but not
convergent. Explain why. (4.5 marks)

c) Prove or disprove: Every bounded subset of a metric space is compact. (3 marks)

UNIT IV – Linear Algebra in Data Analytics


Q8

a) Let

A=[4 1 0 ]

[ 1 4 1]

[ 0 1 4]
Compute the eigenvalues and eigenvectors of A, and verify orthogonality. (8 marks)

b) Explain how eigenvalues and eigenvectors are used in Principal Component Analysis (PCA).
Derive how the first principal component maximizes variance. (7 marks)

Q9

a) Let V=span{(1,2,1),(2,4,2),(3,6,3)}.
i) Determine if the vectors are linearly independent.
ii) Find a basis for V. (6 marks)

b) Define a subspace. Prove that the set of all solutions to a homogeneous linear system Ax=0
forms a subspace of R^n. (6 marks)

c) Explain the significance of spectral decomposition in data reduction and clustering. (3 marks)

You might also like