0% found this document useful (0 votes)
7 views

Viva questions imp

The document outlines important questions and answers related to statistical tests including t-tests, ANOVA, correlation tests, chi-square tests, and regression analysis. It covers definitions, interpretations, and applications of these tests, emphasizing the significance of p-values, confidence intervals, and model selection criteria like AIC. Additionally, it explains the differences between various regression methods and the implications of coefficients in both linear and logistic regression.

Uploaded by

gmp24hdqck
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Viva questions imp

The document outlines important questions and answers related to statistical tests including t-tests, ANOVA, correlation tests, chi-square tests, and regression analysis. It covers definitions, interpretations, and applications of these tests, emphasizing the significance of p-values, confidence intervals, and model selection criteria like AIC. Additionally, it explains the differences between various regression methods and the implications of coefficients in both linear and logistic regression.

Uploaded by

gmp24hdqck
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Important Questions for VIVA –Unit 4 RMLAB

T-Tests
1. What is a t-test, and what is its main purpose?
2. Explain the difference between a one-sample, two-sample, and paired t-test.(explain
wd eg)
3. How do you interpret the t-value in a t-test?
4. What does the p-value indicate in the results of a t-test?
5. In a paired t-test, what does a negative t-value indicate?
6. What is a confidence interval, and how does it help in interpreting t-test results?
7. What does it mean if the confidence interval includes zero in a t-test?
Ans- means that there is no difference in sample and population mean. Or there is no
difference in means of 2 groups.
8. What is null and alternative hypothesis?
ANOVA
9. What is ANOVA, and why is it used in data analysis?
10. Describe the purpose of a one-way ANOVA versus a two-way ANOVA.
11. What does a significant p-value in ANOVA indicate about the groups?
Ans- p value is significant means that it is less than alpha, the significance level eg
0.05 which is 5%. It means we reject null hypothesis. Null hypothesis in anova states
that there is no difference between the groups. If we are rejecting it, it means that
there is difference in groups.
Correlation Tests
13. Explain Pearson’s correlation test and what it measures.
14. What is the difference between Pearson, Spearman, and Kendall correlation methods?
Ans- Pearson correlation checks linear relationship between continuous variables. It
assume that data is normal.

Spearman rank colleration Uses ranked data, making it suitable for ordinal variables
(1st 2nd) or when data is not normally distributed.

Kendall Also uses ranked data, its formula is better , it helps when there is tie between
ranks and When data contain extreme values.

15. What does a negative correlation coefficient mean in Pearson’s test?


16. Why might Spearman’s or Kendall’s correlation be preferred over Pearson’s?
When data is ranked, variables are ordinal.
Chi-Square Test
17. What is the Chi-Square test used for in data analysis?
Ans- The Chi-Square test is used to determine if there's a significant association
between two categorical variables. It helps us understand if the observed distribution
of data differs from what we'd expect by chance.
For example, we could use a Chi-Square test to see if there's a relationship between
gender and favorite color.
18. Explain the meaning of the chi-squared statistic and how it is interpreted.
19. What does it mean if the p-value is less than 0.05 in a Chi-Square test?
If p value is less than 0.05 means we reject null hypothesis at 5% level of
significance. The null hypothesis in chi square test states The variables are not related.
We are rejecting it means that variables are related to each other.
20. Does chi square tell direction of relationship, positive or negative?
Ans – No. it tells whether they are related or not only.
Regression Analysis
20. Describe the purpose of linear regression and when it would be used.
21. What is regression analysis, and why do we use it?
22. Explain the difference between simple linear regression and multiple linear
regression. Give example
23. What is the role of the intercept in a regression model?
24. What does the slope coefficient represent in a linear regression model?
25. How do you interpret the coefficient of an independent variable in a linear regression?
(give example)
Ans the coefficient of an independent variable represents the average change in the
dependent variable for a one-unit increase in the independent variable, while holding
all other independent variables constant. Eg if we are predicting marks of student(Y),
based on hours of study(X1) , and b1 is 10. Our regression eq is Y = b0 + b1x1 +e
Means that for every additional hour of study, marks increase by 10 units,
assuming all other things constant.
26. What does a positive vs. negative coefficient tell you about the relationship between
variables?
27. What is error term?
28. What is the meaning of the R-squared value in a regression output?
Ans- It is a statistical measure in a regression model that determines the proportion of
variance in the dependent variable that can be explained by the independent variable
29. Explain adjusted R-squared. Why is it often preferred over R-squared in multiple
regression?
Ans - Adjusted R-squared- adds a penalty to R-squared for model complexity. R-
squared will always increase if a new independent variable is added to a model.
30. What does the p-value of a coefficient indicate in a regression model?
Ans p value tells the probability of obtaining observed results. If p value is less than
significance level say 0.05 means 5% then we reject null hypothesis. Our null
hypothesis in regression is that there is no relationship between independent and
dependent variable. We reject null means that there is significant relationship.
31. In multiple regression, why might some predictors have high p-values?
Ans - In multiple regression, some predictors might have high p-values because the
explanatory variable(X) does not reliably predict the dependent variable(Y). Means
no relation.
32. What is logistic regression, and when is it used instead of linear regression?
Ans We use it when the dependent variable is binary (1 or 0 ) it can be yes or no, or
pass or fail. Or tells whether an event will occur or not. Linear regression uses
continuous variable as dependent variable(Y) like marks scored can be 50,65,67.
33. What does the log-odds represent in logistic regression?
 A positive log-odds indicates that the event is more likely to occur.
 A negative log-odds indicates that the event is less likely to occur.
 The magnitude of the log-odds reflects the strength of the association between the
predictors(X) and the outcome(Y)
34. Explain how to interpret the coefficients in a logistic regression model.
 Eg .
Y- Buying a product
Bo- Intercept: -2.5
B1 Income: 0.03
B2 Age: -0.02
Interpreting the coefficients in log-odds:
 Intercept: When income and age are both 0 (hypothetically), the log-odds of buying
the product is -2.5.
 Income: For every one-unit increase in income, the log-odds of buying the product
increases by 0.03. This means that higher income is associated with a higher
likelihood of purchasing the product.
 Age: For every one-unit increase in age, the log-odds of buying the product decreases
by 0.02. This suggests that older people are less likely to buy the product.
To understand probability of Y happening, we can calculate odds ratios.
35. What does a p-value of 1 for a predictor in logistic regression indicate?
Ans- it means that the predictor(X) is not significant, there is no relation. We accept
null hypothesis as p value > significance level (1,5 or 10%)
36. Explain stepwise regression and its purpose.

Stepwise regression is a statistical method used to select the most significant predictor
variables from a large pool of potential predictors for inclusion in a regression model.
It iteratively add or remove independent variables in steps.

Forward : Starts with no variables and adds the most significant one at each step.

Backward : Starts with all variables and removes the least significant one at each
step.

37. What is AIC ?


AIC- Akaike Information Criterion. is like a judge for different regression models. It
compares how well each model fits the data and how complex it is. The model with
the lowest AIC score is usually the best choice.

You might also like