0% found this document useful (0 votes)
3 views

Metrics Practice Test 1 Group 7

The document discusses a dataset on infant weights and various maternal factors, including smoking and prenatal care. It outlines steps for statistical analysis, including summary statistics, hypothesis testing, and regression analysis to determine the impact of smoking on birth weight. The results indicate that smoking significantly decreases birth weight, with detailed interpretations of coefficients from simple and log-transformed regressions.

Uploaded by

Nguyễn Lâm
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Metrics Practice Test 1 Group 7

The document discusses a dataset on infant weights and various maternal factors, including smoking and prenatal care. It outlines steps for statistical analysis, including summary statistics, hypothesis testing, and regression analysis to determine the impact of smoking on birth weight. The results indicate that smoking significantly decreases birth weight, with detailed interpretations of coefficients from simple and log-transformed regressions.

Uploaded by

Nguyễn Lâm
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

[ M E T R I C S ] P R A C T I C E E X E R C I S E 1 - S I M P L E R EG R E S S I O N

1 hidden cell

This dataset pertains to the weights of infants and includes the following variables: bwght, age and education
level of both mother and father, the number of cigarettes the mother smoked per day during pregnancy (cigs),
the number of alcoholic drinks the mother consumed per day during pregnancy (drink), the number of
prenatal care visits during pregnancy (npvis), and the ethnicity and race of both mother and father (mwhte,
mblck, moth, fwhte, fblck, foth), among others.

1. Provide summary statistics (mean, median, standard deviation, maximum value, minimum value) for the
variables bwght, cigs, drink, npvis, mwhte, and fwhte.
2. Run a simple regression with bwght as the dependent variable and cigs as the independent variable.
3. State the null and alternative hypotheses to test whether smoking during pregnancy affects birth
weight.
4. Using the provided output, conduct the test at the 5% significance level. You can use the rule of thumb
for t-tests.
5. Interpret the coefficient associated with cigs.
6. Do you think that npvis has an impact on birth weight? If so, in the context of the regression in Question
2, state the Gauss-Markov assumption SLR.3 in your own words. Without performing data analysis, do
you believe this assumption is reasonable in this context?
7. The variable lbwght is the log-transformed birth weight. Regress this variable on cigs and interpret the
results.
8. Create a new variable called lcigs such that lcigs = ln(cigs + 1). Regress lbwght on lcigs and interpret the
results.

1 hidden cell

1. Provide summary statistics (mean, median, standard deviation, maximum value, minimum value) for the
variables bwght, cigs, drink, npvis, mwhte, and fwhte.

Hidden code

bwght cigs drink npvis mwhte fwhte


mean 3401.12 1.09 0.02 11.62 0.89 0.89
median 3425.00 0.00 0.00 12.00 1.00 1.00
sd 576.54 4.22 0.29 3.68 0.32 0.31
min 360.00 0.00 0.00 0.00 0.00 0.00
max 5204.00 40.00 8.00 40.00 1.00 1.00

2. Run a simple regression with bwght as the dependent variable and cigs as the independent variable .
Hidden code

Call:
lm(formula = bwght ~ cigs, data = df)

Residuals:
Min 1Q Median 3Q Max
-3061.71 -323.71 8.29 365.54 1782.29

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3421.711 14.145 241.897 < 2e-16 ***
cigs -11.478 3.245 -3.538 0.000414 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 568.4 on 1720 degrees of freedom


(110 observations deleted due to missingness)
Multiple R-squared: 0.007223, Adjusted R-squared: 0.006646
F-statistic: 12.51 on 1 and 1720 DF, p-value: 0.0004145

3.State the null and alternative hypotheses to test whether smoking during pregnancy affects birth weight.
Null Hypothesis: Smoking during pregnancy has no effect on birth weight. In other words, the coefficient
that measures the impact of cigarette consumption on birth weight is zero:

H0 : βcigs = 0

Alternative Hypothesis: Smoking during pregnancy does affect birth weight, meaning the coefficient is
not zero:

HA : βcigs ≠ 0

4. Using the provided output, conduct the test at the 5% significance level. You can use the rule of thumb for
t-tests.

Hidden code

Call:
lm(formula = bwght ~ cigs, data = data)

Residuals:
Min 1Q Median 3Q Max
-3061.71 -323.71 8.29 365.54 1782.29

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3421.711 14.145 241.897 < 2e-16 ***
cigs -11.478 3.245 -3.538 0.000414 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 568.4 on 1720 degrees of freedom


(110 observations deleted due to missingness)
Multiple R-squared: 0.007223, Adjusted R-squared: 0.006646
F-statistic: 12.51 on 1 and 1720 DF, p-value: 0.0004145

T-value for cigs: -3.537605


Since |t_value| = 3.537605 is greater than 2
we reject the null hypothesis. Smoking during pregnancy has a statistically significant
effect on birth weight.

5. Interpret the coefficient associated with cigs.


Hidden code

Call:
lm(formula = bwght ~ cigs, data = df)

Residuals:
Min 1Q Median 3Q Max
-3061.71 -323.71 8.29 365.54 1782.29

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3421.711 14.145 241.897 < 2e-16 ***
cigs -11.478 3.245 -3.538 0.000414 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 568.4 on 1720 degrees of freedom


(110 observations deleted due to missingness)
Multiple R-squared: 0.007223, Adjusted R-squared: 0.006646
F-statistic: 12.51 on 1 and 1720 DF, p-value: 0.0004145

The coefficient associated with cigs, holding other factors constant, is -11.47833

The coefficient for cigarettes (cigs) is -11.478, which means that for each additional cigarette smoked per day
during pregnancy, birth weight decreases by approximately 11.48 grams, holding all else constant. This effect is
statistically significant at the 1% level (p = 0.000414).

6. Do you think that npvis has an impact on birth weight? If so, in the context of the regression in Question 2,
state the Gauss-Markov assumption SLR.3 in your own words. Without performing data analysis, do you
believe this assumption is reasonable in this context?

Npvis (number of prenatal visits) likely has an impact on birth weight since frequent prenatal care can greatly
affect the trajectory of fetal development.

The Gauss-Markov assumption SLR.3 (zero conditional mean) in this context: The expected value of the error
term, given any number of cigarettes smoked, should be zero. In mathematical notation:

E(u ∣ cigs) = 0

This assumption is likely violated in this context because there are other important factors affecting birth
weight (e.g npvis) that are not included in the simple regression. Women who smoke during pregnancy might
also have other behaviors or characteristics that affect birth weight (such as diet, alcohol consumption, or
socioeconomic status). These omitted variables could be correlated with smoking behavior, leading to biased
estimates.

7. The variable lbwght is the log-transformed birth weight. Regress this variable on cigs and interpret the
results.
Hidden code

Call:
lm(formula = lbwght ~ cigs, data = data)

Residuals:
Min 1Q Median 3Q Max
-2.23489 -0.08507 0.01932 0.11840 0.43619

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 8.120997 0.004940 1643.877 < 2e-16 ***
cigs -0.003436 0.001133 -3.032 0.00246 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.1985 on 1720 degrees of freedom


(110 observations deleted due to missingness)
Multiple R-squared: 0.005317, Adjusted R-squared: 0.004739
F-statistic: 9.194 on 1 and 1720 DF, p-value: 0.002465

The estimated coefficient on cigs is −0.003436. This means that a one-unit increase in the consumption
cigarette is associated with about a 0.34% decrease in birth weight. The regression result is statistically
significant (p = 0.002465), indicating a robust relationship between smoking and lower log-transformed birth
weight.

8. Create a new variable called lcigs such that lcigs = ln(cigs + 1). Regress lbwght on lcigs and interpret the
results.
Hidden code

Call:
lm(formula = lbwght ~ lcigs, data = data)

Residuals:
Min 1Q Median 3Q Max
-2.23611 -0.08305 0.01883 0.11890 0.43497

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 8.122209 0.004983 1629.974 < 2e-16 ***
lcigs -0.023707 0.006750 -3.512 0.000456 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.1983 on 1720 degrees of freedom


(110 observations deleted due to missingness)
Multiple R-squared: 0.00712, Adjusted R-squared: 0.006543
F-statistic: 12.33 on 1 and 1720 DF, p-value: 0.0004562

The regression results show that a 1% increase in cigarettes (after the log transformation) is associated with
approximately a 0.024% decrease in birth weight. This relationship is statistically significant at the 1% level (p
= 0.000456).

You might also like