0% found this document useful (0 votes)
19 views

Analysis of Variance

1. Analysis of variance (ANOVA) is a statistical technique used to test differences between group means and the relationship between categorical variables and continuous variables. 2-way ANOVA examines the effects of two factors on a continuous outcome variable. 2. Logistic regression analyzes the relationship between a binary dependent variable and independent variables. It calculates odds ratios to determine how the log odds of the outcome changes with the predictor variables. 3. Linear regression finds the linear relationship between a continuous dependent variable and one or more independent variables. It estimates the coefficients of the independent variables to predict the dependent variable.

Uploaded by

Ritapereiraa
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Analysis of Variance

1. Analysis of variance (ANOVA) is a statistical technique used to test differences between group means and the relationship between categorical variables and continuous variables. 2-way ANOVA examines the effects of two factors on a continuous outcome variable. 2. Logistic regression analyzes the relationship between a binary dependent variable and independent variables. It calculates odds ratios to determine how the log odds of the outcome changes with the predictor variables. 3. Linear regression finds the linear relationship between a continuous dependent variable and one or more independent variables. It estimates the coefficients of the independent variables to predict the dependent variable.

Uploaded by

Ritapereiraa
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Analysis of variance

» testing the differences between different group means

1 way analysis of variance (x: 1 factor)


H0:
2 way analysis of variance (x +1 factor)
» H0: the time length visit does not change with the age of the nurse/ the nurse age does not
influence or predict the time visit/ time is the same among different age groups
» H02: the time of patient…
» H03: there is no interaction between the age of the nurse and the type of patient on the time
length

Models: 1 or 2/ fixed or random


1: factors specifically chosen
H0:u1=u2=u3

2: factors are randomly selected from a population


H0: general statement without focusing on the group means presented
Beyond the variance within the groups, we further calculate Sa2 (the added expected variability
introduced by the groups): mean square= s2 (error) +s2A

As we want the relative % (magnitude) and not the objective values  S2A% = intraclass
correlation coefficient: a value that represents the variability expected within the groups in
comparison to the variability within the groups; the higher this value, the lower the variability
within, meaning all groups are similar

The higher the F, the higher the differences between groups in comparison to the differences
within  the higher the chance to have a significant association

Report
1. Description of variables/definition of u’s for stating hypothesis
2. Definition of hypothesis
3. Alpha and p values
4. Test statistic: F value + DF, p value, conclusion (reject H0 or do not reject)
5. 95%CI if applicable

Aim: the objective of this study was to…


Methods: What experiment was done/what was done; what was collected (Variables + units);
how were they described and what test was used for the association; conditions tested
(normality of residuals: skewness and deviations from normality/outliers; rule of thumb for the
ratio of the highest/lowest sd < 2/3; a 2 sided significance of 5% significance was used;
statistical analysis done using SPSS
Results: sample size; descriptive statistics; result of the test
Conclusion:
NOTE: anova is a robust test; not sensitive to small departures from normality; even if the rule
of thumb for the homogeneity of variances is not met, if the design of the experiment is
balanced (meaning that the groups have the same n, it may be okay to apply ANOVA, even if
risky).
Logistic regression
» y (outcome) is binary

Logit is a transformation that allows to relate x and an outcome that either has a yes or no
value.
Logit x= logit a -logit b= b0+b1x= log odds ratio= ln odds ratio= e^b1c

Using the inverse, we can calculate the probability of the outcome Y (0 or 1), given a certain
value of x.

OR: crude and adjusted


Interpretation: in comparison to A, those who B have an increased “risk” (approximated by the
odds ratio) of approximately 3 times.

Output
Hypothesis
Alpha, p values
Test statistic: Wald= , value
Model equation:
OR= , 95%CI

How to check for confoundin:


- Associated with the outcome (risk factor)
- Associated with the exposure but not a consequence
- When adding the variable the OR changes >10%
- When stratifying and the odds ratio are similar between the groups but different from
the crude OR
- This means that the true association between a and b is masked by the presence of a 3
variable that is associated with both

How to check for effect modifiers:


- Different OR’s between stata
- The association between a and b depends on the variable c
- Interaction term is significant,then we keep the 3 variables.

We can obtain the OR using either the cross tabs or the output from logistic regression.

Linear regression

Y=bo+b1x + e

Residuals= observed-predicted

Conditions:
» observations are independent of each other
» residuals have mean of zero and constant variance (histogram)
» residuals follow a normal distribution (residual plot -2;2) + histogram
Look for: R2; sig; partial correlation (the importance of that variable in the adjusted model when
excluding the other predicting variables); anova (- ; + modelo explicado)

» changes in b  confounding
» changes in standard error  collinearity
» interaction! Add variable to the variables alone
 If there is interaction, 2 predictors interact between eachother, thus influencing the
true association of a and b

NOTE: collinearity: when predictors are associated with each other


VIF and tolerance

Tolerance  R2 (higher R2, lowe tolerance  higher chances of collinearity)


VIF: inverse of tolerance (how does beta of the model would change if variables were not
associated?)

Tolerance <0.2; VIF >5  collinearity!!


- Remove the variable introducing collinearity! Eg for head circumference

NOTE: same for confounding, if one added variable leads to first one to cease being singicant,
we eliminate this first one.

When accounting for the effect of b in c, a stop affect c; meaning that the true association
between a and b was mascarating the effect of a third variable.

--- residuals and looking for outliers.


» residuals: distance between the observed value and the predicted value obtained from the
modeled equation using least square method <3
» deviance: includes residual distance but not only; distance between the 2 equations: the
predicted and the true one
» cooks distance: measures the distance between the equations when removing one
observation <1
» leverage: how distance an observation is from the majority of the data in the x axis <0.05
» mahalanobis: sd’s <7

You might also like