GraphPad Prism Slides
GraphPad Prism Slides
Experimental design
Choice of a Statistical test
Experiment(s)
Data exploration
Technical Biological
n=1 n=3
Power Analysis
The power analysis depends on the relationship
between 6 variables:
6 The alternative:
• One or two-sided test?
• Fix any five of the variables and a
mathematical relationship can be used to
estimate the sixth.
e.g. What sample size do I need to have a 80% probability (power) to
detect this particular effect (difference and standard deviation) at a
5% significance level using a 2-sided test?
Sample size
• Free packages:
• G*Power and InVivoStat
• Russ Lenth's power and sample-size page:
• http://www.divms.uiowa.edu/~rlenth/Power/
• = not numerical
• = values taken = usually names (also nominal)
• e.g. causes of death in hospital
• Values can be numbers but not numerical
• e.g. group number = numerical label but not unit of
measurement
• Qualitative variable with intrinsic order in their
categories = ordinal
• Particular case: qualitative variable with 2 categories:
binary or dichotomous
• e.g. alive/dead or male/female
Analysis of qualitative data
Example of data (cats and dogs.xlsx):
• Cats and dogs trained to line dance
• 2 different rewards: food or affection
• Is there a difference between the rewards?
Expected frequency:(32/68)*(32/68)=0.22
Did they dance? * Type of Training * Animal Crosstabulati on
Ty pe of Training
22% of 68 = 15.1
Food as Af f ect ion as
Animal Reward Reward Total
Cat Did they Y es Count 26 6 32
dance? Expected Count 15.1 16.9 32.0
Dog Cat
30
D a3n0c e Y e s D ance Y es
D ance N o D ance N o
20 20
C o u n ts
C o u n ts
10 10
0 0
Food A f f e c t io n Food A f f e c t io n
• In our example:
cats are more likely to line dance if they are given food as
reward than affection (p<0.0001) whereas dogs don’t mind
(p>0.99).
Quantitative data
• They take numerical values (units of measurement)
=0
• No errors !
– Positive and negative: they cancel each other out.
Sum of Squared errors (SS)
• To avoid the problem of the direction of the error: we
square them
– Instead of sum of errors: sum of squared errors (SS):
Small S.D: data close to the mean: Large S.D.: data distant from the mean:
mean is a good fit of the data mean is not an accurate representation
SD and SEM (SEM = SD/√N)
The SD quantifies the scatter of the data. The SEM quantifies how much we expect
sample means to vary.
SD or SEM ?
• Range of values that we can be 95% confident contains the true mean of the
population.
- So limits of 95% CI: [Mean - 1.96 SEM; Mean + 1.96 SEM] (SEM = SD/√N)
3) Interval data
• The distance between points of the scale should
be equal at all parts along the scale
4) Independence
• Data from different subjects are independent
– Values corresponding to one subjects do not influence
the values corresponding to another subject.
– Important in repeated measures experiments
Analysis of quantitative data
• Is there a difference between my groups regarding the
variable I am measuring?
– e.g.: are the mice in the group A heavier than the one in
group B?
• Tests with 2 groups:
– Parametric: t-test
– Non parametric: Mann-Whitney/Wilcoxon rank sum test
• Tests with more than 2 groups:
– Parametric: Analysis of variance (one-way ANOVA)
– Non parametric: Kruskal Wallis
15
Dependent variable
Dependent variable
12
14
11 13
~ 2 x SE: p~0.05 ~ 4.5 x SE: p~0.01
10 12
11
9
10
8 9
A B A B
SE gap ~ 2 n>=10
SE gap ~ 1 n>=10
12.0
11.5
Dependent variable
11.5
Dependent variable
11.0
11.0
~ 1 x SE: p~0.05 ~ 2 x SE: p~0.01
10.5
10.5
10.0 10.0
9.5 9.5
A B A B
CI overlap ~ 1 n=3 CI overlap ~ 0.5 n=3
14
Dependent variable
Dependent variable
12 15
10 ~ 1 x CI: p~0.05
~ 0.5 x CI: p~0.01
8 10
A B
A B
CI overlap ~ 0.5 n>=10
CI overlap ~ 0 n>=10
12
12
Dependent variable
Dependent variable
11 11
~ 0.5 x CI: p~0.05
~ 0 x CI: p~0.01
10 10
9 9
A B A B
t-test (4)
• 3 types:
– Independent t-test
• it compares means for two independent
groups of cases.
– Paired t-test
• it looks at the difference between two
variables for a single group:
– the second sample is the same as the first after
some treatment has been applied
– One-Sample t-test
• it tests whether the mean of a single variable
differs from a specified constant (often 0)
Example: coyote.xlsx
• 1 Power Analysis
• 2 Plot the data
• 3 Check the assumptions for parametric test
• 4 Statistical analysis: Independent t-test
G*Power
Independent t-test
Example case:
100
Upper Quartile (Q3) 75th percentile
90
80
70 Outlier
60
Male Female
Assumptions for parametric tests
Histogram of Coyote:Freq. dist. (histogram)
10 Counts OK here Female
but if several groups of different sizes, Male
8
go for percentages
6
Counts
0 707274767880828486889092949698100
102
104
106 707274767880828486889092949698100
102
104
106
15 Bin Center
Female
Male
10
Counts
Normality
0 69 72 75 78 81 84 87 90 93 96 99 102105 69 72 75 78 81 84 87 90 93 96 99 102105
Bin Center
15
Female
Male
10
Counts
Bin Center
100
Independent t-test: example
95 Standard error coyote.xlsx
B ody M ass
90
C o y o te s
85
110
108
80
106
F e m a le M a le
104
102
100
Standard deviation 100
98
96
95 94
B ody M ass
92
90 90
L e n g th (c m )
88
85
86
84
80
82
F e m a le M a le
80
95 78
94 76
93 74
Length (cm)
92 72
91 70
90 68
89 66
88 64
87 62
86 Confidence interval 60
F e m a le M a le
85
Male Female
Independent t-test: results
coyote.xlsx
Homogeneity in variance
Normality
Paired t-test: Results
working memory.xlsx
Protein expression
8
0
A B C D E
Cell groups
10
0
A B C D E
Cell groups
Parametric tests assumptions
1.5
0.5
0.0
-0.5
-1.0
A B C D E
1.5
Protein expression (Log)
1.0
0.5
0.0
-0.5
-1.0
A B C D E
Cell groups
Parametric tests assumptions
Normality
Analysis of variance: Post hoc tests
Homogeneity of variance
F=0.6727/0.08278=8.13
F e m a le
25
B ody M ass
20
15
10
1 .0 1 .5 2 .0 2 .5 3 .0 3 .5
P a r a s it e s b u r d e n
Correlation: example
roe deer.slsx
Stimulation: Inhibition:
Y=Bottom + (Top-Bottom)/(1+10^((LogEC50-X)*HillSlope)) Y=Bottom + (Top-Bottom)/(1+10^((X-LogIC50)))
Curve fitting: example
Inhibition data.xlsx
500
N o in h ib ito r
400 In h ib ito r
300
200
100
0
-1 0 -8 -6 -4 -2
-1 0 0 lo g (A g o n is t ], M
1- Choose a Fit:
not necessary to normalize
should choose it when values defining 0 and 100 are precise
variable slope better if plenty of data points (variable slope or 4 parameters)
3- Constrain:
depends on your experiment
depends if your data don’t define the top or the bottom of the curve
4- Weights:
important if you have unequal scatter among replicates
Curve fitting: example
Inhibition data.xlsx
500
N o in h ib ito r
400 In h ib ito r
300
200
100
0
-1 0 -8 -6 -4 -2
-1 0 0 lo g (A g o n is t ], M
5- Initial values:
defaults usually OK unless the fit looks funny
6- Range:
defaults usually OK unless you are not interested in the x-variable full range (ie time)
7- Output:
summary table presents same results in a … summarized way.
8- Diagnostics:
check for normality (weights) and outliers (but keep them in the analysis)
check Replicates test
Curve fitting: example
Inhibition data.xlsx
N o n - n o r m a liz e d d a ta 3 p a r a m e te r s
N o n - n o r m a liz e d d a ta 4 p a r a m e te r s
500
500
450
450
400
400
350
350
300
300
250
R esponse
250
R esponse
200 EC50
200 EC50
150
150 N o in h ib ito r
N o in h ib ito r
100
100 In h ib ito r
In h ib ito r
50
50
0
0
-9 . 5 -9 . 0 -8 . 5 -8 . 0 -7 . 5 -7 . 0 -6 . 5 -6 . 0 -5 . 5 -5 . 0 -4 . 5 -4 . 0 -3 . 5 -3 . 0
-9 . 5 -9 . 0 -8 . 5 -8 . 0 -7 . 5 -7 . 0 -6 . 5 -6 . 0 -5 . 5 -5 . 0 -4 . 5 -4 . 0 -3 . 5 -3 . 0
-5 0 lo g (A g o n is t)
-5 0 lo g (A g o n is t)
-1 0 0
-1 0 0
N o r m a liz e d d a ta 3 p a r a m e te r s
N o r m a liz e d d a ta 4 p a r a m e te r s
110
110
100
100
90
90
80
80
70
70
R e s p o n s e (% )
60
60
EC50
50
50
N o in h ib ito r 40
40
In h ib ito r 30
30 N o in h ib ito r
20
20 In h ib ito r
10
10
0 0
-1 0 . 0 -9 . 5 -9 . 0 -8 . 5 -8 . 0 -7 . 5 -7 . 0 -6 . 5 -6 . 0 -5 . 5 -5 . 0 -4 . 5 -4 . 0 -3 . 5 -3 . 0 -1 0 . 0 -9 . 5 -9 . 0 -8 . 5 -8 . 0 -7 . 5 -7 . 0 -6 . 5 -6 . 0 -5 . 5 -5 . 0 -4 . 5 -4 . 0 -3 . 5 -3 . 0
lo g (A g o n is t) lo g (A g o n is t)
Curve fitting: example
Inhibition data.xlsx
500
N o n - n o r m a liz e d d a ta 4 p a r a m e te r s
No inhibitor Inhibitor
450
350
300
R esponse
SD replicates 22.71 25.52 200 EC50
150 N o in h ib ito r
SD lack of fit 41.84 32.38 100 In h ib ito r
Discrepancy (F) 3.393 1.610 50
-7.158 -6.011
P value 0.0247 0.1989 0
-9 . 5 -9 . 0 -8 . 5 -8 . 0 -7 . 5 -7 . 0 -6 . 5 -6 . 0 -5 . 5 -5 . 0 -4 . 5 -4 . 0 -3 . 5 -3 . 0
-1 0 0
N o n - n o r m a liz e d d a ta 3 p a r a m e te r s
500
450
400
R esponse
EC50
Discrepancy (F) 2.982 1.438 200
150
-5 0 lo g (A g o n is t)
-1 0 0
N o r m a liz e d d a ta 4 p a r a m e te r s
110
100
70
R e s p o n s e (% )
60
EC50
N o in h ib ito r
40
10
0
-7.017 -5.943
-1 0 . 0 -9 . 5 -9 . 0 -8 . 5 -8 . 0 -7 . 5 -7 . 0 -6 . 5 -6 . 0 -5 . 5 -5 . 0 -4 . 5 -4 . 0 -3 . 5 -3 . 0
lo g (A g o n is t)
N o r m a liz e d d a ta 3 p a r a m e te r s
110
100
90
60
30
N o in h ib ito r
P value 0.0036 0.1246 20
In h ib ito r
0
-7.031 -5.956
-1 0 . 0 -9 . 5 -9 . 0 -8 . 5 -8 . 0 -7 . 5 -7 . 0 -6 . 5 -6 . 0 -5 . 5 -5 . 0 -4 . 5 -4 . 0 -3 . 5 -3 . 0
lo g (A g o n is t)
My email address if you need some help with GraphPad: