ChiSquare Examples
ChiSquare Examples
Example 1: 152 patients were randomly assigned to 4 dose groups in a clinical study. During the
course of the study, some patients dropped out. Is there a difference in dropout rates among dose
groups?
Dropout
Yes No
Dose 10 5 35 40
20 6 29 35
40 10 28 38
80 12 27 39
33 119 152
data a1;
input dose dropout $ count @@; cards;
10 yes 5 10 no 35
20 yes 6 20 no 29
40 yes 10 40 no 28
80 yes 12 80 no 27
;
1
Testing equal rates in all doses
no yes Total
10 35 5 40
29.41 15.15
20 29 6 35
24.37 18.18
40 28 10 38
23.53 30.30
80 27 12 39
22.69 36.36
Cramer's V 0.1774
2
/* Any different inference in exact test? */
proc freq data=a1;
tables dose*dropout / fisher;
weight count;
title1 'Fishers exact test';
run;
Pr <= P 0.1888
86 patients treated with experimental drug for 3 months; pre- and post-study bilirubin
levels were recorded. Many patients exhibited abnormally high bilirubin levels.
Posttest Level
Normal High Normal High
Pre 74 12 two representations Pretest Normal 60 14
Post 66 20 Level High 6 6
χ2=2.46, P-value=0.1170
post 20 66 86
23.26 76.74
62.50 47.14
pre 12 74 86
13.95 86.05
37.50 52.86
Cramer's V 0.1195
4
/* An alternative representation of the data
*/
data a2w; input pretrt $ posttrt $ count; cards;
normal normal 60
normal high 14
high normal 6
high high 6
;
McNemars test
Statistics for Table of pretrt by posttrt
The FREQ Procedure McNemar's Test
5
/* Get equivalent results using patient-level data:
pre = 1 iff abnormally high pre-test
post = 1 iff abnormally high post-test
*/
data a2; input pre post @@; cards;
0 0 0 0 0 0 0 0 0 0 0 1
1 1 0 0 0 0 0 0 0 0 1 0
0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 1 0 0 1 0 0 0 0 0
1 0 0 0 0 0 1 1 0 1 0 1
0 0 0 0 1 0 0 0 0 0 0 0
0 0 0 1 0 1 0 0 0 0 0 1
0 0 1 0 0 0 0 0 1 1 0 0
0 0 0 1 1 0 0 0 0 1 0 0
1 1 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 1 0 1 1 1 0 0
0 0 0 1 0 0 0 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0 0 0
0 0 0 0
;
proc freq data=a2;
tables pre*post / agree norow nocol;
title1 'McNemars test, again';
run;
6
Example 3: Two tests are being considered (call them Method A and Method B) to check blood
samples and diagnose whether or not a patient has a particular condition. 100 patients provide
blood samples, and each sample is run through both methods. We are interested in whether there
is a difference in the two methods' diagnostic abilities.
/* Define data */
data a1; input methodA $ methodB $ count; cards;
Y Y 18
N Y 27
Y N 35
N N 20
;
DF 1
7
/* Define re-configured data based on column and row sums */
data a2; input method $ diagnosis $ count; cards;
A Y 53
A N 47
B Y 45
B N 55
;
Pr > S 0.8474