Using R For Nonparametric Analysis
Using R For Nonparametric Analysis
Binomial Probabilities
>pbinom(b,n,p)
will yield the value for P( B b ) when B is a binomial with n trials and P( S ) = p
Normal Probabilities
>qnorm(p,,) will yield the value y0 such that P( Y < y0 ) = p when Y is normal, mean = , std.dev. =
>pnorm(y, ,) will yield the value of P( Y < y ) when Y is normal, mean = , std.dev. =
Note: if you leave out the values for and , R will assume you want to use the standard normal Z
Example: Hypothesis Testing Example 2, Testing H0: = 5 versus H1: < 5
Find the RR for the large sample test at = .05
s = 3.1
qnorm(.05) or qnorm(.05,0,1)
[1] -1.644854 ( we used Z < - 1.645 )
n = 100
[1] 4.490095
pnorm(4.4,5,.31) or pnorm(-1.94)
[1] 0.02646547
[1] 0.02618984
pnorm(4.49,4,.31)
[1] 0.9430204
or
( we got .0262 )
pnorm(1.58)
[1] 0.9429466 ( we got .9429 )
[1] 3
means we should use B 2 >pbinom(2,15,.4) [1] 0.027114 (=.027)
[1] 9 means we should use B 10 >1-pbinom(9,15,.4) [1] 0.0338333 (=.034)
[1] 2 and >qbinom(.975,15,.4)
[1] 10 means use B 1 or 11
[1] 0.005172035 >1pbinom(10,15,.4) [1] 0.009347661 ( =.005+.009 = .014)
>binom.test(B,b,p=p0,g or l or t)
> binom.test(3,15,p=.4,"g")
Exact binomial test
data: 3 and 15 number of successes = 3, number of trials = 15, p-value = 0.9729 (.973)
alternative hypothesis: true probability of success is greater than 0.4
95 percent confidence interval:
0.05684687 1.00000000
sample estimates:
probability of success
0.2
> binom.test(3,15,p=.4,"l")
Exact binomial test
data: 3 and 15 number of successes = 3, number of trials = 15, p-value = 0.0905 (.091)
alternative hypothesis: true probability of success is less than 0.4
95 percent confidence interval:
0.0000000 0.4397844
sample estimates:
probability of success
0.2
>binom.test(3,15,p=.4,"t")
Exact binomial test
data: 3 and 15 number of successes = 3, number of trials = 15, p-value = 0.1855 (.182)
alternative hypothesis: true probability of success is not equal to 0.4
95 percent confidence interval
0.04331201 0.48089113
sample estimates:
probability of success
0.2
Example: A random sample of 200 registered voters yields 95 that say they would like to see the Health
Care Bill repealed. Using = .01, test to see if the true proportion of registered voters who would like to see
the Health Care Bill repealed is greater than, less than or different from .50.
Obtain the P-values for each of the three possible alternatives
> binom.test(95,200,.5,"g")
Exact binomial test
data: 95 and 200
number of successes = 95, number of trials = 200, p-value = 0.7816
alternative hypothesis: true probability of success is greater than 0.5
> binom.test(95,200,.5,"l")
Exact binomial test
data: 95 and 200
number of successes = 95, number of trials = 200, p-value = 0.2623
alternative hypothesis: true probability of success is less than 0.5
> binom.test(95,200,.5,"t")
Exact binomial test
data: 95 and 200
number of successes = 95, number of trials = 200, p-value = 0.5246
alternative hypothesis: true probability of success is not equal to 0.5
Find the rejection region for the exact two-tailed test at = .01
> qbinom(.005,200,.5) [1] 82 and > qbinom(.995,200,.5) [1] 118 means use B 81 or 119
> pbinom(81,200,.5) [1] 0.00436 > 1-pbinom(118,200,.5) [1] 0.00436 [=2(.00436)=.00872]
Find the RR for the large sample two-tailed test at = .01 > qnorm(.995) [1] 2.575829 B*>2.576
Find the RR in terms of B using large sample:
> qnorm(.995,100,7.07107)
[1] 118.2139
use 119
X = Before
51.2
46.5
24.1
10.2
65.3
92.1
30.3
49.2
Y = After
45.8
41.3
15.8
11.1
58.5
70.3
31.6
35.4
Z =YX
-5.4
-5.2
-8.3
0.9
-6.8
-21.8
1.3
-13.8
Z
5.4
5.2
8.3
0.9
6.8
21.8
1.3
13.8
Ri
4
3
6
1
5
8
2
7
i
0
0
0
1
0
0
1
0
>wilcox.test(y,x,paired=T,g or l or t)
[1] 0.05468
>SIGN.test(y,x,0,g or l or t)
You could use the qbinom or pbinom commands to find RR for the sign test
Example: Does a certain prescription drug affect heart rate ? A sample of 10 patients resting heart rates were
recorded ( X ). All 10 patients were given a dose of the drug in question and, after thirty minutes, their resting
heart rate was again recorded ( Y ). Use the data below to test, at = .10, to see if this drug has any effect on
resting heart rate.
Patient
1
2
3
4
5
6
7
8
9
10
68
73
75
77
78
78
80
81
84
87
70
72
80
80
79
78
79
83
81
89
Z =YX
Ri
Student
X = Before
1
20
2
21
3
25
4
26
5
32
6
27
7
38
8
34
9
28
10
20
11
29
Y = After
20
22
10
16
11
20
20
19
13
21
12
Using = .05, test to try to show that anxiety is reduced by more than 3 points by taking the course.
Exact Signed Rank Test
> x<- c(20,21,25,26,32,27,38,34,28,20,29)
> y<- c(20,22,10,16,11,20,20,19,13,21,12)
> wilcox.test(y,x,paired=T,mu=-3,"l")
Wilcoxon signed rank test with continuity correction
data: y and x
V = 7, p-value = 0.01142
(.009)
alternative hypothesis: true location shift is less than -3
Warning message:In wilcox.test.default(y, x, paired = T, mu = -3, "l") :
cannot compute exact p-value with ties
Exact Sign Test
> SIGN.test(y,x,-3,"l")
Dependent-samples Sign-Test
data: y and x
S = 3,
p-value = 0.1133
alternative hypothesis: true median difference is less than -3
Estimation of using the signed rank and the sign procedures
CI and Point estimate for the safety program data
> wilcox.test(y-x,conf.int=T,conf.level=.9)
Wilcoxon signed rank test
data: y x
V = 3,
p-value = 0.03906
alternative hypothesis: true location is not equal to 0
90 percent confidence interval:
-13.60 -2.15
sample estimates:
(pseudo)median
-6.6
> wilcox.test(y-x,conf.int=T,conf.level=.95)
95 percent confidence interval:
-14.30 -1.95
> wilcox.test(y-x,conf.int=T,conf.level=.99)
99 percent confidence interval:
-21.8 1.3
> SIGN.test(y,x,0,"t",conf.level=.9)
Dependent-samples Sign-Test
data: y and x
S = 2,
p-value = 0.2891
alternative hypothesis: true median difference is not equal to 0
90 percent confidence interval:
-13.05357143 0.07214286
sample estimates:
median of x-y
-6.1
Conf.Level
L.E.pt
U.E.pt
Lower Achieved CI 0.7109
-8.3000
-5.2000
Interpolated CI
0.9000
-13.0536
0.0721
Upper Achieved CI
0.9297
-13.8000
0.9000
> SIGN.test(y,x,0,"t",conf.level=.95)
Conf.Level
Lower Achieved CI 0.9297
Interpolated CI
0.9500
Upper Achieved CI
0.9922
-21.8
1.30
L.E.pt
-1.000
U.E.pt
3.000
Interpolated CI
Upper Achieved CI
One Sample Signed Rank and Sign
0.9900
0.9980
-2.176
-3.000
4.176
5.000
Example 1: A large bank wishes to test to see if the median time its customers spend in line waiting for a teller
is less than 3 minutes. A random sample of 11 customers was taken and their waiting times ( to the nearest
second ) were recorded. Use the data below to test the hypotheses of interest to the bank using = .05.
Zi
2:45
3:12
0:21
0:00
1:34
6:54
0:12
5:32
0:00
3:56
3:31
Zi - 3
-0:15
0:12
-2:39
-3:00
-1:26
3:54
-2:48
2:32
-3:00
0:56
0:31
Zi - 3
0:15
0:12
2:39
3:00
1:26
3:54
2:48
2:32
3:00
0:56
0:31
Rank
2
1
7
9.5
5
11
8
6
9.5
4
3
H0: = 3
H1: < 3
TS: T+ = 25
1 + 11 + 6 + 4 + 3
RR ( = .042 )
T+ 13 = 66 - 53
P-Value
P( T+ 25 ) = .260
same as P( T+ 41 )
Conf.Level
0.7734
0.9000
0.9346
L.E.pt
21.0000
13.9309
12.0000
U.E.pt
211.0000
230.6364
236.0000
Conf.Level
0.9346
0.9500
0.9883
L.E.pt
12.0000
8.5527
0.0000
U.E.pt
236.0000
263.5782
332.0000
Conf.Level
0.9883
0.9900
0.9990
L.E.pt
0
0
0
U.E.pt
332.00
345.12
414.00
> SIGN.test(z,conf.level=.95)
Lower Achieved CI
Interpolated CI
Upper Achieved CI
> SIGN.test(z,conf.level=.99)
Lower Achieved CI
Interpolated CI
Upper Achieved CI
Example 3: A random sample of 5 healthy males between the ages of 19 - 30 was taken. These males
were all non-smokers and either doctors or medical research workers. For each male in the sample their forced
vital capacity ( a measure of aerobic health ) was measured. Using the 5 values given below, find a 90 %
confidence interval for the true median forced vital capacity of males in this group. Also, find a point estimate
for the median.
n = 5 5(6) / 2 = 15 Walsh averages
P( T+ 15 ) = .031
Thus t/2 = 15 and 15 + 1 - 15 = 1
(1)
(15)
A 93.8 % C.I for would be: [ W , W ] and the point estimate for would be W(8) .
Zi
4290
5280
5280
5555
5610
4290
4290.0
4785.0
4785.0
4922.5
4950.0
5280
5280
5555
5610
5280.0
5280.0
5417.5
5445.0
5280.0
5417.5
5445.0
5555.0
5582.5
5610.0
The 93.8 % C.I. is ( 4290, 5610 ) and the point estimate for the median is 5280 .
z <- c(4290,5280,5280,5555,5610)
> wilcox.test(z,conf.int=T,conf.level=.9)
Wilcoxon signed rank test with continuity correction
90 percent confidence interval:
4290 5610
sample estimates:
(pseudo)median
5280
Warning messages:
1: In wilcox.test.default(z, conf.int = T, conf.level = 0.9) :
cannot compute exact p-value with ties
2: In wilcox.test.default(z, conf.int = T, conf.level = 0.9) :
cannot compute exact confidence interval with ties
Example 4: The values below are the effective doses of a drug for 9 different patients. Use this data to find
a 90 % confidence interval for the true median effective dose and also find a point estimate for the true median
effective dose.
n = 9 9(10) / 2 = 45 Walsh averages
P( T+ 37 ) = .049
Thus t/2 = 37 and 45 + 1 - 37 = 9
(9)
(37)
A 90.2 % C.I for would be: [ W , W ] and the point estimate for would be W(23) .
Zi
.41
.45
.52
.68
.75
.78
.82
.91
1.06
.41
.410
.430
.465
.545
.580
.595
.615
.660
.735
.45
.52
.68
.75
.78
.82
.91
1.06
.450
.485
.565
.600
.615
.635
.680
.755
.520
.600
.635
.650
.670
.715
.790
.680
.715
.730
.750
.795
.870
.750
.765
.785
.830
.905
.780
.800
.845
.920
.820
.865
.940
.910
.985
1.060
The 90.2 % C.I. is ( .580, .845 ) and the point estimate for the median is .715 .
> z <- c(.41,.45,.52,.68,.75,.78,.82,.91,1.06)
> wilcox.test(z,conf.int=T,conf.level=.9)
0.715