0% found this document useful (0 votes)
2 views9 pages

1 Np Methods

The document discusses non-parametric methods in statistics, which do not rely on strict assumptions about population parameters, unlike parametric methods. It outlines the advantages and disadvantages of non-parametric techniques, including their applicability in various data situations and their relative insensitivity to outliers. Additionally, it describes the runs test, a specific non-parametric test for assessing randomness in data sequences.

Uploaded by

Akorede Bashir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views9 pages

1 Np Methods

The document discusses non-parametric methods in statistics, which do not rely on strict assumptions about population parameters, unlike parametric methods. It outlines the advantages and disadvantages of non-parametric techniques, including their applicability in various data situations and their relative insensitivity to outliers. Additionally, it describes the runs test, a specific non-parametric test for assessing randomness in data sequences.

Uploaded by

Akorede Bashir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Non-parametric

Methods
By
Prof. S.U. Gulumbe
Introduction
1. First techniques of inference made a good many assumptions about the nature of population from
which data were drawn

• These are called “parametric”, because population values are parameters

• For examples;

• the assumptions could be that the set of scores were drawn from a normally distributed
population or

• the assumption may be that the set of scores were drawn from population having same
variance

2. Now days, a large number of techniques have been developed which do not make stringent
assumptions about parameters.

• These are called “distribution-free” or “Non-parametric” techniques

3. Parametric procedure, where ever suitable is more powerful than an approximate nonparametric
procedure.

4. Different techniques are called for, depending on the nature of data, which may be of any of the
following types: Norminal, Ordinal , Interval or Ratio

5. Examples : Norminal ( e.g sex, Colour), Ordinal (e.g status of health—good, poor, or mediocre),
Interval (e.g temperature ) and Ratio (e.g height, weight, etc)
Some advantages of NP methods
NP methods has certain desirable properties that hold under mild assumptions
regarding underlying populations from the data is obtained

1. NP methods require few assumptions about underlying populations

2. NP procedures enable user to obtain exact p-value for test, exact coverage
probability for CI, etc

3. NP techniques are usually (not always) easy to apply than the normal based theory
counter parts

4. NP methods are often easy to understand

5. NP techniques are mildly more efficient than the normal based competitors when the
underlying populations are not normal

6. NP methods are relatively insensitive to outlying observations

7. They are applicable in many situations where normal theory procedures cannot be
utilised

8. The development of computer software has facilitated fast computation of exact p-


values for individual np tests
Some disadvantages of NP Tests

1. NP tests are less powerful:

• Parametric tests use more of the information available


in a set of numbers.

• Parametric tests make use of information consistent


with interval scale of measurement, where as NP-tests
typically make use of ordinal information only.

2. Parametric tests are much more flexible, and allow you


to test a greater range of hypothesis.
Some NP Counterparts in
Parametric procedures
Parametric Non-parametric

1. Pearson Coeff of Corr Spearman Coeff of Corr

2. One Sample t-test for location Sign test, WSiRT

3. Paired t-test Sign test, WSiRT

4. Two sample test WSuRT, Mann-Whitney

5. One-way ANOVA Kruskal-Wallis Test

6. Two-way ANOVA Friedman Test


Runs Test
• This is one sample test intended to test whether a sample is a random one or
not. It is based on the number of runs which a sample exhibits .

• Definition: A run is define as a succession of uninterrupted identical state


which are followed and preceded by different state .
• The runs test (Bradley, 1968) can be used to decide if a data set is from a random process. In a
random data set, the probability that the (I+1)th value is larger or smaller than the Ith value follows
a binomial distribution, which form the basis of the runs test.

• The first step in the runs test is to count the number of runs in the data sequence. There are several ways to
define runs in the literature, however, in all cases the formulation must produce a dichotomous sequence of
values. For example, a series of 20 coin tosses might produce the following sequence of heads (H) and tails (T).
HHTTHTHHHHTHHTTTTTHH
The number of runs for this series is nine. There are 11 heads and 9 tails in the sequence.

• The hypothesis of run test is define as :

• Ho: the sequence was produced in a random manner

• H1: the sequence was not produced in a random manner


Runs test Method
Let n1 = the number of events of one type and n2= the number of events of another type. Now n1 + n2 is the total
numbers of events.

Determine the value of r, the number of runs

If both n1 and n2 are equal to or less than 20 (or some cases, less 10) tables given critical values of r under Ho for
α = 0.05

If the observed value of r falls between the critical values, we accept the Ho.

If the observed value of r is equal to or more extreme than one of the critical values, we reject Ho

• For small-sample runs test, there are tables to determine critical values that depend on values of n1 and n2.

• Two Tables given : F1 and F11

• F1 gives values of r which are so small that the probability associated with their occurrence under H0 is p = 0.025

• FII gives values of r which are so large that the probability associated with their occurrence under H0 is p = 0.025

• Any observed value of r which is equal to or less than the value shown in Fi OR is equal to or larger than the value
shown in FII is in the region of rejection for α = 0.05
Runs test on large samples
• The test statistics is given

R − R̄
Z=
SR

where R is the observed number of runs, R̄ is the expected number of runs and SR is the standard deviation
of the number of runs. The values of R̄ and SR are computed by

2n1n2
R̄ = +1
n1 + n2

2n1N − 2(2n1N − 2 − n1 − n2)


SR2 =
(n1 + n2)2(n1 + n2 − 1)

where n1 and n2 are the number of type I and type II events in the sequence respectively. Thus n1 + n2 gives
the total number of observations.

• Critical Region: The runs test rejects the null hypothesis if

| Z | > Z1−α/2

For large-sample runs test (N1 > 10 and n2 > 10 ), the test statistics is compared to a standard normal table.
That is , at 5% significant level, a test statistics with absolute value greater than 1.96 indicates non-
randomness.
Examples
• Example 1. Is the sequence of turn choices in a maze random? The sequence
are: LLLRLRRRRLLLLLLLRRRRLR

In this case there are n1 =12 and n2 =10 left and right turns in the sequence. The
number of runs is 8.

Thus, we have

2(12)(10)
R̄ = + 1 = 11.91
12 + 10

σR2 = 5.147 , σR = 2.269

The test statistic is

R − R̄ 8 − 11.9
Z= = = − 1.7232
σR 2.269

• Example 2:

You might also like