0% found this document useful (0 votes)
33 views10 pages

19 MAY - NR - Tests of Significance

Uploaded by

Esha Kutti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views10 pages

19 MAY - NR - Tests of Significance

Uploaded by

Esha Kutti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Tests of significance

In applied investigations, one is often interested to compare some characteristic [ mean, Variance] of a
group with a specified value or in comparing it between two groups. In making such comparison, one
cannot rely on mere numerical magnitudes of the index of comparison like mean or variance. The
reason is that each group is represented by a sample of observation. The numerical value will change if
another sample is drawn. This is because of sampling fluctuations. This affects the observed differences
between groups concealing the real differences. Statistical science provides an objective procedure for
distinguishing whether the observed difference signifies any real difference between the groups. Such a
procedure is called ‘test of significance ‘.

A. Student's t-test

A t-test is a statistical hypothesis test in which the test statistic follows a Student's t distribution. It can
be used when the test statistic would not follow a normal distribution. The t-test is used for small
samples for which the size ‘n ‘will be < 30.
The t-statistic was introduced in 1908 by William S Gosset.

The most frequently used t-tests [ Types] are:

1. A one sample test: to know whether the mean of a population has a value specified in a null
hypothesis.
2. A two sample test: to know whether the means of two populations are equal.
3. Another test is to see whether the difference between two responses measured on the same
statistical unit has a mean value of zero. For example, suppose we measure the size of a cancer
patient's tumor before and after a treatment. If the treatment is effective, we expect the tumor
size for many of the patients to be smaller following the treatment. This is often referred to as
the "paired" or "repeated measures" t-test.
4. A test of whether the slope of a regression line differs significantly from 0.

Tests based on Student’s ‘t’ Distribution


Testing sample mean against population mean { One sample ‘t’ test }
Null hypothesis is set as H0:  = 0. Alternate hypothesis can be any one of the following.
(1) H1 : 0 (2) H1 : <0 (3) H1 : >0
‘’, the population SD is unknown, ‘t’ is calculated as

1
x  0
 S 
 
t =  n 1  df = n-1

∑ ( x− x̄ )2
Note: The SD is calculated as S = SD = √ n , The divisor is ‘n’ only.

∑ ( x− x̄ )2
If SD calculated using S = SD = √ n−1
x−μ ˳
the formula is used the ‘t’ calculation formula will be t = S
√n
The t value calculated is compared with t 2 for  = 0.05 or = 0.01, if alternate hypothesis used is
H1 : <0 or H1 : >0 [ Known as one tailed tests]
The t value calculated is compared with t  for  = 0.05 or = 0.01, if alternate hypothesis used
is H1 : 0. [ Known as two tailed tests]
From ‘t’ tables, the values of tfor two tailed tests will be: For  = 0.05, t will be t0.05. For  = 0.01, t will
be t0.01.
From ‘t’ tables, the values of t 2for one tailed tests will be: For  = 0.05, t2 will be t0.10. For  = 0.01, t2
will be t0.02.
If the t value calculated is <t& t2 for ‘df’ (n-1), then H0 is accepted.
If the t value calculated is >t& t2 for ‘df’ (n-1), then H0 is rejected.
Write the inference compulsorily after the analysis is over.

One sample ‘t’ test Examples and Exercises


Example – This is university question
An injection given to each of 12 infected patients showed blood pressure changes as 1, 5, 2, 8, -1, 3, 0, 6,
-2, 5, 0 and 4. Is it possible to conclude that the injection in general will increase the BP ( = 0.05 and
t0.10 =1.796 and t0.05 = 2.2 for 11 df)
In this example, please note that the population value is ‘ zero ‘. We assume that the BP is not
affected normally. Hence 0 = 0

2
x 0
 S 
 
H0 :  = 0 ( 0 ) ; H1 : >0 ( 0 ); t =  n  1  ; df = (n-1);x = = 2.58, S= 2.95,
2.58
t= * ξ 11 = 2.89. ‘ t ‘ value from table, is 1.796. t > 1.796 Hence H0 is rejected.
2.95

Inference:Hence it can be concluded that the injection increases BP.

Example – This is university question


Given are observations of a sample from a Normal population. Test whether population mean is 55. Use
5 % level of significance. t0.05 = 2.776 for 4 df
47, 49, 63, 45, 53
H0 :  = 0 ( 55) H1 : 0 ( 55 )
t = 1.13. This is < table value for 5% level 2.776. Hence H 0 accepted.
Inference:Mean of population is 55

Example.
It was observed from a random sample of size 5 that the mean was 51.4 mm and SD was 6.37
mm. Test whether the mean of the population is 55 mm. (t 0.05 = 2.78 for 4 df)
x =51.4 S = 6.37n = 5 0 = 55
H0 :  = 0 (35mm) H1 : 0

51.4  55
 4
t= 6.37 = 1.13; df = 4 t < 2.78
Hence H0 is accepted. Hence it can be concluded that the mean of the population is 55 mm.
Example
A certain type of rats show a mean gain in weight of 6.5 gm during the first 6 months of life. Twelve rats
were fed with a particular diet from birth till they attained 6 months. The following weight gains were
found in them.
5.5, 6.2, 5.4, 5.8, 6.4, 6.0, 6.2, 5.9, 6.7, 6.2, 6.1, 6.5
Is there any reason to believe that the causes a decrease from the normal gain? Test using 5 % level of
significance.(t0.05 = 1.796 for 11 df)
H0 :  = 0 or H1 : <0

3
t = 3.833 and it is >t2αvalue 1.796. Hence H0 rejected.
Inference:Diet causes decrease

Exercises
1. A random sample of size 15 from a normal population yielded a mean value of 2.23 and a
variance of 7.33. Does this support the hypothesis that the population mean is ‘zero’? (t0.05 =
2.145 for 14 df)
2. Nine students are chosen at random from a group and their heights are found to be 63, 63, 64,
65, 69, 69, 70, 70, 71. Test whether the mean height may be taken to be 65. (t0.05 = 2.306 for 8
df)
3. A health survey revealed that serum protein value of children is 7.0 g/100 ml. A group of 16
children showed a mean serum value 7.4 g/100ml with SD 0.56. Test the sample mean against
the population mean. Table value of ‘t’ is 2.31
4. Ten weights ten objects observed were 63, 63, 64, 65, 66, 69, 69, 70, 70, 71. Test whether the
population mean is 66. (t0.05 = 2.262 for 9 df)

Paired ‘t’ test

A paired t-test is used to compare two population means where we have two samples in which
observations in one sample are paired with observations in the other sample. Examples of this are:

• Before-and-after observations on the same subjects (e.g. students’ diagnostic test results before and
after a module or course).

• A comparison of two different methods of measurement or two different treatments where the
measurements/treatments are applied to the same subjects (e.g. blood pressure measurements using a
stethoscope and a Dinamap).

Procedure for carrying out a paired t-test

Suppose a sample of ‘n’ students were given a test before studying a module and then again after
completing the module. We want to find out whether, in general, the teaching leads to improvements in
students’ knowledge/skills (i.e. test scores). We can use the results from our sample of students to draw
conclusions about the impact of this module in general. Let B = test score before the module, A = test
score after the module.

Let ‘x’ denote the differences. x = Value A – Value B if ‘A’ values are generally greater than ‘B’ values.

On the other hand, if ‘A’ values are generally greater than ‘B’ values , take X = Value B – Value A.
[ Note this]

4
The hypotheses set as: H0: x = 0 H1: x ≠ 0

The procedure is as follows:

1. Calculate the difference (x = A − B) OR ( x = B − A) between the two observations on each pair, making
sure to distinguish between positive and negative differences [ positive values taken as positive &
negative values taken as negative values ].

2. Calculate the mean difference, x .

3. Calculate the standard deviation of the differences, ‘S’.

4. Calculate the t-statistic as

x 0
 S 
 
t=  n  1  ; df = (n-1)

IN THE NUMERATOR, SUBSTITUTE x VALUE . THE POPULATION VALUE IS ALWAYS “ZERO”


Use tables of the t-distribution to compare the value for ‘t’ calculated with t0.05or t0.01 for assessing
significance.

Paired ‘t’ test Examples and Exercises

Example
A group of 10 children were tested to know the number of digits they can memorize to repeat them
after hearing them once. A test was given initially and the performance of them was noted. The students
were then given a special training and the test was conducted after a week. The performance was then
noted. Is there any significant effect due to the training? t 0.05 = 2.262 for 9 df
Test 1: 6 5 4 7 8 6 7 5 6 8
Test 2: 7 7 6 7 9 6 8 6 6 10
H0: x = 0 H1: x ≠ 0

The differences ‘x’ are 1, 2, 2, 0, 1, 0, 1, 1, 0 , 2


x = 1 and S = 0.775
x 0
 S 
 
t =  n  1  = 3.87 [ only positive values taken ]
t0.05 = 2.262. ‘t’ is > t0.05 and hence we null hypothesis H0 is rejected
Inference:Training is effective

5
Example
A sleeping drug and a neutral control preparation was administered in a selected 10 patients from a
hospital. The result regarding the numbers hours they slept are given below. Test
whether the drug has any greater effect than the control preparation. Use 5 % level of significance. t 0.05
= 2.26 for 9 df
Drug: 10.6 7.5 9 5.4 6.1 10.2 7.1 9.7 8.4 7.9
Control: 8.6 7.3 9.4 5.1 5.4 9 6.5 7.9 8.7 6.9
H0: x = 0 H1: x ≠ 0; t = 2.78. t0.05 = 2.26. df = 9. ‘t’ is > t 0.05. Hence H0 is rejected.

Inference:Drug is better than control preparation


Exercises
1. The scores in the 1st and 5th tests are given below. Before the 5 th test, a coaching was given to
the participants. Test whether there is improvement in the scores. t0.05 = 2.262 for 9 df.

1st test: 50 52 53 60 65 67 48 69 72 80
2nd test: 65 55 65 65 60 67 49 82 74 86
2. A weight reduction teaching was conducted. The weights before and after of 10 people were

Before: 86 92 100 93 88 80 88 92 95 106


After: 77 84 92 87 80 74 80 85 95 96
Test whether there is improvement in the weights. t0.05 = 2.262 for 9 df.

Hypothesis testing procedure / Significance test Procedure – ESSAY QUESTION

It is a method for testing a claim about a parameter in a population, using data measured in a sample.
Through this, we test some hypothesis by determining the likelihood that, a sample statistic could have
been selected from the population.
The method of hypothesis testing can be summarized in four steps.
Step 1: State the hypotheses.
Step 2: Set the criteria for a decision.
Step 3: Decide the appropriate test criterion
Step 4: Calculate the value of the test statistic using the appropriate formula
Step 5: Decide

Step 1:

Stating the hypotheses

We begin by stating the value of a population mean in a null hypothesis, which we presume is true. The
null hypothesis (H0) is a statement about a population parameter, such as the population mean, that is
assumed to be true.

6
We will test whether the value stated in the null hypothesis is likely to be true or it is contradicted. That
is, we test whether the parameter is less than, greater than, or not equal to the value stated in the null
hypothesis.
One of these will be stated in the alternate hypothesis. The alternative hypothesis is thus a rival
hypothesis which states what we think is wrong about the null hypothesis.

Step 2:

Setting the criteria for a decision

To set the criteria for a decision, we state the level of significance for the test. This level is typically set
at 5% [ 0.05] or 1 % [ 0.01], in research studies.

Step 3:

Decide the test criterion

The test statistic is a mathematical formula which allows researchers to determine the likelihood of the
sample to have come from the population if the null hypothesis were true.

The value of the test statistic is used to decide regarding the null hypothesis.

The common test criterions are

Z test
‘t’ test
Chi square test
F test

Step 4:

Obtaining the value of the test statistic based on the selected test criterion.

Step 5:

Take a decision

For this the value of the statistic is compared with the corresponding value in the standard statistical
table.

If calculated statistic value is LESS THAN OR EQUAL to the table value, the Null Hypothesis is accepted

If calculated statistic value is GREATER THAN the table value, the Null Hypothesis is rejected

Parametric and Non - Parametric Methods in Statistical analysis

7
Parametric methods deal with the estimation of population parameters (like the mean).

The non-parametric methods are distribution free methods. They rely on ordering (ranking) of
observations.

More specifically, the data distribution is significant in the choice between parametric and non-
parametric procedures.

If it is proved that the populations are normally distributed, then parametric methods can be used.

If we are not sure or we suspect that they do not behave normally, then we use non-parametric
methods.

Similarly, the scale of the data is important.

Categorical (nominal) or ordinal scale data demand the use of non-parametric methods.

Even if the interval and ratio scale data where population normality cannot be assumed, the non-
parametric methods must be used. But it cannot be vice versa.

A few common parametric studies used are:

Correlation and regression studies

1. Testing Correlation Coefficient

2. Testing Regression coefficient

For Large Samples [‘n’ ≥ 30]:

1. Comparison of Population mean and sample mean – ‘Z’ test

2. Comparison of Means of two independent samples – ‘Z’ test

3. Paired test for testing difference between two related samples – ‘Z’ test

4. Comparison of Population proportion and sample proportion – ‘Z’ test

5. Comparison of proportions of two independent samples – ‘Z’ test

For Small samples [‘n’ < 30]:

1. Comparison of Population mean and sample mean – ‘t’ test

2. Comparison of Means of two independent samples – ‘t’ test

8
3. Paired ‘t’ test for testing difference between two related samples.

‘F’ test:

1. Testing Variances – one sample and two samples

2. Analysis of Variance - ANOVA tests.

3. Multiple analysis of Variance - MANOVA tests

4. Analysis of Covariance – ANCOVA tests

A few common non-parametric studies used are:

1. Wilcoxon rank-sum test is to test difference between two independent variables

2. Wilcoxon sign-rank test tests the difference between two related variables

3. Sign test tests whether two related variables are different

4. Median test

5. Chi-square test

6. Kruskal Wallis test

7. Mann-Whitney test

8. Kolmogorov-Smirnov test

9. McNemar tests

Statistical Packages

Statistical packages are collections of software designed to aid in statistical analysis. Most of the
quantitative and statistical analysis relies upon statistical packages for its execution. An understanding of
statistical packages is essential to correct and efficient application of many quantitative and statistical
methods
Some important packages are STATISTICA, SPSS, SAS, MATLAB, Minitab, R soft Ware, StatistiXL,
Vassarstat.net.

A. SPSS PACKAGE

SPSS stands for Statistical Package for the Social Sciences. First version of SPSS was released in 1968.
Company Logo SPSS is now owned by IBM. IBM SPSS Statistics 21.0 was released on August 2012.
With SPSS we can analyse data in three basic ways.

9
Describe data using descriptive statistics
Examine Relationships between variables
Compare groups to determine if there are significant difference between these groups example; t-test,
ANOVA.
There are two different windows in SPSS
Data Editor Window.: In data editor we can create variables, enter data and carry out statistical
functions.
Output Viewer Window: Output window shows what results are produced by analysing the functions

Prepared By
Mr. Balasubramanian N K
External Faculty, Statistics

10

You might also like