Paired T-Test: A Project Report On
Paired T-Test: A Project Report On
On
Paired T-Test
● Paired sample t-test is used in ‘before-after’ studies, or when the samples are the
matched pairs, or when it is a case-control study.
● Paired T test is based on the differences between the values of each pair that is one
subtracted from the other. In the formula for a paired t-test, this difference is notated
as d. Formula of the paired t test is the ratio of the sum of the differences of each pair
to the square root of n times the sum of the differences squared minus the sum of the
squared differences, all over n - 1.
For example, if we give training to a company employee and we want to know whether or not
the training had any impact on the efficiency of the employee, we could use the paired
sample test. We collect data from the employee on a seven-scale rating, before the training
and after the training. By using the paired sample t-test, we can statistically conclude whether
or not the training has improved the efficiency of the employee. In medicine, by using the
paired sample t-test, we can figure out whether or not a medicine will cure the particular
illness.
Let us assume two paired sets Xi and Yi for i = 1, 2…. n such that their paired difference is
independent and identical and normally distributed. Then the paired t test determines whether
they differ from each other in a significant way.
The paired t-test, also referred to as the paired-samples t-test or dependent t-test, is used to
determine whether the mean of a dependent variable (e.g., weight, anxiety level, salary,
reaction time, etc.) is the same in two related groups (e.g., two groups of participants that are
measured at two different "time points" or who undergo two different "conditions"). For
example, you could use a paired t-test to understand whether there was a difference in
managers' salaries before and after undertaking a PhD (i.e., your dependent variable would be
"salary", and your two related groups would be the two different "time points"; that is,
salaries "before" and "after" undertaking the PhD). Alternately, you could use a paired t-test
to understand whether there was a difference in smokers' daily cigarette consumption 6 week
after wearing nicotine patches compared with wearing patches that did not contain nicotine,
known as a "placebo" (i.e., your dependent variable would be "daily cigarette consumption",
and your two related groups would be the two different "conditions" participants were
exposed to; that is, cigarette consumption values after wearing "nicotine patches" (the
treatment group) compared to after wearing the "placebo" (the control group)). Specifically,
you use a paired t-test to determine whether the mean difference between two groups is
statistically significantly different to zero.
Degrees of Freedom under paired t-test
The degrees of freedom (df) of a set of data relates to the number of values there are in that
data.
2. For an independent t-test, df is the number of People in sample 1 plus the number of
People in sample 2 minus 2.
● 95% confidence interval is derived from the difference between the two sets of paired
observations.
1.3 Hypotheses
Like many statistical procedures, the paired sample t-test has two competing hypotheses, the
null hypothesis and the alternative hypothesis. The null hypothesis assumes that the true
mean difference between the paired samples is zero. Under this model, all observable
differences are explained by random variation. Conversely, the alternative hypothesis
assumes that the true mean difference between the paired samples is not equal to zero. The
alternative hypothesis can take one of several forms depending on the expected outcome. If
the direction of the difference does not matter, a two-tailed hypothesis is used. Otherwise, an
upper-tailed or lower-tailed hypothesis can be used to increase the power of the test. The null
hypothesis remains the same for each type of alternative hypothesis. The paired sample t-test
hypotheses are formally defined below:
1. The null hypothesis (H0) assumes that the true mean difference (μd) is equal to zero.
2. The two-tailed alternative hypothesis (H1) assumes that μd is not equal to zero.
3. The upper-tailed alternative hypothesis (H1) assumes that μd is greater than zero.
4. The lower-tailed alternative hypothesis (H1) assumes that μd is less than zero.
The mathematical representations of the null and alternative hypotheses are defined below:
H0: μd = 0
H1: μd ≠ 0 (two-tailed)
H1: μd > 0 (upper-tailed)
H1: μd < 0 (lower-tailed)
Note. It is important to remember that hypotheses are never about data, they are about the
processes which produce the data. In the formulas above, the value of μd is unknown. The
goal of hypothesis testing is to determine the hypothesis (null or alternative) with which the
data are more consistent.
1.4 STEPS FOR CONDUCTING PAIRED t-Test:
1. Set up hypothesis: We set up two hypotheses. The first is the null hypothesis, which
assumes that the mean of two paired samples are equal. The second hypothesis will be an
alternative hypothesis, which assumes that the means of two paired samples are not equal.
2. Select the level of significance: After making the hypothesis, we set the level of
significance. In most of the cases, significance level is set at 5%, (in medicine, the
significance level is set at 1% because of the nature being a critical one.)
3. Calculate the parameter: To calculate the parameter the following formula is used:
Where d bar is the mean difference between two samples, s² is the sample variance, n is the
sample size and t is a paired sample t-test with n-1 degrees of freedom.
# If the calculated value is greater than the table value, then we will reject the null
hypothesis for the paired sample t-test. If the calculated value is less than the table
value, then we will accept the null hypothesis and say that there is no significant mean
difference between the two paired samples.
Assumptions for paired t-test:
1. The dependent variable should be measured at the interval or “ratio level (i.e. they are
continuous)
7. There should be no significant outliers in the differences between the two related groups.
The data needs to be entered in SPSS in two columns, where one column indicates the pre-
mark and the other has the post-mark – see right (a third column provides the case identity
numbers). For the paired samples t-test to be valid the differences between the paired values
should be approximately normally distributed. To calculate the differences between pre- and
post-marks, from the Data Editor in SPSS, choose Transform - Compute Variable and
complete the boxes as shown below right.
The normality of Diff should first be checked– see Checking normality for parametric tests
worksheet.
There is no evidence for us to suspect that the data is not normally distributed.
3. Running the paired samples t-test
• Select the two paired variables as the Paired Variables, selecting the after variable first
(Post), followed by the before variable (Pre) as shown below and click OK.
1.5 Assumptions
There are four "assumptions" that underpin the paired t-test. If any of these four assumptions
are not met, you cannot analyse your data using a paired t-test because you will not get a
valid result. Since assumptions #1 and #2 relate to your study design and choice of variables,
they cannot be tested for using Stata. However, you should decide whether your study meets
these assumptions before moving on.
Fortunately, you can check assumptions #3 and #4 using Stata. When moving on to
assumptions #3 and #4, we suggest testing them in this order because it represents an order
where, if a violation of the assumption is not correctable, you will no longer be able to use a
paired t-test. In fact, do not be surprised if your data fails one or more of these assumptions
since this is fairly typical when working with real-world data rather than textbook examples,
which often only show you how to carry out a paired t-test when everything goes well.
However, don’t worry because even when your data fails certain assumptions, there is often a
solution to overcome this (e.g., transforming your data or using another statistical test
instead). Just remember that if you do not check that your data meets these assumptions or
you test for them incorrectly, the results you get when running a paired t-test might not be
valid.
o Assumption #3: There should be no significant outliers in the differences between
the two related groups. An outlier is simply a single data point within your data that
does not follow the usual pattern (e.g., in a study of 100 students' IQ scores, where the
mean score was 108 with only a small variation between students, one student had a
score of 156, which is very unusual, and may even put her in the top 1% of IQ scores
globally). The problem with outliers is that they can have a negative effect on the
paired t-test, distorting the differences between the two related groups (whether
increasing or decreasing the scores on the dependent variable), which reduces the
accuracy of your results. In addition, they can affect the statistical significance of the
test. Fortunately, when using Stata to run a paired t-test on your data, you can easily
detect possible outliers.
In practice, checking for assumptions #3 and #4 will probably take up most of your time
when carrying out a paired t-test. However, it is not a difficult task, and Stata provides all the
tools you need to do this.
CHAPTER 2 - COMPANY PROFILE
Jennex established in 1996 with a mission to provide global quality stone building-products
for the international construction market. The product spectrum of Jennex includes natural
stone, marble, granite, sandstone, limestone, slate, slabs, tiles, mosaic, countertops, and cut to
size stone products.
The corporate office of Jennex is located in New Delhi, the Indian national capital. Its
associate companies are Jennex Granites Industries Pvt. Ltd., Jennex International Exports,
Stonex, Jennex International and Jennex Rocks and Minerals Pvt. Ltd., Jennex has
established purely export-oriented manufacturing units equipped with state-of-the-art stone
works technology at Hosur in the state of Tamil Nadu. Jennex brings together the latest
technology and impeccable professionalism and dedication to quality.
Under the leadership of Mr. Yogesh Anand, Jennex has grown steadily since its inception and
is a reputed name among Indian stone exporters today. It has a strong and growing clientele
across UK, the USA, Canada, South Africa, Australia, the Middle East and the Near East. A
regular recipient of the (CAPEXIL) Chemical and Allied Products Exports Promotion
Council award for excellence, Jennex has become a hallmark of quality stone products across
the globeabc.
At Jennex, we focus on two areas; firstly, on the customers and secondly, continuing with the
investments across our businesses and workforce in order to maximize productivity that will
sustain our product quality at all times.
As one of the prominent developers and exporters of marbles, granites and natural stones, our
priority is to ensure that we continue to provide existing and future customers with the
service and support that is required. To us, Customer Service is not a department, but an
attitude!
Current status of Company is – Active
Their products are confirming to various International Standards - BS1139, EN74, ANZ, JIS
and ARAMCO depending on customer requirements.
Company Factsheet
Nature of business: Manufacturer
Legal status of firm: Private Ltd. Co. Registered under Companies Act 1956
The company has marked its presence in various countries and covers the market of Middle
East, united states, Canada, Singapore and India too.
Noida-201305, (U.P.)
CHAPTER 3-Hypothetical Problem
We have collected the data of the company. In total there are around 100 employees in the
company. We have selected a sample of 20 people at random.
Suppose, according to the company an efficient employee is the one who can perform a
specific task assigned to him in minimum time. So, we took one or two people from various
teams and recorded the observations given to us. For reference the company sees on the basis
of the number of marble tiles pack or mixture made by the person in one hour.
So, we have tried to compare the efficiency of workers before the training and after the
training which is as follows: -
BEFORE AFTER
TRAINING TRAINING
45 70
60 75
28 50
55 72
61 70
48 52
62 65
70 78
58 55
34 52
15 40
30 40
18 20
26 52
68 80
72 98
50 62
39 50
61 91
28 60
Solution:
Step I:
Ho: Company claims that they already keep trained workers so the efficiency of workers may
not increase.
Step II:
(BT) (AT) (AT-BT) (D-d) (D-d)2
45 70 25 9.7 94.09
60 75 15 -0.3 0.09
28 50 22 6.7 44.89
55 72 17 1.7 2.89
61 70 9 -6.3 39.69
48 52 4 -11.3 127.69
62 65 3 -12.3 151.29
70 78 8 -7.3 53.29
58 55 -3 -18.3 334.89
34 52 18 2.7 7.29
15 40 25 9.7 94.09
30 40 10 -5.3 28.09
18 20 2 -13.3 176.89
26 52 26 10.7 114.49
66 80 14 -1.3 1.69
72 98 26 10.7 114.49
50 62 12 -3.3 10.89
39 50 11 -4.3 18.49
61 91 30 14.7 216.09
28 60 32 16.7 278.89
306 1910.2
α = 5%
Step III:
=15.3 =10.02
S.E =10.02/4.47
= 2.24
t = 15.3/2.24 = 6.83
α = 0.025
df = 19
Hence, we can say that the efficiency of workers has improved after the training and the
claim of the company is not agreeable.
Table value:
CALCULATION USING SPSS: -
CHAPTER 4- LEARNING AND SUGGESTION
As the alternate hypothesis got accepted it is concluded that there is impact of training on the
performance of employees.
Further “d bar” value indicates that the change is positive, hence we can say there is
improvement in the efficiency of employees after providing training.
Company shall focus on providing training to the workers after employing them instead of
hiring trained workers. So, the efficiency of workers can be increased regularly. They can
pack at least 100 pallets in 1hr.