0% found this document useful (0 votes)
276 views

Paired T-Test: A Project Report On

The document is a project report on conducting a paired t-test. It discusses key topics such as: 1) What a paired t-test is and when it is used, such as to compare means from before-after studies with matched pairs. 2) The assumptions of a paired t-test, including that the differences between pairs are independent and normally distributed. 3) The steps for conducting a paired t-test, including setting hypotheses, selecting a significance level, calculating the test statistic, and making decisions based on comparing the statistic to critical values.

Uploaded by

Tarun kumar
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
276 views

Paired T-Test: A Project Report On

The document is a project report on conducting a paired t-test. It discusses key topics such as: 1) What a paired t-test is and when it is used, such as to compare means from before-after studies with matched pairs. 2) The assumptions of a paired t-test, including that the differences between pairs are independent and normally distributed. 3) The steps for conducting a paired t-test, including setting hypotheses, selecting a significance level, calculating the test statistic, and making decisions based on comparing the statistic to critical values.

Uploaded by

Tarun kumar
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 19

A Project Report

On

Paired T-Test

Submitted To: Submitted By: 1. Vaibhav Nagpal (38)

Dr Sumeet Singh Jasial 2. Tarun Kumar (28)

3.Sherin Alex (47)

4. Sourav Das (26)


CHAPTER 1-INTRODUCTION

1.1 Paired T- Test


● Paired sample t-test is a statistical technique that is used to compare two population
means in the case of two samples that are correlated.

● Paired sample t-test is used in ‘before-after’ studies, or when the samples are the
matched pairs, or when it is a case-control study.

● Paired T test is based on the differences between the values of each pair that is one
subtracted from the other. In the formula for a paired t-test, this difference is notated
as d. Formula of the paired t test is the ratio of the sum of the differences of each pair
to the square root of n times the sum of the differences squared minus the sum of the
squared differences, all over n - 1.

For example, if we give training to a company employee and we want to know whether or not
the training had any impact on the efficiency of the employee, we could use the paired
sample test. We collect data from the employee on a seven-scale rating, before the training
and after the training. By using the paired sample t-test, we can statistically conclude whether
or not the training has improved the efficiency of the employee. In medicine, by using the
paired sample t-test, we can figure out whether or not a medicine will cure the particular
illness.

Let us assume two paired sets Xi and Yi for i = 1, 2…. n such that their paired difference is
independent and identical and normally distributed. Then the paired t test determines whether
they differ from each other in a significant way.

The paired t-test, also referred to as the paired-samples t-test or dependent t-test, is used to
determine whether the mean of a dependent variable (e.g., weight, anxiety level, salary,
reaction time, etc.) is the same in two related groups (e.g., two groups of participants that are
measured at two different "time points" or who undergo two different "conditions"). For
example, you could use a paired t-test to understand whether there was a difference in
managers' salaries before and after undertaking a PhD (i.e., your dependent variable would be
"salary", and your two related groups would be the two different "time points"; that is,
salaries "before" and "after" undertaking the PhD). Alternately, you could use a paired t-test
to understand whether there was a difference in smokers' daily cigarette consumption 6 week
after wearing nicotine patches compared with wearing patches that did not contain nicotine,
known as a "placebo" (i.e., your dependent variable would be "daily cigarette consumption",
and your two related groups would be the two different "conditions" participants were
exposed to; that is, cigarette consumption values after wearing "nicotine patches" (the
treatment group) compared to after wearing the "placebo" (the control group)). Specifically,
you use a paired t-test to determine whether the mean difference between two groups is
statistically significantly different to zero.
Degrees of Freedom under paired t-test
The degrees of freedom (df) of a set of data relates to the number of values there are in that
data.

1. For a paired t-test, df is the number of People minus 1

2. For an independent t-test, df is the number of People in sample 1 plus the number of
People in sample 2 minus 2.

Level of Significance for Paired t-test


The level of significance, usually denoted by ‘alpha’, is specified before the samples are
drawn, so that the results obtained should not influence the choice of the decision maker. The
level of significance is specified in terms of the likelihood (probability) of rejecting a null
hypothesis when it is true, i.e. it is the risk that a decision maker takes of rejecting the null
hypothesis when it is really true.

Usually Alpha =.05,

Is considered for consumer research projects,

For quality assurance is .01, for political polling it is .10


1.2 Paired versus Unpaired T-Test
When studying about paired and unpaired t-test, the similarity between both is that both
assume data from the normal distribution.

Characteristics of Unpaired T-Test:


● The two groups taken should be independent.

● The sample size of the two groups need not be equal.

● It compares the mean of the data of the two groups.

● 95% confidence interval for the mean difference is calculated.

Characteristics of Paired T-Test:


● The data is taken from subjects who have been measured twice.

● 95% confidence interval is derived from the difference between the two sets of paired
observations.

1.3 Hypotheses
Like many statistical procedures, the paired sample t-test has two competing hypotheses, the
null hypothesis and the alternative hypothesis. The null hypothesis assumes that the true
mean difference between the paired samples is zero. Under this model, all observable
differences are explained by random variation. Conversely, the alternative hypothesis
assumes that the true mean difference between the paired samples is not equal to zero. The
alternative hypothesis can take one of several forms depending on the expected outcome. If
the direction of the difference does not matter, a two-tailed hypothesis is used. Otherwise, an
upper-tailed or lower-tailed hypothesis can be used to increase the power of the test. The null
hypothesis remains the same for each type of alternative hypothesis. The paired sample t-test
hypotheses are formally defined below:
1. The null hypothesis (H0) assumes that the true mean difference (μd) is equal to zero.
2. The two-tailed alternative hypothesis (H1) assumes that μd is not equal to zero.
3. The upper-tailed alternative hypothesis (H1) assumes that μd is greater than zero.
4. The lower-tailed alternative hypothesis (H1) assumes that μd is less than zero.
The mathematical representations of the null and alternative hypotheses are defined below:
 H0: μd = 0
 H1: μd ≠ 0 (two-tailed)
 H1: μd > 0 (upper-tailed)
 H1: μd < 0 (lower-tailed)
Note. It is important to remember that hypotheses are never about data, they are about the
processes which produce the data. In the formulas above, the value of μd is unknown. The
goal of hypothesis testing is to determine the hypothesis (null or alternative) with which the
data are more consistent.
1.4 STEPS FOR CONDUCTING PAIRED t-Test:
1. Set up hypothesis: We set up two hypotheses. The first is the null hypothesis, which
assumes that the mean of two paired samples are equal. The second hypothesis will be an
alternative hypothesis, which assumes that the means of two paired samples are not equal.

2. Select the level of significance: After making the hypothesis, we set the level of
significance. In most of the cases, significance level is set at 5%, (in medicine, the
significance level is set at 1% because of the nature being a critical one.)

3. Calculate the parameter: To calculate the parameter the following formula is used:

Where d bar is the mean difference between two samples, s² is the sample variance, n is the
sample size and t is a paired sample t-test with n-1 degrees of freedom.

An alternate formula for paired sample t-test is:

4. Testing of hypothesis or decision making: After calculating the parameter, we will


compare the calculated value with the table value.

# If the calculated value is greater than the table value, then we will reject the null
hypothesis for the paired sample t-test. If the calculated value is less than the table
value, then we will accept the null hypothesis and say that there is no significant mean
difference between the two paired samples.
Assumptions for paired t-test:
1. The dependent variable should be measured at the interval or “ratio level (i.e. they are
continuous)

2. Independent variable should consist of two categories, "related groups" or "matched


pairs".

3. Only the matched pairs can be used to perform the test.

4. Normal distributions are assumed.

5. The variance of two samples is equal.

6. Cases must be independent of each other.

7. There should be no significant outliers in the differences between the two related groups.

1.5 STEPS FOR CONDUCTING PAIRED T-TEST USING SPSS


1. Enter the data

The data needs to be entered in SPSS in two columns, where one column indicates the pre-
mark and the other has the post-mark – see right (a third column provides the case identity
numbers). For the paired samples t-test to be valid the differences between the paired values
should be approximately normally distributed. To calculate the differences between pre- and
post-marks, from the Data Editor in SPSS, choose Transform - Compute Variable and
complete the boxes as shown below right.

2. Check the test assumptions

The normality of Diff should first be checked– see Checking normality for parametric tests
worksheet.

There is no evidence for us to suspect that the data is not normally distributed.
3. Running the paired samples t-test

• Select Analyze – Compare Means – Paired Samples T-test:

• Select the two paired variables as the Paired Variables, selecting the after variable first
(Post), followed by the before variable (Pre) as shown below and click OK.
1.5 Assumptions
There are four "assumptions" that underpin the paired t-test. If any of these four assumptions
are not met, you cannot analyse your data using a paired t-test because you will not get a
valid result. Since assumptions #1 and #2 relate to your study design and choice of variables,
they cannot be tested for using Stata. However, you should decide whether your study meets
these assumptions before moving on.

o Assumption #1: Your dependent variable should be measured at


the interval or ratio level (i.e., they are continuous). Examples of such dependent
variables include height (measured in feet and inches), temperature (measured in oC),
salary (measured in US dollars), revision time (measured in hours), intelligence
(measured using IQ score), reaction time (measured in milliseconds), test performance
(measured from 0 to 100), sales (measured in number of transactions per month), and
so forth.

o Assumption #2: Your independent variable should consist of two


categorical, "related groups" or "matched pairs". "Related groups" indicates that
the same subjects are present in both groups. The reason that it is possible to have the
same subjects in each group is because each subject has been measured on two
occasions on the same dependent variable. For example, you might have measured 50
participants' typing speed using a keyboard (i.e., the dependent variable) before and
after they underwent a touch-typing course designed to improve typing speed (i.e., the
two "time points" where participants' typing speed was measured – "before" and
"after" the touch-typing course – reflect the two "related groups" of the independent
variable). Since the same participants were measured at these two time points, the
groups are related. It is also common for related groups to reflect to different
conditions that all participants undergo (i.e., these conditions are sometimes called
interventions, treatments or trials). For example, 30 participants undergo a
hypnotherapy programme (condition A) and drug programme (condition B) to
determine which is more effective (if any) at treating depression.

Fortunately, you can check assumptions #3 and #4 using Stata. When moving on to
assumptions #3 and #4, we suggest testing them in this order because it represents an order
where, if a violation of the assumption is not correctable, you will no longer be able to use a
paired t-test. In fact, do not be surprised if your data fails one or more of these assumptions
since this is fairly typical when working with real-world data rather than textbook examples,
which often only show you how to carry out a paired t-test when everything goes well.
However, don’t worry because even when your data fails certain assumptions, there is often a
solution to overcome this (e.g., transforming your data or using another statistical test
instead). Just remember that if you do not check that your data meets these assumptions or
you test for them incorrectly, the results you get when running a paired t-test might not be
valid.
o Assumption #3: There should be no significant outliers in the differences between
the two related groups. An outlier is simply a single data point within your data that
does not follow the usual pattern (e.g., in a study of 100 students' IQ scores, where the
mean score was 108 with only a small variation between students, one student had a
score of 156, which is very unusual, and may even put her in the top 1% of IQ scores
globally). The problem with outliers is that they can have a negative effect on the
paired t-test, distorting the differences between the two related groups (whether
increasing or decreasing the scores on the dependent variable), which reduces the
accuracy of your results. In addition, they can affect the statistical significance of the
test. Fortunately, when using Stata to run a paired t-test on your data, you can easily
detect possible outliers.

o Assumption #4: The distribution of the differences in the dependent


variable between the two related groups should be approximately normally
distributed. We talk about the paired t-test only requiring approximately normal
data because it is quite "robust" to violations of normality, meaning that the
assumption can be a little violated and still provide valid results. You can test for
normality using the Shapiro-Wilk test of normality, which is easily tested for using
Stata.

In practice, checking for assumptions #3 and #4 will probably take up most of your time
when carrying out a paired t-test. However, it is not a difficult task, and Stata provides all the
tools you need to do this.
CHAPTER 2 - COMPANY PROFILE

Jennex established in 1996 with a mission to provide global quality stone building-products
for the international construction market. The product spectrum of Jennex includes natural
stone, marble, granite, sandstone, limestone, slate, slabs, tiles, mosaic, countertops, and cut to
size stone products.
The corporate office of Jennex is located in New Delhi, the Indian national capital. Its
associate companies are Jennex Granites Industries Pvt. Ltd., Jennex International Exports,
Stonex, Jennex International and Jennex Rocks and Minerals Pvt. Ltd., Jennex has
established purely export-oriented manufacturing units equipped with state-of-the-art stone
works technology at Hosur in the state of Tamil Nadu. Jennex brings together the latest
technology and impeccable professionalism and dedication to quality.
Under the leadership of Mr. Yogesh Anand, Jennex has grown steadily since its inception and
is a reputed name among Indian stone exporters today. It has a strong and growing clientele
across UK, the USA, Canada, South Africa, Australia, the Middle East and the Near East. A
regular recipient of the (CAPEXIL) Chemical and Allied Products Exports Promotion
Council award for excellence, Jennex has become a hallmark of quality stone products across
the globeabc.
At Jennex, we focus on two areas; firstly, on the customers and secondly, continuing with the
investments across our businesses and workforce in order to maximize productivity that will
sustain our product quality at all times.
As one of the prominent developers and exporters of marbles, granites and natural stones, our
priority is to ensure that we continue to provide existing and future customers with the
service and support that is required. To us, Customer Service is not a department, but an
attitude!
Current status of Company is – Active
Their products are confirming to various International Standards - BS1139, EN74, ANZ, JIS
and ARAMCO depending on customer requirements.

Company Factsheet
Nature of business: Manufacturer

Total no. Of employees: 100 to 150 people

Year of establishment: 1996

Company CEO: Mr. Yogesh Anand

Legal status of firm: Private Ltd. Co. Registered under Companies Act 1956

Annual Turnover: more than Rs. 80.00 crore

The company has marked its presence in various countries and covers the market of Middle
East, united states, Canada, Singapore and India too.

It has its production unit situated at

Noida special economic zone

Noida-201305, (U.P.)
CHAPTER 3-Hypothetical Problem
We have collected the data of the company. In total there are around 100 employees in the
company. We have selected a sample of 20 people at random.

Suppose, according to the company an efficient employee is the one who can perform a
specific task assigned to him in minimum time. So, we took one or two people from various
teams and recorded the observations given to us. For reference the company sees on the basis
of the number of marble tiles pack or mixture made by the person in one hour.

So, we have tried to compare the efficiency of workers before the training and after the
training which is as follows: -

BEFORE AFTER
TRAINING TRAINING
45 70
60 75
28 50
55 72
61 70
48 52
62 65
70 78
58 55
34 52
15 40
30 40
18 20
26 52
68 80
72 98
50 62
39 50
61 91
28 60
Solution:

Step I:
Ho: Company claims that they already keep trained workers so the efficiency of workers may
not increase.

Ha: Efficiency of workers will increase.

Step II:
(BT) (AT) (AT-BT) (D-d) (D-d)2
45 70 25 9.7 94.09
60 75 15 -0.3 0.09
28 50 22 6.7 44.89
55 72 17 1.7 2.89
61 70 9 -6.3 39.69
48 52 4 -11.3 127.69
62 65 3 -12.3 151.29
70 78 8 -7.3 53.29
58 55 -3 -18.3 334.89
34 52 18 2.7 7.29
15 40 25 9.7 94.09
30 40 10 -5.3 28.09
18 20 2 -13.3 176.89
26 52 26 10.7 114.49
66 80 14 -1.3 1.69
72 98 26 10.7 114.49
50 62 12 -3.3 10.89
39 50 11 -4.3 18.49
61 91 30 14.7 216.09
28 60 32 16.7 278.89
306 1910.2

α = 5%
Step III:

d = 306/20 S.D = √100.536

=15.3 =10.02

S.E =10.02/4.47

= 2.24

t = 15.3/2.24 = 6.83

α = 0.025

df = 19

Table Value = 2.093

Conclusion: Calculated value > Table value (6.83>2.093)


Therefore, we reject Ho and accept Ha.

Hence, we can say that the efficiency of workers has improved after the training and the
claim of the company is not agreeable.
Table value:
CALCULATION USING SPSS: -
CHAPTER 4- LEARNING AND SUGGESTION
As the alternate hypothesis got accepted it is concluded that there is impact of training on the
performance of employees.

Further “d bar” value indicates that the change is positive, hence we can say there is
improvement in the efficiency of employees after providing training.

Therefore, Company’s claim is not agreeable.

Company shall focus on providing training to the workers after employing them instead of
hiring trained workers. So, the efficiency of workers can be increased regularly. They can
pack at least 100 pallets in 1hr.

Company will also get benefitted by conducting such trainings in organization.

You might also like