0% found this document useful (0 votes)
14 views

RM File

Uploaded by

rahullal11122
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

RM File

Uploaded by

rahullal11122
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 52

Business Research Methodology Practical File

Submitted in partial fulfilment of the requirements for the


award of the degree of
Bachelor of Business Administration (G)

Sri Guru Tegh Bahadur Institute of


Management & Information Technology

Affliated to Guru Gobind Singh Indraprastha University,


Delhi (2022-2025)

Submitted to – Ms. Mansi Ahuja Submitted by – I


Enroll No.-
rd
3 semester (Morning Shift)

INDEX
S.NO. Particulars Page No.

1
1. Worksheet -1 7-11
Frequency Distribution

2. Worksheet – 2 12-15
Measures of Central Tendency

3. Worksheet – 3 16-20
Outlier Testing

4. Worksheet – 4 21-25
One Sample t-test

5. Worksheet – 5 26-29
Paired Sample t-test

6. Worksheet – 6 30-33
Independent Sample t-test

7. Worksheet – 7 34-41
One-way ANOVA

8. Worksheet – 8 42-48
Chi-Square test

LIST OF TABLES

S.NO. Table Particulars Page


No. No.

2
1. 1.1 Table representing data of workers in small- and 7-8
medium- scale enterprises

2. 1.2 Table representing dataset of coding of different 8


variables

3. 1.3 Table representing SPSS output of frequency 10


distribution

4. 2.1 Table representing monthly sales Figures of 50 13


consecutive months of an enterprise

5. 2.2 Table representing SPSS output of measures of 15


central tendency

6. 3.1 Table representing data of 50 players 16

7. 3.2 Table representing dataset of coding of different 17


variables

8. 3.3 Table representing SPSS output of outlier testing 19

9. 4.1 Table representing data of weight lost by 50 22


customers a month after joining the weight loss
program

10. 4.2 Table representing final SPSS output 25

11. 4.3 Table representing final SPSS output 25

12. 5.1 Table representing data of the performance of 27


employees.

13. 6.1 Table representing data relating to performance 31


score, age and gender

3
14. 6.2 Table representing SPSS Output of independent 33
sample T-Test

15. 6.3 Table representing SPSS Output of independent 33


sample T-Test

16. 7.1 Table representing data of salaries and 35


qualification

17. 7.2 Table representing dataset of coding of different 36


variables

18. 7.3 Table representing SPSS Output of One-way 38


ANOVA.

19. 7.4 Table representing SPSS Output of One-way 39


ANOVA.

20. 7.5 Table representing SPSS Output of One-way 39


ANOVA.
21 7.6 Table representing SPSS Output of One-way 40
ANOVA.

22. 8.1 Table representing codes provided to sub- 42


categories

23. 8.2 Table representing data of 100 internet users 43-44

4
24. 8.3 Table representing SPSS Output of Chi-Square 46
test

25. 8.4 Table representing SPSS Output of Chi-Square 47


test

26. 8.5 Table representing SPSS Output of Chi-Square 47


test

27. 8.6 Table representing SPSS Output of Chi-Square 47


test

LIST OF FIGURES
S.NO. Figure No. Particulars Page
No.
1. 1.1 Figure representing screenshot of frequency 9
distribution (step 1)

5
2. 1.2 Figure representing screenshot of frequency 9
distribution (step 2)

3. 1.3 Figure representing screenshot of frequency 10


distribution (step 3)

4. 1.4 Figure representing bar Chart of education of 11


workers

5. 2.1 Figure representing screenshot of measures of 13


central tendency (step1)

6. 2.2 Figure representing screenshot of measures of 14


central tendency (step 2)

7. 2.3 Figure representing screenshot of measures of 14


central tendency (step 3)

8. 3.1 Figure representing screenshot of outlier 17


testing (step 1)

9. 3.2 Figure representing screenshot of outlier 18


testing (step 2)

10. 3.3 Figure representing screenshot of outlier 18


testing (step 3)

11. 3.4 Figure representing screenshot of SPSS 19


output-box plot diagram

12. 4.1 Figure representing screenshot of One sample 23


T-Test (step 1)

13. 4.2 Figure representing screenshot of one sample 23


T-Test (step 2)
14. 4.3 Figure representing screenshot of one sample 24
T-Test (step 3)

6
15. 4.4 Figure representing screenshot of one sample 24
T-Test (step 4)
16. 5.1 Figure representing screenshot of distribution 27
of before and after training.

17. 5.2 Figure representing screenshot of transfer of 28


variables

18. 6.1 Figure representing screenshot of independent 31


sample T-Test (step 1)

19 6.2 Figure representing screenshot of independent 32


sample T-Test (step 2)

20. 6.3 Figure representing screenshot of independent 32


sample T-Test (step 3)

21. 7.1 Figure representing screenshot of one-way 36


ANOVA (step 1)
22. 7.2 Figure representing screenshot of one-way 37
ANOVA (step 2)
23. 7.3 Figure representing screenshot of one-way 37
ANOVA (step 3)
24. 7.4 Figure representing screenshot of one-way 38
ANOVA (step 4)
25. 7.5 Figure representing screenshot of SPSS 40
Output-Graphical form.

26. 8.1 Figure representing screenshot of chi-square 44


test (step 1)

27. 8.2 Figure representing screenshot of chi-square 45


test (step 2)

7
28. 8.3 Figure representing screenshot of chi-square 45
test (step 3)

29. 8.4 Figure representing screenshot of chi-square 46


test (step 4)

8
WORKSHEET - 1 Frequency Distribution
Frequency distribution is a method of displaying the frequency (number of times
a particular value of variable repeats in the data) of different values of a variable
in a dataset. It represents the counts of all outcomes of variable in a sample. The
frequency distribution of a variable can be represented in tabular as well as
graphical forms.

Frequency distribution is very common and important method of analyzing the


nominal (categorical) and ordinal (ranking) variables in a dataset. In every
questionnaire, one section is dedicated to demographic profiles. The different
categories of demographic profiles in a dataset are normally represented by
frequency distribution in a tabular as well as graphical forms.

Objective: To calculate frequency distribution and present bar chart of


education profiles of the work.

Dataset of workers working in small and medium scale enterprises in city of


India is shown in Table 1.1.

Table 1.1: Data of workers in small- and medium- scale enterprises

S.NO. Gender Age Religion Education S.NO. Gender Age Religion Education
Group Group
1 1 1 3 2 17 1 1 1 5
2 1 4 2 1 18 1 5 1 5
3 1 3 3 4 19 1 5 2 2
4 1 3 1 3 20 1 2 2 5
5 2 4 1 1 21 2 5 2 2
6 1 4 1 1 22 1 2 2 1
7 2 2 1 1 23 1 2 3 1
8 1 2 3 1 24 1 2 1 5
9 1 2 2 1 25 2 5 2 5
10 2 2 2 2 26 1 3 3 2
11 1 3 1 2 27 1 1 1 2
12 1 3 1 3 28 1 2 2 2

9
13 1 4 1 4 29 1 2 2 4
14 2 1 2 3 30 1 2 2 2
15 1 5 2 2 31 1 3 3 5
16 2 2 2 2 32 1 2 2 1
S.NO. Gender Age Religion Education S.NO. Gender Age Religion Education
Group Group
33 2 2 2 2 42 1 2 3 1
34 1 5 2 1 43 1 3 2 1
35 2 5 1 2 44 1 5 2 5
36 2 5 2 3 45 1 2 1 2
37 2 2 3 4 46 2 5 2 3
38 2 5 2 32 47 2 2 1 2
39 1 3 3 3 48 1 3 3 4
40 1 5 2 2 49 1 4 2 4
41 1 2 1 1 50 2 1 1 1

The coding details of different variables in the dataset are shown below table 1.2.

Table 1.2: Dataset of coding of different variables

Variables Numeric codes

Gender 1= Male

2= Female

Age Group 1= Less than 25 yrs. old

2= 26-35 yrs.

3= 36-45 yrs.

4= 46-55 yrs.

5= 56 and above

Religion 1= Hindu
2= Muslim

10
3= Other Religion

Education 1= Below 10th

2= High School

3= Intermediate

4= Technical Diploma

5= Degree Level

SPSS Commands
Step 1: Click Analyze Descriptive statistics Frequency

Figure 1.1: Screenshot of frequency distribution (step 1)

Step 2: Transfer the variable education to variable window.

11
Figure 1.2: Screenshot of frequency distribution (step 2)
Step 3: Select the type of chart.
Example: Bar chart as shown in the Figure 1.3.

Figure 1.3: Screenshot of frequency distribution (step 3)


Step 4: Finally click “continue” and the “ok”.

12
The final SPSS output in tabular form is shown below in Table 1.3.

Table 1.3: SPSS output of frequency distribution

Education Of Workers

Frequency Percent Valid Cumulative


Percent Percent
Below 10 13 26.0 26.0 26.0
High School 18 36.0 36.0 62.0
Intermediate 6 12.0 12.0 74.0
Valid Technical 6 12.0 12.0 86.0

Diploma
Degree Level 7 14.0 14.0 100.0
Total 50 100.0 100.0

SPSS output in graphical form is shown below in Figure 1.4.

13
Figure 1.4: Bar Chart of education of workers

Conclusion: The education level of 50 different workers are calculated and


found that the number of workers below 10 th grade, high school, intermediate,
technical diploma and degree level are 13(26%), 18(36%), 6(12%), 6(12%) and
7(14%) respectively.

WORKSHEET – 2 Measures Of Central Tendency

There are three main measures of central tendency. These are as follows:

• Arithmetic mean
• Median
• Mode
Let us discuss these three in detail.

14
Arithmetic Mean
The mean of variable represents its average value. It can be calculated by
using the following formula:
Where, represents the mean and fi represents the frequency of an i th
observation of the variable.
One of the problems with arithmetic mean is that it is highly sensitive to
the presence of outliers in the data of the related variable. To avoid this
problem, the trimmed mean of the variable can be estimated. Trimmed
mean is the value of the mean of a variable after removing some extreme
observation (example 2.5 percent from both the tails of the distribution)
from the frequency distribution.
Median
Median is known as the ‘positional average’ of a variable. If we arrange
the observations of a variable in an ascending or descending order, the
value of the observation that lies in the middle of the series is known as
median. The value of the median divides the observations of a variable
into two equal halves. Half of the observations of the variable are higher
than the median value and the other half observations are lower than
median value. The extension of median are quartiles, deciles, and
percentiles.
Mode
The mode of a variable is the observation with the highest frequency or
highest concentration of frequencies.
Objective: To calculate mean, median, mode ad quartile of monthly sales
of company.

Dataset of monthly sales figures (in crores) of an enterprise for 50


consecutives are given in Table 2.1.

15
Table 2.1: Monthly sales Figures of 50 consecutive months of an
enterprise
Month Sales Month Sale Month Sale Month Sale Month Sale Month Sale Month Sale

1 60 9 70 17 15 25 120 33 68 41 34 49 89

2 70 10 65 18 40 25 130 34 65 42 56 50 100

3 45 11 54 19 54 27 23 35 70 43 97

4 90 12 72 20 56 28 32 36 60 44 34

5 110 13 45 21 25 29 54 37 30 45 54

6 40 14 24 22 43 30 34 38 40 46 70

7 90 15 12 23 56 31 45 39 110 47 98

8 50 16 8 24 120 32 49 40 150 48 45

SPSS Commands
Step 1: Click Analyze Descriptive statistics Frequency

Figure 2.1: Screenshot of measures of central tendency (step 1)

16
Step 2: Transfer the variable to variable window and click ‘statistics’ as shown
in the Figure 2.2.

Figure 2.2: Screenshot of measures of central tendency (step 2)


Step 3: Select the option ‘mean’, ‘median’, ‘mode’ and ‘quartiles’ and click
‘continue’ and then ‘ok’ as shown in Figure 2.3.

Figure 2.3: Screenshot of measures of central tendency (step 3)

SPSS output is shown in Table 2.2

17
Table 2.2: SPSS output of measures of central tendency

Statistics
sales of an enterprise

Valid 50
N
Missing 0
Mean 61.42
Median 55.00
Mode 45a
25 40.00

Percentiles 50 55.00

75 76.25
a. Multiple modes exist. The smallest value is shown

Conclusion

Table 2.2 represents SPSS output.

Mean value of sales figure is 61.42.

Mode value of sales figure is 45a.

Percentile (25) value is – 40.00.

Percentile (50) value is – 55.00.

Percentile (75) value is – 76.25.

18
WORKSHEET – 3 Outlier Testing

Outlier are:

• The extreme observations lying in the tails of the probability distribution


of the variables.
• The observations with the highest residuals for a relation model
(regression model).
• The observations that, if not included in the analysis, cause a significant
difference in the result.

On the basis of the cases mentioned above, outliers can be divided into three
different types:

1. Extreme values or univariate outliers


2. Multivariate outliers
3. Influencers

Two popular methods of detecting outliers are

1. Extreme values
2. Box plot

Objective: To detect if any outlier(s) is present in the given data.

Dataset of 50 players are given in Table 3.1

S.NO Gender Age Sports Hours S.NO Gender Age Sports Hours S.NO. Gender Age Sports Hours

1 1 1 1 2.0 12 2 2 4 1.5 23 1 2 1 4.0

2 1 2 2 3.0 13 2 3 4 1.5 24 2 3 2 2.0

3 2 1 3 4.0 14 2 1 1 3.5 25 1 3 2 3.0

4 1 3 4 4.5 15 1 2 3 1.5 26 1 2 2 4.5

5 1 4 5 2.5 16 1 3 1 2.0 27 2 3 2 4.0

6 2 1 2 3.0 17 1 3 1 2.0 28 1 2 4 4.0

19
7 1 2 2 2.5 18 1 3 5 2.5 29 1 3 3 5.0

8 1 3 3 5 19 2 1 2 3.5 30 1 1 1 1.0

9 2 2 5 2.0 20 1 1 4 3.0

10 2 2 5 1.0 21 2 2 3 3.0

11 1 2 5 1.0 22 1 2 5 13.0

Table 3.1: Data of 50 players

The coding details of different variables in the dataset are shown below table 3.2.

Table 3.2: Dataset of coding of different variables

Variables Numeric codes

Gender 1= Male

2= Female

Age 1= Under 20 yrs.

2= 20-24 yrs.

3= 25-29 yrs.

Sports 1 = Badminton

2= Hockey

3= Cricket

4= Rugby

5= Football

SPSS Commands
Step 1: Click analyze Descriptive Statistics Explore

20
Figure 3.1: Screenshot of outlier testing (step 1)

Step 2: Send the hours spend variable in the dependent list and then click
statistics.

Figure 3.2: Screenshot of outlier testing (step 2)

21
Step 3: Select ‘outliers’ and click ‘continue’ as shown in Figure 3.3.

Figure 3.3: Screenshot of outlier testing (step 3).

The required output is shown in Table 3.3 and box plot diagram is shown in
Figure 3.4.

Table 3.3: SPSS output of outlier testing

Extreme Values
Case Number Value
1 22 13.00
2 8 5.00
Highest 3 29 5.00
4 4 4.50
Number of hours played by 5 26 4.50
the player 1 30 1.00
2 11 1.00
Lowest 3 10 1.00
4 15 1.50
5 13 1.50a

22
a. Only a partial list of cases with the value 1.50 are shown in the table of lower
extremes.

Figure 3.4: Screenshot of SPSS output-box plot diagram.

Conclusion

Table 3.3 represents SPSS output of outliers. It represents extreme high and
extreme low values of sportsman dataset. Case number 22, 8, 29, 4 and 26 have
extreme high values and case number 30, 11, 10, 15 and 13 have extreme low
values. Figure 3.4 represents that case number 22 is an outlier.
WORKSHEET – 4

Test of Difference: One Sample T-Test

In many situations, we come across claims made by marketers about their


products. For example, a car manufacturer may claim that the average mileage

23
of a car is, for say, 199.9 kmpl or a business school may claim that the average
package offered to its students is Rs. 12 lakh per annum. A researcher may be
interested in analyzing the truthfulness of these claims. For this analysis, the
researcher needs to randomly pick a small sample from the population and
compare its mean with the claimed population mean. The sample mean and the
population mean maybe different form each other. In order to test whether this
difference is statistically significant, we should apply one-sample test.

The null hypothesis of one-sample test is:

“H0: There is no significant difference between sample mean and population


mean.”

Objective: To find out the difference between population mean and sample
mean.

H0: There is no difference between population mean and sample mean. H 1:

There is difference between population mean and sample mean.

Dataset of weight lost (in figure) by 50 customers a month after joining the
weight loss program is shown in table 4.1.

S.NO. Weight lost S.NO. Weight lost S.NO. Weight lost

1 2 18 4 35 4

2 3 19 3 36 4

3 2 20 4 37 3

4 4 21 5 38 4

24
5 5 22 6 39 3

6 3 23 4 40 4

7 3 24 5 41 5

8 2 25 6 42 4

9 3 26 5 43 3

10 4 27 4 44 4

11 2 28 4 45 5

12 3 29 5 46 5

13 3 30 5 47 4

14 4 31 6 48 5

15 3 32 2 49 6

16 4 33 5 50 5

17 5 34 5

Table 4.1: Data of weight lost by 50 customers a month after joining the weight
loss program

25
Step 1: Click analyze Compare Means One sample T-Test

Figure 4.1: Screenshot of One sample T-Test (step 1)

Step 2: Transfer the variable ‘weight loss’ to test variable window.

Figure 4.2: Screenshot of one sample T-Test (step 2)

Step 3: Click ‘Options’


26
Figure 4.3: Screenshot of one sample T-Test (step 3)
Step 4: Click ‘Continue’ and then ‘ok’.

Figure 4.4: Screenshot of one sample T-Test (step 4)

The final SPSS output (statistical Package of Social Science) in tabular form is
shown below in Table 4.2 and Table 4.3 respectively.

27
Table 4.2

One-Sample Statistics

N Mean Std. Deviation Std. Error


Mean
weight loss by customer 50 4.02 1.116 .158

Table 4.3

One-Sample Test

Test Value = 0

t df Sig. (2- Mean 95% Confidence


tailed) Differenc Interval of the
e Difference
Lower Upper
weight loss by 25.48 49 .000 4.020 3.70 4.34
customer 1

When significant level is less than 0.5 then, H0 gets rejected.

Conclusion: Sample mean is 4.02 kgs which is less than the claimed population
mean of 5 kgs. The t statistics is found to be 25.481 with p value of .000. Since
the p value of t statistics is less than 5% level of significance, hence with 95%
confidence level the null hypothesis of no difference between sample mean and
population mean cannot be accepted and it can be concluded that sample mean

28
is significantly different from population mean. Therefore, the company is
making a wrong statement about the weight loss of its customers.

Worksheet – 5 Paired sample t – test

A paired sample t-test is also known as repeated sample t-test because data
(response) is collected from same respondents but at different time periods. A
paired sample t-test should be used when we want to test the impact of a event
or experiment on the variable under study. In the case, the data is collected from
the same respondents before and after the event. After this, means are
compared. The null hypothesis of paired sample t-test is that the means of pre-
sample and post- sample are equal. Some of the instances where paired sample
t-test can be applied are as follows:

a. Analyzing the effectiveness of training program on the performance of


employees of a business enterprise.

b. Analyzing the impact of a new advertisement on the sales of a product.

c. Analyzing the impact of a policy on the volatility in the stock market.

d. Analyzing the difference of the respondents of the same group to two


different treatments.

Example: The HR manager of a business enterprise wants to analyze the impact


of a training program conducted for 30 employees. The purpose of conducting
the training program was to improve the performance of employees. The
performance scores of the employees are noted before and after the training
program. Now, the paired sample t-test is applied in order to analyze the impact
of the training program.
Objective: to find out the difference between before training and after training.

The data is given in Table 5.1.

29
Table 5.1

Table 5.1: Data of the performance of employees.

Pre- Post- Pre- Post- Pre- Post-


training training training training training training
score score score score score score

56 82 38 67 65 68
45 76 44 56 53 56
56 78 76 91 49 53
34 64 34 48 42 56
56 62 38 68 53 76
42 60 42 67 58 82
43 68 83 90 34 45
56 69 72 87 43 76
70 78 47 64 45 67
56 87 48 53 65 72

Step-1: Click analyze compare means paired sample t-test.

30
Fig 5.1 Screenshot of distribution of before and after training.

Step-2: Click on the variable pre training score. Then click on the post training
variable. Now, move the paired variable into the paired variables box by
clicking on the right arrow button. Finally, click on “OK” as shown in fig 5.2.

Fig 5.2 Screenshot of transfer of variables (step 2)

Paired Samples Test

31
Paired Differences t df Sig. (2-
tailed)
Mean Std. Std. 95% Confidence
Deviati Error Interval of the
on
Mean Difference

Lower Upper

pre-training -9.5646 1.7462 - - - 29 .000


score of 17.36 0 5 20.9381 13.7951 9.94
Pair employees - 667 5
5 9
1 post training
score of
employees

SPSS Output

Paired Samples Statistics

Mean N Std. Std. Error


Deviation Mean

pre-training score of 51.4333 30 12.79192 2.33548


employees
Pair 1
post training score of 68.8000 30 12.41634 2.26690
employees

32
Paired Samples Correlations

N Correlation Sig.

pre-training score of 30 .712 .000


Pair 1 employees & post training
score of employees

Since the significance value is .002 which is less than 5% significance level, we
can state with 95% confidence level that null hypothesis is rejected. Hence,
there is significance difference between before and after training.
Worksheet – 6

Test of difference: Independent Sample T- Test

When we want to test the difference between two independent sample means,
we use independent-sample-T-Test. The independent samples may belong to the
same population or different population. Some of the instances in which the
independent samples t-test can be used are as follows:

1. Testing difference in the average level of performance between employees


with the MBA degree and employees without the MBA degree.

2. Testing difference in the average wages received by labor in two different


industries.

3. Testing difference in the average monthly sales of the two firms.

‘H0: There is no significance difference between sample means of two


independent groups.’

In SPSS, the independent samples T-Test is conducted in two stages. At stage


one, SPSS software compares variances of two samples. The statistical method
of comparing two sample variances is known as Levene’s homogeneity test of
33
variance. The null hypothesis of this test is ‘Equal variance assumed’, i.e., there
are no significant differences between the sample variances of two independent
samples. In other words, the two samples are comparable. On the basis of
Levene’s test of homogeneity, the SPSS gives two values of t-statistic. In case
of equal variances, both the values are the same. In case the sample variances
are different, the lower t-statistic value should be considered for final analysis.

Objective: To analyze the difference in the average performance of the


employees of an enterprise in the age groups, below and above 40 years of age.
H0: There is no significant difference in the average performance of the
employees for age groups below and above 40 years of age.

H1: There is a significant difference in the average performance of the


employees for age groups below and above 40 years of age.

A researcher is interested to analyze the difference in the average performance


of employees of an enterprise in different demographic profiles. He divides
employees on the basis of gender and their age group.

The data is given below in Table 6.1

Performance Gender Age Performance Gender Age Performance Gender Age


score score score

56 Male 34 56 Female 30 78 Male 54


60 Female 45 76 Female 34 87 Female 42
45 Female 40 78 Male 45 90 Male 55
65 Male 60 54 Male 34 45 Female 35
73 Male 54 87 Male 23 56 Male 54
45 Female 42 67 Female 38 76 Female 39
60 Female 55 98 Female 43 76 Male 38
34 Male 35 89 Female 72 78 Female 29
56 Male 54 54 Male 56 98 Male 60
59 Female 39 34 Male 32 34 Female 32
35 Female 38 45 Male 26 23 Male 25
65 Male 29 56 Female 34 45 Female 23

34
45 Male 60 34 Female 54 65 Male 26
58 Female 32 56 Male 34 63 Female 30
32 Female 25 76 Female 45 68 Male 34
65 Male 23 87 Male 40 87 Female 45
34 Male 26 54 Female 60

TABLE 6.1 SPSS COMMANDS


Step 1: Click Analyze Compare Means Independent sample T-
Test.

Figure 6.1: Screenshot of independent sample T-Test. (step 1)

Step 2: Send the test variable ‘performance score’ to the test variable(s) window.
Then send ‘age’ variable in the grouping variable and click ‘Define Groups’ as
shown in Figure 6.2.

35
Figure 6.2: Screenshot of independent sample T-Test (step 2)
Step 3: Now define the cut point as 40. Next click ‘continue’ as shown in Figure
6.3.

Figure 6.3: Screenshot of independent sample T-Test (step 3)

The final SPSS output (statistical package of social science) in tabular form is
shown below in Table 6.2 and Table 6.3 respectively.

36
Table 6.2: SPSS Output of independent sample T-Test

Group Statistics

age N Mean Std. Std. Error


Deviation Mean
Performance >= 40.00 22 68.8636 19.07453 4.06670
score < 40.00 28 55.0714 16.77725 3.17060

Table 6.3: SPSS Output of independent sample T-Test


Independent Samples Test

Levene's Test for Equality of t-test for Equality of Means


Variances

F Sig. t df Sig. (2-tailed) Mean Std. Error 95% Confidence Interval of


Difference Difference the
Difference
Lower Upper

Equal variances 1.408 .241 2.717 48 .009 13.79221 5.07660 3.58502 23.99940
assumed
performance.score 2.675 42.170 .011 13.79221 5.15663 3.38696 24.19746
Equal variances not
assumed

Conclusion: Average performance score for less than 40 years of age 68.86
with standard deviation 19.075 and average performance for more than 40 years
of age is 55.07 with standard deviation 16.77. Table 6.1 shows that result of
levene’s test which assumes the null hypothesis that all sample variance are
same, the significance value of 0.241 indicates that 95% level of confidence, the
null hypothesis of equal variance are accepted. Table 6.3 also shows that t
statistics is 2.717 is less than 5% level of significance. Hence, with 95% of
confidence level the null hypothesis of no significant difference in the average
performance of the employees below and above 40 years of age is not accepted.

37
Worksheet – 7

One way of ANOVA


Concept of ANOVA

Independent-samples t-test can be applied to situations where there are only two
independent samples. In other words, we can use independent-samples t-tests
for comparing the means of two populations (such as males and females). When
we have more than two independent samples, t-test is inappropriate. The
Analysis of Variance (ANOVA) has an advantage over t-test when the
researcher wants to compare the means of a large number of populations (i.e.,
three or more). ANOVA is a parametric test that is used to study the difference
among more than two groups in the datasets. It helps in explaining the amount
of variation in the dataset. In a dataset, two main types of variations can occur.
One type of variation occurs due to specific reasons. These variations are
studied separately in ANOVA to identify the actual cause of variation and help
the researcher in taking effective decisions.

In case of more than two independent samples, the ANOVA test explains three
types of variance. These are as follows:

• Total variance
• Between group variance
• Within group variance

The ANOVA test is based on the logic that if the between group variance is
significantly greater than the within group variance, it indicates that the means
of different samples are significantly different.

There are two main types of ANOVA, namely, one-way ANOVA and two-way
ANOVA. One-way ANOVA determines whether all the independent samples
(groups) have the same group means or not. On the other hand, two-way
ANOVA

38
is used when you need to study the impact of two categorical variables on a
scale variable.

Objective: To find out the difference between salaries of graduates, post


graduates and PhD’s.

H0: There is no difference between salaries of graduates, post graduates and

PhD’s. H1: There is difference between salaries of graduates, post graduates and

PhD’s.

Table 7.1: Data of salaries and qualification

Salary Qualification
65000 Postgraduate
60000 Postgraduate
45000 Graduate
40000 PhD
35000 Graduate
56000 Postgraduate
36000 PhD
45000 PhD
40000 Postgraduate
35000 Graduate
56000 PhD
36000 PhD
25000 Graduate
23000 Graduate
40000 Graduate
35000 Postgraduate
56000 PhD
36000 Postgraduate
45000 PhD

39
40000 Graduate
35000 Postgraduate
56000 PhD
37000 Postgraduate
25000 Graduate
85000 PhD
32000 Postgraduate
29000 Graduate
25000 Graduate
Table 7.2: Dataset of coding of different variables

Variables Numeric codes

Qualification 1 = Graduate

2 = Post Graduate

3 = PhD

SPSS Commands

Step 1: Click Analyze Compare Means One-way ANOVA.

40
Figure 7.1: Screenshot of one-way ANOVA (step 1)

Step 2: Transfer the variable ‘salary’ to dependent list window and variable
‘qualification’ to factor window.

Figure 7.2: Screenshot of one-way ANNOVA (step 2)

Step 3: Select ‘Post hoc’ and then click ‘Tukey’ as shown below in Figure 7.3.

41
Figure 7.3: Screenshot of one-way ANOVA (step 3)

Step 4: Click ‘options’ and select ‘Homogeneity of variance test’ and ‘Means
plot’ as shown in Figure 7.4.

Figure 7.4: Screenshot of one-way ANOVA (step 4)

42
The final SPSS output (Statistical Package of Social Science) in tabular form is
shown in Table7.3, Table 7.4, Table 7.5, Table 7.6 and Figure 7.5 respectively.

Table 7.3: SPSS Output of One-way ANOVA.

Test of Homogeneity of Variances


Salary of different qualifications

Levene Statistic df1 df2 Sig.

1.900 2 25 .171

Table 7.4: SPSS Output of One-way ANOVA.

ANOVA
Salary of different qualifications

Sum of df Mean Square F Sig.


Squares

Between 1558603571.4 2 779301785.71 5.132 .014


Groups 29 4
3796075000.0 25 151843000.00
Within Groups 00 0

5354678571.4 27
Total 29

Table 7.5: SPSS Output of One-way ANOVA.


Multiple Comparisons

Dependent Variable: salary of different qualifications


Tukey HSD
(I) different (J) different Mean Std. Sig. 95% Confidence
qualifications qualifications Differenc Error Interval

43
e (I-J) Lower Upper
Bound Bound
- 5510. .066 - 726.40
post graduate 13000.00 771 26726.4
graduate 0 0
- 5845. .015 - -3115.96
3 17675.00 056 32234.0
0* 4
13000.00 5510. .066 -726.40 26726.4
graduate
0 771 0
post graduate - 5845. .707 - 9884.04
3 4675.000 056 19234.0
4
graduate 17675.00 5845. .015 3115.96 32234.0
3 0* 056 4
.707 -9884.04 19234.0
post graduate 4675.000 5845.
056 4
*. The mean difference is significant at the 0.05 level.

Table 7.6: SPSS Output of One-way ANOVA.

Salary of different qualifications


a,b
Tukey HSD
Different qualifications N Subset for alpha = 0.05
1 2
graduate 10 32200.00
post graduate 10 45200.00 45200.00
3 8 49875.00
Sig. .080 .697
Means for groups in homogeneous subsets are displayed. a.
Uses Harmonic Mean Sample Size = 9.231.
b. The group sizes are unequal. The harmonic mean of the group sizes is used.
Type I error levels are not guaranteed.

44
Figure 7.5: Screenshot of SPSS Output-Graphical form.

Conclusion: Table 7.2 represents the Levene Test which assumes the null
hypothesis that all sample variances are same. The significance value of 0.171
indicates that 95% level of confidence the null hypothesis can be accepted. The
homogeneity of variance is one of the desired condition of one-way ANOVA
test. Table 7.3 represents the results of F test in one-way ANOVA. As shown in
Table 7.3 the p value of F statistics (5.591) is less than 5% level of significance.
Hence with 95% confidence level, the null hypothesis of equal group means
cannot be accepted. Thus it can be concluded that average salary of graduates,
post- graduates and PhD’s are not same.
Worksheet – 8 Chi-Square Test
Chi-square test is one of the most popular non-parametric tests. It is used in two
cases which are as follows:

45
• To test the association between nominal variables in research.
• To test the difference between the expected and observed frequencies of
an event.

The process of chi-square test compares the actual observed frequencies with
the calculated expected frequencies of different combinations of nominal
variables. The difference between observed and expected frequencies gives
logic of possible association between categorical variables. The chi-square
statistics compares the observed the observed count in each table cell to the
count that would be expected between the row and column classifications under
the assumptions of no associations. A negligible difference between observed
and expected frequencies may indicate no association, whereas a big difference
may indicate the possibility of association.

Objective: To analyze the association between education background and level


of formality with the internet.

H0: There is no significant difference association between education background


and level of formality with the internet.

H1: There is significant association between education background and level of


formality with the internet.

Table 8.2 has the data collected from 100 internet users. The data consists of
two nominal variables ‘Level of familiarity with the internet’ and ‘Education
Background.’ The details of the codes provided to different sub-categories
of these nominal variables are shown in Table 8.1.
Table 8.1 Codes provided to sub-categories

Codes for the variable ‘Level of Codes for variable ‘Education


familiarity with the internet’ Background’
1 = Low Familiarity 1 = Humanities
2 = Medium 2 = Management

46
3 = High 3 = Technology
4 = IT

Table 8.2: Data of 100 internet users


S.NO. Level of Education S.NO. Level of Education
familiarity with background familiarity with background
the internet the internet
1 3 1 39 1 1
2 2 3 40 2 3
3 3 1 41 2 2
4 3 1 42 1 1
5 3 4 43 2 3
6 3 4 44 2 4
7 3 1 45 2 2
8 3 1 46 3 1
9 3 1 47 3 3
10 3 3 48 2 2
11 2 1 49 3 2
12 1 1 50 2 2
13 3 1 51 1 2
14 3 1 52 2 2
15 3 3 53 1 4
16 2 4 54 3 2
17 2 2 55 2 2
18 2 4 56 2 4
19 2 2 57 1 3
20 2 4 58 1 4
21 3 1 59 3 4
22 3 1 60 3 1
23 3 4 61 1 2
24 3 1 62 1 2
25 3 2 63 2 2
26 3 2 64 1 2
27 3 4 65 2 2
28 3 3 66 1 2
29 2 2 67 2 2
30 2 1 68 2 3
31 1 3 69 1 2
32 3 2 70 3 1
47
33 2 4 71 2 2
34 3 2 72 2 3
35 2 2 73 1 1
36 1 2 74 2 2
37 2 1 75 2 2
38 2 4 76 2 1
S.NO. Level of Education S.NO. Level of Education
familiarity Background familiarity Background
with the with the
internet internet
77 1 1 89 2 1
78 2 3 90 2 4
79 1 2 91 2 1
80 1 1 92 1 3
81 1 1 93 1 4
82 1 3 94 2 1
83 1 1 95 1 1
84 1 1 96 1 3
85 1 2 97 1 2
86 1 1 98 1 1
87 2 1 99 1 1
88 1 2 100 1 3

SPSS Commands

Step 1: Click Analyze Descriptive statistics Cross Tabs.

48
Figure 8.1: Screenshot of chi-square test (step 1)

Step 2: Transfer ‘education background’ to row(s) window and ‘familiarity with


the internet’ go the column(s) window. Click statistics as shown in Figure 8.2.

Figure 8.2: Screenshot of chi-square test (step 2)

Step 3: Select the ‘chi-square’ and ‘Phi and Cramer’s V’ and click ‘continue’ as
shown in Figure 8.3.

49
Figure 8.3: Screenshot of chi-square test (step 3)

Step 4: Click on ‘Cells’ and select ‘Observed’ and ‘Expected’ and click
‘Continue’ as shown in Figure 8.4.

Figure 8.4: Screenshot of chi-square test (step 4)


The final SPSS Output (Statistical Package of Social Science) in tabular form is
shown below in Table 8.3, Table 8.4, Table 8.5 and Table 8.6 respectively.
Table 8.3: SPSS Output of Chi-Square test
50
Case Processing Summary

Cases

Valid Missing Total


N Percent N Percent N Percent

Educational 100 100.0% 0 0.0% 100 100.0%


qualification *
level of familiarity

Table 8.4: SPSS Output of Chi-square test


Educational qualification * level of familiarity Crosstabulation
Level of familiarity Total
low mediu high
familiarit m
y
humanitie Count 13 6 15 34
s Expected 11.2 12.6 10.2 34.0
Count
managem Count 11 17 6 34
ent Expected 11.2 12.6 10.2 34.0
Educational Count
qualification technolog Count 6 6 4 16
y Expected 5.3 5.9 4.8 16.0
Count
Count 3 8 5 16
it Expected 5.3 5.9 4.8 16.0
Count
Count 33 37 30 100
Total Expected 33.0 37.0 30.0 100.0
Count

Table 8.5: SPSS Output of Chi-square test

51
Chi-Square Tests

Value df Asymp. Sig. (2-


sided)

Pearson Chi-Square 11.226a 6 .082

Likelihood Ratio 12.019 6 .062

N of Valid Cases 100

a. 2 cells (16.7%) have expected count less than 5. The minimum expected
count is 4.80.

Table 8.6: SPSS Output of Chi-square test

Value Approx. Sig.


Phi .335 .082
Nominal by Nominal
Cramer's V .237 .082
N of Valid Cases 100

Symmetric Measures

Conclusion: The p value (0.082) is more than 5% level of significance which


indicates that null hypothesis of no association between education background and
level of familiarity with internet is accepted.

52

You might also like