IntroductiontoStatisticsinPsychologyChapter6Relationshipsbetweenvariables
IntroductiontoStatisticsinPsychologyChapter6Relationshipsbetweenvariables
Relationships between
two or more variables
Diagrams and tables
Overview
z Most research in psychology involves the relationships between two or more variables.
Preparation
You should be aware of the meaning of variables, scores and the different scales of
measurement, especially the difference between nominal (category) measurement and
numerical scores.
60 PART 1 DESCRIPTIVE STATISTICS
6.1 Introduction
Although it is fundamental and important to be able to describe the characteristics of
each variable in your research both diagrammatically and numerically, interrelationships
between variables are more characteristic of research in most areas of psychology and
the social sciences. Public opinion polling is the most common use of single-variable
statistics that most of us come across. Opinion pollsters ask a whole series of questions
about political leaders and voting intentions which are generally reported separately.
However, researchers often report relationships between two variables. So, for example,
if one asks whether the voting intentions of men and women differ it is really to enquire
whether there is a relationship between the variable ‘gender’ and the variable ‘voting
intention’. Similarly, if one asks whether the popularity of the President of the USA changed
over time, this really implies that there may be a relationship between the variable ‘time’
and the variable ‘popularity of the President’. Many of these questions seem so familiar
to us that we regard them almost as common sense. Given this, we should not have any
great difficulty in understanding the concept of interrelationships among variables.
Interrelationships between variables form the bedrock of virtually all psychological
research. It is rare in psychology to have research questions which require data from
only one variable at a time. Much of psychology concerns explanations of why things
happen – what causes what – which clearly is about relationships between variables.
This chapter describes some of the main graphical and tabular methods for presenting
interrelationships between variables. Diagrams and tables often overlap in function as
will become apparent in the following discussion. We should emphasise that graphs and
tables are not simply ways of smartening up a report or dissertation. Their function in
statistical analysis is much deeper than this and they are at the heart of the analytic work
of the researcher. Graphs and tables should be the mainstay of a good statistical ana-
lysis not the end product. Their role is crucial from the start of the analysis as part of
the familiarisation process with one’s data which leads to understanding of what is going
on in the data. So looking at charts which first of all give the distributions of each of the
variables in your study is the initial stage. This can lead you to identify problems such
as very skewed distributions for a variable or bunching and clustering around particular
data points. Then you can move onto the graphs and tables which allow you to under-
stand the relationships between two variables. This may well be your first indication that
your expectations are being confirmed by your data. But it may show that the relation-
ships that you are expecting are more complex than you imagined or that there is a pos-
sibility that there are outliers which spuriously appear to create a relationship between
your variables but there is no relationship for the bulk of the data. One has to enter this
phase with an open mind since it involves getting to understand your data and becom-
ing familiar with its characteristics. This is why you do research. They may seem like
very basic procedures compared with the riches of more advanced statistics but they are
basic because they are the base from which your analysis is built. Figure 6.1 gives the
key steps to consider when describing relationships between two variables in diagram
and table form.
Table 6.1 Types of relationships based on nominal categories and numerical scores
FIGURE 6.1 Conceptual steps for showing relationships between two variables
data. If we are considering the interrelationships between two variables (X and Y) then
the types of variable involved are as shown in Table 6.1.
Once you have decided to which category your pair of variables belongs, it is easy to
suggest appropriate descriptive statistics. We have classified different situations as type
A, type B and type C. Thus type B has both variables measured on the nominal category
scale of measurement.
FIGURE 6.2 The dramatic fall in share price in the Timeshare Office Company
Time is no different, statistically speaking, from a wide range of other numerical scores.
Figure 6.3 is an example of a scattergram from a psychological study. You will see that
the essential features remain the same. In Figure 6.3, the point marked with an arrow
represents a case whose score on the X-variable is 8 and whose score on the Y-variable
is 120. It is sometimes possible to see that the points of a scattergram fall more or less
on a straight line. This line through the points of a scattergram is called the regression
line. Figure 6.3 includes the regression line for the points of the scattergram.
One complication you sometimes come across is where several points on the scattergram
overlap completely. In these circumstances you may well see a number next to a point which
corresponds to the number of overlapping points at that position on the scattergram.
In line with general mathematical notation, the horizontal axis or horizontal dimen-
sion is described as the X-axis and the vertical axis or vertical dimension is called the
Y-axis. It is helpful if you remember to label one set of scores the X scores since these
belong on the horizontal axis, and the other set of scores the Y scores because these
belong on the vertical axis (Figure 6.4).
CHAPTER 6 RELATIONSHIPS BETWEEN TWO OR MORE VARIABLES 63
FIGURE 6.4 A scattergram with the X- and Y-axes labelled and overlapping points illustrated
Variable X Variable Y
0–9 15 7 6 3 4
10–19 7 12 3 5 4
20–29 4 9 19 8 4
30–39 1 3 2 22 3
40–49 3 2 3 19 25
In Figure 6.4, overlapping points are marked not with a number but with lines around
the point on the scattergram. These are called ‘sunflowers’ – the number of ‘petals’ is the
number of cases overlapping at the same point. So if there are two ‘petals’ then there are
two people with the same pattern of scores on the two variables. If there are three
‘petals’ then three people have exactly the same pattern of scores on the two variables.
Another way of indicating overlaps is simply to put the number of overlaps next to the
scattergraph point.
Apart from cumbersomely listing all of your pairs of scores, it is often difficult to
think of a succinct way of presenting data from pairs of numerical scores in tabular
form. The main possibility is to categorise each of your score variables into ‘bands’ of
scores and express the data in terms of frequencies of occurrence in these bands; a table
like Table 6.2 might be appropriate.
Such tables are known as ‘crosstabulation’ or ‘contingency’ tables. In Table 6.2 there
does seem to be a relationship between variable X and variable Y. People with low scores
on variable X also tend to get low scores on variable Y. High scorers on variable X also
tend to score highly on variable Y. However, the trend in the table is less easily discerned
than in the equivalent scattergram.
64 PART 1 DESCRIPTIVE STATISTICS
Table 6.3 Gender and whether previously hospitalised for a set of 89 people
1 male yes
2 male no
3 male no
4 male yes
5 male no
... ... ...
85 female yes
86 female yes
87 female no
88 female no
89 female yes
Male Female
Previously hospitalised f = 20 f = 25
Not previously hospitalised f = 30 f = 14
Male Female
You probably think that Table 6.5 is not much of an improvement in clarity. An alter-
native is to express the frequencies as percentages of males and percentages of females
(Table 6.6). By presenting the percentages based on males and females separately, it is
easier to see the trend for females to have had a previous psychiatric history relatively
more frequently than males.
The same data can be expressed as a compound bar chart. In a compound bar chart
information is given about the subcategories based on a pair of variables. Figure 6.5
shows one example in which the proportions are expressed as percentages of the males
and females separately.
The golden rule for such data is to ensure that the number of categories is manageable.
In particular, avoid having too many empty or near-empty categories. The compound
bar chart shown in Figure 6.6 is a particularly bad example and is not to be copied. This
chart fails any reasonable clarity test and is too complex to decipher quickly.
Male Female
FIGURE 6.5 Compound percentage bar chart showing gender trends in previous hospitalisation
Low-tech industry 7 18 3 1
High-tech industry 17 7 0 0
Table 6.8 Comparison of the statistical characteristics of anxiety in two different types of industry
which gives the mean, median, mode, etc. for the anxiety scores of the two different
groups.
Key points
z Never assume that your tables and diagrams are good enough at the first attempt. They could prob-
ably be improved with a little care and adjustment.
z Do not forget that tables and diagrams are there to present clearly the major trends in your data (or
lack of them). There is not much point in having tables and diagrams that do not clarify your data.
z Your tables and diagrams are not means of tabulating your unprocessed data. If you need to present
your data in full then most of the methods to be found in this chapter will not help you much.
z Labelling tables and diagrams clearly and succinctly is an important part of the task – without clear
titling and labelling you are probably wasting your time.
COMPUTER ANALYSIS
The SPSS Statistics instruction book to this text is Dennis Howitt and Duncan Cramer (2011), Introduction to SPSS
Statistics in Psychology: For version 19 and earlier, Harlow: Pearson. Chapters 8 (tables) and 9 (diagrams) in that book give
detailed step-by-step procedures for the statistics described in this chapter together with advice on how to report the
results. Figure 6.8 gives the SPSS Statistics steps for producing contingency tables, compound charts and histograms.
FIGURE 6.8 SPSS Statistics steps for contingency tables, compound charts and histograms