PSY123 Lecture 10-1
PSY123 Lecture 10-1
PSY123-Introduction to
research methods and
statistics
• Chapter 2(pgs. 26-38), Reader (pgs. 21-27)
• Step 4: data analysis – analysing quantitative
data (introduction to statistics)
• Coding – assigning numerical values to
variables or responses
• Cleaning data set
• Descriptive statistics vs inferential statistics
• Summarising data: frequency distributions –
i.e., Ungrouped frequency distributions or
grouped frequency distributions
• Stem and leaf
Lecture 10
PSY123-Introduction to research methods and
statistics
Reader (pgs. 36-39)
• Graphic representation of frequency distributions
Reader (pgs. 40-43), Chapter 2 (pgs. 39-40)
• Measures of central tendency for grouped and ungrouped frequency
distributions
Reader (pgs. 43-46), Chapter 2 (pgs. 40)
• Measures of variability
Graphic presentation of data
• When looking at the stem and leaf display, it can also be viewed as
a pictorial or graphic presentation of scores.
• However, when we think of graphs, we often think of traditional
ones such as bar charts, histograms and the frequency polygon
• Graphic presentation is another way of presenting data and
information. Usually, graphs (as discussed above) are used to
present frequency distributions.
• Most graphs use two lines that are perpendicular (placed at right
angles) to one another
• The horizontal line is called the x-axis and the vertical line is called
the y-axis
• To summarise frequencies (the occurrence of scores), the score is
listed on the x-axis, while the frequencies are listed on the y-axis.
Bar chart vs Histogram
• Two of the traditional graphs that we may be familiar with are the bar
chart and the histogram
• So how different or alike are these graphs?
Gender example
Bar chart
• The bar chart is normally used for variables
with discrete or distinct categories (such as
gender or seasons)
• It is a pictorial representation of data that uses
bars to compare different categories of data.
• The verticals bars on the chart do not touch
each other, indicating discontinuity.
• Categories are listed on the x-axis and the
frequencies on
• The height of each bar corresponds to the
frequency for each category
• The gaps between bars are used to indicate
discrete/distinct categories
Gender example
Histogram
• The histogram is normally used for continuous,
numerical data (ranging from low to high) or
data that is grouped in different classes.
• It represents the frequency distribution of
continuous variables.
• The verticals bars on this chart do touch each
other.
• The x-axis represents groups of scores
organized into classes, or numbers which are
categorised together to represent ranges of
data (in the example, we note that the
midpoint of each class is listed on the x-axis and
not the various classes)
• The height of each bar corresponds to the
frequency for each category
Frequency polygon
• An alternative to the histogram to show frequencies is to use a frequency
polygon
What is a frequency polygon?
• It, like the bar and histogram, a graphical form of representation of data. It
is used to depict the shape of the data and to depict trends.
• It is also known as a line graph
• It is drawn by calculating and plotting the frequencies of the different data
values and then connecting the plotted dots or midpoints with a straight
line
• The midpoint or class mark can be calculated using the following formula:
1st add the lower-class limit and the upper-class limit, get the total and
then divide by 2
…continued
• In this polygon the midpoint of
each class is listed on the x-axis.
• Above each midpoint a solid dot is
placed corresponding to the
frequency of that class. The dots are
then joined together with the line.
• Also note that two additional
midpoints were added, one below
the lowest class (16) and one above
the highest class (49). This is done to
anchor the lines of the frequency
polygon to the x-axis.
Shapes of distribution
• We know now that graphs are a way of summarising and presenting data visually.
• However, they also play an important role in describing the shape of the distribution of scores
• The shape of a distribution plays a crucial role in the selection of appropriate statistical
techniques.
• The 4 shapes below are the most common shapes of a distribution of scores
…continued
• The figure represented shows a distribution
in statistics that is often desired and thus
known as a normal distribution
• In this distribution the majority of scores
are clustered in the centre of the
distribution, and then trails off towards the
upper and lower extremes
• The normal distribution is bell shaped
• It is also called a symmetrical distribution
since the left and right halves of the
distribution are mirror images of each
other.
…continued
1st add all the scores, get your total (=43). Then divide by the total
number of observations ( in this case 7)
…continued
• It is possible to calculate the mode, median and mean for grouped
data – by using special formulae that produce only approximate
statistics since they (the stats or results produces) are not based on
actual scores
• However, it is advised to avoid calculating measures of central
tendency for grouped frequency distributions.
Skewness – comparing the measures of
central tendency
• Skewness refers to the tendency for
scores to cluster at either side of the
distribution.
• In a normal distribution (C) the mean,
mode and median are exactly the
same.
• In a positively skewed distribution (A),
most scores occur at the lower end of
the distribution below the mean with
the median and mode less than the
mean.
• In a negatively skewed distribution
(B), most scores appear above the
mean and the mode and median are
also greater than the mean.
Measures of variability
• Although measures of central tendency are extremely useful, they only
provide us with a partial description of our data.
• We also need to consider what measures contribute to a mean, mode or
median.
• For example, the mean of 30 plus 30 is 30; however, the mean of 60 plus 0
is also 30. Thus, to fully understand a data set, we also need to consider
measures of variability.
• A measure of variability provides an indication of how diverse or variable
the spread of scores is.
• When scores are spread out, the variability should be high. While, when
scores are clustered together, the variability should be low.
…continued
• The most common measures of variability are the range, the
variance, & the standard deviation.
• These are the most important descriptive statistics, as they form the
basis for most advanced statistical procedures
• The range indicates the distance between the highest and lowest
scores, whilst the other measures of variability relates to how far
scores vary from a typical score (i.e., mean).
…continued
So, what is the range?
• The range is defined as the highest
score minus the lowest score.
• In our gym attendance example,
the highest score is 47 and the
lowest is 19
• Therefore, the range is:
4. A group of seven people have taken a quiz and their scores are 8, 9, 10, 5, 8, 9, and 10. What is the standard deviation?
5. The _______ is normally used for _______, ________or data that is grouped in different classes.
6. A distribution is unimodal when:
(A)it has only one major peak
(B)when it is symmetrical
(C)the mean, mode and the median are equal
(D)both a and b
7. You are given the following scores:3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 40, 56. What is the median of the scores?
8. True or False? a high standard deviation indicates that values are clustered very close to the mean.