0% found this document useful (0 votes)
33 views3 pages

StatsSummary JCHL

Statistics can be used to summarize large amounts of data. There are several key concepts: 1) Data can be numerical, categorical, discrete, continuous, ordinal or nominal. Common ways to collect data include surveys, questionnaires, and existing records. It is important to select a random sample that represents the overall population. 2) Common graphs used to visualize data include bar charts, line plots, pie charts and histograms. These graphs must be clearly labeled. 3) Key metrics for analyzing data numerically include the mean, median, and mode which measure center, and the range which measures spread. Frequency distributions organize large datasets into grouped intervals.

Uploaded by

rukhsararaisa8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views3 pages

StatsSummary JCHL

Statistics can be used to summarize large amounts of data. There are several key concepts: 1) Data can be numerical, categorical, discrete, continuous, ordinal or nominal. Common ways to collect data include surveys, questionnaires, and existing records. It is important to select a random sample that represents the overall population. 2) Common graphs used to visualize data include bar charts, line plots, pie charts and histograms. These graphs must be clearly labeled. 3) Key metrics for analyzing data numerically include the mean, median, and mode which measure center, and the range which measures spread. Frequency distributions organize large datasets into grouped intervals.

Uploaded by

rukhsararaisa8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Topic 6: Statistics

1) The Basics:

a) Terminology: b) Collecting Data:

• Numerical: data is numbers Notes: When selecting people to survey it is important that:
e.g.s shoe size, height, rainfall, number of kids in a family ➢ the sample is selected randomly to avoid bias
• Categorical: data is text ➢ the sample represent the population
e.g.s favourite phone brand, tv programme, hair colour ➢ the sample is sufficiently large
• Discrete: numerical data that can only take on set values
(generally whole numbers) Methods of Collecting Data:
e.g.s shoe size, number of kids in family • Phone Interview:
• Continuous: numerical data that can take on a range of Advs: questions can be explained can select sample from
values (can be decimals) entire population
e.g.s rainfall in mm, weight, height Disadvs: expensive compared to post or online
• Ordinal: categorical data that can be put into order • Online Questionnaire:
e.g. grades in an exam A, B, C…. Advs: cheap, anonymous so answers are more honest
• Nominal: categorical data that cannot be put into order Disadvs: people may not respond, not representative of
e.g. phone brand entire population...only those that are online
• Primary Data: data collected by person who's going to use it • Face to Face Interview:
• Secondary Data: data that's already available e.g. internet, Advs: questions can be explained
magazines Disadvs: people might not answer honestly when asked in
• The population is the entire group being studied. person, expensive and not random
• A sample is a group that is selected from the population. • Postal Questionnaire:
• A census is a survey of the whole population. Advs: not expensive
• A sampling frame is a list of all those within a population Disadvs: people don't always respond
who can be sampled. • Observation:
• An outlier is an extreme value that is not typical of other Advs: low cost, easy to carry out
values in the data set. Disadvs: not suitable for some surveys, questions can't be
• Bias can mean something which sways a respondent in a explained
particular way or another, in a survey/questionnaire. The
term bias can also be used if a sample doesn’t reflect the Tips for designing a questionnaire:
population. E.g. selecting people coming out of Lidl and • Use clear & simple language
asking them their opinion on shopping in non-Irish owned • Begin with simple questions
retailers. • Accommodate all possible answers
• Contain no leading questions
• Be as brief as possible
• Be clear where answers should be recorded
• Avoid personal questions

2) Graphing Data from Junior Cert:

a) Bar Charts: b) Trend Graphs:

Notes:
➢ Individual bars must be labelled and Note:
axes labelled ➢ Axes labelled and scaled evenly
➢ Must be an even scale on vertical (e.g.
going up in 25s in example above)
➢ Bars and axes drawn with ruler
➢ Can be used for categorical data
c) Line Plots: d) Pie Charts:

Notes:
Notes: ➢ Circle drawn with compass
➢ Clear columns and rows of 'x' (See
➢ Angles measured with protractor
Diagram) ➢ Label sectors and angles of the pie chart
➢ Each column labelled
➢ Can be used for categorical data
➢ Can be used for categorical data
e) Histograms: f) Stem & Leaf Plots:

Notes:
➢ Clear columns and rows of numbers (see diagram)
➢ Key MUST be included (see diagram)
Notes: ➢ Can use comma separated list for leaves also
➢ Different to bar chart as there is a ➢ Can use back-to-back plots to compare two sets of data (bivariate data)
scale along the bottom as well
➢ Axes labelled and evenly scaled
➢ Axes and bars drawn with ruler
➢ Can be used for continuous numerical
data

3) Analysing Data:

a) Measures of Centre: 3. Median: the middle value (list must be in ascending order)
e.g. Data: 2, 1, 3, 3, 2, 5, 3, 2, 1
1. Mean: the sum of all the values divided by the number of Rearrange in order first: 1, 1, 2, 2, 2, 3, 3, 3, 5
values => Median = 2
e.g. Data: 1, 4, 3, 5, 4, 2, 1 • Used only with numerical data
Mean =
1+4+3+5+4+2+1
= 2.86 • Advs: Easy to calculate, not heavily affected by outliers
7
• Disadvs: Does not use all the data
• Only used with numerical data
• Advs: uses all the data b) Measures of Spread:
• Disadvs: affected by outliers
Note: For the following, the list of values should be in
2. Mode: the value that appears the most often ascending order
e.g. Data: 2, 3, 1, 2, 5, 4, 2, 1, 2 Range: the difference between the max and the min value
Mode = 2 (as it appears 4 times) e.g. Data: 20, 40, 40, 45, 60 => Range = 60 – 20 = 40
• Can be used for numerical but the only one that can be used
for categorical data
• Advs: Not affected by outliers, can be used for any data
• Disadvs: There is not always a mode, does not use all the
data
4) Frequency Distributions:

a) Frequency Distributions: b) Mean, Mode and Median of a Frequency Distribution:

• A frequency distribution is a way of grouping together a Mode: Can be read straight away from the table on the left
large amount of data into a table. E.g. => Mode = 4 as it appears the most often (14 times)

No. in Household 2 3 4 5 6 7
Mean:
No. of People 6 8 14 11 4 1 o We could add up all the values in the full list, shown below
the table above, and then divide by 44
• Always remember what this table represents…..i.e. a full list o Quicker way is to multiply the columns together from the
of data: 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4………………. table i.e. (2x6)+(3x8)+(4x14)+(5x11)+(6x4)+(7x1)
o We then divide this by 44 to get a mean of 4.04
c) Grouped Frequency Distributions:
Median:
• If the frequency distribution is a grouped frequency o Count up how many values we have in total by adding the
distribution, all the calculations shown above are the same bottom row i.e. 6 + 8 + 14 + 11 + 4 + 1 = 44
except we use mid-interval values instead. E.g. o This means that the median here will be the average of the
Age 0-10 10-20 20-30 30-40 22nd and 23rd values.
Freq 2 5 4 8 o We can find the 22nd and 23rd values from the table above
• The mid-interval values for the age row are 5, 15, 25 and i.e. the first 14 values are '2' and '3' and the next 14 values
35. are '4', which would include the 22nd and 23rd values

4+4
We now proceed to find mean, median and mode as in (b). => Median = =4
2

You might also like