0% found this document useful (0 votes)
4 views

Untitled document 4

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Untitled document 4

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Statistics

Measures of central tendency

Key word(s):

central tendency
mean
median
mode
range

What is the central tendency? When you study data, one of the most common things to
look at is an average. The most common measures of central tendency are mean,
median, and mode. Another statistic that many are interested in when analyzing data is
the range.

Mean:

The average number - add all numbers, divide by the number of numbers

Median:

The middle number - find the number in the middle, if even amount find the number
between the middle two with .5 at the end

Mode:

The most common number - find the number that there is the most of

Range:

Largest number minus smallest number - Subtract the largest number from the smallest
number

Example:

Student 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10 - 11 - 12
Height (cm) 160 - 165 - 164 - 162 - 161 - 162 - 160 - 159 - 170 - 162 - 160 - 161
159, 160, 160, 160, 161, 161, 162, 162, 162, 164, 165, 170,
Mean: 162.2

Median: 165.5

Mode: 160, 162

Range: 11

Frequency Tables, Line Plots, and Histograms

Key word(s):

frequency table
line plot
histogram
data

The first step to analyzing data is displaying the data in an organized way. Tables of
information are the first step to doing so. It provides a way to organize the information
so that someone else can understand it. Sometimes the data tables can be difficult to
understand when there is a lot of information. So frequency tables, line plots, and
histograms are a couple ways to make the data more manageable.

Frequency Tables

A frequency table is usually made from a tally table. A frequency table is much easier to
read and interpret.

A tally table uses tally marks to record data.

Example:

Favorite Dinosaur

Name Tally

Tyrannosaurus Rex IIII

Stegosaurus II
A frequency table uses numbers to record data.

Favorite Dinosaur

Name Number

Tyrannosaurus Rex 4

Stegosaurus 2

Line Plots

This line plot tells you how many students spent certain numbers of hours studying in a
week:

Histograms

A Histogram is a graphical or visual display of data using bars of different heights


representing the frequency of the information.

There are some differences between bar and histogram. Notice there are number
ranges along the bottom instead of categories.
Box-and-whiskers plots

Key word(s):

box-and-whiskers plot
box plot
Quartzite
Interquartile range

When we have a set of data, the measures of central tendency, such as the mean,
median and the mode

Mean

Mean is calculated by dividing the sum of the data points by the total number of data
points. The mean is also known as the average and gives us one value that represents
the entire set of data.

Mode

Mode is the value in the data set that occurs most often. You can have no mode, one
mode, or more than one mode in a data set. The mode tells us the data point that is
most frequent. Mode is often used when there is a voting situation.

Median

The median is the number that is in the middle of a set of data after the data is placed in
order. The median splits data into two equal parts and shows the middle of these two
parts.

Quartiles are the values that divide a list of numbers into quarters. To separate values
into quartiles, begin by putting the list of numbers in order. Then cut the list into four
equal parts.

The quartiles are at the ¨cuts,¨ as shown in the example below.

Example: 6, 8, 4, 4, 7, 3, 5

Put in order from lowest to highest: 3, 4, 4, 5, 6, 7, 8


3, 4, 4, 5, 6, 7, 8
^ ^ ^
Q1 Q2 Q3

Q1 = lower quartile

Q2 = median (middle) quartile

Q3 = upper quartile

So for our example, we can say that:

Quartile 1 (Q1) = 4
Quartile 2 (Q2) = 5
Quartile 3 (Q3) = 7

Quartile 3 was 7 and Quartile 1 was 4; thus, the IQR is the difference between the two,
or 3.

A box-and-whisker graph looks like this. Note that the maximum, minimum, and quartile
values are all indicated on the graph. You can see why it is sometimes called a
box-and-whisker graph--because it looks like a box with whiskers.

Circle Graphs

Key word(s):

circle graph
pie chart
sector
Circle graphs are used to picture fractional parts of a whole and allow for quick
understanding of the distribution of data. They are also called pie charts.

The graph is divided into pieces called sectors. Each sector represents a percent and all
sectors must add up to equal 100%

Example:

This circle graph represents the animal favorites of


students that were surveyed. Each type of animal is
featured on one of the sectors of the graph. All of the
sectors together make a circle. The percentages add
up to 100%: 30 + 25 + 18 + 13 + 9 + 5 = 100. From
this chart, we cannot tell how many students were
surveyed, but we can tell what percentages of
students like certain animals.

Sometimes circle graphs include more than just


percentages. For example, this circle graph shows
movie preferences of 20 people. It includes not just a
percentage, but a number on each sector. Knowing
that 20 people were surveyed, 6 liked Romance
movies the best. That is also 30% of those that were
surveyed: 6/20 = 0.30 = 30%.

You can also find out how many data points are in each sector if you do not know the
actual number but know the percent and the total number represented in the circle
graph.

x/20 = 0.30 x = (0.30)(20) = 6

Here are the steps needed to transform data into your own circle graph.

Step 1: Find the whole. What is the total value for the items on your graph?

Step 2: Find the parts. Each item to be graphed represents a part of the whole. To
complete the circle graph, you must find exactly what fraction or percent each item
represents. The easiest way to do this is to divide the part by the whole and then
convert the result to a percent

Step 3: Find the degrees for each part. Every circle is made up of 360 degrees. To find
the angle measure for each item, use this formula:

angle measure for an item = percent the item represents × 360 degrees.

Step 4: Draw a circle and a radius. Use a protractor to draw each angle. Each new
angle should be measured from the previously drawn line segment. Draw the angles
from largest to smallest in a clockwise direction.

Outliers

Key word(s):

outlier

Student Time

Zac 0:58

Ben 0:56

Kenny 1:05

Jared 1:03

Simon 1:34

Ron 1:04

Bobby 1:02

Kareem 0:59

Notice most of the boys' times range from 56 seconds to a minute and 4
seconds--except for Simon. Simon's time is 1 minute and 34 seconds. This is an
example of an outlier. It lies above the majority of the data points in the table.
Student number Height jumped

1 14

2 18

3 28

4 28

5 29

6 29

7 29

8 30

9 30

10 31

11 33

12 34

13 35

14 36

15 42

Did you notice after looking at the data that one student jumped much higher than the
rest of the students? There are many possible reasons for that.

. he may be much taller


. he may just be able to jump higher, or
. there may have been an error in the measurement of his jump height.

*NOTE*

It is important to pay attention to these outliers because they can affect your measures
of central tendency calculations.

Examining Samples
Key words:

sample
random
generalization
population

A population is a group of objects, plants, animals, or people. A sample is a part of a


population.

To gather data that reflects characteristics of the population, you need a random
sample. When a random sample is being chosen, all members of a population have
equal chances of being selected.

Suppose that you are a member of a class


with 120 students, and you want to know the
opinion of the students on a certain aspect of
the class. You cannot survey everyone. You
get an idea on how to take a sample. Put
cards numbered from 1 to 120, one number
per card, in a box. Because the class has
assigned seating by number, each numbered
card corresponds to a student in the class.
Reach in and randomly select six cards. Each
card, and therefore each student, has an
equal chance of being selected. Then, use the
opinions about the course from the six
randomly selected students to generalize
about the course opinion for the entire 120-student population.

A sampling method is a procedure for picking sample elements from a population. An


entire population is too large to survey, so you need to pick a small portion or sample of
the population.

Simple Random Sampling

Simple random sampling guarantees that the sample chosen represents the population
and is unbiased. The sample is chosen randomly and completely by chance and each
member of the population has the same probability of being chosen. This sampling
method ensures that the results are valid.
Systematic Sampling

In the systematic sampling every nth person is taken from a population. For example, if
you take a directory of people, maybe your sample will consist of every 19th person in
the directory.

Convenience Sampling

In convenience sampling, persons are chosen because of their proximity or accessibility


to the data collection site.

One of the ways that statistics can be misleading is when there is a bias involved. Bias
is an inherent misjudgment about something. It happens automatically and can be
overlooked in many cases.

For example, suppose you are conducting a study on what kind of movies people like to
see. You start with creating a survey for people to complete that allows you to gather
data. Take a look at the following two questions:

1. What kind of movies do you like to see?

Please choose from the following:


Horror
Comedy
Romance
Action/Adventure
Foreign
Other

2. Do you like going to see:

boring romance movies


exciting action and adventure movies

Question:

Do you notice any difference in the two questions? Is one more biased than another?

Answer:
The first question shows no bias because a person is free to pick any of the types of
movies without being pointed in any direction. On the other hand, notice that the second
question calls romance movies "boring" while calling action and adventure movies
"exciting." This may lead a person to choose one type of movie over another. This is not
good when collecting data.​

Comparing Data Sets

Key words:

Trends
data sets

A scatter plot is a graph of plotted points that compares and shows a relationship
between two sets of data.

For example, let's say we are comparing height and weight of the level of the glaciers
over some period of time. From the scatter plot you can see if there is a correlation or
relationship between the data.

Correlation is positive when the values increase together.


Correlation is negative when one value decreases as the other increases.
There is no correlation when neither of these situations occur.
Correlation ranges from values of +1 (high positive correlation) through -1 (high
negative correlation). No correlation is represented with the value 0.

We can see correlation as follows:

Example:
Suppose you collected data about the relationship of students' height and shoe size by
looking at two different classrooms. Now you have data from two different rooms, and
you need to organize and analyze the data.

You could put the data on one scatter plot, assigning each room a different color and
then compare the data not only for your classroom, but compare the two classrooms.
You can also graph the data on separate scatter plots.

Height Shoe size Shoe size


(inches) (group 1) (group 2)
50​ ​ 5​ 5.5
52​ ​ 5.5​ 6
54​ ​ 5.5​ 6
56​ ​ 6​ 6.5
58​ ​ 6.5​ 7
60​ ​ 7.5​ 8
62​ ​ 8​ 8.5
64​ ​ 7.5​ 9
68​ ​ 10​ 11

This is the data as a dot plot:

The box plots show that the middle 50% of shoe


sizes is smaller for group 1 than for group 2. This
agrees with the scatter plot where the points for
group 1 are lower than the points for group 2. The
median, minimum, and maximum points are also
lower for group 1 than for group 2. It looks like
students in group 1 generally have smaller feet
even though they tend to be the same height.
There is also stronger agreement in the data when
the shoe sizes are on the lower end of the
range--the quartile ranges are smaller on both
plots in quartile 1 and 2.

This is the data as a scatter plot:


The scatterplot shows a positive correlation between student height and shoe size. For
both groups, the shoe size increases as the student height increases.

You might also like