0% found this document useful (0 votes)
2 views

Stats Notes

Uploaded by

reeshaworks
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Stats Notes

Uploaded by

reeshaworks
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 76

4.

1 Data sampling methods


Simple/random → A method such
as drawing names from a hat, every
member of the population is equally
likely to be chosen.

Convenience → A method that is the


most accessible for the sampler, such
as asking the people in the room for
some data. Taking samples from the
members of the population that you
have access to until you have a
Quota → decide how many members sample of the desired size.
of each group you want to sample, Systematic → When you randomly
and take samples from the population select the first data point, then select
until you have a large enough sample the rest at regular intervals, e.g every
for each group. Similar to stratified 10th person. k = N/n where you
but not random, if your population find a sample size of n from a
has certain percentages of genders, population of size N by selecting
ages, races etc, then your random every k th member.
sample would maintain these
Stratified → Selecting a random
percentages.
sample where numbers in certain
categories are proportional to the
numbers in the population. E.g,
choosing 5 people from 8th Grade,
9th Grade etc. It uses simple random
sampling once the categories have
been generated.

4.1 Data sampling methods 1


4.1 Data sampling methods 2
4.7 Discrete random variables

Syllabus and formula Notes


X is a discrete random variable.
Discrete → X can be found by counting.
Random → X is the result of a random
process, e.g rolling a dice.
Variable → the quantity, value.

Discrete random
variables just means that
we have a random
experiment where the
outcomes only take We notice that:
discrete values, and each 1. The sum of the probabilities always
has an assigned equals 1
probability. 2. The x's can take numerical or non-
E.g → rolling a dice, each numerical values.
value has a probability of
1/6. Expected value:
This is similar to the mean, it is what you
would expect the mean to be if you were to
repeat the experiment many times.

Therefore, it works in a similar way to


calculating the mean from a frequency
table (ΣF iXi)/n, but because we have
Tips
probabilities instead of frequencies, it
essentially negates the need for dividing by
n. This gives us the formula:
μ = E(X) = Σx ∗ P (X = x) (given in
formula booklet).

4.7 Discrete random variables 1


It reads as, the sum of the outcome times
E(X) is not a probability, the probability.
so it does not need to be
between 0 and 1.

A 'fair game' is an
experiment where
E(X) = 0, no gain or
loss, on average.

4.7 Discrete random variables 2


4.11 X^2 goodness of fit

Syllabus and Notes


formula
X 2 goodness of fit
It is a similar concept to a t-test but we compare
a relative frequency table of a population to a
frequency table from a random sample.

It measures how likely it is that the sample could


have come from that original - i.e the goodness
of fit

You will take an expected frequency table (done


on calculator) and compare it to an observed
frequency table (actual data from your sample).

How to do on calculator:
In a spreadsheet, fill in one row of the observed
values, one of the expected values
Do not include totals
and additional Menu → Stat tests → X^2 GOF
columns or rows in This gives a p-value, which can be compared to
the matrix. the significance level.

If the significance value of the test is greater

General steps than the predetermined significance level, then


we accept/ fail to reject the null hypothesis. If the
1. Define the Null Hypothesis significance value is less than the predetermined
(H0) value, then we should reject the null hypothesis.
2. Define the Alternate (Same as t-test).
Hypothesis If they give you the critical value then:
3. Conduct the test

4.11 X^2 goodness of fit 1


4. Assess the results, lay out Reject the null hypothesis if the X 2 > C.V //
your equation and decide Do not reject if X 2 < C.V
whether or not we reject
the null hypothesis.
Reject if: P< S.L or X 2 > C.V
Fail to reject if: P>S.L or X 2 < C.V
Expected outcome:

Degrees of freedom:
Probability times the
n-1 → say there are 4 columns of data then the
number of trials.
degrees of freedom is just 3.

At HL, for every column combined, an additional


E.g → Find the expected
-1 is taken for the degrees of freedom.
frequency of fields with bees
yielding a crop of more than
three tonnes:

1. 24 + 21 = 45 (> 3 tonnes)

2. 24 + 32 + 20 + 14 (bees)

3. All added together = 175

4. 45 ∗ 90/175 = 23.1

4.11 X^2 goodness of fit 2


4.11 X^2 test for independence

Syllabus and formula Notes

A chi-squared (X 2 ) test for


independence can be
performed to find out if two
data sets are independent of
each other. It can be done at
various significance levels.

Hypotheses Each data point is categorized with


regard to two variables, and the
frequencies are put into a
The null hypothesis is that they
contingency table.
are independent, and the
alternate is that they are not We then compare these frequencies
independent. to what values would be expected,
based on an average of the row and
column totals.
Reject if: P< S.L or X 2 >
C.V Calculating expected
Fail to reject if: P>S.L or frequencies:
X 2 < C.V

In exams, the degrees of


freedom may be written as v,
Or (Row total x column total) / total
and it will always be greater
than 1. Degrees of freedom:

D of freedom = (rows - 1) * (columns


- 1)

4.11 X^2 test for independence 1


Expected values must be
greater than 5. If there are How to do on calculator shown
below:
expected values less than 5
then you will need to combine
Menu → Matrix & Vector →
rows or columns. See 4.12 Create → Matrix → Fill in →
notes ctrl var (sto → ) → fill in name
→ menu → Statistics → Stat
tests → X 2 2-way Test

Results:
A p-value will be calculated here as
well, which you will need to compare
against the significance level
provided, or your X 2 value may be
compared against a provided
critical value.

4.11 X^2 test for independence 2


4.12 Choice, validity and
interpretation of tests

Collection of data
It is important to be aware of any
possible errors in the collection
methods (4.1).

The errors may be systematic → the


errors have a non-zero mean and
often follow some kind of pattern.

The errors may be random → due to


natural variation or other unknown
Selecting the sample factors, which might be expected to
See 2.1 notes from sampling have a zero mean.
methods list. Example of a systematic error → a
It is important that the right method is weighing machine that always adds
used for the question you are trying on a fixed amount. A systematic error
to answer. may still give a good correlation, but
any line of regression will be affected,
Collecting data using so it will give misleading results.
questionnaires
Data mining
A multiple-choice or short-answer
survey will provide data that is easy Data mining occurs when lots of pairs
to analyse but it needs to be carefully of variables are considered to see if
because: any significant results can be found.

Answers may be restrictive, so not If enough variables are compared it is


enough information is obtained very likely that such results can then
be found, particularly at a 5 or 10%
People might not answer honestly
significance level.

4.12 Choice, validity and interpretation of tests 1


The question may be interpreted in It can be useful for highlighting
different ways possible connections between
variables, but should be repeated to
The questionnaire needs to be
see if similar results are obtained.
complete, as extra questions cannot
be asked later
Reliability of tests
Choosing a valid test A test is reliable if it produces similar
results on each occasion it is carried
A valid test is one that measures the
out in similar circumstances. Its key
quality it claims to be measuring.
attribute is repeatability.
A test which covers all the content
A reliable test is not necessarily valid.
being used, is said to have content
validity, it measures the extent to Example → A well-rounded, reliable
which the items on a test are fairly quadratics test is not valid for
representative of the entire domain assessing biological knowledge.
the test seeks to measure.
All valid tests have to be reliable and
Example → A chemistry test about use data from a reliable source.
the periodic table provides data of
how well the students have learnt the Test-retest
periodic table. To test if the means of collecting data
Things such as feelings are not easily is reliable, the same test can be
as content validity as maths and given to the same people after a
science. period of time.

Criterion validity measures how well If it is reliable, there should be a


one measure predicts an outcome for strong correlation between the results
another measure. on the two occasions.

Example → Entry test score affecting There may be intervening factors


future performance. between the tests, but it should still
have a high correlation.
If it also relates particularly to future
events then it is also referred to as Parallel forms
predictive validity.
Parallel forms are two tests that are
equivalent in the sense that they
contain the same kinds of items of

4.12 Choice, validity and interpretation of tests 2


equal difficulty but not the same
items are administered to the same
individuals.

If the test is reliable, there should be


a strong correlation between the
scores on each test.

Disadvantages:
You have to create a large number of
questions of equal difficulty to
measure the same quality.
Proving two parallel tests are
equivalent is difficult.

4.12 Choice, validity and interpretation of tests 3


4.12 Estimating parameters for
x^2
Often in an x2 test, you are not
comparing the data with a fixed
distribution, but just want to test if it is
normal, binomial or Poisson.
Degrees of freedom In these cases, it makes no sense to
Estimating parameters from the data choose an arbitrary mean, standard
does affect the degrees of freedom deviation or probability, but instead
used. use values estimated from the

To obtain the number of degrees of observed data.


freedom, take the number of cells Expected values must be greater
minus one, and then subtract one for than 5. If there are expected values
each of the parameters estimated. less than 5 you will need to combine
rows or columns.

4.12 Estimating parameters for x^2 1


4.13 R^2 and SSR
If you have data - such as a table
1. Put it into a graph, determine what type of regression it is

2. On the spreadsheet, fill in this type of regression → Menu → Statistics → Stat


Calculations → type of regression

3. This gives you the R^2 value

Calculating SSR
SSR = (difference between data)^2 + (difference between data)^2 etc

What does this R^2 value?


E.g R^2 is 0.931. That means that 93.1% of the variation in the second data item, such
as ice creams sold, can be explained by the variation in, e.g, temperature. Only 7% can
be explained by other variables, such as day of the week. R^2 gives the proportion of
variability in the second variable accounted for by the chosen model. The closest to 1,
the better.

What is SSR?
It looks at the difference between the actual value of that data point, and the one of the
line of regression. The difference between the mean (y bar) and the predicted value (y
hat)

4.13 R^2 and SSR 1


Awareness that R^2 = 1 - SSR/SST may enhance understanding but will not
be examined. To understand what this SSR value actually means, we compare it to
the SST.
Awareness that many factors affect the validity of a model and the coefficient of
determination, by itself, is not a good way to decide between different models.

SST is the total variations between the mean and the actual values
squared

We do this by doing the 1-SSR/SST


SSR = Σ(Yhat − Ybar)/n
SST = Σ(Yi − Ybar)/n

Not given in formula booklet!

Yhat =model's prediction of y


Yi = data's actual y-values
Y bar = mean of y values

4.13 R^2 and SSR 2


4.13 R^2 and SSR 3
4.14 unbiased estimates of σ and
μ

Syllabus and formula Notes

n= sample size

To find the unbiased estimate s2 n = variance of sample


of the mean: s2 n − 1= an estimate of the
you do it as expected, add all the population variance using the sample
values and divide by the number of Sx = unbiased estimator using the
data items data as a sample

σx = calculator assumes the data


To find the unbiased estimate
you're entered is the entire population
of population variance:
fill the data into a spreadsheet, look
at the population standard deviation, If asked to find the standard deviation
times it by n/n-1 of the sample means, you take sn-1,
Or, look at the sx standard deviation, and divide it by the square root of the
and square it, they should be the sample size

same

4.14 unbiased estimates of σ and μ 1


4.15 Sample means/ Central Limit
Theorom

Syllabus and formula Notes


A linear combination of n indpendent
normal variables in normally
distributed, following the formula
shown on the right, which is not given
in the formula booklet, but it is very
similar to the working out for a normal
distribution question.
(not provided in formula booklet) In order for the CLT to work at all,
The Central Limit Theorem says that you have to be able to calculate a
if data has a sufficiently large sample mean from your sample, which it will
size (n>30), the mean will approach a in the exam.
normal distribution. For the data for the sample to be
Generally, Xbar approaches normally distributed, n > 30.
normality for large n, how large Some questions may just be a
depends on the distribution from straight forward normal distribution,
which the sample is taken. to identify a sample means questions
We use n > 30. look for 'sample mean' or a sample
size.

If it is sample means we do the usual


formula except, it is the standard
deviation over the square root of n.

4.15 Sample means/ Central Limit Theorom 1


4.16 Confidence intervals
As opposed to testing data obtained
from a sample against a single value
for the population mean, it is more
convenient to have a range of values
within which the mean of the
The interval can be stated in words or population is likely to lie.
as 'The 95% confidence interval is These intervals always have a
(number, number) confidence level. For example, a 95%
When the population from which the confidence interval means that on
sample is taken can be regarded as 95% of all occasions, such a sample
being normally distributed, was selected the population mean
confidence intervals for the would fall within the
population means can easily be calculated boundaries.
worked out.
T-distribution confidence
Confidence intervals - when we intervals:
don't know what a population's mean
is, so
we use a sample to give us a better
idea of what it might be. Unbiased
estimator of population
standard deviation often needed (t-
distribution).

When we sample data, it's unlikely


that our sample's mean is the true
population mean. However, if we
use the sample data to work
out an unbiased estimator for the Z-interval (normal distribution):
population standard deviation,
then, in combination with our sample
mean, we can calculate values

4.16 Confidence intervals 1


within which we are 90%, 95% or
99% confident that the real,
population mean lies.

With confidence intervals, we are not


testing a hypothesis, we are using
samples to tell us something about
the population's mean.

Working out:
Type of distribution → Example X ~ N
(mean, standard deviation)

Lower bound < mean < upper bound

In terms of a general mean, and a


general population standard deviation,
we use mean 0 and standard deviation 1.
So for a general population 99%
confidence interval: -2.576 < x < 2.576

4.16 Confidence intervals 2


4.17 Poisson distribution

When?
An event can occur any number of times (no upper or lower, known limits on the
number of occurrences) during a time period.

Independent events; In other words, if an event occurs, it does not affect the
probability of another event occurring in the same time period.

Rate of occurrence is constant; that is, the rate does not change based on time
(mean is constant for any given time period).

Probability of an event occurring is proportional to the length of the time


period. For example, it should be twice as likely for an event to occur in a 2 hour
period as it is in a 1 hour time period. If mean number of calls to a call centre in
1hour = 1000 calls, then in ½ hr = half of 1000 calls and in ¼ hr = ¼ of 1000 calls,
10mins = 1/6 of 1000 calls etc.

It must be discrete variables

The events cannot occur simultaneously

The events must be random/ unpredictable

4.17 Poisson distribution 1


Working out:
K~Po (x) where x=probability

Poisson PDF = probability of a specific number of events in the interval


Poisson CDF = probability of a range of events in the interval

X = discrete random variable, e.g number of cars


Lamda (upside down y) = expected result

In a Poisson distribution, the mean = the standard deviation squared, this may be
useful in proving whether it does follow a Poisson distribution, because the two
values should be approximately equal

Poisson uses discrete data

If X~Po(α) then E(X) = Var (X) = α

The mean = mode = variance → can be useful for testing whether it does follow the
Poisson distribution

Critical values:
Spreadsheet, first column = 0s, 2nd column = values around the mean, 3rd column =
PoissonCDF(mean, title of second column, title of 1st column) YOU HAVE TO LABEL
THE 3rd COLUMN as well
Or THIS ACTUALLY WORKS -> 1st column = values around the mean, 3rd column you
do PoissCDF(mean, 0, a1) then drag down (to drag lots, when its got the four way hold
and drag)
Or sadly you have to guess random values until it’s just on the edge of the significance
level, so you put the mean, 0, then guess a value

4.17 Poisson distribution 2


4.18 Paired and unpaired t-test
Paired t-test/ paired samples
What is a paired t-test?
A paired t-test compares the means and standard deviations of two related groups to
determine if there is a significant difference between the two groups. It means that both
samples consist of the same test subjects. A paired t-test is equivalent to a one-sample
t-test.

What are the hypotheses?


There are two possible hypotheses in a paired t-test.
● The null hypothesis (H0) states that there is no significant difference between the
means of the two groups, but it is rephrased to say the difference in the two means is
equal to zero, and the test is done on the differences, rather than on the two sample
separately. ( mu1 = mu2) or (mu1 − mu2 = 0)
● The alternative hypothesis (H1) states that there is a significant difference between
the two population means, and that this difference is unlikely to be caused by sampling
error or chance.
mu1 = /mu2, < or > (see on paper)

What are the assumptions of a paired t-test?


● The dependent variable is normally distributed.
● The observations are sampled independently.

● The independent variables must consist of two related groups or matched pairs.

How to run a paired t-test:


Fill in data onto a spreadsheet

4.18 Paired and unpaired t-test 1


Menu → Statistics → Stat tests → t-test if there's just the differences, I think 2
sample t-test if you have the data of both groups

Paired interval test → Menu → Statistics → Confidence intervals → t-interval

Unpaired t-test/ two-sample tests


What is an unpaired t-test?
An unpaired t-test, also known as an independent t-test, compares the means of two
independent or unrelated groups to determine if there is a significant difference between
the two. It means that both samples consist of distinct test subjects. An unpaired t-test is
equivalent to a two-sample t-test.

What are the hypotheses?


● The null hypothesis (H0) states that there is no significant difference between the
means of the two groups.
● The alternative hypothesis (H1) states that there is a significant difference between
the two population means, and that this difference is unlikely to be caused by sampling
error or chance.

What are the assumptions?


● The dependent variable is normally distributed.
● The observations are sampled independently.
● The variance of data is the same between groups, meaning that they have the same
standard deviation, so the test is just whether or not the populations they come from
have the same mean.
● The independent variables must consist of two independent groups.

The subjects volunteered and so the data is randomized.

Example:

4.18 Paired and unpaired t-test 2


If you wanted to conduct an experiment to see how drinking an energy drink increases
heart rate, you could do it two ways.
The "paired" way would be to measure the heart rate of 10 people before they drink the
energy drink and then measure the heart rate of the same 10 people after drinking the
energy drink. These two samples consist of the same test subjects, so you would
perform a paired t-test on the means of both samples.
The "unpaired" way would be to measure the heart rate of 10 people before drinking an
energy drink and then measure the heart rate of some other group of people who
have drank energy drinks. These two samples consist of different test subjects, so you
would perform an unpaired t-test on the means of both samples.
https://socratic.org/questions/what-is-a-paired-and-unpaired-t-test-what-are-the-
differences

4.18 Paired and unpaired t-test 3


4.18 Product moment correlation
coefficient test

Syllabus and formula Notes


If you have a set of bivariate (two
variables) data, then this can be
shown on a scatter graph, and the
correlation coefficient (r) can be
found.

This is unlikely to be the same r value


as the r value for the whole
population, which is normally written
Steps to do a hypothesis test as ρ.
for the population correlation For a large sample size, it is likely to
coefficient: be quite similar.
1. Write the null hypothesis: ρ =0 If you need to calculate a line of best
2. Decide on the alternate hypothesis, fit for your data, you need to be sure
which will depend on if you're looking that there is a linear correlation. This
for a positive correlation, negative can be tested by testing the null
correlation or any linear relationship. hypothesis: H0: ρ = 0

3. Find and write the p-value. H0: ρ = 0 is tested against the


alternatives Ha : ρ < 0 or ρ > 0 or
4. Compare the p-value with the
ρ=
0
significance level and write a
conclusion. This depends on whether you are
testing if there is negative correlation,
positive correlation or neither
correlation

To see if the sample correlation is big


enough so that the null hypothesis is

4.18 Product moment correlation coefficient test 1


rejected, a p-value is found.
You should not calculate a
least squares regression line This is the probability of the sample
unless there is significant correlation being at least as large as
evidence of a linear the one obtained if the two
relationship. populations are not correlated (ρ =
0)

How to do on calculator Conclusion of the test


Menu → Stats → Linear reg t Test If the p-value is less than the
significance level, then there is strong
evidence that the two variables have
a linear correlation, so calculating the
least squares regression line is
appropriate.

If the p-value is less than the


significance level:
"reject the null hypothesis that there
is no correlation between the two
variables."

If the p-value is greater than the


significance level
"there is insufficient evidence to
reject the null hypothesis that there is
no correlation between the two
variables."

4.18 Product moment correlation coefficient test 2


4.18 Test for population mean for
normal distribution

Syllabus and formula Notes

Test for population mean for


normal distribution
If you are testing whether or not your
data has come from a normal
distribution, you will need to know an
estimate for the mean and the
standard deviation to work out the
expected values.

The best choices would be to use the


Paired t-test → compares the sample mean as an estimate for the
means and standard deviations population mean: X̄ ~N(μ , σX/√n)
of two related groups to
For the population mean we use the
determine if there is a
formula given above (from the CLT/
significant difference between
sample means), we find out if the
the two groups. It means that
value of Xbar (because it is mean)
both samples consist of the
falls in the critical region then the null
same test subjects. A paired t-
hypothesis is rejected.
test is equivalent to a one-
sample t-test. So we are provided with a
significance level, then we complete
an equation as so: P(Xbar>a)=0.05
(or whatever the significance level is).

Draw a diagram like the one below to


figure out which critical region you
are looking for

4.18 Test for population mean for normal distribution 1


Unpaired t-test → also known
as an independent t-test,
compares the means of two The different critical regions
independent or unrelated
groups to determine if there is Use Inverse Normal and fill in your
a significant difference data values → Don't forget over the
between the two. It means that square root of n (sample size)
both samples consist of distinct
If there is a big difference between
test subjects. An unpaired t-
the mean you found and the mean
test is equivalent to a two-
given, then you reject H0. (See
sample t-test.
Example 5 of Oxford → page 640)

See other document for more details


Reminder: Degrees of freedom
on paired/ unpaired t-tests.
is n-1

A critical value is a point on the


distribution of the test statistic
under the null hypothesis that
defines a set of values that call
for rejecting the null
hypothesis.

4.18 Test for population mean for normal distribution 2


4.18 Test for population mean
using Poisson distribution

Syllabus and formula Notes


If a distribution is known to be
Poisson, or if it has the
characteristics of a Poisson
distribution, a hypothesis test can be
done to test for a particular value for
the mean.

This is very similar to testing


hypotheses with the binomial
distribution.
If unclear, look at example 3 in
Oxford textbook, page 634. Steps
1. State the null and alternate
It will always only be a one-tailed hypotheses (e.g λ = 7.3, λ > 7.3)
test. 2. To find the critical region, do
PoissonCDF: λ = mean, your
Example
estimate for the critical value, an
upper bound, infinity doesn't work.

3. You have found the critical value


when it is above the boundary, e.g if
the significance level is 0.05, if your
value is above 0.95, then that is the
critical value and so that number is
the critical region.

4.18 Test for population mean using Poisson distribution 1


4.18 Test for proportion using
binomial distribution

Syllabus and formula Notes


Each statistic obtained from a sample
has an associated p-value.

This p-value is the probability of


obtaining this value (or a more
extreme one) if the null hypothesis is
true.

If the p-value is less than the


significance level the test is
significant and the null hypothesis is
The null hypothesis is rejected
rejected.
if either the test statistics fall in
the critical region or if the p-
How to:
value is less than the
significance level. 1. State the null and alternate
hypothesis.

The test will always be one-tailed 2. Write your equation: P(X>r)</=0.05


only, like this, or the other side, never (or whatever the significance level is.
both: 3. BinomialCDF(Number of trials,
probability, test a value, number of
trials) (Or InverseBinomial - you have
to add or minus one to get > / =

4. When it is less than 0.05 or more


than 0.95, then that is the critical
region.

See example 2 of Oxford textbook,


page 631.

4.18 Test for proportion using binomial distribution 1


4.18 Test for proportion using binomial distribution 2
4.18 Type I and Type II errors

Syllabus and formula Notes

Conclusions
There are two acceptable conclusions to
a hypothesis test:

1. Sufficient evidence to reject H0 at the


Type I error significance level.
If we falsely reject H0 while it was
2. Insufficient evidence to reject H0 at
true, there has been a Type I error. It
the significance. (fail to reject)
can be called a 'false positive'.

It is the 'more serious error'. Type II error


We can write its probability as Accepting H0 when it is not true is a
P(C│H0 true) where C stands for the Type II error, also called 'a false
test statistic being in the critical negative'.
region. The probability can be written as
For the normal or t-distributions, this P(C'│H1 true). The actual value of
is the same as the significance level, the probability depends on the
and for discrete distributions, it is particular value of the parameter.
equal to the probability of rejecting It needs an alternative hypothesis (an
H0, which might not be exactly the alternative mean and standard
same as the quoted significance deviation) to work this out.
level.
The probability of this happening
It is very important to consider the depends mostly on sample size and
balance between the two. In some cases, variance.
it is more important to avoid a Type I
error than to avoid a Type II error. Chances
For a given sample size, reducing the
chance of a Type I error increases

4.18 Type I and Type II errors 1


the chance of a Type II error, and
vice versa.

α and β
α is the level of significance.

β is incorrectly failing to reject a false


null hypothesis.

Power → the ability of the test to


correctly reject a false null
hypothesis.

Finding the probability of a


Type I error
Find the critical value (s)

It is only a two-tailed test if it is μ ≠ μ0

This can be done by inverse normal/t/


binomial of the significance level of
the test

Then put in the values to the t cdf/ Find the probability of a Type II
normal cdf/ binomial cdf error
Finding the critical value (s)

It is only a two-tailed test if it is μ ≠ μ0

This can be done by inverse normal/t/


binomial of the significance level of
the test

Then using normal cdf/ t cdf/ binomial


cdf

Then doing it in the same way as on


the left

4.18 Type I and Type II errors 2


If it is using sample means/ the
Central Limit Theorem, then don't
forget to do the standard deviation
over the square root of the sample
size.

OR

Lower bound = lower critical value

Upper bound = upper critical value


Binomial → Number of trials,
probability of success, 0/ lower Mean
bound, 1 less than the critical value. Standard deviation/ Standard
Example → 50, 0.02, 0, 3 (critical deviation over the square root of the
value is 4) sample size
1 - P(X<3)

4.18 Type I and Type II errors 3


4.16/18 One sample t-test, when σ
is unknown
In questions about the mean of the
normal distribution, we were assuming
to know the population variance, so we
use the unbiased estimator of the
population variance → S 2 n − 1

An unbiased estimator is one that will on


average, tend towards the value of the
parameter being estimated, in this case
σ2 .
Reminder about unbiased
estimator This adds an extra degree of uncertainty,
which depends on the sample size.
→ If the sample is large then there is not
much uncertainty because the estimate
will be close to the actual value.

This extra degree of uncertainty means


n= sample size the distribution of Xbar is no longer
s2 n = variance of sample normally distributed, but instead follows
a t-distribution.
s2 n − 1= an estimate of the population
For a sample size of n, we use the
variance using the sample
T (n − 1) distribution where n − 1 is
Sx = unbiased estimator using the data the degrees of freedom, and is often
as a sample/ population standard written as v.
deviation using the sample

σx = calculator assumes the data you're t-distribution on calculator (data)


entered is the entire population Find the mean, this is the mean that will
Don't forget to square root it because it's be tested against the population mean.
the variance (Or use the mean given).

σ: (pop SD known & n>30), Sn-1 (t- Statistics → Stat tests → t-Test → Data
distr,σ unknown or n<30) or S_(n−1)/√n μ0 → population mean, the assumed
(testing x ̅, t-distr,σ unknown or n<30) mean

4.16/18 One sample t-test, when σ is unknown 1


List → list of data
The normal distribution is used
Frequency list → How many lists of
when the population variance/ σ 2 entry
is known.
The T (n − 1) distribution is used This gives a P-value which can be
when the population variance/ σ 2 compared against the significance level
has to be estimated from the data. for rejecting or accepting the null
hypothesis.

It also gives a t number (test score). A


The t-distribution approaches the normal
positive t number means that the score
distribution for large values of n.
is on the right side of the graph, a
The t-distribution is used regardless of negative means it is on the left, with 0 in
the sample size. the middle (where the mean would be).

t-distribution on calculator (stats)


Fill data into column.

Statistics → Stat tests → t-Test → Stats

μ0 → population mean, the assumed


Finding the t-value mean
x̄ → sample mean
Finding the t-value can be used for
finding the equivalent of the critical Sx → unbiased estimator, data as a
sample
value
n → sample size
For this, we do inverse-t, put in the
area (it always has to be below) When to use normal vs t-test
We put in the area and the degrees
of freedom, which is n-1, and that
gives us the critical value.

Z-test → sample means


T-test → sample means

Norm Cdf → individual items


T Cdf → individual items

4.16/18 One sample t-test, when σ is unknown 2


Definitions
A dependent variable is a variable whose variations depend on another
variable—usually the independent variable.

An Independent variable is a variable whose variations do not depend on


another variable but the researcher experimenting.

Discrete data is information that can only take certain values.

Continuous data is data that can take any value. Height, weight,
temperature and length are all examples of continuous data.

Mutually exclusive refers to two (or more) events that cannot both occur
when the random experiment is formed.

Critical region is where the null hypothesis is rejected when a calculated


value of the test statistic lies within this region.

The Critical value is the value which determines the boundary of the critical
region.

Definitions 1
4.1 Data sampling methods
Simple/random → A method such
as drawing names from a hat, every
member of the population is equally
likely to be chosen.

Convenience → A method that is the


most accessible for the sampler, such
as asking the people in the room for
some data. Taking samples from the
members of the population that you
have access to until you have a
Quota → decide how many members sample of the desired size.
of each group you want to sample, Systematic → When you randomly
and take samples from the population select the first data point, then select
until you have a large enough sample the rest at regular intervals, e.g every
for each group. Similar to stratified 10th person. k = N/n where you
but not random, if your population find a sample size of n from a
has certain percentages of genders, population of size N by selecting
ages, races etc, then your random every k th member.
sample would maintain these
Stratified → Selecting a random
percentages.
sample where numbers in certain
categories are proportional to the
numbers in the population. E.g,
choosing 5 people from 8th Grade,
9th Grade etc. It uses simple random
sampling once the categories have
been generated.

4.1 Data sampling methods 1


4.1 Data sampling methods 2
4.7 Discrete random variables

Syllabus and formula Notes


X is a discrete random variable.
Discrete → X can be found by counting.
Random → X is the result of a random
process, e.g rolling a dice.
Variable → the quantity, value.

Discrete random
variables just means that
we have a random
experiment where the
outcomes only take We notice that:
discrete values, and each 1. The sum of the probabilities always
has an assigned equals 1
probability. 2. The x's can take numerical or non-
E.g → rolling a dice, each numerical values.
value has a probability of
1/6. Expected value:
This is similar to the mean, it is what you
would expect the mean to be if you were to
repeat the experiment many times.

Therefore, it works in a similar way to


calculating the mean from a frequency
table (ΣF iXi)/n, but because we have
Tips
probabilities instead of frequencies, it
essentially negates the need for dividing by
n. This gives us the formula:
μ = E(X) = Σx ∗ P (X = x) (given in
formula booklet).

4.7 Discrete random variables 1


It reads as, the sum of the outcome times
E(X) is not a probability, the probability.
so it does not need to be
between 0 and 1.

A 'fair game' is an
experiment where
E(X) = 0, no gain or
loss, on average.

4.7 Discrete random variables 2


4.11 X^2 goodness of fit

Syllabus and Notes


formula
X 2 goodness of fit
It is a similar concept to a t-test but we compare
a relative frequency table of a population to a
frequency table from a random sample.

It measures how likely it is that the sample could


have come from that original - i.e the goodness
of fit

You will take an expected frequency table (done


on calculator) and compare it to an observed
frequency table (actual data from your sample).

How to do on calculator:
In a spreadsheet, fill in one row of the observed
values, one of the expected values
Do not include totals
and additional Menu → Stat tests → X^2 GOF
columns or rows in This gives a p-value, which can be compared to
the matrix. the significance level.

If the significance value of the test is greater

General steps than the predetermined significance level, then


we accept/ fail to reject the null hypothesis. If the
1. Define the Null Hypothesis significance value is less than the predetermined
(H0) value, then we should reject the null hypothesis.
2. Define the Alternate (Same as t-test).
Hypothesis If they give you the critical value then:
3. Conduct the test

4.11 X^2 goodness of fit 1


4. Assess the results, lay out Reject the null hypothesis if the X 2 > C.V //
your equation and decide Do not reject if X 2 < C.V
whether or not we reject
the null hypothesis.
Reject if: P< S.L or X 2 > C.V
Fail to reject if: P>S.L or X 2 < C.V
Expected outcome:

Degrees of freedom:
Probability times the
n-1 → say there are 4 columns of data then the
number of trials.
degrees of freedom is just 3.

At HL, for every column combined, an additional


E.g → Find the expected
-1 is taken for the degrees of freedom.
frequency of fields with bees
yielding a crop of more than
three tonnes:

1. 24 + 21 = 45 (> 3 tonnes)

2. 24 + 32 + 20 + 14 (bees)

3. All added together = 175

4. 45 ∗ 90/175 = 23.1

4.11 X^2 goodness of fit 2


4.11 X^2 test for independence

Syllabus and formula Notes

A chi-squared (X 2 ) test for


independence can be
performed to find out if two
data sets are independent of
each other. It can be done at
various significance levels.

Hypotheses Each data point is categorized with


regard to two variables, and the
frequencies are put into a
The null hypothesis is that they
contingency table.
are independent, and the
alternate is that they are not We then compare these frequencies
independent. to what values would be expected,
based on an average of the row and
column totals.
Reject if: P< S.L or X 2 >
C.V Calculating expected
Fail to reject if: P>S.L or frequencies:
X 2 < C.V

In exams, the degrees of


freedom may be written as v,
Or (Row total x column total) / total
and it will always be greater
than 1. Degrees of freedom:

D of freedom = (rows - 1) * (columns


- 1)

4.11 X^2 test for independence 1


Expected values must be
greater than 5. If there are How to do on calculator shown
below:
expected values less than 5
then you will need to combine
Menu → Matrix & Vector →
rows or columns. See 4.12 Create → Matrix → Fill in →
notes ctrl var (sto → ) → fill in name
→ menu → Statistics → Stat
tests → X 2 2-way Test

Results:
A p-value will be calculated here as
well, which you will need to compare
against the significance level
provided, or your X 2 value may be
compared against a provided
critical value.

4.11 X^2 test for independence 2


4.12 Choice, validity and
interpretation of tests

Collection of data
It is important to be aware of any
possible errors in the collection
methods (4.1).

The errors may be systematic → the


errors have a non-zero mean and
often follow some kind of pattern.

The errors may be random → due to


natural variation or other unknown
Selecting the sample factors, which might be expected to
See 2.1 notes from sampling have a zero mean.
methods list. Example of a systematic error → a
It is important that the right method is weighing machine that always adds
used for the question you are trying on a fixed amount. A systematic error
to answer. may still give a good correlation, but
any line of regression will be affected,
Collecting data using so it will give misleading results.
questionnaires
Data mining
A multiple-choice or short-answer
survey will provide data that is easy Data mining occurs when lots of pairs
to analyse but it needs to be carefully of variables are considered to see if
because: any significant results can be found.

Answers may be restrictive, so not If enough variables are compared it is


enough information is obtained very likely that such results can then
be found, particularly at a 5 or 10%
People might not answer honestly
significance level.

4.12 Choice, validity and interpretation of tests 1


The question may be interpreted in It can be useful for highlighting
different ways possible connections between
variables, but should be repeated to
The questionnaire needs to be
see if similar results are obtained.
complete, as extra questions cannot
be asked later
Reliability of tests
Choosing a valid test A test is reliable if it produces similar
results on each occasion it is carried
A valid test is one that measures the
out in similar circumstances. Its key
quality it claims to be measuring.
attribute is repeatability.
A test which covers all the content
A reliable test is not necessarily valid.
being used, is said to have content
validity, it measures the extent to Example → A well-rounded, reliable
which the items on a test are fairly quadratics test is not valid for
representative of the entire domain assessing biological knowledge.
the test seeks to measure.
All valid tests have to be reliable and
Example → A chemistry test about use data from a reliable source.
the periodic table provides data of
how well the students have learnt the Test-retest
periodic table. To test if the means of collecting data
Things such as feelings are not easily is reliable, the same test can be
as content validity as maths and given to the same people after a
science. period of time.

Criterion validity measures how well If it is reliable, there should be a


one measure predicts an outcome for strong correlation between the results
another measure. on the two occasions.

Example → Entry test score affecting There may be intervening factors


future performance. between the tests, but it should still
have a high correlation.
If it also relates particularly to future
events then it is also referred to as Parallel forms
predictive validity.
Parallel forms are two tests that are
equivalent in the sense that they
contain the same kinds of items of

4.12 Choice, validity and interpretation of tests 2


equal difficulty but not the same
items are administered to the same
individuals.

If the test is reliable, there should be


a strong correlation between the
scores on each test.

Disadvantages:
You have to create a large number of
questions of equal difficulty to
measure the same quality.
Proving two parallel tests are
equivalent is difficult.

4.12 Choice, validity and interpretation of tests 3


4.12 Estimating parameters for
x^2
Often in an x2 test, you are not
comparing the data with a fixed
distribution, but just want to test if it is
normal, binomial or Poisson.
Degrees of freedom In these cases, it makes no sense to
Estimating parameters from the data choose an arbitrary mean, standard
does affect the degrees of freedom deviation or probability, but instead
used. use values estimated from the

To obtain the number of degrees of observed data.


freedom, take the number of cells Expected values must be greater
minus one, and then subtract one for than 5. If there are expected values
each of the parameters estimated. less than 5 you will need to combine
rows or columns.

4.12 Estimating parameters for x^2 1


4.13 R^2 and SSR
If you have data - such as a table
1. Put it into a graph, determine what type of regression it is

2. On the spreadsheet, fill in this type of regression → Menu → Statistics → Stat


Calculations → type of regression

3. This gives you the R^2 value

Calculating SSR
SSR = (difference between data)^2 + (difference between data)^2 etc

What does this R^2 value?


E.g R^2 is 0.931. That means that 93.1% of the variation in the second data item, such
as ice creams sold, can be explained by the variation in, e.g, temperature. Only 7% can
be explained by other variables, such as day of the week. R^2 gives the proportion of
variability in the second variable accounted for by the chosen model. The closest to 1,
the better.

What is SSR?
It looks at the difference between the actual value of that data point, and the one of the
line of regression. The difference between the mean (y bar) and the predicted value (y
hat)

4.13 R^2 and SSR 1


Awareness that R^2 = 1 - SSR/SST may enhance understanding but will not
be examined. To understand what this SSR value actually means, we compare it to
the SST.
Awareness that many factors affect the validity of a model and the coefficient of
determination, by itself, is not a good way to decide between different models.

SST is the total variations between the mean and the actual values
squared

We do this by doing the 1-SSR/SST


SSR = Σ(Yhat − Ybar)/n
SST = Σ(Yi − Ybar)/n

Not given in formula booklet!

Yhat =model's prediction of y


Yi = data's actual y-values
Y bar = mean of y values

4.13 R^2 and SSR 2


4.13 R^2 and SSR 3
4.14 unbiased estimates of σ and
μ

Syllabus and formula Notes

n= sample size

To find the unbiased estimate s2 n = variance of sample


of the mean: s2 n − 1= an estimate of the
you do it as expected, add all the population variance using the sample
values and divide by the number of Sx = unbiased estimator using the
data items data as a sample

σx = calculator assumes the data


To find the unbiased estimate
you're entered is the entire population
of population variance:
fill the data into a spreadsheet, look
at the population standard deviation, If asked to find the standard deviation
times it by n/n-1 of the sample means, you take sn-1,
Or, look at the sx standard deviation, and divide it by the square root of the
and square it, they should be the sample size

same

4.14 unbiased estimates of σ and μ 1


4.15 Sample means/ Central Limit
Theorom

Syllabus and formula Notes


A linear combination of n indpendent
normal variables in normally
distributed, following the formula
shown on the right, which is not given
in the formula booklet, but it is very
similar to the working out for a normal
distribution question.
(not provided in formula booklet) In order for the CLT to work at all,
The Central Limit Theorem says that you have to be able to calculate a
if data has a sufficiently large sample mean from your sample, which it will
size (n>30), the mean will approach a in the exam.
normal distribution. For the data for the sample to be
Generally, Xbar approaches normally distributed, n > 30.
normality for large n, how large Some questions may just be a
depends on the distribution from straight forward normal distribution,
which the sample is taken. to identify a sample means questions
We use n > 30. look for 'sample mean' or a sample
size.

If it is sample means we do the usual


formula except, it is the standard
deviation over the square root of n.

4.15 Sample means/ Central Limit Theorom 1


4.16 Confidence intervals
As opposed to testing data obtained
from a sample against a single value
for the population mean, it is more
convenient to have a range of values
within which the mean of the
The interval can be stated in words or population is likely to lie.
as 'The 95% confidence interval is These intervals always have a
(number, number) confidence level. For example, a 95%
When the population from which the confidence interval means that on
sample is taken can be regarded as 95% of all occasions, such a sample
being normally distributed, was selected the population mean
confidence intervals for the would fall within the
population means can easily be calculated boundaries.
worked out.
T-distribution confidence
Confidence intervals - when we intervals:
don't know what a population's mean
is, so
we use a sample to give us a better
idea of what it might be. Unbiased
estimator of population
standard deviation often needed (t-
distribution).

When we sample data, it's unlikely


that our sample's mean is the true
population mean. However, if we
use the sample data to work
out an unbiased estimator for the Z-interval (normal distribution):
population standard deviation,
then, in combination with our sample
mean, we can calculate values

4.16 Confidence intervals 1


within which we are 90%, 95% or
99% confident that the real,
population mean lies.

With confidence intervals, we are not


testing a hypothesis, we are using
samples to tell us something about
the population's mean.

Working out:
Type of distribution → Example X ~ N
(mean, standard deviation)

Lower bound < mean < upper bound

In terms of a general mean, and a


general population standard deviation,
we use mean 0 and standard deviation 1.
So for a general population 99%
confidence interval: -2.576 < x < 2.576

4.16 Confidence intervals 2


4.17 Poisson distribution

When?
An event can occur any number of times (no upper or lower, known limits on the
number of occurrences) during a time period.

Independent events; In other words, if an event occurs, it does not affect the
probability of another event occurring in the same time period.

Rate of occurrence is constant; that is, the rate does not change based on time
(mean is constant for any given time period).

Probability of an event occurring is proportional to the length of the time


period. For example, it should be twice as likely for an event to occur in a 2 hour
period as it is in a 1 hour time period. If mean number of calls to a call centre in
1hour = 1000 calls, then in ½ hr = half of 1000 calls and in ¼ hr = ¼ of 1000 calls,
10mins = 1/6 of 1000 calls etc.

It must be discrete variables

The events cannot occur simultaneously

The events must be random/ unpredictable

4.17 Poisson distribution 1


Working out:
K~Po (x) where x=probability

Poisson PDF = probability of a specific number of events in the interval


Poisson CDF = probability of a range of events in the interval

X = discrete random variable, e.g number of cars


Lamda (upside down y) = expected result

In a Poisson distribution, the mean = the standard deviation squared, this may be
useful in proving whether it does follow a Poisson distribution, because the two
values should be approximately equal

Poisson uses discrete data

If X~Po(α) then E(X) = Var (X) = α

The mean = mode = variance → can be useful for testing whether it does follow the
Poisson distribution

Critical values:
Spreadsheet, first column = 0s, 2nd column = values around the mean, 3rd column =
PoissonCDF(mean, title of second column, title of 1st column) YOU HAVE TO LABEL
THE 3rd COLUMN as well
Or THIS ACTUALLY WORKS -> 1st column = values around the mean, 3rd column you
do PoissCDF(mean, 0, a1) then drag down (to drag lots, when its got the four way hold
and drag)
Or sadly you have to guess random values until it’s just on the edge of the significance
level, so you put the mean, 0, then guess a value

4.17 Poisson distribution 2


4.18 Paired and unpaired t-test
Paired t-test/ paired samples
What is a paired t-test?
A paired t-test compares the means and standard deviations of two related groups to
determine if there is a significant difference between the two groups. It means that both
samples consist of the same test subjects. A paired t-test is equivalent to a one-sample
t-test.

What are the hypotheses?


There are two possible hypotheses in a paired t-test.
● The null hypothesis (H0) states that there is no significant difference between the
means of the two groups, but it is rephrased to say the difference in the two means is
equal to zero, and the test is done on the differences, rather than on the two sample
separately. ( mu1 = mu2) or (mu1 − mu2 = 0)
● The alternative hypothesis (H1) states that there is a significant difference between
the two population means, and that this difference is unlikely to be caused by sampling
error or chance.
mu1 = /mu2, < or > (see on paper)

What are the assumptions of a paired t-test?


● The dependent variable is normally distributed.
● The observations are sampled independently.

● The independent variables must consist of two related groups or matched pairs.

How to run a paired t-test:


Fill in data onto a spreadsheet

4.18 Paired and unpaired t-test 1


Menu → Statistics → Stat tests → t-test if there's just the differences, I think 2
sample t-test if you have the data of both groups

Paired interval test → Menu → Statistics → Confidence intervals → t-interval

Unpaired t-test/ two-sample tests


What is an unpaired t-test?
An unpaired t-test, also known as an independent t-test, compares the means of two
independent or unrelated groups to determine if there is a significant difference between
the two. It means that both samples consist of distinct test subjects. An unpaired t-test is
equivalent to a two-sample t-test.

What are the hypotheses?


● The null hypothesis (H0) states that there is no significant difference between the
means of the two groups.
● The alternative hypothesis (H1) states that there is a significant difference between
the two population means, and that this difference is unlikely to be caused by sampling
error or chance.

What are the assumptions?


● The dependent variable is normally distributed.
● The observations are sampled independently.
● The variance of data is the same between groups, meaning that they have the same
standard deviation, so the test is just whether or not the populations they come from
have the same mean.
● The independent variables must consist of two independent groups.

The subjects volunteered and so the data is randomized.

Example:

4.18 Paired and unpaired t-test 2


If you wanted to conduct an experiment to see how drinking an energy drink increases
heart rate, you could do it two ways.
The "paired" way would be to measure the heart rate of 10 people before they drink the
energy drink and then measure the heart rate of the same 10 people after drinking the
energy drink. These two samples consist of the same test subjects, so you would
perform a paired t-test on the means of both samples.
The "unpaired" way would be to measure the heart rate of 10 people before drinking an
energy drink and then measure the heart rate of some other group of people who
have drank energy drinks. These two samples consist of different test subjects, so you
would perform an unpaired t-test on the means of both samples.
https://socratic.org/questions/what-is-a-paired-and-unpaired-t-test-what-are-the-
differences

4.18 Paired and unpaired t-test 3


4.18 Product moment correlation
coefficient test

Syllabus and formula Notes


If you have a set of bivariate (two
variables) data, then this can be
shown on a scatter graph, and the
correlation coefficient (r) can be
found.

This is unlikely to be the same r value


as the r value for the whole
population, which is normally written
Steps to do a hypothesis test as ρ.
for the population correlation For a large sample size, it is likely to
coefficient: be quite similar.
1. Write the null hypothesis: ρ =0 If you need to calculate a line of best
2. Decide on the alternate hypothesis, fit for your data, you need to be sure
which will depend on if you're looking that there is a linear correlation. This
for a positive correlation, negative can be tested by testing the null
correlation or any linear relationship. hypothesis: H0: ρ = 0

3. Find and write the p-value. H0: ρ = 0 is tested against the


alternatives Ha : ρ < 0 or ρ > 0 or
4. Compare the p-value with the
ρ=
0
significance level and write a
conclusion. This depends on whether you are
testing if there is negative correlation,
positive correlation or neither
correlation

To see if the sample correlation is big


enough so that the null hypothesis is

4.18 Product moment correlation coefficient test 1


rejected, a p-value is found.
You should not calculate a
least squares regression line This is the probability of the sample
unless there is significant correlation being at least as large as
evidence of a linear the one obtained if the two
relationship. populations are not correlated (ρ =
0)

How to do on calculator Conclusion of the test


Menu → Stats → Linear reg t Test If the p-value is less than the
significance level, then there is strong
evidence that the two variables have
a linear correlation, so calculating the
least squares regression line is
appropriate.

If the p-value is less than the


significance level:
"reject the null hypothesis that there
is no correlation between the two
variables."

If the p-value is greater than the


significance level
"there is insufficient evidence to
reject the null hypothesis that there is
no correlation between the two
variables."

4.18 Product moment correlation coefficient test 2


4.18 Test for population mean for
normal distribution

Syllabus and formula Notes

Test for population mean for


normal distribution
If you are testing whether or not your
data has come from a normal
distribution, you will need to know an
estimate for the mean and the
standard deviation to work out the
expected values.

The best choices would be to use the


Paired t-test → compares the sample mean as an estimate for the
means and standard deviations population mean: X̄ ~N(μ , σX/√n)
of two related groups to
For the population mean we use the
determine if there is a
formula given above (from the CLT/
significant difference between
sample means), we find out if the
the two groups. It means that
value of Xbar (because it is mean)
both samples consist of the
falls in the critical region then the null
same test subjects. A paired t-
hypothesis is rejected.
test is equivalent to a one-
sample t-test. So we are provided with a
significance level, then we complete
an equation as so: P(Xbar>a)=0.05
(or whatever the significance level is).

Draw a diagram like the one below to


figure out which critical region you
are looking for

4.18 Test for population mean for normal distribution 1


Unpaired t-test → also known
as an independent t-test,
compares the means of two The different critical regions
independent or unrelated
groups to determine if there is Use Inverse Normal and fill in your
a significant difference data values → Don't forget over the
between the two. It means that square root of n (sample size)
both samples consist of distinct
If there is a big difference between
test subjects. An unpaired t-
the mean you found and the mean
test is equivalent to a two-
given, then you reject H0. (See
sample t-test.
Example 5 of Oxford → page 640)

See other document for more details


Reminder: Degrees of freedom
on paired/ unpaired t-tests.
is n-1

A critical value is a point on the


distribution of the test statistic
under the null hypothesis that
defines a set of values that call
for rejecting the null
hypothesis.

4.18 Test for population mean for normal distribution 2


4.18 Test for population mean
using Poisson distribution

Syllabus and formula Notes


If a distribution is known to be
Poisson, or if it has the
characteristics of a Poisson
distribution, a hypothesis test can be
done to test for a particular value for
the mean.

This is very similar to testing


hypotheses with the binomial
distribution.
If unclear, look at example 3 in
Oxford textbook, page 634. Steps
1. State the null and alternate
It will always only be a one-tailed hypotheses (e.g λ = 7.3, λ > 7.3)
test. 2. To find the critical region, do
PoissonCDF: λ = mean, your
Example
estimate for the critical value, an
upper bound, infinity doesn't work.

3. You have found the critical value


when it is above the boundary, e.g if
the significance level is 0.05, if your
value is above 0.95, then that is the
critical value and so that number is
the critical region.

4.18 Test for population mean using Poisson distribution 1


4.18 Test for proportion using
binomial distribution

Syllabus and formula Notes


Each statistic obtained from a sample
has an associated p-value.

This p-value is the probability of


obtaining this value (or a more
extreme one) if the null hypothesis is
true.

If the p-value is less than the


significance level the test is
significant and the null hypothesis is
The null hypothesis is rejected
rejected.
if either the test statistics fall in
the critical region or if the p-
How to:
value is less than the
significance level. 1. State the null and alternate
hypothesis.

The test will always be one-tailed 2. Write your equation: P(X>r)</=0.05


only, like this, or the other side, never (or whatever the significance level is.
both: 3. BinomialCDF(Number of trials,
probability, test a value, number of
trials) (Or InverseBinomial - you have
to add or minus one to get > / =

4. When it is less than 0.05 or more


than 0.95, then that is the critical
region.

See example 2 of Oxford textbook,


page 631.

4.18 Test for proportion using binomial distribution 1


4.18 Test for proportion using binomial distribution 2
4.18 Type I and Type II errors

Syllabus and formula Notes

Conclusions
There are two acceptable conclusions to
a hypothesis test:

1. Sufficient evidence to reject H0 at the


Type I error significance level.
If we falsely reject H0 while it was
2. Insufficient evidence to reject H0 at
true, there has been a Type I error. It
the significance. (fail to reject)
can be called a 'false positive'.

It is the 'more serious error'. Type II error


We can write its probability as Accepting H0 when it is not true is a
P(C│H0 true) where C stands for the Type II error, also called 'a false
test statistic being in the critical negative'.
region. The probability can be written as
For the normal or t-distributions, this P(C'│H1 true). The actual value of
is the same as the significance level, the probability depends on the
and for discrete distributions, it is particular value of the parameter.
equal to the probability of rejecting It needs an alternative hypothesis (an
H0, which might not be exactly the alternative mean and standard
same as the quoted significance deviation) to work this out.
level.
The probability of this happening
It is very important to consider the depends mostly on sample size and
balance between the two. In some cases, variance.
it is more important to avoid a Type I
error than to avoid a Type II error. Chances
For a given sample size, reducing the
chance of a Type I error increases

4.18 Type I and Type II errors 1


the chance of a Type II error, and
vice versa.

α and β
α is the level of significance.

β is incorrectly failing to reject a false


null hypothesis.

Power → the ability of the test to


correctly reject a false null
hypothesis.

Finding the probability of a


Type I error
Find the critical value (s)

It is only a two-tailed test if it is μ ≠ μ0

This can be done by inverse normal/t/


binomial of the significance level of
the test

Then put in the values to the t cdf/ Find the probability of a Type II
normal cdf/ binomial cdf error
Finding the critical value (s)

It is only a two-tailed test if it is μ ≠ μ0

This can be done by inverse normal/t/


binomial of the significance level of
the test

Then using normal cdf/ t cdf/ binomial


cdf

Then doing it in the same way as on


the left

4.18 Type I and Type II errors 2


If it is using sample means/ the
Central Limit Theorem, then don't
forget to do the standard deviation
over the square root of the sample
size.

OR

Lower bound = lower critical value

Upper bound = upper critical value


Binomial → Number of trials,
probability of success, 0/ lower Mean
bound, 1 less than the critical value. Standard deviation/ Standard
Example → 50, 0.02, 0, 3 (critical deviation over the square root of the
value is 4) sample size
1 - P(X<3)

4.18 Type I and Type II errors 3


4.16/18 One sample t-test, when σ
is unknown
In questions about the mean of the
normal distribution, we were assuming
to know the population variance, so we
use the unbiased estimator of the
population variance → S 2 n − 1

An unbiased estimator is one that will on


average, tend towards the value of the
parameter being estimated, in this case
σ2 .
Reminder about unbiased
estimator This adds an extra degree of uncertainty,
which depends on the sample size.
→ If the sample is large then there is not
much uncertainty because the estimate
will be close to the actual value.

This extra degree of uncertainty means


n= sample size the distribution of Xbar is no longer
s2 n = variance of sample normally distributed, but instead follows
a t-distribution.
s2 n − 1= an estimate of the population
For a sample size of n, we use the
variance using the sample
T (n − 1) distribution where n − 1 is
Sx = unbiased estimator using the data the degrees of freedom, and is often
as a sample/ population standard written as v.
deviation using the sample

σx = calculator assumes the data you're t-distribution on calculator (data)


entered is the entire population Find the mean, this is the mean that will
Don't forget to square root it because it's be tested against the population mean.
the variance (Or use the mean given).

σ: (pop SD known & n>30), Sn-1 (t- Statistics → Stat tests → t-Test → Data
distr,σ unknown or n<30) or S_(n−1)/√n μ0 → population mean, the assumed
(testing x ̅, t-distr,σ unknown or n<30) mean

4.16/18 One sample t-test, when σ is unknown 1


List → list of data
The normal distribution is used
Frequency list → How many lists of
when the population variance/ σ 2 entry
is known.
The T (n − 1) distribution is used This gives a P-value which can be
when the population variance/ σ 2 compared against the significance level
has to be estimated from the data. for rejecting or accepting the null
hypothesis.

It also gives a t number (test score). A


The t-distribution approaches the normal
positive t number means that the score
distribution for large values of n.
is on the right side of the graph, a
The t-distribution is used regardless of negative means it is on the left, with 0 in
the sample size. the middle (where the mean would be).

t-distribution on calculator (stats)


Fill data into column.

Statistics → Stat tests → t-Test → Stats

μ0 → population mean, the assumed


Finding the t-value mean
x̄ → sample mean
Finding the t-value can be used for
finding the equivalent of the critical Sx → unbiased estimator, data as a
sample
value
n → sample size
For this, we do inverse-t, put in the
area (it always has to be below) When to use normal vs t-test
We put in the area and the degrees
of freedom, which is n-1, and that
gives us the critical value.

Z-test → sample means


T-test → sample means

Norm Cdf → individual items


T Cdf → individual items

4.16/18 One sample t-test, when σ is unknown 2


Definitions
A dependent variable is a variable whose variations depend on another
variable—usually the independent variable.

An Independent variable is a variable whose variations do not depend on


another variable but the researcher experimenting.

Discrete data is information that can only take certain values.

Continuous data is data that can take any value. Height, weight,
temperature and length are all examples of continuous data.

Mutually exclusive refers to two (or more) events that cannot both occur
when the random experiment is formed.

Critical region is where the null hypothesis is rejected when a calculated


value of the test statistic lies within this region.

The Critical value is the value which determines the boundary of the critical
region.

Definitions 1

You might also like