Tutorial Notes (Complete) 2
Tutorial Notes (Complete) 2
51
I
49.2 sum of
4
squares
12.3 SS
Std der ST 12.35 3.5 days
mean avg centre
Std der distance from mean spread
In this context understand severity of
virus and its consensus effects
3 to dirt quant
graph display frequency
I'aime
let
o o
0
00 000
right stew left skew
median mean median mean
3 17
1707222525291 n 8
19.3751
19 5
Maish
1
Z score standardized value
cannot compare apples to oranges or C to
F w o
standardizing
tells us how many SD an observation
is from the mean i Z X M
basis to compare values o
tr different means where X obs
and so u mean
T SD
2
Standard normal tables give us Z scores
and corresponding area under the normal
curve
e if 7 01520 proportion under cure
g
ANSWER
0.9357
or
93.571
0
a Tx
alwaysunder
III stand
3
aka empirical rule
based on normal distribution a
meany
and SD 8
68 t of obs w in 1 SD of
y
957 11 2 SD
99 7 t 3 SD
O
2 4 61810 12 A
4
multiplying obs in data set
by value b
multiplies mean and SD by lb
addition of value a to each obs in a
C
z
First
2.9 3S
0.6 LIL is below
is above mean second
2
µ 15.5 8 0.7
a Z x 13 15.5 3 57
0.7
Using Z table
0 0002 0.021
Up
P 2 16 PE xt
0.7611 0.0002
07609
OR
1 7 16 7 13
1
1 1 7611 0.0002
447 0.7609
3
mean multiply by 1.8 and add 32 96.6
SD only multiply by 1.8 0.9
1
2
Direction neither
positive or negative
i
d
i
Mx My NY both Tx y
ve ve neither
Strength weak moderate strong
i
moderate weak
strong
ii i'or y
i or i
Ix
ay
Scatterplots correlation linear
t t regression
relationship Want to understand
of 2 quant correlation of x and y
r
variables correlation coefficient
Pearson's
visualized direction
f
strength of
linear
relationship
where
X obs
y
X y mean
n samplesize
Sx Sy Sample SD
PROPERTIES of r
2
quantitative
no units
tu e or ve
blue t 1 and t
T strength O or O
not resistant to outliers
x y
Linear regression least squares regression
LOBI smallest residuals line
how does with X
y change
allows us to predict response
y bo t bi x
slope
intercept
Sy Sx gample
r correlation
coefficient
meagle
Tik
Coefficient of determination R
literally correlation squared
how much variation is
explained by regression
model hat
Residuals
residual y y y
difference btw obs and pied for response
Sample Surveys
Question How many hours on average do
STAB221 3 Students spend studying for
this course week
known
Poggi
parameters statistic
to infer
sample Unknown
Statistics parameters
I s
Action Ask all students
physically
present at lecture this week
O
0000
O O O
strata
accuracy
sample
Cluster Multistage hierarchy of dusters
O O
cluster
convenience
or sample
Observation vs Experiment
d f aka
passively actively Ateatment
watching manipulating
situation variables and
and drawing recording outcomes
conclusions determine causal
retrospective relationships
prospective
Experimental Design
Factors fertilizer
water
Levels high med low
rapndonigmenon
It totaftone
outcomes
Theoretical Probability
e P A 2 or
15
g
I P S
at
at
no
no outcomes
overlap in
common
outcomes overlap
can occurs
occur
simulta eously
aka
PLANB
Important to distinguish ormultiplication
b w the two rule
PCA and B PLA XP B
Quiz 7 Q2 b
Probability that Student chosen at random is
neither smart nor
funny
Correct answer
1 0.49 0.51
Syd f
P Sore
S F
Common mistake
1 0.36 0 26 0 13
0.25
Calculating this way doubles the amount of
people you take into account
people in the smart category also fall into
Sandf category
hotsmart or
funny
S f
greater than 0
multiplication
rule
B gifenda'd
A and B can be any events
order is important what would P A B be
PCA IB PCB and A
P B
are
or
P AIB P A
At least one
type questions
at least one is equal to 1 P none
e albinism example from lecture
g 3 Children at least one albino
calculate all scenarios 1 P none
All combos
Aa albino 04 4 3 4 3
A AA Aa
aaa et
Zadkine at 4 3 31
I P none
txt t t
3 4 4 37
1
It
Eg
Ya Ya
fo
focus on discrete
variables
Examples
tossing a coin number of students in a
Crv probability
of
events each
outcome
probability
distribution
special riv
boolean outcomes denoted
only yes no OH as numbers
distribution
v.v of heads
all possible
outcomes
heads
IT
same X P
table O 1 8
diff I 3 8
orientation 2 3 8
3 118
Mean expected value of v.v
aka weighted average oryx
X 12 3
P 0.20.3 0.5
MX 1 0.2
3
2 0.3
0.5
2 3
Median of v.v
aka midpoint on which X D P 0.5
t t
t
pass 0.5
Variance of r v
Nary SD X
sigma
Binomial Model
type of probability model dist Gottman
outcome either success or failure
X B hip Binomial distribution in
h and
parameters Pls
prob of success
of observations
Solving questions
Formula or Binomial Table
Example 257 of people believe in astrology
you select 4 people at random
a P none believe
b DC at least one believes
Formulas
X B Cn p
where
d n trials
X success X of people
D prob ofsuccess who believe
a I p prop of failure
P X O 4C 025 1 0.2540 0.25
p
1 1 0.757
0.31640625
Note C nor key where rex can also
be written as combination n choose x k
P X K Ordo
f D
or k
p l
manually
b P X2 1 I PED
I 031640625
0.68359375
0.409620.2901
Binomial table a P X o
0.32985
large diff due
to rounding
b PIX 1
0.67515
1 PLAN
on
sum up values 21
Note Make sure to use correct table
Not z or t tables
Also x is often interchangeable it
k as of sucesses but not always
a MX 2 0.3 5 0.7
0 6 3.5
4 I
b var x E x M PA
2 4.1 0.3 5 4.1 10.7
1 89
SD X vara
V89
1 37
p 0.1 prob of event happening i.e drop
out before graduation
n 10
a K 3
00574
b K 3 not including3
0.9298
Review
µ
proportion
µ
T
count
Zuse
fable
SRS simple random sample
n size of sample
p poplin probability of success
X count of success of sample
P sample proportion of success
ie X h
Rutes
np 210 AND nu p 210
when n is large
sampling dist
approaches normal
mean of sample
proportionsapproach
poplin prob of
success p
In proportion
pop
e Stat
g key
distribution
Sampling for prop
Dataset Coin flip
Sample size h to us n too
Generate 1,10 100 1000 Samples
p 0.21
n 300
a X N G0.21 np
µ 300
63
SD
http
763110.2T
7.0548
Z score
1 0.5
P X 80 X x val
z
p É
Z 8
I z 2.41
É t
P X2 2.41
Using Z table
0 or
IE
0.00
P X2 2.4 1 1 0.9920
0.008
all the best