0% found this document useful (0 votes)
5 views

Mod 1 Stats

Uploaded by

jennylehuynh29
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Mod 1 Stats

Uploaded by

jennylehuynh29
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Module 1: Descriptive Statistics

Lecture 1: Introduction and Descriptive Statistics


Data Types
Qualitative/categorical
● Mutually exclusive labels (one label cannot mean two things)
● Not often numbers, if so, numbers have no mathematical meaning
- Nominal: ordering/ranking makes no sense, numerical labels are arbitrary
- Ordinal: ordering/ranking has meaning/can be interpreted, numerical labels
respect the ordering
Quantitative/numerical
● Numbers used to record certain events, numbers have mathematical meaning
- Interval: quantity in difference is meaningful, but in ratio is not; zero has no
natural meaning
- Ratio : difference and ratio of two quantities is
also meaningful; zero is meaningful

Using categorical/qualitative data


Frequency distribution
● Frequency: the total number of occurrences for each
category


● Relative frequency: the fraction of total number of
items belonging to category (eg. 102 808 = 0.1262)
● Percent frequency: relative frequency x 100%
Histograms
● Categories on x-axis
● Frequency, relative frequency, percent frequency on y-axis

Using numerical/quantitative data


Frequency distributions and histograms
● Categories on x-axis are grouped (eg. 0-5, 5-10, 10-15)
● Density frequency

Probability theory
● Random variable (r.v.) - a variable’s value appears randomly
● population - the complete pool of a certain random variable
● Sample - a random collection of certain size from the population

Probability distribution
● Probability distribution - the general shape of probability for
values that a random variable may take

Notation
● Random variable denoted by X, Y (capital letters)
- Eg. X: number of children in household
- Eg. Y: amount of time spent by husband on
housework per day
● realisations/observations of a random variable denoted by xᵢ, yᵢ (lowercase letters
with subscript)
- Eg. x₁: number of children in household is 1
- Eg. y₁₃₇:amount of time spent by husband is 137 on housework per day
● N and n denote the size or number of observations.
- N is referred to population size
- n denotes the sample size

Descriptive Statistics
Central tendency
● Measure of central tendency yields info about the centre of a set of numbers
(distribution of a r.v.’s) – does not focus on the span of the dataset or how far values
are from middle numbers
● gives an idea of what a typical, middle, or average that a r.v. can take
● sometimes called measures of location

three measures of central tendency

Mode ● most frequently occurring value in a set of data


● If there are 2 modes, the 2 modes are listed and the data is said to be bimodal
● Datasets with 3 or more modes are referred to as multimodal
● Concept of mode is often used in determining sizes
● Appropriate descriptive summary measure for categorical data

Median ● middle value in an ordered array of numbers


𝑛+1
● locate the median by finding the 2 th term in the ordered array
● Large and small values do not inordinately influence the median – hence the
● best measure of location to use in the analysis of variables in which extreme but
acceptable values can occur at just one end of the data
● Not all info from the dataset is used
● Data must be quantitative or be able to be ranked

Mean ● Average of a set of numbers


● Sample mean is represented by X̄
● Population mean is represented by µ
● Data should be quantitative as it needs to be summed
● Affected by all values – advantage because it reflects all the data, but
disadvantage because extreme values pull the mean towards extremes
● To calculate the mean forecast value, we need to multiply each possible value by
its probability and sum up the products.

- If we denote the r.v. by X:


Variability
● Measures of variability yield info about the likelihood of a realisation of the r.v. is
away from the centre of its distribution, describes the spread/dispersion of a dataset
● Gives an idea of fluctuation and volatility across realisations of the r.v.
● The more variability in a dataset, the less typical they are of the whole set
● Using measures of variability in conjunction with measures of central tendency
makes possible a more complete numerical description of the data (measure of
variability is necessary to complement the mean value when describing data)
● Conveys fluctuations and volatility across realisation of random variable
● The more spread out the r.v. is, the larger the risk/dispersion the variability is
● Also called measures of scale, spread, dispersion or risk
● Measures of variability
- Variance (Var) - average of squared distance from the mean
- Standard deviation (std): square root of variance
- Coefficient of variation - standard deviation/ mean x100%

Variability formulas
Variance
● It computes the average squared distance between data points and their mean,
depending on sample or population
● Population variance
- Finite population
- Denoted by σ² (stigma square) or
Var(X)/Variance of X
● Sample variance
- Denoted by s²
Standard deviation
● Standard deviation solves the problem of squared units. It has the same unit of the
original data
● Population standard deviation
- Denoted by σ (stigma) or std(X)
● Sample standard deviation
- Denoted by s
Coefficient of variation
● Measures standard deviation per unit of
mean
● In finance when the r.v. X denotes assets returns, CV measures risk per unit of
expected return
● It is unit free, because both the numerator and denominator have the same unit as
the original data and they cancel each other
● Population CV
- when σ increase, CV increase
- when µ increase, CV decreases
- Ratio between risk and expected return
Skewness
Shape
● Central tendency and variability are useful to describe and summarise data or the
distribution of r.v.’s
● Skewness - a measure of asymmetry
● Mode: value on the horizontal axis where the high point of the curve occurs
● Mean: towards the tail of the distribution (drawn towards the extreme values)
● Median: generally located somewhere between the mode and the mean

Lecture 2: Probability theory


● Multi-dimensional data
● Experiment: a random process that creates outcomes (eg. the data collection
procedure)
● Sample space: the set of all possible outcomes
● Event: a set of outcomes (can contain no outcome, single outcome or multiple
outcomes) of an experiment to which probability is assigned. So an event is a subset
of the sample space
● Relative frequency: outcomes receive probability corresponding to their number of
occurrences → P(outcomes)= number of occurrences of outcomeı ÷ total number of
occurrences of all outcomes

Law of addition
Joint vs marginal probabilities
● Distinguish joint and marginal probability through multidimensional outcomes
● Joint probability: denotes relative frequency when asking about all dimensions
- Eg. what is relative frequency that customer bought a $49 plan on a weekday
● Marginal probability: displays relative frequency when only asking about a single
dimension

Law of total probability, version 1


● Complement of the event denoted as A’ → pronounced as A prime - meaning not A -
if there is a dash at the top = not the outcome
● When referring to joint probability, we use
intersection “∩”. The event A∩B (it reads: the
intersection of A and B or A intersection B) means
the event where both A and B are true or both A and B occur

Venn diagram: visualisation of probability


● Venn diagram shows logic relations across sets
● The external rectangle indicates the whole sample space
● The internal circle indicates some event A
Joint events
● Joint events such as A ∩ B is the intersection (∩) of A and B
Union of events
● Indicates the event A or B happens
● This is denoted by A∪B, pronounced as the union of A and B or A union B.
So P(A∪B) indicates the probability that A or B is true or that A or B occurs

General rule of addition

Mutually exclusive events


● If event A occurs only if event B does not occur (cannot occur at the
same time), we say A and B are mutually exclusive (events)
● Any event and its complement are mutually exclusive. Either “A
occurs” or “A does not occur
● P(A∩A’) = 0

Collectively exhaustive events


● If the occurrence of events A and B covers the whole sample
space, we say A and B are collectively exhaustive (events
● Any event and its complement are collectively exhaustive. “A
occurs” and “A does not occur” make up all possible outcomes
● P( A∪A’) = 1

Conditional probability and independence


Conditional probabilities
● P(A|B) denotes the probability that event A occurs, conditional on that B occurs.
● The symbol P(X=x|Y=y) denotes the probability of r.v. X taking value x, conditional on
the r.v. Y taking value y
● formula:

● Bayes rule:

Law of total probability


● Joint probability = conditional probability multiplied by the marginal probability

Independent events: formula


● If A and B are independent events, whether or not B occurs should not affect the
probability that A occurs; also, whether or not A occurs should affect the probability
that B occurs
● Formula:

● Bayes rule:

Implications of formulas

Binomial experiments
● Eg. toss a coin 3 times in a row and you are interested in how likely it is that you get
exactly two heads
● A binomial experiment assesses the number of a certain outcome from repeated
independent trials
● Each trial has two possible outcomes (eg. heads or tails, success or failure)

Binomial tree
● When two outcomes are independent, P(A|B) = P(A)
● Suppose we have three products, each can be defect
(D) with probability p or functional (F) with probability q=
=1-p
Binomial distribution
● A r.v. X taking value in (0,1,...,n) is said to follow the binomial distribution denoted by
𝑋 ~ 𝐵𝑖𝑛(𝑛, 𝑝)

𝑥
● 𝑝 : the probability of x successes
𝑛−𝑥
● (1 − 𝑝) : the probability of n-x failures. So in total we have n trials
● The factor (combinatorial operator)

- computes the number of cases/combinations of choosing x


objects from the set of n objects. Remember the factorial
operator m! = 1 x 2 x 3 x … x (m-1) x m

● Properties of binomial distribution:


- Almost all distributions have expectation (i.e. mean) and variance (and thus
standard deviation).
- Every distribution (their pdf) is characterised by some parameters.
→ The binomial distribution has two parameters, 𝒏 (the number of trials) and
𝒑 (the success probability or success rate)
→ the mean (expectation) and variance of 𝑋~𝐵𝑖𝑛(𝑛, 𝑝) are given by:

You might also like