0% found this document useful (1 vote)
940 views

Quantitative Analysis

The document provides an outline for quantitative analysis concepts covered in the CFA Level I exam, including probability concepts, common probability distributions, sampling and estimation, and hypothesis testing. It defines key terms like random variables, events, types of probabilities, probability rules, expected value, variance, and covariance. It also covers important probability distributions like the normal, binomial, and lognormal distributions. Sampling topics include simple random sampling, stratified random sampling, and the central limit theorem. Hypothesis testing concepts like confidence intervals and the standard error are also discussed.

Uploaded by

Manoj Kumar
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (1 vote)
940 views

Quantitative Analysis

The document provides an outline for quantitative analysis concepts covered in the CFA Level I exam, including probability concepts, common probability distributions, sampling and estimation, and hypothesis testing. It defines key terms like random variables, events, types of probabilities, probability rules, expected value, variance, and covariance. It also covers important probability distributions like the normal, binomial, and lognormal distributions. Sampling topics include simple random sampling, stratified random sampling, and the central limit theorem. Hypothesis testing concepts like confidence intervals and the standard error are also discussed.

Uploaded by

Manoj Kumar
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 84

Quantitative Analysis

CFA Level I

By Dr. Jatin Thukral


Outline
• Probability Concepts
• Common Probability Distributions
• Sampling and Estimation
• Hypothesis Testing
Probability Concepts
• A random variable is an uncertain quantity
• An outcome is an observed value of a random
variable
• An event is a single outcome or a set of
outcomes
• Mutually exclusive and exhaustive events are
those that include all possible outcomes
Defining Properties of Probability
• The probability of occurrence of any event Ei is
between 0 and 1
0 ≤ P(Ei) ≤ 1
• If a set of event E1 ,E2 , E3 , …. , EN is mutually
exclusive and exhaustive, then
∑P(E ) = 1
i
Types of Probabilities

• Empirical probability: based on historical data

• A priori probability: based on reasoning

• Subjective probability: based on personal


judgment
Probability in terms of Odds

• Alternate way of expressing probabilities

• The odds that an event with probability 0.2


will occur are 0.2/(1-0.2) = 0.25
Unconditional and Conditional Probabilities

• Unconditional Probability: The marginal


probability that an event occurs regardless of
any past or future event

• Conditional Probability: The probability of any


event given that another event(s) has
occurred or will occur
An example illustrating unconditional, conditional and joint
probabilities
Rules of Probability
• Multiplication Rule:
Joint Probability P(AB) = P(A|B) P(B)
• Addition Rule:
P( A or B) = P(A) + P(B) – P(AB)
• Total Probability Rule:
P(A) = P(A|B1) P(B1) + P(A|B2) P(B2)
+ ………. + P(A|BN) P(BN)
where B1, B2 ,…, BN is a mutually exclusive and
exhaustive set of outcomes
Applications of Probability Rules
• Multiplication rule is used to compute joint
probability
• Addition rule is used to determine probability
that at least one of two event occurs
• Joint probability of any number of
independent events is calculated as
P(ABCD)=P(A) P(B) P(C) P(D)
Independent Events
• Independent Events: Occurrence of one event
has no influence on occurrence of other events
P(A|B) = P(A)
Expected Value

Expected value is the weighted average of


the possible outcomes of a random variable,
where the weights are the probabilities that
the outcomes will occur
Diagramming an investment problem using
a tree diagram
Covariance

• Covariance is a measure of how two assets


move together
• Variance is covariance of an asset with itself
• Covariance can range from negative infinity to
positive infinity
Correlation Coefficient

• Correlation is covariance divided by product of


standard deviations
• Measures the strength of linear relationship
between two random variables
• Has no units and ranges from -1 to +1
• If Corr(Ri,Rj)=1, the random variables have perfect
positive correlation
Portfolio Expected Value and Variance

• Portfolio Expected Value

• Portfolio Variance
Bayes’ Formula
• Bayes’ formula is used to update a given set of prior
probabilities for a given event in response to the arrival of
new information
Labeling
• There are n items and • The total number of
each can receive one of ways that the labels can
k different labels be assigned is
• The number that
receive label 1 is n1, that
receive label 2 is n2 and
so on, such that
Combination Formula
• A special case of labeling where k=2
• n1=r and n2=n-r
• Choosing r items from a set of n items
• nCr is the number of possible ways of selecting r
items from n items when order of selection is
not important
Permutation Formula
• nPr is the number of possible ways to select r
items from a set of n items when the order of
selection is important
Outline
• Probability Concepts
• Common Probability Distributions
• Sampling and Estimation
• Hypothesis Testing
Probability Distribution
• A probability distribution describes the
probabilities of all the possible outcomes for a
random variable
• A discrete random variable is one for which
number of possible outcomes can be counted
• A continuous random variable has infinite
possible outcomes
Probability Function
• A probability function p(x) specifies the
probability that a random variable is equal to
a specific value x

• The two key properties are


0 ≤ p(x) ≤ 1
∑ p(x) = 1
Probability Density Function (pdf) and
Cumulative Distribution Function (cdf)

• A pdf is a function f(x) that generates the


probability that outcomes of a continuous
distribution lie within a particular range

• A cdf defines the probability that a random


variable X takes on a value less than a specific
value x
Pdf and Cdf of Normal Distribution
Discrete Uniform Random Variable and a
Binomial Random Variable
• A discrete uniform random variable:
probabilities for all possible outcomes are
equal
• A binomial random variable is the number of
“successes” in a given number of trials when
outcome can be either “success” or “failure”
Binomial Random Variable
• The probability of success p is constant for
each trial and the trials are independent
• For number of trials = 1, the binomial random
variable is called Bernoulli random variable
• The expected value of a Binomial variable is
Stock Price Movements as Binomial Tree
Continuous Uniform Distribution
• Outcomes can only occur between some
lower limit a and some upper limit b
• P(X < a) = P(X > b) = 0
• P(x1 ≤ X ≤ x2) = (x2 – x1) / (b – a)
Normal Distribution
• Completely described by mean µ and variance
σ2, stated as X ~ N(µ,σ2)
• Skewness = 0, that is P(X ≤ µ) = P(µ ≤ X) = 0.5
• Kurtosis = 3
• A linear combination of normally distributed
random variable is also normally distributed
• Probabilities of outcomes further above and
below from mean get smaller but never zero
Normal Distribution
Multivariate Distributions
• Specifies the probabilities associated with a
group of random variables
• Meaningful only if behavior of one random
variable depends on behavior of others
• Multivariate normal distribution is completely
defined by three set of parameters
Confidence Interval
• A confidence interval is a range of values
around the expected outcome within which
we expect the actual outcome to be some
specified percentage of the time
Standard Normal Distribution
• A standard normal distribution is a normal
distribution that has been standardized so that
it has mean zero and standard deviation 1
• Standardization is the process of converting an
observed value for a random variable to its z-
value
Using the z-table
• Consider an EPS distributed with µ=6,σ =2.
What is probability that EPS ≥ 9.70?
• The z-value of EPS = 9.70 is

• From z-table, F(1.85)=0.9678


• F(EPS > 9.70) = 1-0.9678 = 0.0322
Shortfall Risk and Safety-First Ratio
• Shortfall risk is probability that return will fall
below a target value over a given time period
• Roy’s safety-first criterion states that optimal
portfolio minimizes the probability that return
falls below a minimum threshold level, i.e.,

• If returns are normally distributed, then


Lognormal Distributions
• The lognormal distribution is generated by
the function ex where x is normally distributed
Monte Carlo Simulations
• Technique based on repeated generation of
one or more factors that affect security values
• A probability distribution is assigned to each
factor
• Each set of randomly generated factors is used
with a pricing model to value the security
• Limitations: fairly complex; results no better
than the assumptions on factor distributions
Historical Simulation
• Based on actual changes in risk factors over
some time period
• Advantage of using the actual distribution of
risk factors and not estimations
• Limitation compared to Monte Carlo
simulation is that can’t answer “What if”
questions
Outline

• Probability Concepts
• Common Probability Distributions
• Sampling and Estimation
• Hypothesis Testing
Sampling
• Simple random sampling: selecting a sample such
that each item has same chance of being selected
• Systematic sampling: selecting every mth item
• Sampling error: difference between sample statistic
(mean, variance) and its population parameter
• Sampling distribution of a sample statistic is a
probability distribution of all possible sample statistics
computed from a set of equal-sized samples
Stratified Random Sampling
• Population is separated into smaller groups
based on one or more distinguishing
characteristics
• From each group, or stratum, a random
sample is taken and results are pooled
• Size of sample from each stratum depends on
size of stratum relative to population
• Often used in bond indexing
Time-Series and Cross-Sectional Data

• Times-series data: observations taken over a


period of time at specific and equally spaced
time intervals
• Cross-sectional data: observations taken at a
single point in time
Central Limit Theorem
• For simple random samples of size n from a
population with mean µ and variance σ2, the
sampling distribution of sample mean
approaches N(µ,σ2/ n)
• Extremely useful: Normal distribution is easy to
apply to hypothesis testing and construct
confidence intervals
• Inference can be made irrespective of
population distribution if n>30
Standard Error of Sample Mean
• It is the standard deviation of the distribution
of the sample means
Point Estimates and Confidence Interval
• Point estimates are single values used to estimate
population parameters. For ex.

• Confidence interval estimates result in a range


of values within which the actual parameter will
lie with probability 1-α. Here, α is called the
level of significance and 1-α is called degree of
confidence
Construction of Confidence Intervals
point estimate ± (reliability factor X standard error)
Where
Point estimate = vale of a sample statistic
Reliability factor = a number depending on the
sampling distribution of estimation
and probability that estimate falls within
confidence interval
Standard error = standard error of the point estimate
Desirable Properties of an Estimate
• Unbiased: Expected value of the estimator is
equal to the parameter you are trying to
estimate
• Efficient: Variance of sampling distribution is
smaller than all the other unbiased estimators
• Consistent: Accuracy of parameter estimate
increases as the sample size increases
Student’s t-distribution
• Bell-shaped distribution, symmetric about mean
• Appropriate distribution to use when
constructing confidence intervals based on small
samples (n<30) from populations with unknown
variance and a normal (or approx.) distribution
• Also appropriate when variance is unknown and
sample size is large enough that CLT will ensure
that sampling distribution is approx. normal
Properties of Student’s t-distribution

• Symmetrical about mean


• Defined by a single parameter, the degrees of
freedom, equal to n – 1
• More probability in tails than normal
distribution
• As degrees of freedom get larger it
approaches standard normal distribution
• Fatter tails => difficult to reject null hypothesis
Distributions for Different Degrees of
Freedom
Calculating Confidence Interval
• Population is normal distributed with known
variance
Calculating Confidence Interval

• Population is normal distributed with


unknown variance
Constructing Confidence Intervals
• Unlike standard normal distribution, the
reliability factor for t-distribution depends on
sample size, so we have to look for reliability
factors from table
• Owing to fatter tails of the t-distribution,
confidence intervals using t-distribution are
wider/more conservative
Selecting the Appropriate Test Statistic
Biases in Estimation Based on Samples
• Data-Mining Bias: Repetitive search for data patterns
results in data-mining
• Sample Selection Bias: Some data is systematically
excluded because of lack of availability
• Survivorship Bias: Most common bias, only surviving
data is considered
• Look-Ahead Bias: A relationship is tested based on
data not available on test date
• Time-Period Bias: Period over which data is gathered is
too short or too long
Outline

• Probability Concepts
• Common Probability Distributions
• Sampling and Estimation
• Hypothesis Testing
Hypothesis Testing Procedure
Null Hypothesis and Alternative Hypothesis

• The null hypothesis H0 is the hypothesis that


the researcher wants to reject. For ex,

• The alternate hypothesis Ha is what is


concluded if there is sufficient evidence to
reject the null hypothesis
Two-Tailed Tests of Hypotheses
• A two-tailed test for population mean is

• The general decision rule for a two-tailed test is


Two-Tailed Test Using Standard Normal(z)
Distribution
• The decision rule for a two-tailed z-test at α=0.05 is
Example of Two-Tailed Test
• Data gathered on daily returns for 250 days;
Mean daily return=0.1%, Sample Std. Dev.
=0.25%. Belief: Mean Daily Return ≠ 0
• Null Hypothesis:
• Decision Rule:
• Standard error = 0.0025/√250
• Test Statistic = (0.001 – 0)/(0.0025/√250) = 6.33
• Since 6.33 > 1.96, reject the null hypothesis
One-Tailed Hypothesis Test
• The null and alternative hypotheses are either

• Using Standard Normal(z) Distribution


Test Statistic

• Standard Error when population σ is known:

• Standard Error when population σ is not known:


Critical Value and Decision Rule
• A test statistic is a random variable that may
follow one of several distributions: the t-
distribution, the z-distribution, the chi-square
distribution and the F-distribution
• The critical value for the appropriate test
statistic depends on its distribution
• The decision rule: if the test statistic is
(greater, less than) X, reject the null
Type I and Type II Errors
• Type I error: Rejection of H0 when it is actually true
• Type II error: Failure to reject H0 when it is false
• The significance level is the probability of making a
Type I error and is designated by α
• Power of a test is the probability of correctly
rejecting H0 when it is false:
• Therefore, Power of a test = 1 – P(Type II error)
Type I and Type II Errors
Relation Between Confidence Intervals and
Hypothesis Tests
• A confidence interval is determined as

• The above expression can be rewritten as


p-value and Hypothesis Testing
• The p-value is the smallest level of significance
for which H0 can be rejected assuming it is
true
• For example,
Signif. level =95%,
Test statistic =2.3
Population mean: when to use t-test?
• Use t-test when the population variance is
unknown and either n≥30 or n<30 but
distribution is approx. normal
• A t-statistic with n-1 degrees of freedom is
computed as
Population mean: when to use z-test?
• Use z-test when population is normally
distributed and variance is known
• A z-statistic is computed as
Population mean: when to use z-test?
• Also, use z-test when sample size is large and
variance is unknown
• Corresponding z-statistic is computed as
Critical z-Values
Equality Test of Two Population Means
• Three possible hypotheses structures

• Two possible t-statistics


• Unknown variances but assumed equal
• Unknown variances assumed unequal
Unknown Variances Assumed Equal
Unknown Variances Assumed Uequal
Paired Comparisons Test
• Test whether the means of the difference
between observations for two samples are
different
• Two-sided Test

• One-sided Test
t-statistic for Paired Comparisons Test
Test Variance of Normally-Distributed
Population
• Chi-square test
– Two-tailed test

– One-tailed test
Decision Rule for a Two-Tailed Chi-
Square Test
Chi-Square Test Statistic
• The chi-square test statistic with n-1
degrees of freedom is computed as
Equality of Variance of Two Normally
Distributed Populations
• F-distributed test statistic
– Two-tailed test

– One-tailed test

• F-statistic computed as
F-test
• Always put larger variance in numerator
• n1 – 1 and n2 – 1 are degrees of freedom used
to identify critical value from the F-table
Parametric and Nonparametric Tests

• Parametric Tests: Rely on assumptions


regarding the distribution of the population
and are specific to population parameters
• Nonparametric Tests: Either do not consider a
particular population parameter or have few
assumptions about the population that is
sampled

You might also like