0% found this document useful (0 votes)
97 views

L2 数量课件

Algorithmic trading allows for trading strategies to be implemented quickly and consistently based on quantitative models and analysis of market data. Some key benefits include: - Speed - Algorithms can react to market news and price changes much faster than humans. This allows traders to capitalize on very short-term opportunities. - 24/7 Trading - Algorithms can place trades around the clock including overnight and on weekends when markets are closed. This allows them to benefit from global market hours. - Disciplined Process - Algorithms follow predefined rules and are not influenced by emotions like fear and greed. This ensures a disciplined, unbiased approach. - Diversification - Algorithms can implement and adjust complex trading strategies across many markets

Uploaded by

Ran XU
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
97 views

L2 数量课件

Algorithmic trading allows for trading strategies to be implemented quickly and consistently based on quantitative models and analysis of market data. Some key benefits include: - Speed - Algorithms can react to market news and price changes much faster than humans. This allows traders to capitalize on very short-term opportunities. - 24/7 Trading - Algorithms can place trades around the clock including overnight and on weekends when markets are closed. This allows them to benefit from global market hours. - Disciplined Process - Algorithms follow predefined rules and are not influenced by emotions like fear and greed. This ensures a disciplined, unbiased approach. - Diversification - Algorithms can implement and adjust complex trading strategies across many markets

Uploaded by

Ran XU
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 196

Quantitative Methods

Level 2 -- 2019
Instructor: Feng

1
Brief Introduction
Topic weights:
Study Session 1-2 Ethics & Professional Standards 10 - 15%
Study Session 3 Quantitative Methods 5 - 10%
Study Session 4 Economics 5 - 10%
Study Session 5-6 Financial Reporting and Analysis 10 - 15%
Study Session 7-8 Corporate Finance 5 - 10%
Study Session 9-11 Equity Investment 10 - 15%
Study Session 12-13 Fixed Income 10 - 15%
Study Session 14 Derivatives 5 - 15%
Study Session 15 Alternative Investments 5 - 10%
Study Session 16-17 Portfolio Management 5 - 10%
Weights: 100%
2
Brief Introduction
Contents:
➢ Study session 3: Quantitative Methods for Valuation Reading 4 Introduction
to Linear Regression
✓ Reading 6: Fintech Investment Management (★)
✓ Reading 7: Correlation and Regression (★★) Reading 5 Multiple
Regression
✓ Reading 8: Multiple Regression and Issues in Regression
Reading 7
Analysis (★★★) Machine Learning

✓ Reading 9: Time-Series Analysis (★★★) Reading 8 Big


Data Project

✓ Reading 10: Excerpt from “Probabilistic approaches:


scenario analysis, decision trees, and simulations” (★)
3
Brief Introduction
考纲对比:
➢ 与2018年相比,2019年的考纲有如下变化:
✓ 新增Reading 6: Fintech in Investment Management。
✓ 在Reading 8中新增了3条关于“Machine Learning”
的考纲。

4
Brief Introduction
Level I vs. Level II:
➢ Level I 学习的主要是描述统计和推断统计中的估计与
判断部分。二级主要学习regression,是推断统计中的
预测部分。
➢ Level II 的学习中,会较多的用到 Level I 中学习的
Hypotheses Testing ,可以提前复习一下,再开始二级
的学习内容。

5
Brief Introduction
课程特征与学习建议:

➢ 理科内容,文科考法;

➢ 考纲主题部分基本没有发生过大的变动;

➢ 本门课程逻辑递进关系很强,要把每个知识点学懂了再继
续往前学;

➢ 听课与做题相结合,但并不建议“刷题”;

➢ 最重要的,认真、仔细的听课。
6
幸福就是,有人爱、有事做、
有所相信、有所期待!

7
Fintech in Investment Management

Basics of Fintech
Tasks:
➢ Describe “fintech”;
➢ Describe big data, artificial intelligence, and machine
learning.

8
Basics of Fintech
Definition of Fintech
➢ Fintech refers to technological innovation in the design and
delivery of financial services and products.
✓Drivers underlying fintech development include extremely
rapid growth in data and technological advances that
enable the capture and extraction of information from
them.

9
Basics of Fintech
Areas of Fintech development
➢ Analysis of large dataset.
➢ Analytical tools.
✓E.g., artificial intelligence (AI), machine learning (ML).
➢ Automated trading.
➢ Automated advice.
✓E.g., Robo-advisers.
➢ Financial record keeping.
✓E.g., Distributed ledger technology (DLT).
10
Basics of Fintech
Big data
➢ Big data typically refers to datasets having the following
characteristics:
✓Volume: many millions, or even billions, of data points.
✓Velocity: real-time or near-real-time have become the
norm in many areas.
✓Variety: include structured data (e.g. SQL tables or CSV
files), semi-structured data (e.g. HTML code), and
unstructured data (e.g. video messages).
11
Basics of Fintech
Big data (Cont.)
➢ Traditional data: stock exchanges, financial statements,
economic indicators.
➢ Non-traditional data: electronic devices, social media,
sensor network.
✓ Obtain data from smart phones, cameras, microphones,
radio-frequency identification (RFID) readers, wireless
sensors, and satellites.
✓ Also named alternative data.
12
Basics of Fintech
Artificial intelligence (AI)
➢ Artificial intelligence technology enable the development of
computer systems that exhibit cognitive and decision-
making ability comparable or superior to that of human
beings.
➢ By the late 1990s, AI had been deployed in logistics, data
mining, financial analysis, medical diagnosis, and other
areas.

13
Basics of Fintech
Machine learning (ML)
➢ ML algorithms are computer programs that are able to
“learn” how to complete tasks, improving their performance
over time with experience.
➢ ML involves splitting the dataset into a training dataset and
validation dataset.
✓Training dataset allows to identify relationships between
inputs and outputs based on historical data.
✓Validation dataset is used to test the relationship.
14
Basics of Fintech
Machine learning (Cont.)
➢ ML still requires human judgement in understanding data
and selecting the appropriate techniques for data analysis.
✓Before they can be used, the data must be clean and free
of biases and spurious data.
➢ Errors from overfitting may leads to prediction errors and
incorrect output forecasts.
✓Overfitting occurs when the ML model learns the input
and target dataset too precisely, and treats noise in the
data as true parameters.
15
Basics of Fintech
Types of machine learning
➢ Supervised learning: computers learn to model
relationships based on labeled training data.
✓Inputs and outputs are labeled, or identified, for the
有label就是supervised,没有就
algorithm. 是unsupervised (没有标签再
进⾏聚类等等)

➢ Unsupervised learning: computers are given only data from


which the algorithm seeks to describe the data and their
structure.

16
Basics of Fintech
Types of machine learning (Cont.)
➢ Deep learning: computers use neural networks, often with
many hidden layers, to perform multistage, non-linear data
processing to identify patterns.
✓Deep learning may use supervised or unsupervised
machine learning approaches.

17
Summary
➢ Importance: ☆
➢ Content:
✓ Definition of Fintech;
✓ Definitions of big data, AI, ML;
✓ Types of ML.
➢ Exam tips:
✓ 不是重要考点。

18
Fintech in Investment Management

Fintech Application
Tasks:
➢ Describe fintech applications to investment
management;
➢ Describe financial applications of distributed ledger
technology.

19
Fintech Application
Fintech application
➢ Text analytics and natural language processing
➢ Robo-advisory services
➢ Risk analysis
➢ Algorithmic trading
➢ Distributed ledger technology (DLT)

20
Fintech Application
Text analytics and natural language processing (NLP)
➢ Text analytics involves the use of computer programs to
analyze and derive meaning typically from large,
unstructured text- or voice-based datasets.
✓May be used to help identify indicators of future
performance, such as consumer sentiment.
➢ Natural language processing focuses on developing 语音识别等等

computer programs to analyze and interpret human


language.
✓A field of research at the intersection of computer science,
artificial intelligence, and linguistics. 21
Fintech Application
Robo-advisory services
➢ Robo-advisory services provide investment solutions
through online platforms, reducing the need for direct
interaction with financial advisers.
✓Currently, the services include automated asset allocation,
trade execution, portfolio optimization, tax-loss harvesting,
and rebalancing for investor portfolios.

22
Fintech Application
Robo-advisory services (Cont.)
➢ Most Robo-advisers follow a passive investment approach,
and two types of wealth management services dominate
the robo-advice sector:
✓Fully automated digital wealth managers
✓Adviser-assisted digital wealth manager

23
Fintech Application
Risk analysis
➢ Big data may provide insights into real-time and changing
market circumstances to help identify weakening market
conditions and adverse trends in advance.
➢ Machine learning can help validate data quality by identify
questionable data, potential errors, and data outliers before
integration with traditional data for use in risk models and
in risk management applications.
➢ Advanced AI-based techniques can be used for scenario
analysis and back-testing simulation which are often
computationally intense. 24
Fintech Application
Algorithmic trading 快

➢ Algorithmic trading is the computerized buying and selling


of financial instruments, in accordance with pre-specified
rules and guidelines.
✓Benefits including speed of execution, anonymity, and
lower transaction costs.
✓May also determine the best way to price the order (e.g.,
limit or market order) and the most appropriate trading
venue (e.g., exchange or dark pool) to route for execution.
25
Fintech Application
Algorithmic trading (Cont.)
➢ High-frequency trading (HFT) is a form of algorithmic
trading that makes use of vast quantities of granular
financial data to automatically place trades when certain
conditions are met.
✓HFT algorithms decide what to buy or sell and where to
execute on the basis of real-time prices and market
conditions, seeking to earn a profit from intraday market
mispricing.
26
Fintech Application
Distributed ledger technology (DLT)
➢ A distributed ledger is a type of database that may be
shared among entities in a network.
✓Entries are recorded, stored, and distributed across a
network of participants so that each participant has a
matching copy of the digital database.
✓Basic elements of a DLT network include a digital ledger, a 共识机制,多数原则

consensus mechanism used to confirm new entries, and a


participant network.
27
Fintech Application
Distributed ledger technology (Cont.)
➢ Blockchain is a type of digital ledger in which information,
such as changes in ownership, is recorded sequentially
within blocks that are then linked or “chained” together and
secured using cryptographic methods.
✓Each block contains a grouping of transactions (or entries)
and a secure link (known as a hash) to the previous block.

28
Fintech Application
Distributed ledger technology (Cont.)
➢ Permissionless networks: open to any user who wishes to
make a transaction, and all users within the network can see
all transactions that exist on the blockchain.
所有人都可以参与
✓Network participants can perform all network functions. 权限不同

➢ Permissioned networks: network members may be


restricted from participating in certain network activities.
✓Permissions, or controls, may be used to allow varying
levels of access to the ledger.
29
Fintech Application
Applications of DLT to investment management
➢ Cryptocurrency (digital currency) operates as electronic
currency and allows near-real-time transactions between
parties without the need for an intermediary (e.g. bank).
➢ Initial coin offering (ICO) is an unregulated process whereby
融资
companies sell their crypto tokens to investors in exchange
数字代币
for fiat money or for another agreed upon cryptocurrency.
✓Typically structured to issue digital tokens to investors that
can be used to purchase future products or services being
developed by the issuer.
30
Fintech Application
Applications of DLT to investment management (Cont.)
➢ Tokenization: the process of representing ownership rights
to physical assets on a blockchain or distributed ledger.
✓Through tokenization, DLT has the potential to streamline
the transactions process involving physical assets by
creating a single, digital record of ownership with which to
verify ownership title and authenticity, including all
historical activity.

31
Fintech Application
Applications of DLT to investment management (Cont.)
➢ Post-trading clearing and settlement
✓DLT provides near-real-time trade verification,
reconciliation, and settlement, thereby reduces the
complexity, time and, costs associated with post-trade
processes.

32
Fintech Application
Applications of DLT to investment management (Cont.)
➢ Compliance
✓Allow regulators and firms to maintain near-real-time
review over transactions and other compliance-related
processes.
✓Could help uncover fraudulent activity and reduce
compliance costs associated with known-your-customer
and anti-money-laundering regulations, which entail
很难作假
verifying the identity of clients and business partners.
33
Summary
➢ Importance: ☆
➢ Content:
✓ Fintech application: text analytics and natural language
processing, robo-advisory services, risk analysis,
algorithmic trading;
✓ DLT application: cryptocurrency, tokenization, post-
trading clearing and settlement, compliance.
➢ Exam tips:
✓ 不是重要考点。
34
Correlation and Regression

Correlation Analysis
Tasks:
➢ Calculate and interpret a sample covariance and a
sample correlation coefficient; 数学:
-相关关系
➢ Formulate a hypothesis test of population correlation -函数关系

coefficient;
➢ Describe limitations to correlation analysis.
35
Correlation Analysis
Scatter plots
➢ A graph that shows the relationship between the
observations for two data series in two dimensions.
South Korea
Australia

U.K.
U.S.
Switzerland 定量分析-协方差

Japan

36
Correlation Analysis
Sample covariance
➢ A statistical measure of the degree to which two
variables move together, and capture the linear
relationship between tow variables.
( X -X )( Y -Y )
n

i i
i=1
Cov(X,Y)=
n-1
➢ Ranges of Cov(X,Y): -∞ < Cov(X,Y) < +∞.
✓ Cov(X,Y) > 0: the two variables tend to move together;
✓ Cov(X,Y) < 0: the two variables tend to move in
opposite direction. 37
Correlation Analysis
Sample correlation coefficient 进行标准化

➢ A measure of the direction and extent of linear


association between two variables.
Cov(X,Y)
rXY =
sXsY
✓ Ranges of rXY : -1 < rXY < +1.

38
Correlation Analysis
Sample correlation coefficient (Cont.) 不是斜率!! 只是斜向上+1,斜向下-1

r = +1 r = -1
(perfect positive linear (perfect negative linear
correlation) correlation)

39
Correlation Analysis
Sample correlation coefficient (Cont.)
0<r<1 -1 < r < 0
(positive linear correlation) (negative linear correlation)

40
Correlation Analysis
Sample correlation coefficient (Cont.)
r=0
(no linear correlation) 两种可能,
-没有关系
-没有线性关系

41
r算出来多少才算线性关系强?
Correlation Analysis - 要做假设检验

Steps of hypothesis testing (Review of Level 1)


➢ Step 1: stating the hypotheses: relation to be tested;
➢ Step 2: identifying the appropriate test statistic and its
probability distribution;
➢ Step 3: specifying the significance level;
➢ Step 4: stating the decision rule;
➢ Step 5: collecting the data and calculating the test
statistic;
➢ Step 6: making the statistical decision;
➢ Step 7: making the economic or investment decision. 42
Correlation Analysis
Example:
➢ Test the hypothesis that a fund’s mean return is equal to
1% per month at 5% significance level, the population is
normal distributed. The data provided is:
✓ Sample mean: 1.5%.
✓ Sample size: 45.
✓ Standard deviation of population: 1.4%.

43
Correlation Analysis
Answer:
➢ Step 1: H0: µ = 0.01 and Ha: µ ≠ 0.01.
➢ Step 2: with known population variance (standard
deviation), use two-tailed z-test.原假设是=,而不是>或是<,所以做双尾检验
➢ Step 3: The critical z-values for 5% significance level (95%
confidence interval) are +/- 1.96.
➢ Step 4: decision rule: if the z-statistic is outside the range
of critical values (-1.96 to +1.96), reject H0.
44
Correlation Analysis
Z statistic formula ?
Answer:
➢ Step 5: calculate the test statistic.
0.015-0.01 算检验统计量
z-statistic= =2.396 (样本值-假设值) / 标准误 标准误=标准差/根号n
0.014 45
➢ Step 6: reject H0 (mean return = 1%), because the z-
statistic (2.396) is outside the range of critical values
(-1.96 to +1.96).
95%
z-stat = 2.396
Reject H 0

2.396 45

1-SS3-S259
– 1.96 0 1.96
Correlation Analysis
Hypothesis testing of correlation
➢ Test the correlation coefficient between two variables is
equal to zero.
✓ H0: ρ=0, Ha: ρ≠0; 一般自由度是n-1

r n-2 下面就是相关
✓ t-test: t= 2
df=n-2; 自由度
1-r 系数的标准误

✓ Two-tailed test;
✓ Decision rule: reject H0 if t > + tcritical , or t < - tcritical
46
Correlation Analysis
Example:
A analyst want to test the correlation between variable X
and variable Y. The sample size is 20, and he find the
covariance between X and Y is 16. The standard deviation of
X is 4 and the standard deviation of Y is 8. With 5%
significance level, test the significance of the correlation
coefficient between X and Y.

47
Correlation Analysis
Answer:
➢ H0: ρ=0, Ha: ρ≠0;
➢ Sample correlation coefficient r = 16/(4×8) = 0.5;
20-2
➢ t-statistic: t=0.5x = 2.45 带公式
1-0.25
➢ The critical value of two-tailed t-test with df=18 and
significance level of 5% is 2.101; 查t表,自由度18, 关键值

➢ Since 2.45 is larger than 2.101, the null hypothesis can be


rejectted, and we can say the correlation coefficient
between X and Y is significantly different from zero. 有相关关系 48
Correlation Analysis
Limitation to correlation analysis
➢ Outlier: may result in false statistical significance of 有极值的存在

linear relationship.

49
Correlation Analysis
Limitation to correlation analysis (Cont.)
➢ Spurious correlation: statistically significant correlation
伪相关
exists when in fact there is no relation (no economic
explanation).

50
Correlation Analysis
Limitation to correlation analysis (Cont.)
➢ Nonlinear relationships: two variables can have a strong
nonlinear relation and still have a very low correlation.

可能有一个很强的非线性相关

51
Summary
➢ Importance: ☆☆
➢ Content:
✓ Covariance and correlation coefficient; 会算协方差,会算假设检验统计量
判断是否有线性关系
✓ Hypothesis testing of correlation coefficient;
✓ Limitation of correlation analysis.
➢ Exam tips:
✓ 这一部分是后面学习的基础,出题点比较多,出题形式也
比较灵活。

52
Correlation and Regression

Simple Linear Regression


Tasks:
➢ Describe the assumptions underlying linear regression;
➢ Calculate and interpret the predicted value and
confidence interval for the dependent variable;
➢ Interpret regression coefficients, formulate its
hypothesis testing, calculate and interpret its
confidence interval.
53
Simple Linear Regression
Dependent variable (Y) 因变量

➢ The variable that you are seeking to explain;


➢ Also referred to as explained variable or predicted
variable.

Independent variable (X) 自变量

➢ The variable(s) that you are using to explain changes in


the dependent variable.
➢ Also referred to as explanatory variable or predicting
variable.
54
Simple Linear Regression
Linear regression
➢ Use linear regression model to explain the dependent
variable using the independent variable(s).

55
Simple Linear Regression
Simple linear regression model
➢ Yi =b 0 +b1X i +εi i=1, .... ,n
where:
Yi = ith observation of the dependent variable, Y;
Xi = ith observation of the independent variable, X;
b0 = intercept;
b1 = slope coefficient;
残差项
εi = error term for the ith observation (also referred to as
residual of disturbance term).
56
Simple Linear Regression
Assumptions of simple linear regression model
➢ The relationship between the dependent variable (Y) and
the independent variable (X) is linear;
➢ The independent variable (X) is not random;
➢ The expected value of the error term is 0: E(ε)=0; 均值

➢ The variance of the error term is the same for all


2 2
残差有同方差性
observations (homoscedasticity): E(ε )=σ i ε i=1, .... ,n
➢ The error term is uncorrelated (independent) across
observations: E(εiεj)=0 for all i ≠ j; 残差没有相关性

➢ The error term (ε) is normally distributed. 残差符合正态分布 57


Simple Linear Regression
The regression line (the line of best fit)
➢ Ordinary least squares (OLS) regression: chooses values 普通最小二乘法
for the intercept (estimated intercept coefficient, b̂0 )
and slope (estimated slope coefficient, b̂1), to minimize
the sum of squared errors (SSE).
✓ Sum of squared errors (SSE): sum of squared vertical
distances between the observations and the regression
line.
➢ Equation of regression line: Yˆi =bˆ 0 +bˆ 1X i 估计值
58
Simple Linear Regression
The regression line
➢ Estimated slope coefficient ( b̂1 ) 不要求计算,要求interpretation

Cov XY
✓ Calculation: bˆ 1 = 2
X

✓ Interpretation: the sensitivity of Y to a change in X.


敏感度
• The change of Y for 1-unit change of X.
➢ Estimated intercept coefficient ( bˆ 0 )
✓ Calculation: bˆ 0 =Y-bˆ 1 X E(epslon)=0

✓ Interpretation: the value of Y when X is equal to zero. 59


Simple Linear Regression
Predicted value of dependent variable
➢ The values that are predicted by the regression equation,
given an estimate of the independent variable.
ˆ ˆ0 + b
Y=b ˆ 1X p
where:
ˆ predicted value of the dependent variable;
Y:

Xp : forecasted value of the independent variable.

60
Simple Linear Regression
Predicted value of dependent variable (Cont.)
➢ The confidence interval for a predicted value of
dependent variable is:
ˆ ( t c sf )
Y ˆ - ( t c sf ) Y Y
or Y ˆ + ( t c sf )
加减关键值*标准误
where:
t c : two-tailed critical t-value with df=n-2;

s f : standard error of the prediction.

61
Simple Linear Regression
回归系数的假设性检验
Significance test for a regression coefficient
➢ H0: b1= hypothesized value; Ha: b1≠ hypothesized value;
✓ Typically, H0: b1= 0; Ha: b1≠ 0, which means to test
whether an independent variable explains the variation
检验自变量是否可以解释因变量
in the dependent variable.
ˆ 1 -b1
b
➢ Test statistic: t = df=n-2;
sbˆ
1

➢ Decision rule: reject H0 if t > + tcritical , or t < - tcritical ;


62
Simple Linear Regression
Significance test for a regression coefficient (Cont.)
➢ Rejection of null hypothesis means the regression
coefficient is significantly different from the
hypothesized value.

63
Simple Linear Regression
Confidence interval for a regression coefficient 点估计,区间估计

➢ The confidence interval for a regression coefficient is:


bˆ 1 (tc s bˆ ) or bˆ 1 - ( t c s bˆ ) b1 bˆ 1 + ( t c s bˆ )
1 1 1

where:
t c : two-tailed critical t-value with df=n-2;

s bˆ : standard error of the regression coefficient.


1

64
Simple Linear Regression
Confidence interval for a regression coefficient (Cont.) 看假设值是否在置信区间

➢ Confidence interval can be applied to significance test for


a regression coefficient.
✓ If the confidence interval does not include zero, the null
hypothesis (H0: b1=0) is rejected, and the coefficient is
said to be statistically significantly different from zero.

65
Practice
There is a number of assumptions for linear regression.
Which of the following is NOT an assumption?
A. The independent variables are not correlated with the
error term.
B. There is at least some correlation between the error
terms from one observation to the next. 残差和残差不能有序列相关

C. The variance of the error terms each period keeps


同方差性
constant.

Answer: B
66
Summary
➢ Importance: ☆☆☆
➢ Content:
✓ Underlying consumptions of linear regression;
✓ Prediction of dependent variable; x带进去算出来
✓ Interpretation of hypothesis testing for regression
coefficient.
➢ Exam tips:
✓ 常考点1:underlying consumption;
✓ 常考点2:回归系数的假设检验。 67
Correlation and Regression

ANOVA Analysis (1) 方差分析

Tasks:
➢ Describe and interpret ANOVA;
➢ Calculate and interpret SEE, R2, and F-statistics;
➢ Describe limitations of regression analysis.

68
ANOVA Analysis (1)
Analysis of variance (ANOVA)
➢ A statistical procedure for dividing the total variability of
a variable into components that can be attributed to
different sources.
回归项解释的
✓ Total variation = explained variation + unexplained
variation
• Total sum of squares(SST) = Regression sum of
squares (RSS) + Sum of squared errors (SSE)
69
ANOVA Analysis (1)
Analysis of variance (Cont.)
➢ A graphic explanation of the components of total
variation:

均值

70
ANOVA Analysis (1)
Analysis of variance (Cont.)
➢ Total sum of squares(SST): measures the total variation
in the dependent variable.
n
SST= (Yi -Y)2
i-1
➢ Regression sum of squares (RSS): measures the variation
in the dependent variable that is explained by the
independent variable.
n
RSS= ˆ 2
(Y-Y)
i-1 线上的那个点
71
ANOVA Analysis (1)
Analysis of variance (Cont.)
➢ Sum of squared errors (SSE): measures the unexplained
variation in the dependent variable.
n
SSE= ˆ 2
(Yi -Y)
i-1

✓ Also known as the sum of squared residuals or the


residual sum of squares.

72
ANOVA Analysis (1)
Analysis of variance (Cont.) MS= SS / df
T就是Total
➢ ANOVA table 自由度 平方和 均方和 R就是Regression
Sum of Squares Mean Sum of E就是Error
df
(SS) Squares (MS)
Regression
1 RSS MSR=SSR/1
(explained)
Error
n-2 SSE MSE=SSE/(n-2)
(unexplained)
Total n-1 SST -

✓ MSR: mean regression sum of squares;


✓ MSE: mean squared error. 73
ANOVA Analysis (1)
Standard error of estimate (SEE)
➢ The standard deviation of error terms in the regression.
SSE MSE像方差, 根
SEE= = MSE 号之后是标准差
n-2
➢ Measures the degree of variability of the actual Y-values
relative to the estimated Y-values from a regression
equation;
✓ Gauges the "fit" of the regression line. The smaller the
SEE, the better the fit.
74
ANOVA Analysis (1)
Coefficient of determination (R²)
➢ The percentage of the total variation that is explained by
the regression.
Explained variation RSS SST-SSE
R2 = = =
Total variation SST SST
✓ For simple linear regression, R² is equal to the squared
correlation coefficient: R² = r².

75
ANOVA Analysis (1)
F-statistic F统计量 (多元回归)

➢ An F-statistic assesses how well the independent


variables, as a group, explains the variation in the
dependent variable; or used to test whether at least one
independent variable explains a significant portion of the
variation of the dependent variable.
RSS k是自变量的个数
MSR k
F= =
MSE SSE
n-k-1
✓ Note: this is always a one-tailed test. 76
ANOVA Analysis (1)
F-statistic (Cont.)
➢ For simple linear regression, the F-test duplicate the t-
test for the significance of the slop coefficient.
✓ H0: b1= 0; Ha: b1≠ 0;
RSS
MSR 1 df
✓ F= = numerator=1; dfdenominator=n-2;
MSE SSE
n-2

✓ Decision rule: reject H0 if F > Fc.

77
ANOVA Analysis (1)
Limitations of regression analysis
➢ Regression relations can change over time (parameter
b可能就是变化的, 过去两年是这样的, 前五年的可能就不一样了
instability).
➢ To investment contexts, public knowledge of regression
relationships may negate their future usefulness. 大家都知道,就没帮助了

➢ If the regression assumptions are violated, hypothesis


tests and predictions based on linear regression will not
be valid.
78
Practice
An analyst performs two simple regressions. The first
regression analysis has an R-squared of 0.40 and a beta
coefficient of 1.2. The second regression analysis has an
R-squared of 0.77 and a beta coefficient of 1.75. Which
one of the following statements is most accurate?

79
Practice
A. The first regression equation has more explaining
power than the second regression equation. R^2

B. The R-squared of the first regression indicates that


there is a 0.40 correlation between the independent and
the dependent variables. 0.4开根号就是相关系数

C. The second regression equation has more explaining


power than the first regression equation.

Answer: C
80
Summary
➢ Importance: ☆☆☆
➢ Content:
✓ ANOVA;
✓ SEE, R2, and F-statistic.
➢ Exam tips:
✓ 常考点1:给出ANOVA表,计算某空白格;
✓ 常考点2:R2的calculation and interpretation,计算题和概念
题都可能考。

81
Multiple Regression and Issues in Regression Analysis

Multiple Regression
Tasks:
➢ Formulate a multiple regression and explain the
assumptions of a multiple regression model;
➢ Interpret estimated regression coefficients, formulate
hypothesis tests for them and interpret the results.
➢ Calculate and interpret the predicted value for the
dependent variable.
82
Multiple Regression
Multiple regression
➢ Regression analysis with more than one independent
variable.
✓ Multiple linear regression model
Yi =b0 +b1X1i +b2 X2i +...+bk Xki +ε i

where:
Yi = the ith observation of the dependent variable Y
Xji = the ith observation of the jth independent variable Xj
bj = slope coefficient of independent variables 83
Multiple Regression
Assumptions of multiple linear regression
➢ The relationship between the dependent variable and
the independent variables is linear;
➢ The independent variables are not random. Also, no 自变量之间没有线性关系

exact linear relation exists between two or more of the


independent variables;
➢ The expected value of the error term, conditioned on the
残差期望值=0
independent variables, is 0: E(ε | X1, X2, …, Xk) = 0;
84
Multiple Regression
Assumptions of multiple linear regression (Cont.)
➢ The variance of the error term is the same for all
2 2
observations (homoscedasticity, 同方差性): E(εi )=σ ε;
➢ The error term is uncorrelated across observations:
E(εiεj)=0 for all i≠j;
➢ The error term is normally distributed.

85
Multiple Regression
Intercept term (b0)
➢ The value of the dependent variable when the
independent variables are all equal to zero.

Slope coefficient (bj) 敏感度

➢ The expected increase in the dependent variable for a 1-


unit increase in that independent variable, holding the
other independent variables constant. 其他自变量不变

✓ Also called partial slope coefficients.


86
Multiple Regression
Hypothesis testing of regression coefficients 回归系数假设检验

➢ Hypothesis: ˆ =b
H0 : b ˆ
Ha : b bj
j j j

ˆ
H0 : b bj ˆ
Ha : b bj
j j

ˆ
H0 : b bj ˆ
Ha : b bj
j j

ˆ -b
b
➢ Test statistic: t = j j
(估计值-假设值) / 标准误
s bˆ
j

✓ df = n-k-1, k = number of independent variables 之前k=1


➢ Decision rule: reject H0 if
✓ t > + tc , or t < - tc ;
✓ p-value < significance level (α). 87
Multiple Regression
对因变量影响是否显著
Statistical significance of independent variable
➢ Hypothesis: H 0 : bˆ j = 0 ˆ
Ha : b j 0
ˆ
b
➢ Test statistic: t = j

s bˆ
j

✓ df = n-k-1, k = number of independent variables


➢ Decision rule: reject H0 if
✓ t > + tcritical , or t < - tcritical ;
✓ p-value < significance level (α).
88
Multiple Regression
Interpret the testing results
➢ Rejection of null hypothesis means the regression
coefficient is different from/greater than/less than the
hypothesized value given a level of significance (α).
➢ For significance testing, rejection of null hypothesis
means the regression coefficient is different from zero,
or the independent variable explains some variation of
the dependent variable.
89
Multiple Regression
区间估计,标准误会给出
Confidence interval for a regression coefficient
➢ The confidence interval for a regression coefficient is:
bˆ j (tc s bˆ ) or bˆ j - ( t c s bˆ ) b1 bˆ 1 + ( t c s bˆ )
j j j

where:
t c : two-tailed critical t-value with df=n-k-1;

s bˆ : standard error of the regression coefficient.


1

90
Multiple Regression
Confidence interval for a regression coefficient (Cont.)
➢ Confidence interval can be applied to significance test for
a regression coefficient.
✓ If the confidence interval does not include zero, the null
hypothesis (H0: bj=0) is rejected, and the coefficient is
said to be statistically significantly different from zero.

91
Multiple Regression
Predicting the dependent variable
➢ The regression equation can be used to predict the value
of the dependent variable based on assumed values of
the independent variables.
ˆ =b
Y ˆ +bˆ X
ˆ ˆ ˆ ˆ ˆ
i 0 1 1i + b 2 X 2i +...+ bk Xki

where:
Ŷi = predicted value of the dependent variable
b̂ j = estimated slope coefficient for jth independent
variable
92
Summary
➢ Importance: ☆☆
➢ Content:
✓ Assumptions of multiple linear regression;
✓ Interpretation and hypothesis testing of regression
coefficients;
✓ Prediction of dependent variable.
➢ Exam tips:
✓ 常考点:regression coefficients的假设检验;出题点比较灵
活,包括检验统计量的计算,检验结果的判断和解读。
93
Multiple Regression and Issues in Regression Analysis

ANOVA Analysis (2)


多元回归的方差分析
Tasks:
➢ Describe and interpret ANOVA table;
➢ Calculate and interpret F-statistic, and describe how it
is used in regression analysis;
➢ Distinguish between and interpret R2 and adjusted R2.

94
ANOVA Analysis (2)
ANOVA table of multiple regression
均方和
df SS MSS
Regression k RSS MSR=SSR/k
Error n-k-1 SSE MSE=SSE/(n-k-1)
Total n-1 SST -

➢ R2 = RSS/SST
➢ F = MSR/MSE with df of k and n-k-1
➢ SEE= MSE

95
ANOVA Analysis (2)
F-statistics
➢ Test whether all regression coefficients are
simultaneously equal to zero; or test whether the
independent variables, as a group, help explain the
整个回归系数当整体,
dependent variable; or assess the effectiveness of the 是否可以解释因变量

model, as a whole, in explaining the dependent variable.


✓ H0: b1= b2= … = bk=0; Ha: at least one bj≠0 (j = 1 to k)
RSS
MSR k
✓ F= =
MSE SSE
n-k-1 96
ANOVA Analysis (2)
F-statistics (Cont.)
➢ Decision rule: reject H0 if F-statistic > F-critical value.
单尾检验
✓ The F-test here is always a one-tailed test.
➢ Rejection of H0 means there is at least one regression
coefficient is significantly different from zero, thus at
least one independent variable makes a significant 整个模型有意义

contribution to the explanation of dependent variable.

97
ANOVA Analysis (2)
R² (Coefficient of determination)
Explained variation RSS SST-SSE
➢ R2 = = =
Total variation SST SST
➢ Test the overall effectiveness (goodness of fit) of the
entire set of independent variables (regression model) in
explaining the dependent variable.
✓ For example, an R² of 0.7 indicates that the model, as a
whole, explains 70% of the variation in the dependent
variable. 98
ANOVA Analysis (2)
R² (Cont.)
➢ For multiple regression, however, R2 will increase simply
by adding independent variables that explain even a
slight amount of the previously unexplained variation.
✓ Even if the added independent variable is not
只要加自变量,R^2都会增加
statistically significant, R2 will increase.
我们不需要这样的

99
ANOVA Analysis (2)
Adjusted R²
n -1
➢ adjusted R = 1 -
2

n - k -1
(1 - R )
2

where: n = number of observations


k = number of independent variables
➢ Adjusted R² ≤ R², and may less than zero if R² is low
enough.
➢ Adding a new independent variable will increase R2, but
may either increase or decrease the adjusted R2.
✓ If the new variable has only a small effect on R2, the
value of adjusted R2 may decrease. 100
ANOVA Analysis (2)
Interpretation of regression model
➢ Interpretation generally focuses on the regression
分析 解释回归系数
coefficients;
➢ It is possible to identify a relationship that has statistical
significance without any economic significance. 统计学显著关系 ≠经济学显著关系

101
Summary
➢ Importance: ☆☆☆
➢ Content:
✓ ANOVA table;
✓ Calculation and interpretation of F-statistics;
✓ R2 and adjusted R2.
➢ Exam tips:
✓ 常考点1:F-statistics的解读,概念题;
✓ 常考点2:R2和adjusted R2的比较,概念题。

102
Multiple Regression and Issues in Regression Analysis

Violations of Assumptions
Tasks:
异方差
➢ Explain the types of heteroskedasticity and how
heteroskedasticity and serial correlation affect
statistical inference;
➢ Describe multicollinearity and explain its causes and
effects in regression analysis.
103
Heteroskedasticity (异方差性)
Definition of heteroskedasticity
➢ The variance of the errors differs across observations 不同观察值残差
的方差是否相同
(i.e., the error terms are not homoskedastic).
✓ Unconditional heteroskedasticity: heteroskedasticity
of the error variance is not correlated with the
与自变量是否有关
independent variables.
• Creates no major problems for statistical inference.
无条件异方差没有关系

104
Heteroskedasticity
Definition of heteroskedasticity (Cont.)
✓ Conditional heteroskedasticity: heteroskedasticity of
the error variance is correlated with (conditional on)
the values of the independent variables.
• Does create significant problems for statistical
inference.

105
Heteroskedasticity
Effects of heteroskedasticity
➢ The coefficient estimates ( b̂ j ) aren't affected.
➢ The standard errors of coefficient ( s bˆ j )are usually
unreliable.
✓ With financial data, the standard errors are most likely
ˆ -b
b j j
underestimate, and the t-statistics ( t = ) will be
s bˆ
j

inflated, and tend to find significant relationships 标准值偏小,


检验统计量太大,
where none actually exist (type I error). 本来没有,
以真为假
➢ The F-test is also unreliable.
106
Heteroskedasticity
Testing for heteroskedasticity
➢ Examining scatter plots of the residuals; 看残差图

Residuals
左边残差方差小,右边大

107
Heteroskedasticity
Testing for heteroskedasticity (Cont.)
➢ Breusch-Pagen χ² test.
✓ H0: no heteroskedasticity; 原假设:没有异方差性
考试考以上
2
✓ BP χ² = n Rresid with df = k (the number of
independent variables) and one-tailed test;
• n = the number of observation;
• Rresid = R2 of a second regression of the squared
2

residuals from the first regression on the


因变量变成epslon
independent variables.
108
Heteroskedasticity
修正
Correcting for heteroskedasticity
➢ Use robust standard errors to recalculate the t-statistics;
✓ Also called White-corrected standard errors.
➢ Use generalized least squares, other than ordinary least
squares, to build the regression model. 广义最小二乘法

109
Serial Correlation (序列相关、自相关) 残差之间是否相关
Definition of serial correlation (autocorrelation)
➢ The residuals (error terms) are correlated with one
another, and typically arises in time-series regressions.
✓ Positive serial correlation: a positive/negative error for
如果一个残差为
one observation increases the chance of a 正,会增加另一个
为正的概率
positive/negative error for another observation.
✓ Negative serial correlation: a positive/negative error
for one observation increases the chance of a
negative/positive error for another observation.
110
Serial Correlation
Effects of serial correlation
➢ The coefficient estimates aren't affected.
➢ The standard errors of coefficient are usually unreliable.
✓ Positive serial correlation: standard errors
underestimated and t-statistics inflated, suggesting
significance when there is none (type I error);
✓ Negative serial correlation: vice versa (type II error).
➢ The F-test is also unreliable.
111
Serial Correlation
Testing for serial correlation
➢ Residual scatter plots
Residuals Residuals
正序列相关 负序列相关
一个为正都为正

T T

Positive serial correlation Negative serial correlation


112
Serial Correlation
Testing for serial correlation (Cont.)
➢ The Durbin-Watson test
没有序列相关
✓ H0: No serial correlation
✓ DW ≈ 2×(1−r), if sample size is very large
• r = correlation coefficient between residuals from one
inconclusive就是没办法做结论
period and those from the previous period.
✓ Decision rule:
r=-1 到 r=+1
In- Fail to In-
Positive 带进去DW=0 到 4
conclusive reject H0 conclusive Negative
DW=0 dL dU 4-dU 4-dL DW=4
113
(r=1) (r=-1)
Serial Correlation
Correcting for serial correlation
➢ Adjust the coefficient standard errors (recommended);
✓ E.g., Hanson method, which also correct conditional
heteroskadasticity;
✓ Adjusted standard errors, or Hansen-White standard
errors.
➢ Modify the regression equation itself.

114
Multicollinearity ( 多重共线性)
Definition of multicollinearity 自变量之间有相关

➢ Two or more independent variables (or combinations of


independent variables) are highly (but not perfectly)
correlated with each other.

115
Multicollinearity
Effects of multicollinearity
➢ Estimates of regression coefficients become extremely
回归系数就不靠谱
imprecise and unreliable;
➢ Standard errors of regression coefficients will be inflated,
then t-test on the coefficients will have little power
(more type II error).
✓ Greater probability we will incorrectly conclude that a
variable is not statistically significant. 第二类错误是:以假为真
116
y=5x1
Multicollinearity x1=2x2
可以写y=4x1+2x2
Testing for multicollinearity 不断往下拆,每个系数都可以变得很小
做假设检验就会认为=0
➢ The t-tests indicate that none of the regression
coefficients is significant, while R² is high and F-test
indicates overall significance; 单个自变量不显著,但是模型是有意义的

➢ The absolute value of the sample correlation between


any two independent variables is greater than 0.7 (not
recommended).

117
Multicollinearity
Correcting for multicollinearity
➢ Excluding one or more of the correlated independent
variables.

118
Summary of Assumption Violations
Violation Effects Testing
• Residual scatter plots
Conditional
Type I error • Breusch-Pagen χ²-test
Heteroskedasticity
BP = n×R²
Positive serial
Type I error • Residual scatter plots
correlation
• Durbin-Watson test
Negative serial
Type II error DW≈2×(1−r)
correlation
• t-tests indicate no significance
Multicollinearity Type II error when F-test indicates overall
significance and R² is high
119
Practice
Feng, CFA, runs a regression of portfolio returns on three
independent variables. Feng discovers that the p-values
for each independent variable are relatively high, but the
F-test has a very small p-value. Feng is puzzled and tries p值高,不能拒绝原假设
bj=0,说明每一个自变量都不
to figure out the reasons. What violation of regression 具有解释力

analysis has occurred? 但是F检验整个模型p值小


A. conditional heteroskedasticity.
B. serial correlation.
C. multicollinearity
Answer: C 120
Summary
➢ Importance: ☆☆☆ 非常重要

➢ Content:
✓ Definition, effects, testing, and correcting of
heteroskedasticity, serial correlation, and
multicollinearity.
➢ Exam tips:
✓ 常考点:effects and testing of heteroskedasticity and serial
correlation, 概念题。

121
Multiple Regression and Issues in Regression Analysis

Other Issues in Regression Analysis


Tasks:
➢ Formulate a multiple regression with dummy
variables and interpret the coefficients;
➢ Describe effects of model misspecification and
avoidance of its common forms;
➢ Describe models with qualitative dependent variables.
122
Dummy Variable (哑变量)
Dummy variable
➢ Qualitative variables may be used as independent
variables in a regression. 定性的变量,只有几个选择

✓ Dummy variable is one type of qualitative variable, and


takes on a value of “0” or “1”. 哑变量只有0或者1

➢ If we want to distinguish among n categories, we need


n−1 dummy variables.

123
Dummy Variable
Example
➢ Yi = b0 + b1X1i + b2X2i + b3X3i + ɛi
where: Yi = quarterly value of EPS of a stock
Y X1 X2 X3
Q1 EPS 1 0 0
Q2 EPS 0 1 0
Q3 EPS 0 0 1
Q4 EPS
0 0 0
(omitted category)
缺省类
124
Dummy Variable
Interpretation of coefficient
➢ Intercept coefficient (b0): the average value of
缺省类的平均值
dependent variable for the omitted category.
➢ Regression coefficient (bj): the difference in dependent
variable (on average) between the category represented
by the dummy variable and the omitted category. 哑变量和缺省之间的不同

125
Model Misspecification (模型设定偏误)
Definition of model misspecification
➢ The set of variables included in the regression and the
regression equation’s functional form.

126
Model Misspecification
Categories of model misspecification
➢ Misspecified functional form 模型方程形式有问题

✓ Important variables ommited


✓ Variables need to be transformed
✓ Pools data incorrectly
➢ Independent variables correlated with the error term 自变量和残差有关系

✓ Lagged dependent variables as independent variables


✓ Incorrect dating of variables
✓ Independent variables are measured with error
➢ Other types of time-series misspecification 时间序列问题 127
Model Misspecification
Effects of model misspecification
➢ Regression coefficients are often biased and inconsistent,
leading to unreliable hypothesis testing and inaccurate
predictions.

128
Model Misspecification
Avoiding model misspecification
➢ The model should be grounded in cogent economic
reasoning;
➢ The functional form chosen for the variables should be
appropriate given the nature of the variables;
➢ The model should be parsimonious;
➢ The model should be examined for violations of
regression assumptions before being accepted;
➢ The model should be tested and be found useful out of
sample before being accepted. 7:3 分数据, 7检测,3验证 129
Qualitative Dependent Variable (定性的因变量)
Qualitative dependent variable
➢ Dummy variables used as dependent variables instead
of as independent variables.
✓ Probit and logit model
✓ Discriminant models 打分

130
Practice
Consider the following model of earnings (EPS) regressed
against dummy variables for the quarters:
EPSt = α + β1Q1t + β2Q2t + β3Q3t
where:
EPSt: quarterly observation of EPS;
Q1t : 1 for the second quarter, 0 otherwise;
Q2t : 1 for the third quarter, 0 otherwise;
Q3t : 1 for the fourth quarter, 0 otherwise.
Which of the following statements regarding this model
is most accurate? The: 131
Practice
A. coefficient on each dummy tells us about the
difference in earnings per share between the respective
quarter and the one left out (first quarter in this case)
B. EPS for the first quarter is represented by the residual.
C. significance of the coefficients cannot be interpreted
in the case of dummy variables.

Answer: A

132
Summary
➢ Importance: ☆
➢ Content:
✓ Dummy variable;
✓ Model misspecification;
✓ Qualitative dependent variable.
➢ Exam tips:
✓ 不是考试重点。

133
Multiple Regression and Issues in Regression Analysis

Machine Learning
Tasks:
➢ Distinguish between supervised and unsupervised
machine learning;
➢ Describe machine learning algorithms used in
prediction, classification, clustering, and dimension
reduction;
➢ Describe the steps in model training. 134
Machine Learning
Machine learning (ML)
➢ Machine learning comprises diverse approaches by
which computers are programmed to improve
电脑去学习
performance in specified tasks with experience.
✓Types of machine learning
✓Machine learning algorithms
✓Steps in model training

135
Types of Machine Learning
监督学习和非监督学习
Supervised learning vs. unsupervised learning
➢ Supervised learning: makes use of labeled training
data.
✓More formally, supervised learning is the process of
training an algorithm to take a set of inputs X and
find a model that best relates them to the output Y.
✓E.g., ML program labels “fraudulent” or “non-
fraudulent” and uses them to train a model in
predicting fraud more accurately in new credit card
transactions. 136
Types of Machine Learning
Supervised learning vs. unsupervised learning
➢ Unsupervised learning does not make use of labeled
training data.
✓More formally, in unsupervised learning, we have
inputs X that are used for analysis without any
targets Y being supplied, the ML program has to
discover structure within the data themselves.
✓E.g., based on financial statement data, ML program
clusters firms into groups based on their attributes.
137
Machine Learning Algorithms
Machine learning algorithms
Penalized regression
CART
Supervised learning
Random forests
Neural networks

Unsupervised learning Clustering algorithms


Dimension reduction

138
Machine Learning Algorithms
Supervised learning algorithms
➢ Supervised learning can be divided into two categories:
回归 分类
regression and classification, the distinction is
determined by the nature of the target variable (Y).
✓If the target variable is continuous, then the task is
one of regression. 连续就是回归

• Regression include linear and non-linear models.

139
Machine Learning Algorithms
Supervised learning algorithms (Cont.)
✓If the target variable is categorical or ordinal (e.g., a
离散就是分类
firm’s rating), then it is a classification problem.
• Classification include classification and regression
trees (CART), random forests, and neural networks
for brief descriptions.

140
Machine Learning Algorithms
Supervised learning algorithms (Cont.)
➢ Penalized regression: choose the regression coefficients
to minimize the sum of squared residuals plus a penalty
term. 加上一个处罚项

✓The penalty term increases in size with the number of


自变量多了,处罚项就多
included variables with non-zero regression coefficients. 像adjusted R2

✓Work well in prediction because they are less subject to 防止过拟合


overfitting.
• Overfitting: include variables that is “noise” in a 本来是噪音,被当成参数
specific dataset and will not be present in future data. 141
Machine Learning Algorithms
Supervised learning algorithms (Cont.)
➢ Classification and regression trees (CART) can be 分类树

applied to predict either a categorical target variable


(a classification problem), producing a classification
tree, or a continuous outcome (a regression problem),
producing a regression tree. 如果是连续的是回归树

✓CART is most commonly used in binary target, and is


better adapted to classification problems with
significant non-linear relationships among variables. 肯定不是线性关系
142
Machine Learning Algorithms
Supervised learning algorithms (Cont.)
有放回的抽样,形成子分类树
随机森林, 很多树
➢ Random forests is a collection of classification trees.
✓Rather than use just one classification tree, we build
several, based on random selection of features
(variables). Each tree is slightly different from the others.
✓For any new observation, all the classifier trees (the
“random forest”) undertake classification by majority 多数原则
vote, implementing a machine learning version of the
“wisdom of crowds.”
143
Machine Learning Algorithms
Supervised learning algorithms (Cont.)
神经网络
➢ Neural networks could be applied to a variety of tasks
characterized by non-linearities and interactions
among variables.
✓Neural networks consist of nodes connected by links,
and have three types of layers: an input layer, hidden
layers, and an output layer.

144
Machine Learning Algorithms
Supervised learning algorithms (Cont.)

有权重 神经元

145
Machine Learning Algorithms
Unsupervised learning algorithms
聚类
➢ Clustering algorithms could cluster groups data solely
on the basis of information found in the data.
✓In classification, data are assigned to classes 之前是有目标的分类
determined by the researcher (fraudulent or non-
fraudulent). In clustering, the groups are determined
by the data themselves.

146
Machine Learning Algorithms
Unsupervised learning algorithms (Cont.)
降维
➢ Dimension reduction focuses on reducing the number
of independent variables while retaining variation
across observations (to preserve the information
contained in that variation). 通过合并,找出重要的元素

147
Steps in Model Training
Supervised machine learning: training
➢ Process of training ML models involves 5 steps:
✓Specify the ML technique/algorithm. 确定算法

✓Specify the associated hyperparameters. 设定超参数


✓Divide data into a training sample and a validation
sample. 训练样本

✓Evaluate learning with performance measure and adjust


通过指标衡量
the hyperparameters.
✓Repeat the training cycle the specified number of times
or until the required level of accuracy is obtained. 不断循环 148
Summary
➢ Importance: ☆☆
➢ Content:
✓ Supervised vs. unsupervised learning;
✓ Learning algorithms;
✓ Model training.
➢ Exam tips:
✓ 几个重点概念要弄清楚。

149
Time-Series Analysis

Trend Models 趋势模型:时间做自变量

Tasks:
➢ Calculate and evaluate the predicted trend value for a
time series;
➢ Describe factors to determine trend model selection;
➢ Evaluate limitations of trend models.

150
Trend Models
Linear trend models 线性趋势模型

➢ Work well in fitting time series that have constant


change amount with time.
yt=b0 + b1t + εt ˆ ˆ0 + b
y=b ˆ 1t
yt

t 151
Trend Models
Log-linear trend models 对数线性趋势模型

➢ Work well in fitting time series that have constant growth


成长率相同
rate with time (exponential growth).
y t =e(b0 +b1 t+εt ) Ln(yt) =b0+b1t+εt

152
Trend Models
Linear trend model vs. Log-linear trend model
➢ If data plots with a linear shape (constant change
amount), a linear trend model may be appropriate.
➢ If data plots with a non-linear (curved) shape (constant
growth rate), a log-linear model may be more suitable.

153
Trend Models
Limitations of trend models
➢ The trend model is not appropriate for time series when
data exhibit serial correlation. 序列相关

✓ Use the Durbin-Watson statistic to detect serial


correlation.

154
Summary
➢ Importance: ☆
➢ Content:
✓ Linear trend model & log-linear trend model;
✓ Limitation of trend models.
➢ Exam tips:
✓ 不是考试重点。

155
Time-Series Analysis

Autoregressive Models (AR) 自回归模型

Tasks: 没有自变量,创造自变量,用前值
X_t+1=f(X_t)
➢ Describe the structure of an AR model, explain the
testing of autocorrelations of the residuals;
➢ Calculate one- and two-period-ahead forecasts given
the estimated coefficients of an AR model;
➢ Explain mean reversion and calculate a mean-
reverting level.
156
Autoregressive Model (AR)
Covariance stationary 协方差平稳

➢ A key assumption for AR time series model to be valid


based on ordinary least squares (OLS) estimates.
➢ A covariance stationary series must satisfy three
principal requirements:
✓ Constant and finite expected value in all periods; 期望值稳定有限

✓ Constant and finite variance in all periods; 方差

✓ Constant and finite covariance with itself for a fixed 协方差

number of periods in the past or future in all periods.


157
Autoregressive Model
Autoregressive model
➢ Uses past values of dependent variables as independent
variables.
✓ AR(1): First-order autoregressive model
x t =b 0 +b1x t-1 +ε t
✓ AR(p): p-order autoregressive model
x t =b 0 +b1x t-1 +b 2 x t-2 +...+b p x t-p +ε t
• Where p indicates the number of lagged values that
the autoregressive model will include as independent
variables. 158
Autoregressive Model
Chain rule of forecasting 预测链式法则

➢ A one-period-ahead forecast for an AR(1) model:


ˆ +b
x̂ t+1 =b ˆ x
0 1 t

➢ A two-period-ahead forecast for an AR(1) model:


ˆ +b
x̂ t+2 =b ˆ x
0 1 t+1

159
Autoregressive Model
Detecting autocorrelation 检测自相关
DW检验只能用于有真正
➢ Step 1: Estimate the AR(1) model using linear regression; 自变量的
✓ xt = b0 + b1xt-1 + ɛt
➢ Step 2: Compute the autocorrelations ( ρ ε t ,ε t-k ) of the
residual; 算残差的自相关系数

✓ Autocorrelation: the correlations of a time series with


its own past values;
✓ The order of the correlation is given by k, where k
represents the number of periods lagged. 160
Autoregressive Model
Detecting autocorrelation (Cont.)
➢ Step 3: Test if the autocorrelations are significantly
different from zero.
ρ ε t ,ε t-k
✓ t= 不能再用DW检验
1/ T
• T: the number of observations in the time series;
• Degree of freedom: T-2.
✓ If the residual autocorrelations differ significantly from
0, the model is not correctly specified and need to be
modified. 161
Autoregressive Model
Seasonality 季节性,以年为周期,每年都有这种趋势

➢ Time series that shows regular patterns of movement


within the year.
➢ Testing of seasonality: test if the seasonal
autocorrelation of the residual will differ significantly
from 0.
✓ The 4th autocorrelation in case of quarterly data;
✓ The 12th autocorrelation in case of monthly data.
162
Autoregressive Model
Seasonality (Cont.)
➢ Correcting of seasonality: include a seasonal lag in AR
加一个季节变量
model:
✓ Quarterly data: xt = b0+b1xt-1+ b2xt-4+εt
✓ Monthly data: xt = b0+b1xt-1+ b2xt-12+εt
➢ Forecasting using AR model with a seasonal lag:
✓ Quarterly data: xˆt = bˆ0 + bˆ1 xt −1 + b2 xt -4
✓ Monthly data: xˆt = bˆ0 + bˆ1 xt −1 + b2 xt -12
163
Autoregressive Model
Mean reversion 均值复归

➢ A time series shows mean reversion if it has a tendency


to move towards its mean. 趋向均值

✓ Tends to fall when it is above its mean and rise when it


is below its mean.
➢ Mean-reverting level for an AR(1) model:
b0
xt =
1 - b1
✓ Covariance stationary → finite mean-reverting level;
如果b1绝对值小于1,就有有限的
✓ lb1l < 1 in AR(1) model → finite mean-reverting level. 均值复归水平 164
Practice
An analyst wants to model quarterly sales data using an
autoregressive model. She has found that an AR(1) model
with a seasonal lag has significant slope coefficients. She
also finds that when a second and third seasonal lag are
added to the model, all slope coefficients are significant
too. Based on this, the best model to use would most
likely be an:
A. AR(1) model with no seasonal lags. 加了3个leg,算出来斜率系数都不为0,就应
该加这3个
B. ARCH(1).
C. AR(1) model with 3 seasonal lags
Answer: C 165
Summary
➢ Importance: ☆☆
➢ Content:
✓ Covariance stationary and AR model;
✓ Auto-correlation and seasonality;
✓ Mean reversion.
➢ Exam tips:
✓ 常考点:mean-reverting level的计算。

166
Time-Series Analysis

Random Walk 随机游走

Tasks: 资产的价格是符合随机游走的

➢ Describe characteristics of random walk processes;


单位根
➢ Describe unit roots for time-series analysis and the
steps of the unit root test for nonstationarity;
➢ Demonstrate how a random walk can be transformed
to be stationary.
167
Random Walk
Random walk (simple random walk)
➢ A time series in which the value of the series in one
period is the value of the series in the previous period
plus an unpredictable random error.
xt = xt -1 + t

✓ A special AR(1) model with b0=0 and b1=1;


✓ The best forecast of xt is xt-1.
最好的预测就是上一个
比如小伙伴喝醉了在操场上,回去找的时候,最好的地点就是
他一开始在的地方
168
Random Walk
Random walk with a drift
➢ A random walk with the intercept term that not equal to
zero (b0 ≠ 0).
xt = b0 + xt -1 + t

✓ Increase or decrease by a constant amount (b0) in each


period.

169
Random Walk
Random walk vs. Covariance stationary 随机游走和协方差平稳的关系

➢ A random walk will not exhibit covariance stationary. 有限恒定的均值?

✓ A time series must have a finite mean reverting level to


be covariance stationary;
✓ A random walk has an undefined mean reverting level.
b0 0
xt = = 因为b1=1,b0=0,无法确定均值有限
1 - b1 0
➢ The least squares regression method doesn’t work to
不是协方差平稳,不能用最
estimate an AR(1) model on a time series that is actually 小二乘法做1阶自回归模型

a random walk.
170
Random Walk
Unit root
单位根
➢ The time series is said to have a unit root if the lag 滞后变量系数

coefficient is equal to one (b1=1) and will follow a


random walk process.
只能检测是不平稳,不一定
✓ Testing for unit root can be used to test for 平稳

nonstationarity since a random walk is not covariance


不平稳
stationary;
• But t-test of the hypothesis that b1=1 in AR model is
矛盾,如果b1=1,不平稳不能做自回归,如果做不出来自回
invalid to test the unit root; 归,无法做t检验检测b1=1 171
Random Walk
Unit root (Cont.)
✓ Testing of AR model can determine if a time series is
covariance stationary.
• If autocorrelations at all lags are statistically
indistinguishable from zero, the time series is
stationary.

172
Random Walk
Dickey-Fuller test for unit root
➢ Step 1: start with an AR(1) model: xt=b0+b1xt-1+εt ;
➢ Step 2: subtract xt-1 from both sides:
xt-xt-1 = b0 + (b1 –1)xt-1 + εt ;
✓ Or: xt-xt-1 =b0+g1xt-1+εt where: g1=b1-1
➢ Step 3: test if g1=0.
✓ H0: g1=0; Ha: g1<0;
比通常查出来的要大一些
✓ Calculate t-statistic and use revised critical values;
✓ If fail to reject H0, there is a unit root and the time
series is non-stationary. 有单位根就是不平稳的 173
Random Walk
First differencing 一阶差分
➢ A random walk (i.e., has a unit root) can be transformed
to a covariance stationary time series by first 随机游走的模型可以转化为协方差平稳的

differencing.
✓ Subtract xt-1 from both sides of random walk model:
xt-xt-1=xt-1-xt-1+εt=εt 做一阶差分
1,2,3,4,5,。。。
差分
✓ Define yt=xt-xt-1, so yt=εt ; 1,1,1,1,1,。。。平稳
Or yt=b0+b1yt-1+εt ; where: b0=b1=0
还可以二阶差分
✓ Then, yt is covariance stationary variable with a finite 1,2,4,7,11
1,2,3,4
mean-reverting level of 0/(1-0)=0. 1,1,1,
174
Practice
One choice a researcher can use to test for
不平稳检验,修正t检验
nonstationarity is to use a:
A. Dickey-Fuller test, which uses a modified t-statistic
B. Breusch-Pagan test, which uses a modified t-statistic.
检验异方差的
C. Dickey-Fuller test, which uses a modified χ2 statistic.

Answer: A

175
Summary
➢ Importance: ☆☆☆
➢ Content:
✓ Random walk;
✓ Testing of unit roots;
✓ First differencing.
➢ Exam tips:
✓ 常考点:unit roots的检验方法,检验结果的解读,random
walk变形为stationary的方法(first differencing)。

176
Time-Series Analysis

Model Evaluation 模型评估

Tasks:
➢ Contrast in-sample and out-of-sample forecasts; 样本内和样本外的预测
➢ Explain ARCH model;
➢ Determine and justify an appropriate time-series
model.

177
Model Evaluation
Comparing forecasting model performance
➢ In-sample forecasts errors: the residuals within sample
period to estimate the model;
➢ Out-of-sample forecasts errors: the residuals outside
一万数据,7:3分,3k数据回测
sample period to estimate the model.
➢ Root mean squared error (RMSE) criterion: the model
就是SEE
with the smallest RMSE for the out-of-sample data is
越小越好
typically judged most accurate.
178
Model Evaluation
Instability of regression coefficients
➢ Financial and economic relationships are inherently
dynamic, so the estimates of regression coefficients of
the time-series model can change substantially across
different sample periods.
➢ The is a tradeoff between reliability and stability.
✓ Models estimated with shorter time series are usually 数据少,更稳定,不够可靠

more stable but less reliable.


179
Model Evaluation
自回归条件异方差
Autoregressive Conditional Heteroskedasticity (ARCH)
➢ Review of conditional heteroskedasticity:
heteroskedasticity of the error variance is correlated with
(conditional on) the values of the independent variables.
➢ ARCH: conditional heteroskedasticity in AR models.
✓ When ARCH exists, the standard errors of the
regression coefficients in AR models are incorrect, and
the hypothesis tests of these coefficients are invalid.
180
Model Evaluation
ARCH(1) model
➢ Variance of the error in a particular time-series model in
one period depends on the variance of the error in
previous periods.
ˆt2 = a0 + a1 ˆt2-1 + ut , where ut is the error item.
✓ If the coefficient a1 is statistically significantly different
from 0, the time series is ARCH(1).
✓ If a time series model has ARCH(1) errors, generalized
least squares must be used to develop a predictive
model. 181
Model Evaluation
Predicting variance with ARCH models
➢ If a time-series model has ARCH(1) errors, the ARCH
model can be used to predict the variance of the
residuals in future periods.
✓ ˆ t +1 = aˆ0 + aˆ1 ˆt
2 2

182
Model Evaluation
Steps in time series forecasting
Does series have a trend? (plotting)
Yes

Linear trend Exponential trend No

DW test for serial correlation?


No Yes
Use a trend model Use an AR model
183
Model Evaluation
Steps in time series forecasting (Cont.)
No Yes
Covariance stationary?

First differencing AR(1)?


Yes
Adding lags Serial correlation?
No
广义最小二乘法
ARCH?
Yes
General least squares
No
Time Series model 184
Regression With Two Time Series
Regression with two time series
➢ When running regression with two time series, either or
both could be subject to nonstationarity. 两个时间序列都可能不平稳

➢ Dickey-Fuller tests can be used to detect unit root:


检测单位根
✓ If none of the time series has a unit root, linear
regression can be safely used;
✓ Only one time series has a unit root, linear regression
can not be used;
185
Regression With Two Time Series
Regression with two time series (Cont.)
✓ If both time series have a unit root:
• If the two series are cointegrated, linear regression
协整
can be used;
• If the two series are not cointegrated, linear
regression can not be used.
✓ Cointegration: two time series have long-term financial
or economic relationship so that they do not diverge
from each other without bound in the long run.
186
Summary
➢ Importance: ☆
➢ Content:
✓ In-sample and out-of-sample forecasting; 看样本外RSE越小越好

✓ ARCH model;
✓ Regression with two time series.
➢ Exam tips:
✓ 不是考试重点。

187
Excerpt from “Probabilistic approaches: scenario
analysis, decision trees, and simulations”

Simulation
Tasks:
➢ Describe steps of simulation and treatment of
correlation;
➢ Describe advantage, constraints, and issues of
simulation;
➢ Compare scenario analysis, decision trees, and
simulations.
188
Simulation
Steps in running a simulation
➢ Determine “probabilistic” variables;
➢ Define probability distributions for these variables;
➢ Check for correlation across variables;
➢ Run the simulation.

189
Simulation
Define probability distributions for variables
➢ Historical data
➢ Cross sectional data 截面数据

➢ Statistical distribution and parameters

190
Simulation
Treatment of correlation across variables
➢ When there is strong correlation, positive or negative,
across inputs, we have two choices:
✓ Pick only one that has the bigger impact on value;
✓ Building the correlation explicitly into the simulation.

191
Simulation
Advantages of using simulations
➢ Better input estimation;
➢ It yield a distribution for expected value rather than a
不是点是一个分布
point estimate.

192
Simulation
Constraints on simulations
➢ Book value constraints;
➢ Earnings and cash flow constraints;
➢ Market value constraints.

193
Simulation
Issues in using simulations in risk assessment
➢ Garbage in, garbage out;
➢ Real data may not fit distributions;
➢ Non-stationary distributions;
➢ Changing correlation across inputs.

194
Simulation
Comparing probabilistic approaches
➢ How to choose among probabilistic approaches: scenario
analysis, decision trees, and simulation:
✓ Selective vs. full risk analysis; 选择风险评估还是全面风险评估

✓ Type of risk;
离散还是连续
• Discrete vs. continuous.
✓ Correlations across risks; 风险因子相关性

✓ Quality of information.
195
Summary
➢ Importance: ☆
➢ Content:
✓ Steps of simulation and ways to define probability
distribution;
✓ Advantages, constraints, and issues of simulation;
✓ Comparison of scenario analysis, decision tree and
simulation.
➢ Exam tips:
✓ 不是考试重点。
196

You might also like