course content
course content
processes
Assumptions of Linear regression analysis. Violations of the
assumptions of Classical Linear Regression
Analysis. Time series analysis - Preparing data for analysis, Univariate
time series analysis, Autocorrelation
function (ACF), Partial autocorrelation function (PACF), Moving Average
processes (MA), Auto Regressive
processes (AR), ARMA process,
Module II: Univariate time series modelling and forecasting
Box Jenkins approach – Building ARMA, ARIMA models in EVIEWS,
GRETL, R. Forecasting ARMA models using
EVIEWS, Exponential smoothing models, ARIMA models, applications in
financial decision making.
Module III: Multivariate models
Multiequation modelling- Simultaneous equation modelling. Vector Auto
Regression (VAR), VAR with
exogenous variables, VAR estimation in E Views, GRETL, R. Impulse
Response and variance decomposition
Module IV: Modelling long-run relationships
Stationarity and unit root testing, cointegration, equilibrium or error
correction models, testing for cointegration
using Johansen technique.
Module V: Modelling volatility in time series
Modelling time series volatility: Volatility - Historical volatility, Implied
volatility models, ARCH processes,
GARCH Processes, Estimation of ARCH, GARCH models in EVIEWS,
Extensions of GARCH models, Multivariate
GARCH models
MODULE 1
Where R_j^2 is the R^2 value obtained by regressing X_j on the other
independent variables.
o If VIF > 10, multicollinearity is a problem.
Solution:
o Remove one of the correlated variables.
2. Heteroscedasticity
Heteroscedasticity means that the variance of residuals (ϵt\epsilon_t) is not
constant across all values of XX.
Mathematical Representation:
Solution:
o Use Weighted Least Squares (WLS) instead of OLS.
3. Autocorrelation
Autocorrelation occurs when residuals (ϵt\epsilon_t) are correlated over time,
violating the assumption of independence.
Detection:
o Durbin-Watson (DW) Test:
Solution:
o Use ARMA (AutoRegressive Moving Average) models instead
of OLS.
4. Endogeneity
Endogeneity occurs when an independent variable (X) is correlated with the error
term (ϵ\epsilon), leading to biased and inconsistent estimates.
Solution:
o Use Instrumental Variable (IV) Regression, where another
variable (instrument) is used to replace the endogenous X.
5. Non-Linearity
If the relationship between X and Y is non-linear, a simple linear regression
model is inappropriate.
Solution:
o Use Polynomial Regression (e.g., add X^2, X^3 terms).
ACF cuts off after lag qq, while PACF decays gradually.
PACF cuts off after lag pp, while ACF decays gradually.
ARMA Process (Auto-Regressive Moving Average)
A combination of AR(p) and MA(q) models.
Used when data exhibits both autoregressive (AR) and moving average
(MA) properties.
3. Independence of Errors:
o No autocorrelation in residuals.
4. Normality of Errors:
o Residuals follow a normal distribution.
5. No Multicollinearity:
o Independent variables should not be highly correlated.
1. Identification
The first step is to determine the appropriate model for the time series data. This
involves analyzing the stationarity, trend, seasonality, and autocorrelation
structure.
Key Activities in Identification:
1. Stationarity Check
o A stationary time series has a constant mean, variance, and
autocovariance over time.
o Dickey-Fuller Test (ADF Test) is used to check stationarity.
2. Estimation
Once the model is selected, the next step is estimating the parameters of the
model.
Mathematical Representation of ARIMA Model:
where:
p = Order of AutoRegressive (AR) component.
q = Order of Moving Average (MA) component.
d = Differencing order to make series stationary.
ϵt = White noise (error term).
Estimating ARIMA Parameters in Python:
from statsmodels.tsa.arima.model import ARIMA
model = ARIMA(data['Price'], order=(p, d, q))
model_fit = model.fit()
print(model_fit.summary())
The Maximum Likelihood Estimation (MLE) method is used for
parameter estimation.
The model outputs coefficients, AIC (Akaike Information Criterion), and BIC
(Bayesian Information Criterion) to determine model fit.
3. Diagnostic Checking
After estimating the model, we need to check its validity by analyzing residuals.
Key Diagnostic Tests:
1. Residual Analysis
o The residuals should be normally distributed with zero mean
and no autocorrelation.
o A Q-Q Plot can check normality:
4. Forecasting
After model validation, it is used for future predictions.
Forecasting with ARIMA:
forecast = model_fit.forecast(steps=10)
print(forecast)
Evaluating Forecast Accuracy:
1. Mean Absolute Error (MAE)
2. Root Mean Squared Error (RMSE)
Python Implementation:
from sklearn.metrics import mean_squared_error
import numpy as np
rmse = np.sqrt(mean_squared_error(actual_values, forecasted_values))
print('RMSE:', rmse)
Conclusion
The Four Steps in Time Series Analysis:
1. Identification: Choose AR, MA, or ARIMA based on stationarity and
ACF/PACF plots.
2. Estimation: Estimate model parameters using Maximum Likelihood
Estimation (MLE).
3. Diagnostic Checking: Verify residuals are normal and uncorrelated.
4. Forecasting: Use the model to predict future values and evaluate
accuracy.
MODULE 2
Here’s a detailed breakdown of Module 2: Univariate Time Series
Modelling and Forecasting, including equations, model-building steps,
and forecasting techniques.
2. Estimate
o Fit an ARMA(p, q) or ARIMA(p, d, q) model using maximum
likelihood estimation (MLE).
3. Diagnose
o Check residuals for autocorrelation using the Ljung-Box
test.
o Ensure residuals are normally distributed.
4. Forecast
o Use the estimated model to predict future values.
Where:
p = order of Auto-Regression (AR)
d = degree of Differencing
q = order of Moving Average (MA)
L = Lag Operator
If d=1d = 1, the model is ARIMA(1,1,1), meaning one differencing is
applied.
R (Statistical Computing)
Load libraries:
1. library(forecast)
2. library(tseries)
Import data and check stationarity:
3. adf.test(time_series_data)
Fit ARIMA model:
4. model <- auto.arima(time_series_data)
Forecast future values:
5. forecast(model, h=10)
6. plot(forecast(model, h=10))
MODULE 3
Module 3: Multivariate Time Series Models
In contrast to univariate models, where only one dependent variable is
analyzed, multivariate time series models examine multiple
interdependent variables simultaneously. These models help in
understanding causality, interrelationships, and forecasting when
multiple time-dependent variables interact over time.
Where:
Y_t is a vector of endogenous variables.
A_i are coefficient matrices.
ϵt is a white noise error term.
Example of a VAR(1) Model (Two Variables)
Where:
X_t represents exogenous variables (e.g., policy shocks,
macroeconomic indicators).
B is the coefficient matrix capturing their impact.
VAR Estimation in R
library(vars)
# Load dataset
data <- read.csv("timeseries.csv")
# Forecasting
forecast <- predict(var_model, n.ahead=10)
plot(forecast)
Conclusion
Multivariate models help analyze the interactions between
multiple time series.
VAR models are powerful tools for economic forecasting and
policy analysis.
Impulse Response & Variance Decomposition provide valuable
insights into shock transmission.
EVIEWS, GRETL, and R are widely used for estimating and
forecasting VAR models.
MODULE 4
Module 4: Modelling Long-Run Relationships in Time Series
Long-run relationships in time series data are crucial for
macroeconomic forecasting, financial decision-making, and policy
formulation. Traditional time series models like ARMA and VAR are
suitable for stationary data, but real-world financial and economic data
often exhibit non-stationary behavior. This module covers:
Stationarity and Unit Root Testing
Cointegration and Long-Run Equilibrium
Error Correction Models (ECM)
Testing for Cointegration Using Johansen’s Technique
Where:
γ\gamma is the error correction term (ECT), capturing deviations
from equilibrium.
If γ\gamma is negative and significant, the system returns to
equilibrium.
Implementation in R
library(urca)
# Load data
data <- read.csv("financial_data.csv")
Y <- ts(data[,2:3]) # Select variables
# Johansen test
johansen_test <- ca.jo(Y, type="trace", ecdet="const", K=2)
summary(johansen_test)
Conclusion
Unit root tests ensure data stationarity.
Cointegration tests check for long-run relationships.
Error Correction Models adjust short-run deviations while
maintaining equilibrium.
Johansen’s technique is a powerful tool for multivariate
cointegration testing.
MODULE 5
Module 5: Modelling Volatility in Time Series
Volatility modelling is crucial in financial markets, risk management,
and derivative pricing. Unlike standard time series models, which
assume constant variance, financial time series often exhibit time-
varying volatility, commonly seen in stock returns, exchange rates, and
interest rates. This module covers:
Historical Volatility and Implied Volatility
Autoregressive Conditional Heteroskedasticity (ARCH) Models
Generalized ARCH (GARCH) Models
Extensions of GARCH Models
Multivariate GARCH Models
Estimation of ARCH/GARCH Models in EVIEWS
Types of Volatility
1. Historical Volatility (HV)
o Measures past fluctuations in asset prices.
where:
σt2\sigma_t^2 is the conditional variance.
ϵt2\epsilon_t^2 represents past shocks.
Estimation in EVIEWS:
arch_model <- garchFit(~garch(1,0), data=returns, trace=FALSE)
summary(arch_model)
Limitation: ARCH(q) requires a large number of lags (q) to capture
volatility, making it inefficient.
where:
βj\beta_j represents past volatility’s impact on current variance.
p and q are selected using AIC/BIC.
Example: GARCH(1,1) Model
3. Interpret Results
o Alpha (α1\alpha_1): Impact of past shocks.
Conclusion
ARCH models capture conditional variance.
GARCH models improve efficiency by adding lagged variance.
EGARCH, TGARCH, and GARCH-M enhance modeling capabilities.
Multivariate GARCH is essential for portfolio risk management.
EVIEWS, R, and Python help estimate these models efficiently.