0% found this document useful (0 votes)
9 views

Time Series

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Time Series

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Time series

Time series analysis is a specific way of analyzing a sequence of data points


collected over an interval of time. In time series analysis, analysts record data
points at consistent intervals over a set period of time rather than just recording
the data points intermittently or randomly.
• Python Time Series Analysis: Analyze Google Trends Data | DataCamp

• 8.1 Stationarity and differencing | Forecasting: Principles and Practice


(2nd ed) (otexts.com)

• https://www.investopedia.com/terms/b/box-jenkins-
model.asp#:~:text=The%20Box%2DJenkins%20Model%20forecasts,p%2C%20d%2
C%20q).
Google Trends provides insights into what
people are searching for on Google
• Search Interest: Google Trends shows the search interest for various topics
and events in the United States over the past 24 hours. You can explore
what’s trending right now and dive deeper into specific issues and events.
• Year in Search: Explore the year through the lens of Google Trends data. It
gives you a snapshot of the most searched topics, events, and trends
throughout the year.
• Academy Awards 2024: Check out how people are searching for the 96th
Academy Awards.
• Super Tuesday: Republican Primary 2024: See how America is searching as
the Republican Party selects a presidential candidate.
• State of the Union 2024: The State of the Union address will be on
Thursday, March 7th. Discover search trends related to this event.
• Time series analysis in Python is a fascinating field that allows us to
uncover patterns, trends, and seasonality in sequential data.

• Google Trends Data Analysis:


• you can learn how to analyze Google Trends data using Python. The
tutorial focuses on keywords like ‘diet’ and ‘gym’ to observe how their
popularity varies over time. It emphasizes visual exploration of the dataset
rather than heavy mathematics.
• You’ll start by importing necessary packages such as numpy, pandas,
matplotlib, and seaborn. These tools will help you manipulate and visualize
the data.
• The dataset contains monthly search trends for the specified keywords. For
instance, here’s a snippet of the data:
• Table

• Month diet (Worldwide) gym (Worldwide) finance


(Worldwide)
• 2004-01 100 31 48
• 2004-02 75 26 49
• The session covers topics like identifying trends, seasonality, and
correlation analysis. It’s a great way to gain practical experience with
time series techniques.
• Remember, time series analysis opens up a world of possibilities for
understanding patterns and making informed decisions.
stationary
• Stationary time series data : 12

• Signifies that the statistical properties (mean, variance, covariance) remain


constant over time.
• Simplifies data for analysis, modeling, and forecasting.
• Observations are not dependent on time.
• No trend or seasonal effects.
• Consistent summary statistics (mean, variance).
• Variance and covariance are both mathematical terms used in statistics and probability theory:
• Variance measures how much a data set varies from its mean value.
• Covariance measures the directional relationship between two random variables12.

• Chapter 2 Modelling Time Series | Time Series for Beginners (bookdown.org)


• Lag features are target values from previous periods. For example, if you would like to forecast
the sales of a retail outlet in period t, you can use the sales of the previous month t − 1 as a
feature. That would be a lag of 1 and you could say it models some kind of momentum

• Lag is essentially delay. Just as correlation shows how much two time series are similar,
autocorrelation describes how similar the time series is with itself. For lag 1, you compare your
time series with a lagged time series, in other words you shift the time series by 1 before
comparing it with itself

• A lag plot is a special type of scatter plot with the two variables (X,Y) “lagged.” A “lag” is a fixed
amount of passing time; One set of observations in a time series is plotted (lagged) against a
second, later set of data. The k th lag is the time period that happened “k” time points before
time i


AR and MA
• Two of the most common models in time series are the
Autoregressive (AR) models and the Moving Average (MA)
models.
• Autoregressive Model: AR(p)
• The autoregressive model uses observations from previous time
steps as input to a regression equations to predict the value at
the next step. The AR model takes in one argument, p, which
determines how many previous time steps will be inputted.
• The order, p, of the autoregressive model can
be deterimined by looking at the partial autocorrelation
function (PACF). The PACF gives the partial correlation of a
stationary time series with its own lagged values, regressed
of the time series at all shorter lags.
• Let’s take a look at the PACF plot for the global temperature
time series using the pacf() function in R.
• pacf.plot <- pacf(temp.ts)
What is Partial Autocorrelation?
• Partial correlation is a statistical method used to measure how
strongly two variables are related while considering and adjusting for
the influence of one or more additional variables. In more
straightforward terms, it helps assess the connection between two
variables by factoring in the impact of other relevant variables,
providing a more understanding of their relationship.
• In terms of mathematical expression, the partial correlation
coefficient which assesses the relationship between variables X and Y
while considering the influence of variable Z, is typically calculated
using the given formula:
MA MODEL
• Moving Average Models (MA Models) are essential tools used in
econometrics to forecast trends and understand patterns in time
series data. Here’s what you need to know:
• Definition:
• In an MA model, the present value of a time series depends on a
linear combination of the past white noise error terms of the same
time series.
• The order of the moving average model is denoted by the letter “q”,
which represents how many past error terms influence the current
value.
• is the correlation coefficient between X and Y.
• is the correlation coefficient between X and Z.
• is the correlation coefficient between Y and Z.
• The numerator represents the correlation between X and Y after
accounting for their relationships with Z. The denominator normalizes
the correlation by removing the effects of Z.
• The moving average model of order q can be represented as:
• Yt =c+εt +θ1 εt−1 +θ2 εt−2 +…+θq εt−q
• Here:
• (Y_t) is the value of the time series at time (t).
• (c) is a constant or the mean of the time series.
• (\varepsilon_t, \varepsilon_{t-1}, \ldots, \varepsilon_{t-q}) are the
white noise terms associated with the time series at different
time points.
• (\theta_1, \theta_2, \ldots, \theta_q) are the moving average
coefficients.
• Interpretation:
• The MA model has a finite impulse response. This means that the current
noise value directly affects the present value of the model as well as the
next (q) values.
• Unlike autoregressive (AR) models, where past noise indirectly influences
the present value, MA models directly incorporate past noise terms.
• AR models act as infinite impulse models, as the current noise affects an
infinite number of future model values.

BOX JENKINS
• The Box-Jenkins Model forecasts data using three principles:
autoregression, differencing, and moving average. These three
principles are known as p, d, and q, respectively. Each principle is
used in the Box-Jenkins analysis; together, they are collectively shown
as ARIMA (p, d, q).
• The autoregression (p) process tests the data for its level of
stationarity. If the data being used is stationary, it can simplify the
forecasting process. If the data being used is non-stationary it will
need to be differenced (d). The data is also tested for its moving
average fit (which is done in part q of the analysis process). Overall,
initial analysis of the data prepares it for forecasting by determining
the parameters (p, d, and q), which are then applied to develop a
forecast.
• Box-Jenkins is a type of autoregressive integrated moving average
(ARIMA) model that gauges the strength of one dependent variable
relative to other changing variables. The model's goal is to predict
future securities or financial market moves by examining the
differences between values in the series instead of through actual
values.
• An ARIMA model can be understood by outlining each of its components as
follows:

• Autoregression (AR): refers to a model that shows a changing variable that


regresses on its own lagged, or prior, values.
• Integrated (I): represents the differencing of raw observations to allow for
the time series to become stationary, i.e., data values are replaced by the
difference between the data values and the previous values.
• Moving average (MA): incorporates the dependency between an
observation and a residual error from a moving average model applied to
lagged observations.
• Forecasting Stock Prices
• One use for Box-Jenkins Model analysis is to forecast stock prices. This
analysis is typically built out and coded through R software. The analysis
results in a logarithmic outcome, which can be applied to the data set to
generate the forecasted prices for a specified period of time in the future.

• ARIMA models are based on the assumption that past values have some
residual effect on current or future values. For example, an investor using
an ARIMA model to forecast stock prices would assume that new buyers
and sellers of that stock are influenced by recent market transactions when
deciding how much to offer or accept for the security.
• During those times, an investor using an autoregressive model to
predict the performance of U.S. financial stocks would have had good
reason to predict an ongoing trend of stable or rising stock prices in
that sector. However, once it became public knowledge that many
financial institutions were at risk of imminent collapse, investors
suddenly became less concerned with these stocks' recent prices and
far more concerned with their underlying risk exposure.

• Therefore, the market rapidly revalued financial stocks to a much


lower level, a move that would have utterly confounded an
autoregressive model.
ARIMA Parameters

• Each component in ARIMA functions as a parameter with a standard notation. For ARIMA
models, a standard notation would be ARIMA with p, d, and q, where integer values
substitute for the parameters to indicate the type of ARIMA model used. The parameters
can be defined as:

• p: the number of lag observations in the model, also known as the lag order.
• d: the number of times the raw observations are differenced; also known as the degree
of differencing.
• q: the size of the moving average window, also known as the order of the moving
average.
• For example, a linear regression model includes the number and type of terms. A value
of zero (0), which can be used as a parameter, would mean that particular component
should not be used in the model. This way, the ARIMA model can be constructed to
perform the function of an ARMA model, or even simple AR, I, or MA models.
• Because ARIMA models are complicated and work best on very large
data sets, computer algorithms and machine learning techniques are
used to compute them.

• https://www.investopedia.com/terms/a/autoregressive-integrated-
moving-average-arima.asp
• Understanding Autoregressive Integrated Moving Average (ARIMA)
• An autoregressive integrated moving average model is a form of regression analysis that
gauges the strength of one dependent variable relative to other changing variables. The
model's goal is to predict future securities or financial market moves by examining the
differences between values in the series instead of through actual values.
• An ARIMA model can be understood by outlining each of its components as follows:1
• Autoregression (AR): refers to a model that shows a changing variable that regresses on
its own lagged, or prior, values.
• Integrated (I): represents the differencing of raw observations to allow the time series to
become stationary (i.e., data values are replaced by the difference between the data
values and the previous values).
• Moving average (MA): incorporates the dependency between an observation and a
residual error from a moving average model applied to lagged observations
Autoregressive Integrated Moving Average
(ARIMA)
• KEY TAKEAWAYS
• Autocorrelation represents the degree of similarity between a given time series and a lagged version of itself over successive time intervals.
• Autocorrelation measures the relationship between a variable's current value and its past values.
• An autocorrelation of +1 represents a perfect positive correlation, while an autocorrelation of -1 represents a perfect negative correlation.
• Technical analysts can use autocorrelation to measure how much influence past prices for a security have on its future price.

You might also like