0% found this document useful (0 votes)
29 views

Linear Machine Learning and Probabilistic Approaches For Time Series Analysis

This paper analyzes different approaches for time series predictive analysis, including linear models, ARIMA, XGBoost machine learning, copulas for probabilistic modeling, and Bayesian inference. It uses sales time series data from Rossmann stores to compare models and evaluate accuracy. The author finds that stacking ARIMA and XGBoost models improves forecasting accuracy and that probabilistic copula and Bayesian models can obtain distributions for risk analysis of sales dynamics.

Uploaded by

Sushma V
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Linear Machine Learning and Probabilistic Approaches For Time Series Analysis

This paper analyzes different approaches for time series predictive analysis, including linear models, ARIMA, XGBoost machine learning, copulas for probabilistic modeling, and Bayesian inference. It uses sales time series data from Rossmann stores to compare models and evaluate accuracy. The author finds that stacking ARIMA and XGBoost models improves forecasting accuracy and that probabilistic copula and Bayesian models can obtain distributions for risk analysis of sales dynamics.

Uploaded by

Sushma V
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Introduction to Scientific Writing

Linear, Machine Learning and Probabilistic Approaches for Time Series Analysis

1. REFERENCE
B. M. Pavlyshenko, "Linear, machine learning and probabilistic approaches for time series
analysis," 2016 IEEE First International Conference on Data Stream Mining & Processing
(DSMP), Lviv, 2016, pp. 377-381, doi: 10.1109/DSMP.2016.7583582.

2. ABSTRACT
In this paper we study different approaches for time series modelling. The forecasting
approaches using linear models, ARIMA algorithm, XGBoost machine learning algorithm are
described. Results of different model combinations are shown. For probabilistic modelling
the approaches using copulas and Bayesian inference are considered.

Keywords: Time series, forecasting, machine learning, predictive analysis.

3. SUMMARY
 This paper intends to analyse different approaches for time series predictive
analysis.
 The author uses the sales time series of Rossmann stores for the analysis.
 It is needed to forecast not just the sales' probable values but also the distribution,
which is required for the risk analysis for assessing different risks in sales dynamics.
 To analyse different forecasting approaches, the author used two months of the
historical data as validation data for accuracy scoring using root mean squared error
(RMSE).
 Models such as ARIMA, linear regression with LASSO regularization, XGBoost
(Extreme gradient boost) model is used to show the comparison.
 The author also shows that stacking with ARIMA on the first step and XGBoost on
the second step model boosts the accuracy of time series forecasting.
 Two ways of classification are used, the first way based on the time-series approach
and second, based on the identical and independently distributed variables.
 A copula is a multivariate probability distribution, which consists of uniform marginal
probability distribution for each variable.
 Copulas describes the dependency between random variables.
 Sklar's Theorem asserts that any univariate marginal distribution functions and a
copula, which describes the dependence structure between the variables written
together, represents a multivariate joint distribution.
 The copula comprises all information on the dependence structure between the
variables, whereas the marginal cumulative distribution functions include all
information related to the marginal distributions.
 Multivariate dependencies with more than two variables, can be analysed using vine
copulas, which allow to construct complex multivariate copula using bivariate ones.
 The copula can be used to model stochastic dependencies between various factors
of sales time series distinctly from their marginal distributions.
 For experimenting with time series analysis using Bayesian interference, the author
uses Markov Chain Monte Carlo (MCMC) algorithm.
 The plots show the stationary process indicates the good convergence and adequate
burn-in period in the MCMC algorithm.
 The author examines mean sales using Gaussian distribution, sales using student
distribution, and promo using Bernoulli's distribution in the natural logarithmic
scale.
 The Bayesian approach models stochastic dependencies between various factors of
sales time series and obtain the distributions for model parameters. This approach
can be useful for estimating multiple risks related to sales dynamics.

4. NOTES
 A case study for different approaches for time series modelling has been conducted.
 Results are pictorially plotted in the graph.
 For the time series linear regression case, Bayesian interference was applied.
 For probabilistic modelling, copulas were used.
 The probabilistic approach is used for risk assessment problems.
 For machine learning modelling XGBoost model was used.

5. DEFICITS
 Paper talks about the modelling of time series using different approaches, but do
not give the conclusion and comparison for each of the experiments conducted.

You might also like