(Ebook) Deep Learning in Multi-step Prediction of Chaotic Dynamics: From Deterministic Models to Real-World Systems by Matteo Sangiorgio ISBN 9783030944810, 3030944816 all chapter instant download
(Ebook) Deep Learning in Multi-step Prediction of Chaotic Dynamics: From Deterministic Models to Real-World Systems by Matteo Sangiorgio ISBN 9783030944810, 3030944816 all chapter instant download
com
OR CLICK HERE
DOWLOAD EBOOK
ebooknice.com
https://ebooknice.com/product/chaotic-dynamics-in-planetary-
systems-54704748
ebooknice.com
(Ebook) Chaotic Dynamics in Planetary Systems by Sylvio
Ferraz-Mello ISBN 9783031458163, 3031458168
https://ebooknice.com/product/chaotic-dynamics-in-planetary-
systems-54822428
ebooknice.com
https://ebooknice.com/product/chaotic-dynamics-of-nonlinear-
systems-11099820
ebooknice.com
ebooknice.com
ebooknice.com
Matteo Sangiorgio
Fabio Dercole
Giorgio Guariso
Deep Learning
in Multi-step
Prediction
of Chaotic Dynamics
From Deterministic
Models to Real-World
Systems
SpringerBriefs in Applied Sciences and Technology
PoliMI SpringerBriefs
Editorial Board
Barbara Pernici, Politecnico di Milano, Milano, Italy
Stefano Della Torre, Politecnico di Milano, Milano, Italy
Bianca M. Colosimo, Politecnico di Milano, Milano, Italy
Tiziano Faravelli, Politecnico di Milano, Milano, Italy
Roberto Paolucci, Politecnico di Milano, Milano, Italy
Silvia Piardi, Politecnico di Milano, Milano, Italy
More information about this subseries at https://link.springer.com/bookseries/11159
http://www.polimi.it
Matteo Sangiorgio · Fabio Dercole ·
Giorgio Guariso
Deep Learning
in Multi-step Prediction
of Chaotic Dynamics
From Deterministic Models
to Real-World Systems
Matteo Sangiorgio Fabio Dercole
Dipartimento di Elettronica, Dipartimento di Elettronica,
Informazione e Bioingegneria (DEIB) Informazione e Bioingegneria (DEIB)
Politecnico di Milano Politecnico di Milano
Milan, Italy Milan, Italy
Giorgio Guariso
Dipartimento di Elettronica,
Informazione e Bioingegneria (DEIB)
Politecnico di Milano
Milan, Italy
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Non ti sia grave il fermarti alcuna volta a
vedere nelle macchie de’ muri, o nella cenere
del fuoco, o nuvoli o fanghi, od altri simili
luoghi, ne’ quali, se ben saranno da te
considerati, tu troverai invenzioni
mirabilissime… perchè nelle cose confuse
l’ingegno si desta.
(It would not be too much of an effort to
pause sometimes to look into these stains on
walls, the ashes from the fire, the clouds, the
mud, or other similar places. If these are well
contemplated, you will find fantastic
inventions… These will do you well because
they will awaken genius.)
Leonardo da Vinci, Trattato della pittura, ca.
1540.
Preface
In the present data-rich era, we know that time series of many variables can hardly
be interpreted as regular movements plus some stochastic noise. For half a century,
we have also known that even apparently simple sets of nonlinear equations can
produce extremely complex movements that remain within a limited portion of the
variables space without being periodic. Such movements have been named “chaotic”
(“deterministic chaos” when the equations include no stochasticity).
Immediately after they were discovered, Lorenz and other researchers were trou-
bled by the problem of predictability. How far into the future can we reliably forecast
the output of such systems? For many years, the answer to such a question remained
limited to very few steps. Today, however, powerful computer tools are available
and have been successfully used to accomplish complex tasks. Can we extend our
predictive ability using such tools? How far? Can we predict not just a single value,
but also an entire sequence of outputs?
This book tries to answer these questions by using deep artificial neural networks
as the forecasting tools and analyzing the performances of different architectures of
such networks. In particular, we compare the classical feed-forward (FF) architecture
with the more recent long short-term memory (LSTM) structure. For the latter, we
explore the possibility of using or not the traditional training approach known as
“teacher forcing”.
Before presenting these methods in detail, we revise the basic elements and tools
of chaos theory and nonlinear time-series analysis in Chap. 2. We take a practical
approach, looking at how chaoticity can be quantified by simulating mathematical
models, as well as from real-world measurements.
Chapter 3 presents the cases on which we test our deep neural predictors. We
consider four well-known dynamical systems (in discrete time) showing a chaotic
attractor in their variables space: the logistic and the Hénon maps, which are the
prototypes of chaos in non-reversible and reversible systems, respectively, and two
generalized Hénon maps, to include cases of low- and high-dimensional hyperchaos.
These systems easily produce arbitrarily long synthetic datasets, in the form of deter-
ministic chaotic time series, to which we also add a synthetic noise to better mimic
real situations. We finally consider two real-world time series: solar irradiance and
vii
viii Preface
ozone concentration, measured at two stations in Northern Italy. These time-series are
shown to contain a chaotic movement by means of the tools of nonlinear time-series
analysis.
Chapter 4 illustrates the structures of the neural networks that we use and intro-
duces their performance metrics. It deals, in particular, with the issue of multi-
step forecasting that is commonly performed by recursively nesting one-step-ahead
predictions (recursive predictor). An alternative explored here consists of training
the model to directly compute multiple outputs (multi-output predictor), each repre-
senting the prediction at a specific time step in the future. Multi-output prediction
can be implemented in static networks (using FF nets), as well as in a dynamic way
(adopting recurrent architectures, such as the LSTM nets). The standard training
procedures for both architectures are revised, with particular attention to the issue
related to the “teacher forcing” training approach for LSTM networks.
Chapter 5 illustrates the key results of the study on both the synthetic and the
real-world time series and explores the effect of different sources of noise. In partic-
ular, we consider a stochastic environment that mimics the observation noise, and
the presence of non-stationary dynamics that can be seen as a structural distur-
bance. Additionally, in the context of chaotic systems’ forecasting, we introduce the
concept of the generalization capability of the neural predictors in terms of “domain
adaptation”.
Chapter 6 discusses some remarks on the choice of the experimental settings that
are adopted in this work. It also presents additional aspects of the neural predic-
tors, analyzing their training method, their neural architecture, and their long-term
performances.
The main result we can draw from our journey in the vast area of deep neural
network architectures is that the LSTM networks—which are dynamical systems
themselves—trained without teacher forcing, are the best approach to predict
complex oscillatory time series. They systematically outperform the competitors
and also prove to be able to adapt to other domains with similar features without
a relevant decrease of accuracy. Overall, they represent a significant improvement
of forecasting capabilities of chaotic time series and can be used as components
of advanced control techniques, such as model predictive control, to optimize the
management of complex real-world systems. These conclusions and some consid-
erations about the expected future research in the field are summarized in the last
chapter.
The book will be useful for researchers and Ph.D. students in the area of neural
networks and chaotic systems as well as to practitioners that are interested in applying
modern deep learning techniques to the forecasting of complex real-world time series,
particularly those related to environmental systems. The book will also be of interest
for all the scholars that would like to bridge the gap between the classical theory of
nonlinear systems and recent development in machine learning techniques.
The book largely draws from the Ph.D. thesis in Information Technology by
Matteo Sangiorgio; supervisor: Giorgio Guariso; tutor: Fabio Dercole, Politecnico
di Milano, 2021. The main results presented in the text appeared in the following
publications:
Preface ix
xi
xii Contents
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Chapter 1
Introduction to Chaotic Dynamics’
Forecasting
Abstract Chaotic dynamics are the paradigm of complex and unpredictable evolu-
tion due to their built-in feature of amplifying arbitrarily small perturbations. The
forecasting of these dynamics has attracted the attention of many scientists since
the discovery of chaos by Lorenz in the 1960s. In the last decades, machine learn-
ing techniques have shown a greater predictive accuracy than traditional tools from
nonlinear time-series analysis. In particular, artificial neural networks have become
the state of the art in chaotic time series forecasting. However, how to select their
structure and the training algorithm is still an open issue in the scientific commu-
nity, especially when considering a multi-step forecasting horizon. We implement
feed-forward and recurrent architectures, considering different training methods and
forecasting strategies. The predictors are evaluated on a wide range of problems,
from low-dimensional deterministic cases to real-world time series.
Since Edward Lorenz’s discovery of deterministic chaos in the sixties [63], many
attempts to forecast the future evolution of chaotic systems and to discover how
far they can be predicted, have been carried out adopting a wide range of models.
Some early attempts were performed in the 80s and 90s [7, 18, 31, 50, 75, 82–
84, 91, 97], but the topic has been debated more and more in recent years (see
Fig. 1.1) due to the development of lots of machine learning techniques in the field
of time-series analysis and prediction. Attempts include widely used models such
as the support vector machines [41, 73, 114] and the polynomial functions [92,
93], but the most widely used architectures are artificial neural networks (ANNs).
The main distinction within ANNs is between feed-forward (FF) structures—static
maps which approximate the relationship between inputs and outputs—and recurrent
neural nets (RNNs)—dynamical models whose outputs also depend on an internal
state characterizing each neuron.
Examples of the first category are the traditional multi-layer perceptrons [4, 9,
24, 25, 28, 42, 52, 56, 61, 103, 110], radial basis function networks [26, 38, 57, 71,
95, 96, 104], fuzzy neural networks [22, 33, 53, 69, 108, 117], deep belief neural
nets [54], extreme learning machines [30, 72], and convolutional neural networks
[8, 81]. RNNs are particularly suited to be used as predictors of nonlinear time series
350
300
Web of Science records per year
250
200
150
100
50
0
1990 1995 2000 2005 2010 2015 2020
Year
Fig. 1.1 Number of records by publication year (1986–2020) from a Web of Science search for
the topic “chaos prediction”
[15, 16, 19, 20, 43, 61, 66, 67, 113, 116], though their training is computation-
ally heavy because of the back propagation through time of the internal memory
[36]. Recently, two particular RNN classes, reservoir computers and long short-term
memory (LSTM) networks, have been demonstrated to be extremely efficient in the
prediction of chaotic dynamics [3, 10, 13, 14, 29, 40, 44, 49, 51, 64, 65, 68, 74,
79, 80, 87, 98–102, 111, 115, 118]. Reservoir computers make use of many recur-
rent neurons with random connections and training is only performed on the output
layer. Conversely, recurrent neurons are trained in LSTM nets. These nets avoid the
typical issue of vanishing or exploding optimization gradients by modulating the
information flow through so-called gates, whose behaviors are learned as well.
The comparison between feed-forward and recurrent nets is not trivial. Despite
the initial optimism, RNNs turned out to achieve similar performances to tradi-
tional FF architectures on many forecasting tasks [34, 67]. Recurrent neurons have
been demonstrated to be efficient when they are used as basic blocks to build up
sequence-to-sequence architectures. This kind of structure represents the state-of-
the-art approach in many natural language processing tasks (e.g., machine transla-
tion, question answering, automatic summarization, text-to-speech, . . . ), though they
have recently lost ground to advanced FF architectures. Other works in the field of
chaotic dynamics’ forecasting focused on wavelet networks [17], swarm-optimized
neural nets [62], Chebyshev [2], and residual [67] nets. Some authors developed
hybrid forecasting schemes that integrate machine learning tools and knowledge-
based models [27, 55, 58, 78, 98] or combine multiple predictive techniques [23,
47, 76, 90].
Forecasting chaotic dynamics one or few time steps ahead is usually an easy
task as demonstrated by the high performances obtained in many systems, in both
continuous and discrete time [1, 4, 5, 59, 68, 69, 103, 107, 109, 116]. The situation
1 Introduction to Chaotic Dynamics’ Forecasting 3
noise. To quantify the effect of the observation noise, we artificially add a stochas-
tic disturbance with different magnitudes to the deterministic time series generated
by the traditional chaotic systems. We also assess the sensitivity to the structural
noise, obtained by introducing a slow-varying dynamic for the parameter defining
the growth rate in the traditional logistic map. The resulting non-stationary process
has concurrent slow and fast dynamics that represent a challenging forecasting task.
These numerical experiments lay somewhat in between the deterministic systems,
mainly theoretical, and the practical applications. Finally, we apply the proposed
methodologies to two real-world time series that exhibit a chaotic behavior: solar
irradiance and ozone concentration.
The results obtained show that, in general, LSTM nets trained without TF are the
highest performing in predicting chaotic dynamics. The improvement with respect
to the three competitors is not uniform and strongly task-dependent. LSTM-no-TF
are also shown to be more robust when a redundant (or lacking) number of time lags
are included in the input—a feature that represents a remarkable advantage given
the well-known problem of estimating the actual embedding dimension from a time
series [11].
Another remarkable feature of the forecasting models is their generalization capa-
bility, often mentioned as “domain adaptation” (a sub-field of the so-called “transfer
learning”) in the neural networks literature [35]. It indicates the possibility of storing
the knowledge gained while solving a given task and applying it to different, though
similar, datasets [112]. Transfer learning became a hot topic in machine learning in
the last decade, but it only received attention from researchers in the field of nonlinear
science and chaotic systems in recent times [37, 39, 48, 86, 106]. To test this feature,
the neural networks developed to forecast the solar irradiance in a specific location
(source domain) have been used, without retraining, on other sites (target domains)
with quite different geographical conditions. The neural networks developed in our
study have proved able to forecast solar radiation in other stations with a minimal
loss of precision.
The rest of the book is structured as follows. Chapter 2 introduces some basic
concepts related to chaos theory. Chapter 3 presents the chaotic systems and time
series used to test the predictors. Chapter 4 describes different ways to use feed-
forward and recurrent neural networks in a multi-step ahead prediction task. The
other sections of this chapter focus on some technical issues on how to set up a
supervised learning task starting from a time series and on the performance metrics
and the training procedure. Chapter 5 reports the forecasting performance obtained
in both the artificial systems and the real-world time series considered. In Chap. 6,
we discuss some technical aspects of the training procedure and of the behavior of
the different neural predictors. We also present an overview of other alternative state-
of-the-art architectures which can be adopted for numerical time series forecasting.
Chapter 7 provides the concluding remarks of the book and presents some possible
future research directions.
References 5
References
1. Abdulkadir, S. J., Alhussian, H., & Alzahrani, A. I. (2018). Analysis of recurrent neural net-
works for henon simulated time-series forecasting. Journal of Telecommunication, Electronic
and Computer Engineering (JTEC), 10.1-8, 155–159.
2. Akritas, P., Antoniou, I., & Ivanov, V. V. (2000). Identification and prediction of discrete
chaotic maps applying a Chebyshev neural network. Chaos, Solitons and Fractals, 11.1-3,
337–344.
3. Antonik, P., et al. (2018). Using a reservoir computer to learn chaotic attractors, with appli-
cations to chaos synchronization and cryptography. Physical Review E, 98.1, 012215.
4. Atsalakis, G., Skiadas, C., & Nezis, D. (2008). Forecasting Chaotic time series by a Neural
Network. In Proceedings of the 8th International Conference on Applied Stochastic Models
and Data Analysis, Vilnius, Lithuania. (Vol. 30, p. 7782).
5. Atsalakis, G., & Tsakalaki, K. (2012). Simulating annealing and neural networks for chaotic
time series forecasting. Chaotic Model. Simul., 1, 81–90.
6. Bakker, R., et al. (2000). Learning chaotic attractors by neural networks. Neural Computation,
12.10, 2355–2383.
7. Bollt, E. M. (2000). Model selection, confidence and scaling in predicting chaotic time-series.
International Journal of Bifurcation and Chaos, 10.06, 1407–1422.
8. Bompas, S., Georgeot, B., & Guéry-Odelin, D. (2020). Accuracy of neural networks for the
simulation of chaotic dynamics: Precision of training data versus precision of the algorithm.
arXiv:2008.04222.
9. Bonnet, D., Labouisse, V., & Grumbach, A. (1997). δ-NARMA neural networks: A new
approach to signal prediction. IEEE Transactions on Signal Processing, 45.11, 2799–2810.
10. Borra, F., Vulpiani, A., & Cencini, M. (2020). Effective models and predictability of chaotic
multiscale systems via machine learning. Physical Review E, 102.5, 052203.
11. Bradley, E., & Kantz, H. (2015). Nonlinear time-series analysis revisited. Chaos: An Inter-
disciplinary Journal of Nonlinear Science, 25.9, 097610.
12. Brajard, J. et al. (2020). Combining data assimilation and machine learning to emulate a
dynamical model from sparse and noisy observations: A case study with the Lorenz 96 model.
Journal of Computational Science, 44, 101171.
13. Butcher, J. B., et al. (2013). Reservoir computing and extreme learning machines for non-
linear time-series data analysis. Neural Networks, 38, 76–89.
14. Canaday, D., Griffith, A., & Gauthier, D. J. (2018). Rapid time series prediction with a
hardware-based reservoir computer. Chaos: An Interdisciplinary Journal of Nonlinear Sci-
ence, 28.12, 123119.
15. Cannas, B., & Cincotti, S. (2002). Neural reconstruction of Lorenz attractors by an observable.
Chaos, Solitons and Fractals, 14.1, 81–86.
16. Cannas, B., et al. (2001). Learning of Chua’s circuit attractors by locally recurrent neural
networks. Chaos, Solitons and Fractals, 12.11, 2109–2115.
17. Cao, L., et al. (1995). Predicting chaotic time series with wavelet networks. Physica D:
Nonlinear Phenomena, 85.1-2, 225–238.
18. Casdagli, M. (1989). Nonlinear prediction of chaotic time series. Physica D: Nonlinear Phe-
nomena, 35.3, 335–356.
19. Cechin, A. L., Pechmann, D. R., & de Oliveira, L. P. (2008). Optimizing Markovian modeling
of chaotic systems with recurrent neural networks. Chaos, Solitons and Fractals, 37.5, pp.
1317–1327.
20. Chandra, R., & Zhang, M. (2012). Cooperative coevolution of Elman recurrent neural net-
works for chaotic time series prediction. Neurocomputing, 86, 116–123.
21. Chen, P., et al. (2020). Autoreservoir computing for multistep ahead prediction based on the
spatiotemporal information transformation. Nature Communications, 11.1, 1–15.
22. Chen, Z. (2010). A chaotic time series prediction method based on fuzzy neural network and
its application. In International Workshop on Chaos-Fractal Theories and Applications (pp.
355–359). IEEE.
6 1 Introduction to Chaotic Dynamics’ Forecasting
23. Cheng, W., et al. (2021). High-efficiency chaotic time series prediction based on time convo-
lution neural network. Chaos, Solitons and Fractals, 152, 111304.
24. Covas, E., & Benetos, E. (2019). Optimal neural network feature selection for spatial-temporal
forecasting. Chaos: An Interdisciplinary Journal of Nonlinear Science, 29.6, 063111.
25. Dercole, F., Sangiorgio, M., & Schmirander, Y. (2020). An empirical assessment of the uni-
versality of ANNs to predict oscillatory time series. IFAC-PapersOnLine, 53.2, 1255–1260.
26. Ding, H.-L., et al. (2009). Prediction of chaotic time series using L-GEM based RBFNN.
In 2009 International Conference on Machine Learning and Cybernetics (Vol. 2, pp. 1172–
1177). IEEE.
27. Doan, N. A. K., Polifke, W., & Magri, L. (2019). Physics-informed echo state networks
for chaotic systems forecasting. In International Conference on Computational Science (pp.
192–198). Springer.
28. Dudul, S. V. (2005). Prediction of a Lorenz chaotic attractor using two-layer perceptron neural
network. Applied Soft Computing, 5.4, pp. 333–355.
29. Fan, H., et al. (2020). Long-term prediction of chaotic systems with machine learning. Physical
Review Research, 2.1, 012080.
30. Faqih, A., Kamanditya, B., & Kusumoputro, B. (2018). Multi-step ahead prediction of
Lorenz’s Chaotic system using SOM ELM- RBFNN. In 2018 International Conference on
Computer, Information and Telecommunication Systems (CITS) (pp. 1–5). IEEE.
31. Farmer, J. D., & Sidorowich, J. J. (1987). Predicting chaotic time series. Physical Review
Letters, 59.8, 845.
32. M Galván, I., & Isasi, P. (2001). Multi-step learning rule for recurrent neural models: An
application to time series forecasting. Neural Processing Letters, 13.2, 115–133.
33. Gao, Y., & Joo Er, M. (2005). NARMAX time series model prediction: Feedforward and
recurrent fuzzy neural network approaches. Fuzzy Sets and Systems, 150.2, 331–350.
34. Gers, F. A., Eck, D., & Schmidhuber, J. (2002). Applying LSTM to time series predictable
through time-window approaches. In Neural Nets WIRN Vietri-01 (pp. 193–200). Springer.
35. Glorot, X., Bordes, A., & Bengio, Y. (2011). Domain adaptation for large-scale sentiment
classification: A deep learning approach. Proceedings of the 28 th International Conference
on Machine Learning, Bellevue, WA, USA.
36. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.
37. Guariso, G., Nunnari, G., & Sangiorgio, M. (2020). Multi-step solar irradiance forecasting
and domain adaptation of deep neural networks. Energies, 13.15,3987.
38. Guerra, F. A., & dos Coelho, L. S. (2008). Multi-step ahead nonlinear identification of Lorenz’s
chaotic system using radial basis neural network with learning by clustering and particle swarm
optimization. Chaos, Solitons and Fractals, 35.5, 967–979.
39. Guo, Y., et al. (2020). Transfer learning of chaotic systems. arXiv:2011.09970.
40. Haluszczynski, A., & Räth, C. (2019). Good and bad predictions: Assessing and improving the
replication of chaotic attractors by means of reservoir computing. Chaos: An Interdisciplinary
Journal of Nonlinear Science, 29.10, 103143.
41. Han, L., Ding, L., & Qi, L. (2005). Chaotic Time series nonlinear prediction based on support
vector machines. Systems Engineering - Theory and Practice, 9.
42. Han, M., & Wang, Y. (2009). Analysis and modeling of multivariate chaotic time series based
on neural network. Expert Systems with Applications, 36.2, 1280–1290.
43. Han, M. et al. (2004). Prediction of chaotic time series based on the recurrent predictor neural
network. IEEE Transactions on Signal Processing, 52.12, 3409–3416.
44. Hassanzadeh, P., et al. (2019). Data-driven prediction of a multi-scale Lorenz 96 chaotic
system using a hierarchy of deep learning methods: Reservoir computing, ANN, and RNN-
LSTM. In Bulletin of the American Physical Society, C17-009.
45. He, T., et al. (2019). Quantifying exposure bias for neural language generation.
arXiv:1905.10617
46. Hussein, S., Chandra, R., & Sharma, A. (2016). Multi-step- ahead chaotic time series pre-
diction using coevolutionary recurrent neural networks. In IEEE Congress on Evolutionary
Computation (CEC) (pp. 3084–3091). IEEE.
References 7
47. Inoue, H., Fukunaga, Y., & Narihisa, H. (2001). Efficient hybrid neural network for chaotic
time series prediction. In International Conference on Artificial Neural Networks (pp. 712–
718). Springer.
48. Inubushi, M., & Goto, S. (2020). Transfer learning for nonlinear dynamics and its application
to fluid turbulence. Physical Review E, 102.4, 043301.
49. Jiang, J., & Lai, Y.-C. (2019). Model-free prediction of spatiotemporal dynamical systems
with recurrent neural networks: Role of network spectral radius. Physical Review Research,
1.3, 033056.
50. Jones, R. D., et al. (1990). Function approximation and time series prediction with neural
networks. In 1990 IJCNN International Joint Conference on Neural Networks (pp. 649–665).
IEEE.
51. Jüngling, T. (2019). Reconstruction of complex dynamical systems from time series using
reservoir computing. In IEEE International Symposium on Circuits and Systems (ISCAS) (pp.
1–5). IEEE.
52. Karunasinghe, D. S. K., & Liong, S.-Y. (2006). Chaotic time series prediction with a global
model: Artificial neural network. Journal of Hydrology, 323.1-4, 92–105.
53. Kuremoto, T., et al. (2003). Predicting chaotic time series by reinforcement learning. In
Proceedings of the 2nd International Conferences on Computational Intelligence, Robotics
and Autonomous Systems (CIRAS 2003).
54. Kuremoto, T. (2014). Forecast chaotic time series data by DBNs. In 7th International Congress
on Image and Signal Processing (pp. 1130–1135). IEEE.
55. Lei, Y., Hu, J., & Ding, J. (2020). A hybrid model based on deep LSTM for predicting
high-dimensional chaotic systems. arXiv:2002.00799.
56. Lellep, M., et al. (2020). Using machine learning to predict extreme events in the Hénon map.
In Chaos: An Interdisciplinary Journal of Nonlinear Science, 30.1, 013113.
57. Leung, H., Lo, T., & Wang, S. (2001). Prediction of noisy chaotic time series using an optimal
radial basis function neural network. IEEE Transactions on Neural Networks, 12.5, 1163–
1172.
58. Levine, M. E., & Stuart, A. M. (2021). A framework for machine learning of model error in
dynamical systems. arXiv:2107.06658.
59. Li, Q., & Lin, R.-C. (2016). A new approach for chaotic time series prediction using recurrent
neural network. Mathematical Problems in Engineering, 3542898.
60. Lim, T. P., & Puthusserypady, S. (2006). Error criteria for cross validation in the context
of chaotic time series prediction. Chaos: An Interdisciplinary Journal of Nonlinear Science,
16.1, 013106.
61. Lin, T.-N., et al. (1997). A delay damage model selection algorithm for NARX neural net-
works. IEEE Transactions on Signal Processing, 45.11, 2719–2730.
62. López-Caraballo, C. H., et al. (2016). Mackey-Glass noisy chaotic time series prediction
by a swarm-optimized neural network. Journal of Physics: Conference Series, 720, 1. IOP
Publishing.
63. Lorenz, E. N. (1963). Deterministic nonperiodic flow. Journal of the Atmospheric Sciences,
20.2, 130–141.
64. Lu, Z., Hunt, B. R., & Ott, E. (2018). Attractor reconstruction by machine learning. Chaos:
An Interdisciplinary Journal of Nonlinear Science, 28.6, 061104.
65. Lu, Z., et al. (2017). Reservoir observers: Model-free inference of unmeasured variables in
chaotic systems. Chaos: An Interdisciplinary Journal of Nonlinear Science, 27.4, 041102.
66. Ma, Q.-L. (2007). Chaotic time series prediction based on evolving recurrent neural networks.
In International Conference on Machine Learning and Cybernetics (Vol. 6, pp. 3496–3500).
IEEE.
67. Maathuis, H. et al. (2017). Predicting chaotic time series using machine learning techniques.
In Preproceedings of the 29th Benelux Conference on Artificial Intelligence (BNAIC 2017)
(pp. 326–340).
68. Madondo, M., & Gibbons, T. (2018). Learning and modeling chaos using lstm recurrent neural
networks. MICS 2018 Proceedings Paper 26.
8 1 Introduction to Chaotic Dynamics’ Forecasting
69. Maguire, L. P., et al. (1998). Predicting a chaotic time series using a fuzzy neural network.
Information Sciences, 112.1-4, 125–136.
70. Mariet, Z., & Kuznetsov, V. (2019). Foundations of sequence-to-sequence modeling for time
series. In The 22nd International Conference on Artificial Intelligence and Statistics, 408–417.
71. Masnadi-Shirazi, M., & Subramaniam, S. (2020). Attractor Ranked Radial Basis function
network: A nonparametric forecasting Approach for chaotic Dynamic Systems. Scientific
Reports, 10.1, 1–10.
72. Xin-Ying, W. Min, H. (2012). Multivariate chaotic time series prediction based on extreme
learning machine. Acta Physica Sinica, 8.
73. Mukherjee, S., Osuna, E., & Girosi, F. (1997). Nonlinear prediction of chaotic time series
using support vector machines. In Neural Networks for Signal Processing VII. Proceedings
of the 1997 IEEE Signal Processing Society Workshop (pp. 511–520). IEEE.
74. Nakai, K., & Saiki, Y. (2019). Machine-learning construction of a model for a macroscopic
fluid variable using the delay-coordinate of a scalar observable. arXiv:1903.05770.
75. Navone, H. D., & Ceccatto, H. A. (1995). Learning chaotic dynamics by neural networks.
Chaos, Solitons and Fractals, 6, 383–387.
76. Okuno, S., Aihara, K., & Hirata, Y. (2019). Combining multiple forecasts for multivariate
time series via state-dependent weighting. Chaos: An Interdisciplinary Journal of Nonlinear
Science, 29.3, 033128.
77. Patel, D., et al. (2021). Using machine learning to predict statistical properties of non-
stationary dynamical processes: System climate, regime transitions, and the effect of stochas-
ticity. Chaos: An Interdisciplinary Journal of Nonlinear Science, 31.3, 033149.
78. Pathak, J., et al. (2018). Hybrid forecasting of chaotic processes: Using machine learning in
conjunction with a knowledge-based model. Chaos: An Interdisciplinary Journal of Nonlinear
Science, 28.4, 041101.
79. Pathak, J., et al. (2018). Model-free prediction of large spatiotemporally chaotic systems from
data: A reservoir computing approach. Physical Review Letters, 120.2, 024102.
80. Pathak, J., et al. (2017). Using machine learning to replicate chaotic attractors and calculate
Lyapunov exponents from data. Chaos: An Interdisciplinary Journal of Nonlinear Science,
27.12, 121102.
81. Penkovsky, B., et al. (2019). Coupled nonlinear delay systems as deep convolutional neural
networks. Physical Review Letters, 123.5, 054101.
82. Principe, J. C., & Kuo, J.-M. (1995). Dynamic modelling of chaotic time series with neural
networks. Proceedings of the 7th International Conference on Neural Information Processing
Systems, 311–318.
83. Principe, J. C., Rathie, A., Kuo, J.-M. (1992). Prediction of chaotic time series with neural
networks and the issue of dynamic modeling. International Journal of Bifurcation and Chaos,
2.04, 989–996.
84. Principe, J. C., Wang, L., & Kuo, J.-M. (1998). Non-linear dynamic modelling with neural
networks. In Signal Analysis and Prediction (pp. 275–290). Springer.
85. Ranzato, M., et al. (2015). Sequence level training with recurrent neural networks.
arXiv:1511.06732.
86. Sangiorgio, M. (2021). Deep learning in multi-step forecasting of chaotic dynamics. Ph.D.
thesis. Department of Electronics, Information and Bioengineering, Politecnico di Milano.
87. Sangiorgio, M., & Dercole, F. (2020). Robustness of LSTM neural networks for multi-step
forecasting of chaotic time series. Chaos, Solitons and Fractals, 139, 110045.
88. Sangiorgio, M., Dercole, F., & Guariso, G. (2021). Forecasting of noisy chaotic systems with
deep neural networks. Chaos, Solitons & Fractals, 153, 111570.
89. Shi, X., et al. (2017). Chaos time-series prediction based on an improved recursive Levenberg-
Marquardt algorithm. Chaos, Solitons and Fractals, 100, 57–61.
90. Shi, Z., & Han, M. (2007). Support vector echo-state machine for chaotic time-series predic-
tion. IEEE Transactions on Neural Networks, 18.2, 359–372.
91. Shukla, J. (1998). Predictability in the midst of chaos: A scientific basis for climate forecasting.
Science, 282.5389, 728–731.
References 9
92. Su, L., & Li, C. (2015). Local prediction of chaotic time series based on polynomial coefficient
autoregressive model. Mathematical Problems in Engineering, 901807.
93. Su, L.-Y. (2010). Prediction of multivariate chaotic time series with local polynomial fitting.
Computers and Mathematics with Applications, 59.2, 737–744.
94. Teng, Q., & Zhang, L. (2019). Data driven nonlinear dynamical systems identification using
multi-step CLDNN. AIP Advances, 9.8, p. 085311.
95. Todorov, Y., Koprinkova-Hristova, P., & Terziyska, M. (2017). Intuitionistic fuzzy radial basis
functions network for modeling of nonlinear dynamics. In 2017 21st International Conference
on Process Control (PC) (pp. 410–415). IEEE.
96. Van Truc, N., & Anh, D. T. (2018). Chaotic time series prediction using radial basis function
networks. 2018 4th International Conference on Green Technology and Sustainable Devel-
opment (GTSD) (pp. 753–758). IEEE.
97. Verdes, P. F., et al. (1998). Forecasting chaotic time series: Global versus local methods. Novel
Intelligent Automation and Control Systems, 1, 129–145.
98. Vlachas, P. R., et al. (2018). Data-driven forecasting of high-dimensional chaotic systems
with long short-term memory networks. In Proceedings of the Royal Society A: Mathematical,
Physical and Engineering Sciences (Vol. 474.2213, p. 20170844).
99. Vlachas, P. R. et al. (2020). Backpropagation algorithms and Reservoir Computing in Recur-
rent Neural Networks for the forecasting of complex spatiotemporal dynamics. Neural Net-
works, 126, 191–217.
100. Wan, Z. Y., et al. (2018). Data-assisted reduced-order modeling of extreme events in complex
dynamical systems. PLoS One, 13.5, e0197704.
101. Wang, R., Kalnay, E., & Balachandran, B. (2019). Neural machine-based forecasting of
chaotic dynamics. Nonlinear Dynamics, 98.4, 2903–2917.
102. Weng, T., et al. (2019). Synchronization of chaotic systems and their machine-learning models.
Physical Review E, 99.4, 042203.
103. Woolley, J. W., Agarwal, P. K., & Baker, J. (2010). Modeling and prediction of chaotic systems
with artificial neural networks. International Journal for Numerical Methods in Fuids, 63.8,
989–1004.
104. Wu, K. J., & Wang, T. J. (2013). Prediction of chaotic time series based on RBF neural network
optimization. Computer Engineering, 39.10, 208–216.
105. Wu, X., et al. (2014). Multi-step prediction of time series with random missing data. Applied
Mathematical Modelling, 38.14, 3512–3522.
106. Xin, B., & Peng, W. (2020). Prediction for chaotic time series-based AE-CNN and transfer
learning. Complexity, 2680480.
107. Yanan, G., Xiaoqun, C., & Kecheng, P. (2020). Chaotic system prediction using data assimi-
lation and machine learning. In E3S Web of Conferences (Vol. 185, p. 02025).
108. Yang, H. Y. et al. (2006). Fuzzy neural very-short-term load forecasting based on chaotic
dynamics reconstruction. Chaos, Solitons and Fractals, 29.2, 462–469.
109. Yang, F.-P., & Lee, S.-J. (2008). Applying soft computing for forecasting chaotic time series.
In 2008 IEEE International Conference on Granular Computing (pp. 718–723), IEEE.
110. Yeh, J.-P. (2007). Identifying chaotic systems using a fuzzy model coupled with a linear plant.
Chaos, Solitons and Fractals, 32.3, 1178–1187.
111. Yeo, K. (2019). Data-driven reconstruction of nonlinear dynamics from sparse observation.
Journal of Computational Physics, 395, 671–689.
112. Yosinski, J. et al. (2014). How transferable are features in deep neural networks? In Proceed-
ings of the 28th Conference on Neural Information Processing Systems, 27, 3320–3328.
113. Yu, R., Zheng, S., & Liu, Y. (2017). Learning chaotic dynamics using tensor recurrent neural
networks. Proceedings of the ICML. In ICML 17 Workshop on Deep Structured Prediction.
114. Yuxia, H., & Hongtao, Z. (2012). Chaos optimization method of SVM parameters selection
for chaotic time series forecasting. Physics Procedia, 25, 588–594.
115. Zhang, C., et al. (2020). Predicting phase and sensing phase coherence in chaotic systems with
machine learning. Chaos: An Interdisciplinary Journal of Nonlinear Science, 30.8, 083114.
10 1 Introduction to Chaotic Dynamics’ Forecasting
116. Zhang, J.-S., & Xiao, X.-C. (2000). Predicting chaotic time series using recurrent neural
network. Chinese Physics Letters, 17.2, 88.
117. Zhang, J., Shu-Hung Chung, H., & Lo, W.-L. (2008). Chaotic time series prediction using a
neuro-fuzzy system with time-delay coordinates. IEEE Transactions on Knowledge and Data
Engineering, 20.7, 956–964.
118. Zhu, Q., Ma, H., & Lin, W. (2019). Detecting unstable periodic orbits based only on time
series: When adaptive delayed feedback control meets reservoir computing. Chaos: An Inter-
disciplinary Journal of Nonlinear Science, 29.9, 093125.
Chapter 2
Basic Concepts of Chaos Theory and
Nonlinear Time-Series Analysis
Abstract We introduce the basic concepts and methods to formalize and analyze
deterministic chaos, with links to fractal geometry. A chaotic dynamic is produced by
several kinds of deterministic nonlinear systems. We introduce the class of discrete-
time autonomous systems so that an output time series can directly represent data
measurements in a real system. The two basic concepts defining chaos are that of
attractor—a bounded subset of the state space attracting trajectories that originate in
a larger region—and that of sensitivity to initial conditions—the exponential diver-
gence of two nearby trajectories within the attractor. The latter is what makes chaotic
dynamics unpredictable beyond a characteristic time scale. This is quantified by the
well-known Lyapunov exponents, which measure the average exponential rates of
divergence (if positive) or convergence (if negative) of a perturbation of a reference
trajectory along independent directions. When a model is not available, an attrac-
tor can be estimated in the space of delayed outputs, that is, using a finite moving
window on the data time series as state vector along the trajectory.
These two features produce the aperiodic alternation between expansion and con-
traction phases of the distance between the system’s trajectories [13]. These are the
two characterizing forces of chaotic dynamics which take the names of “stretching”
and “folding”
In the following section (Sect. 2.1), we introduce the class of dynamical systems
that we use as models for real systems and we formalize the first of the above key
properties, i.e., the boundedness of the system’s trajectories on a so-called attractor.
We opt for a description in discrete time only, because it is more directly applicable
to data in real systems. Analogous concepts for continuous-time models can be found
in standard textbooks, such as [1, 10]. Then, in Sect. 2.2, we formalize the second key
property of chaos, the sensitivity to initial conditions, in terms of the well-known
Lyapunov exponents. Based on the two concepts of system’s attractors and their
Lyapunov exponents, in Sect. 2.3 we finally define a chaotic attractor and we discuss
its typical fractal geometry. Indeed, chaotic attractors typically present self-similar
complex structures and have a non-integer dimension. For this reason, they are also
called strange attractors. Section 2.4 extends the two key concepts of attractor and
Lyapunov exponents to the case in which a data time series is available, instead of a
mathematical model of the system.
Let’s consider a generic discrete-time system (also called map), whose evolution in
one time step is defined by the difference equation:
where x(t) is the n-dimensional state vector [x1 (t), x2 (t), . . . , xn (t)] , and f (·) :
Rn → Rn is the vector-valued function (or map) whose components f 1 (·), f 2 (·),
. . . , f n (·) set the right-hand sides of the state equations in (2.1).
Given an initial condition x0 at time zero, i.e., x(0) = x0 , the sequence of state
points x(t), t = 0, 1, . . . , T is the arc of the system’s trajectory, forward in time (T >
0), that originates from x0 . If function f (·) is invertible everywhere, the trajectory is
uniquely defined also backward in time and the system is said to be reversible.
Note that system (2.1) is autonomous and time-independent, i.e., there is no exter-
nal input and no time-dependence explicitly affecting function f (·). Such dependen-
cies can be taken into account by letting some of the parameters defining function
f (·) be externally driven, as done in Chap. 3.
In the long term, besides the possibility of diverging to infinity, which should
never be the case in a realistic model of a real system, the trajectories of system (2.1)
typically converge toward a set of state values, which takes the name of attractor.
Formally speaking, an attractor is a closed subset A of the state space characterized
by the following three properties:
2.1 Dynamical Systems and Their Attractors 13
A complete treatment of the stability of equilibria, cycles, tori, and strange objects
is out of the scope of this chapter and the reader is redirected to the above-mentioned
standard textbooks [1, 10]. In particular, we will only consider systems with a unique
attractor, or limit the analysis to the initial conditions in the basin of the attractor that
is relevant for the case under study. Because our aim is to design neural predictors
able to perform in the most complex conditions, we will consider the case of chaotic
attractors, which are strange attractors characterized by the property of being sensitive
to initial conditions. As already anticipated, this property is quantified by the so-
called Lyapunov exponents of the attractor, which is the topic of the next section.
The formal way to quantify the extent to which a given trajectory is sensitive to the
perturbation of its initial conditions is by its so-called Lyapunov exponents (LEs).
Intuitively, in nonlinear systems, the LEs play the same role of the eigenvalues in
linear systems.
To compute the LEs associated to a trajectory x(t), t ≥ 0, taken as reference, it is
necessary to define a perturbed trajectory x(t) + δx(t) and to linearize the evolution
of the perturbation vector δx(t) along the reference trajectory x(t). The results is the
so-called variational equation:
In other words, the i-th column of M(t) is the solution of (2.2) at time t starting
from the natural basis vector δx(0) = v (i) , with v (i) (i)
j = 0 for j = i and vi = 1,
i = 1, . . . n.
The matrix M(t) is generically nonsingular (it is certainly so for reversible sys-
tems) and defines the ellipsoid at time t. Indeed, from
The ellipsoid’s symmetry axes are the eigenvectors of matrix E(t). Geometri-
cally, a symmetry axis is a vector r from the ellipsoid center that is aligned with
the gradient E(t)r of the quadratic form 21 r E(t)r . Denoting by r (i) (t) the sym-
metry axes associated to the eigenvalue √ λi (t) of E(t), the alignment translates into
E(t)r (i) (t) = λi (t)r (i) (t), where 1/ λi (t) = σi (t) are the so-called singular values
16 2 Basic Concepts of Chaos Theory and Nonlinear Time-Series Analysis
Substituting r (i) (t) = (1/σi2 (t))E(t)−1r (i) (t) (the eigenvector equation) in lieu of
the second occurrence of r (i) into Eq. (2.9) yields
Note that the initial perturbations r (i) (0), 1, . . . , n, that are mapped by the varia-
tional equation (2.2) into the ellipsoid’s symmetry axes at time t (see Fig. 2.1), are
the eigenvectors of M(t) M(t) associated to the same singular value σi (t) of M(t).
Indeed, by left-multiplying both sides of r (i) (0) = M(t)−1r (i) (t) by (M(t)−1 ) ,
exploiting at the righthand side the eigen-property of r (i) (t) and Eq. (2.5) to get
(M(t)−1 )r (i) (0) = (1/σi2 (t))M(t)r (i) (0), and left-multiplying both sides by M(t) ,
we get
r (i) (0) = (1/σi2 (t))M(t) M(t)r (i) (0). (2.11)
Indeed, ri (t)/ε = t−1 k=0 ri (k + 1)/ri (k) is the product of all the t rates of growth of
the i-th axis along the reference trajectory and the (1/t)-power takes the geometric
average (e.g., the average between the rates 2—doubling in one unit of time—and
1/2—halving in one unit of time—is (2 × 0.5)1/2 = 1). The limits in (2.12) exist
and are finite provided the reference solution x(t) does exist. Note that, by definition,
L 1 ≥ L 2 ≥ · · · ≥ L n , though one does not know a priori the length ordering among
the ellipsoid’s symmetry axes. Actually, the ordering might change along the trajec-
tory x(t) (especially in the initial phase), so that, in principle, one first computes the
limits following the ellipsoid’s symmetry axes in arbitrary order and then sort the
results (see [15] for a review of the algorithms to numerically compute the LEs). In
2.2 Lyapunov Exponents 17
the following, we will refer to the first symmetry axis r (1) (t) as the longest, to r (2) (t)
as the second-longest, etc., i.e., r1 (t) ≥ r2 (t) ≥ · · · ≥ rn (t), for sufficiently large t.
The LEs hence measure the average exponential rate of divergence (if positive)
or convergence (if negative) of perturbed trajectories with respect to the reference
one (the one starting from x(0)). Generically, the perturbation δx(t) will have a
nonzero component along the longest symmetry axes r (1) (t), so that L 1 , the largest
Lyapunov exponent (LLE), is the dominant rate (average, exponential) of diver-
gence/convergence of the approximate perturbed trajectory (approximated by the
variational equation (2.2)). That is, the size of the perturbation grows/decays (on
average) as ε exp(L 1 t). This is approximately true also for the true perturbed trajec-
tory (the solution of the nonlinear system (2.1) from the perturbed initial condition
x(0) + δx(0)) only if the size ε of the initial perturbation is so small that the perturbed
trajectory remains close to the reference one up to time t. However, if L 1 > 0, the
perturbed trajectory sooner of later leaves the reference one, so that the linearization
taken to define the LEs no longer represents the behavior of the perturbation.
The interpretation of the remaining exponents is more delicate. L 2 is the greatest
rate chosen from all directions orthogonal to r (1) (t). Similarly, L 3 is the greatest rate
chosen from all directions orthogonal to both r (1) (t) and r (2) (t), and so on. How-
ever, only those initial perturbations with no component along r (1) (0) (the initial
perturbation mapped into r (1) (t) by the variational equation (2.2)) generate (approx-
imate) perturbations at time t orthogonal to r (1) (t). The problem is that considering
a different time t modifies the initial perturbation r (1) (0) (it is an eigenvector of the
time-dependent matrix M(t) M(t)), so that the (approximate) perturbation δx(t)
still grows/decays (on average) as ε exp(L 1 t).
The correct interpretation of the non-leading exponents is in terms of partial sums.
The sum L 1 + L 2 is the dominant rate (average, exponential) for the area of two-
dimensional sets of perturbations, as the components along r (3) (t), . . . , r (n) (t) are
dominated (for sufficiently large t) by those along r (1) (t) and r (2) (t) for all the pertur-
k
bations in the set. Similarly, i=1 L i is the dominant rate (average, exponential) for
the k-dimensional measure (or hypervolume) of k-dimensional sets of perturbations.
Finally, if the trajectory x(t) converge to an attractor A, the resulting limits are
generically independent of the initial condition x(0) in the basin of attraction of
A. This is certainly true for attracting equilibria, cycles, and tori, whereas it is only
generic (i.e., true for almost all initial conditions in the basin of attraction) for strange
attractors. In principle, starting from an initial condition on the stable manifold of
one of the saddle objects composing the attractor’s backbone, the resulting LEs are
those characterizing the reached saddle, though unavoidable numerical errors make
the trajectory miss the saddle and generically visit the whole attractor.
Exponentially attracting equilibria and cycles have negative LEs, whereas some
LEs are null if the speed of attraction is less than exponential (e.g., a time power
law); d-dimensional tori have d null exponents, because the internal perturbations are
neither expanded nor contracted (on average), the remaining ones being negative in
case of exponential attraction; strange attractors typically have at least one positive
exponent, responsible of the divergence of the unstable objects in their backbone
(this exponent is null if the speed of divergence is weaker than exponential; these
18 2 Basic Concepts of Chaos Theory and Nonlinear Time-Series Analysis
attractors are called weakly chaotic). The sum of the attractor’s exponents is in any
case negative, at least in reversible systems, because n-dimensional volumes of initial
conditions in the basin of attractions spread, forward in time, on a lower-dimensional
object. As it will be recalled in Sect. 2.3, a chaotic attractor is formally defined as an
attractor characterized by L 1 > 0.
ri (1)
L i(loc) (x) = log . (2.13)
ε x(0)=x
From the ellipsoid analysis developed in Subsect. 2.2.1, it follows that the symmetry
axes at time 1 are the eigenvectors of J (x(t))J (x(t)) . The loc-LEs are therefore the
log of the singular values of the system’s Jacobian at x(t).
Note that the (average) LEs are not the (arithmetic) time average of the loc-LEs
along the reference trajectory. For example, L (loc) 1 (x(t)) is, at any t, the log of the
largest local rate along r (loc-1) (x(t)). In general, however, the longest symmetry axis
r (1) (t) of the ellipsoid originated at x(0) is not aligned at time t, with r (loc-1) (x(t)), so
that only its component along r (loc-1) (x(t)) is subject to the largest rate. Consequently,
the time average of L (loc)
1 (x(t)) is generically larger than L 1 . Similarly, the symmetry
axis r (i) (t), i ≥ 2, is not aligned, at time t, with r (loc-i) (x(t)) and generically has
nonzero component along r (loc- j) (x(t)), j < i, so that there is not even a clear sign
relation between the time average of L i(loc) (x(t)) and L i .
2.3 Chaotic Systems, Predictability, and Fractal Geometry 19
Fig. 2.3 Example of a chaotic (blue) and a random (red) time series. Despite they seem to behave
similarly (a), the blue dynamic is deterministic as demonstrated by the mapping between y(t − 1)
and y(t) (b)
jectories can be slower than exponential and therefore yield a zero limit in (2.12)
for L 1 . Hence, chaotic attractors are also strange, while strange attractors can be
non-chaotic. Strange non-chaotic attractors are also called “weakly chaotic”.
In chaotic regime, the generic system’s trajectory shows some peculiar properties:
• Despite its random appearance, a chaotic process is deterministic [2] (see Fig. 2.3),
i.e., with the exact same initial conditions, it will always evolve over time in
the same way. However, numerical issues (e.g., round-off errors) can generate
dynamics that seem to suggest the presence of some source of randomness [12,
21]. For instance, two numerical simulations of the same system, starting from the
same initial condition, performed by different hardware architectures, can result
in two distinct trajectories visiting the attractor in apparently unrelated ways.
• The trajectories never return to a state already visited (i.e., non-periodicity), but
pass arbitrarily close to it. This happens provided that one waits for a sufficient
amount of time, because the attractor contains a dense trajectory. In principle,
a chaotic system may have sequences of values that exactly repeat themselves
(periodic behavior). However, as described in Sect. 2.1, such periodic sequences
are repelling rather than attracting, meaning that if the variable is outside the
sequence, it will never enter the sequence and in fact, will diverge from it. In this
terms, for almost all initial conditions, the variable evolves chaotically with non-
periodic behavior. The non-periodicity typical of chaotic dynamics has not to be
confused with that of quasi-periodic signals. The difference between the two is
usually investigated in the frequency domain. The spectrum of chaotic signals is
typically distributed on a range of values, while the one of quasi-periodic signals
is concentrated on a finite group of (few) peculiar values.
• The trajectories display complex paths, usually building geometries with a fractal
structure: the strange attractors presented in Sect. 2.1.
In the context of time series forecasting, the presence of stretching is the criti-
cal factor, even if one has the perfect predictor available. Indeed, the exponential
sensitivity to initial condition, soon amplifies the tiniest measurement error or unde-
sired noise on the data feeding the predictor. This is why we often say that “chaotic
2.3 Chaotic Systems, Predictability, and Fractal Geometry 21
dynamics are unpredictable”. The focus of this book is actually to investigate how
far we can go.
Starting from L 1 , it is possible to compute the Lyapunov time (LT) of the system,
which represents the characteristic time scale on which a dynamical system is chaotic.
By convention, the LT is defined as the inverse of L 1 , as the time period for the distance
between nearby trajectories of the system to increase by a factor of e. The concept of
LT is crucial in forecasting tasks, because it mirrors the limits of the predictability
of a chaotic system and allows to fairly compare the predictive accuracy in different
systems [5, 14, 17, 18, 22].
In accordance with what we have done in Sect. 2.2, it is also possible to define the
local Lyapunov time (loc-LT).
where T (T + 1)/2 is the number of point pairs in the trajectory arc. The correlation
function increases from 0 to 1 as r increases from 0 to the maximum distance between
the points in the trajectory. Let d, possibly non-integer, be the attractor’s dimension to
be determined. For small r , the number of points x(t) that are r -close to a given point
x ∈ A increases linearly with the length T of the trajectory arc and proportionally
to r d . Think, for example, to the case d = 1 (a one-dim closed curve densely filled
by a quasi-periodic trajectory). The segment of the curve contained in the r -ball
centered at x has a length proportional to r d and is filled by the trajectory as T goes
to infinity. For large T and small r , the numerator of the correlation function is hence
proportional to T 2 r d (there are T points each with a number of r -close neighbors
proportional to T r d ), while the denominator has order T 2 (see (2.14)). Solving for
d gives the correlation dimension
log C(r, T )
dcorr (A) = lim lim . (2.15)
r →0 T →∞ log r
22 2 Basic Concepts of Chaos Theory and Nonlinear Time-Series Analysis
In practice, we work with finite r and T . Starting from an initial condition x(0) in
the basin of attraction of the target attractor, we discard the initial trajectory transient
(say, the first t0 points) and then take T + 1 points x(t0 ), x(t0 + 1), . . . , x(t0 + T ), for
sufficiently large t0 and T . We then compute the correlation function for a sequence
of small increasing values of the radius r . Plotting log C(r ) versus log r , there will be
an interval of r in which the obtained points will approximately align. The correlation
dimension dcorr (A) is given by the slope of this linear part of the graph (see, Fig. 2.4).
To well identify the linear part of the graph, it is worth to discuss how the graph gets
distorted for too small and too large values of the radius r . The behavior for small r
is due to the finite resolution with which our finite arc of trajectory fills the attractor.
When r is smaller than such a resolution, the correlation function underestimates
the density of the attractor points within distance r and scaling this underestimate
with r d necessarily results in a d larger than the actual attractor’s dimension. This is
evident in the expression (2.15) by considering smaller and smaller r at constant T :
C(r, T ) vanishes below a resolution threshold on r , so that the numerator diverges
to – infinity, while the denominator is still finite. Vice-versa, for too large values
of r , the boundedness of the attractor makes the correlation function saturate at
1 (independently on how large is T ) and the slope of the graph correspondingly
vanishes (see again Fig. 2.4).
When the full spectrum of the attractor’s LEs is available, a simple way of esti-
mating the attractor’s dimension is the well-known Kaplan-Yorke formula [9], which
approximates the concept of Lyapunov dimension. The geometric idea behind the
concept is related to the interpretation given in Sunsect. 2.2.1 of the partial sums
of the attractor’s LEs. According to that interpretation, the Lyapunov dimension
dLyap (A) is the dimension of a small set of initial conditions, in the basin of attraction,
with dLyap (A)-dimensional measure (hypervolume) that neither grows nor decays (on
average), as the corresponding trajectories converge to the attractor. Because the tra-
jectories densely visit the attractor, this occurs when the set of initial conditions has
the same dimension of the attractor.
The dimension could however be non-integer, so that the best one can do is to
k k+1
look for the index k for which i=1 L i ≥ 0 and i=1 L i < 0. The hypervolume
of a k-dimensional set of initial conditions grows (on average), while the one of a
(k + 1)-dimensional set does vanish, while converging to the attractor, so that all we
2.3 Chaotic Systems, Predictability, and Fractal Geometry 23
can say is that the dimension d is in between k and k + 1. As illustrated in Fig. 2.5,
the Kaplan-Yorke formula simply considers a continuous piece-wise interpolation
of the partial sums of the attractor’s LEs. The Lyapunov dimension dLyap (A) is
therefore approximated with the value at which the piece-wise interpolation crosses
the horizontal axis, the latter representing the zero-rate of expansion/contraction.
The result is the following formula:
k
Li
dKY (A) = k + i=1
. (2.16)
|L k+1 |
The concepts presented in the first part of this chapter require to know the state
space representation of the dynamical system. In practical applications, the equations
describing the evolution of the state variables (and also the whole state vector itself)
24 2 Basic Concepts of Chaos Theory and Nonlinear Time-Series Analysis
are usually unknown. Most of the time, only a time series of observed values of the
system output y(t) = g(x(t)), is available.
Fig. 2.6 1- and 2-dimensional embeddings of a noise-free dataset obtained simulating the Hénon
map (presented in Sect. 3.1), with y(t) = x1 (t). The points A and B are true neighbors, while B
and C are false neighbors
26 2 Basic Concepts of Chaos Theory and Nonlinear Time-Series Analysis
all the points in the dataset, we will conclude that in this specific case, a 2-dimensional
space is sufficient to unfold the attractor.
Practically speaking, two points are considered as neighbors when their Euclidean
distance is lower than a certain threshold, that has to be properly selected depend-
ing on the specific application. The algorithm checks the neighbors in increasing
embedding dimensions until it finds only a negligible number of false neighbors
Fig. 2.7. This is chosen as the lowest embedding dimension, which is presumed to
give reconstruction without self-intersections between the trajectories.
If the data are noise-free, the percentage of false neighbors will drop to zero when
the proper dimension is reached. In the case of noisy time series, we expect that a
certain (hopefully small) number of false neighbors remain also for large embedding
dimension due to the presence of noise.
Takens’ theorem also states that, generically, the m has to be only a little larger,
more precisely, the first integer larger, than twice the dimension of the attractor. To
develop a geometric intuition about this fact, we can consider the following simple
example. Two 1-dimensional curves can have intersections in either R2 , R3 or any
higher-dimensional space. The difference is that a small perturbation will remove
the intersection in R3 , while in R2 it will only move the intersection somewhere
else. We can imagine that the two curves correspond to two invariant strands of the
attractor, e.g., two arcs belonging to the reconstruction of a 1-dimensional torus with
a quasi-periodic motion on it. A slight perturbation in the system dynamics would
generically cause the self-intersection in R3 to disappear.
In other words, although it is certainly possible for two generic curves placed at
random to have intersections in Rk for k ≥ 3, we should consider them as exceptional
situations and expect them to occur with essentially zero probability. This is the reason
why we used the term “generically” hereinbefore: if m > 2d, self-crossings should
in principle be rare, but they could still occur. When this is the case, it is necessary
to further increase m, even if it is already greater than 2d.
2.4 Attractor Reconstruction from Data 27
As reported in Sect. 2.2, the traditional algorithm for the computation of the LEs
makes use of the Jacobian matrix and thus requires knowledge of the state equations
of the system. However, when the model of the system is not available, it is necessary
to make use of statistical methods for detecting chaos [7].
Once an appropriate embedding dimension m of the dataset has been esti-
mated, the dynamic of the system can be analyzed in the delayed phase space
Y (t) = [y(t), y(t − 1), . . . , y(t − m + 1)] which shares the same topological prop-
erties (and thus also L 1 ) of the original phase space as stated by Takens’ theorem
[20]. After that, the following procedure [23] can be performed to estimate L 1 :
1. select N p pairs of nearby points in the m-dimensional delayed space, Y (ti ) and
Y (t j );
2. for each pair, compute the Euclidean distance between the two points δ p (0) =
Y (ti ) − Y (t j ) ;
3. recompute the distance between the two points of each pair after te time steps
δ p (te ) = Y (ti + te ) − Y (t j + te ) . On average, for small values of te , we expect
that δ p (te ) evolves following the exponential law:
4. calculate Q(te ), namely the average logarithm of the divergence rate as:
1
Np δ p (te )
Q(te ) = log . (2.18)
Np p=1
δ p (0)
Repeating the procedure for increasing values of the expansion step te , we can plot
Q(te ) as a function of te . In a chaotic system, the initial part of this curve is charac-
terized by a linear trend (see Fig. 2.8). This is due to the fact that two points initially
close together will diverge exponentially, and that this exponential expansion pro-
duces a linear trend because of the logarithm. After that, the average divergence tends
to a constant value because the trajectories lie inside the chaotic attractor. Finally,
the L 1 can be easily computed as the slope of the function Q(te ) in the initial part.
The algorithm presented in this subsection only provides an estimation of L 1 .
Alternative procedures for computing the largest exponent are available [6, 16, 24].
In principle, one could also make use of more complex procedures which allow us to
numerically compute the positive Lyapunov spectrum, i.e., only positive exponents
[23], or the whole spectrum of LEs [3, 4, 19]. However, even in this last case, only the
estimates of the first k exponents make sense. Going further would produce arbitrary
values since the data are typically sampled when the trajectories are already in the
attractor, meaning that they are representative of what happens inside it, while the
LEs beyond the k-th (k + 1, k + 2, …) describe the transient of convergence toward
28 2 Basic Concepts of Chaos Theory and Nonlinear Time-Series Analysis
the attractor. In any case, the dimension of the original state space remains unknown,
thus we do not even know how many LEs should be computed.
In this book, we limit the analysis to L 1 because the computation of the LEs
spectrum presents computational issues which strongly affect the numerical stability
of the estimation [23]. Note, however, that in the context of time series prediction, we
are interested in deriving the Lyapunov time LT, which sets the limit of predictability
of the chaotic system. To this end, it is sufficient to know L 1 , since it defines the
system’s dominant dynamics.
References
12. Nepomuceno, E. G., et al. (2019). Soft computing simulations of chaotic systems. International
Journal of Bifurcation and Chaos, 29.08, 1950112.
13. Ott, E. (2002). Chaos in dynamical systems. Cambridge University Press.
14. Pathak, J., et al. (2018). Model-free prediction of large spatiotemporally chaotic systems from
data: A reservoir computing approach. Physical Review Letters, 120.2, 024102.
15. Ramasubramanian, K., & Sriram, M. S. (2000). A comparative study of computation of Lya-
punov spectra with different algorithms. Physica D: Nonlinear Phenomena, 139.1-2, 72–86.
16. Rosenstein, M. T., Collins, J. J., & De Luca, C. J. (1993). A practical method for calculating
largest Lyapunov exponents from small data sets. Physica D: Nonlinear Phenomena, 65.1-2,
117–134.
17. Sangiorgio, M. (2021). Deep learning in multi-step forecasting of chaotic dynamics. Ph.D.
thesis. Department of Electronics, Information and Bioengineering, Politecnico di Milano.
18. Sangiorgio, M., & Dercole, F. (2020). Robustness of LSTM neural networks for multi-step
forecasting of chaotic time series. Chaos, Solitons and Fractals, 139, 110045.
19. Sano, M., & Sawada, Y. (1985). Measurement of the Lyapunov spectrum from a chaotic time
series. Physical Review Letters, 55.10, 1082.
20. Takens, F. (1981). Detecting strange attractors in turbulence. In: Dynamical systems and tur-
bulence, Warwick 1980 (pp. 366–381). Springer.
21. Ushio, T., & Hsu, C. (1987). Chaotic rounding error in digital control systems. IEEE Transac-
tions on Circuits and Systems, 34.2, 133–139.
22. Vlachas, P. R., et al. (2020). Backpropagation algorithms and Reservoir Computing in Recurrent
Neural Networks for the forecasting of complex spatiotemporal dynamics. Neural Networks,
126, 191–217.
23. Wolf, A., et al. (1985). Determining Lyapunov exponents from a time series. Physica D:
Nonlinear Phenomena, 16.3, 285–317.
24. Wright, J. (1984). Method for calculating a Lyapunov exponent. Physical Review A, 29.5, 2924.
Chapter 3
Artificial and Real-World Chaotic
Oscillators
Abstract Four archetypal chaotic maps are used to generate the noise-free syn-
thetic datasets for the forecasting task: the logistic and the Hénon maps, which are
the prototypes of chaos in non-reversible and reversible systems, respectively, and
two generalized Hénon maps, which represent cases of low- and high-dimensional
hyperchaos. We also present a modified version of the traditional logistic map, intro-
ducing a slow periodic dynamic of the growth rate parameter, that includes ranges
for which the map is chaotic. The resulting system exhibits concurrent slow and fast
dynamics and its forecasting represents a challenging task. Lastly, we consider two
real-world time series of solar irradiance and ozone concentration, measured at two
stations in Northern Italy. These dynamics are shown to be chaotic movements by
means of the tools of nonlinear time-series analysis.
In the past, many nonlinear systems that exhibit a chaotic behavior have been dis-
covered in several fields of science, from ecology [20] to finance [17, 18], from
population dynamics [10] to atmospheric processes [13].
In this book, we consider four discrete-time dynamical systems universally known for
being chaotic to test the predictive power of the neural predictors: the logistic map,
the Hénon map, and two generalized Hénon maps. The logistic and Hénon maps
are the classical prototypes of chaos in non-reversible and reversible discrete-time
systems, respectively, whereas the generalized Hénon map is considered to include
low- and high-dimensional hyperchaos. We limit the analysis to a single output
variable for each system so that the corresponding time series is univariate. For the
ease of the reader, the systems are presented below.
Fig. 3.1 Two examples of the logistic map behavior (r = 3.7) when starting from different initial
conditions selected at random
where r > 1 is the growth rate at low density. As is well known, the logistic map
exhibits chaotic behavior for most of the values of r in the range 3.6-4. Two examples
of how the logistic map (r = 3.7) variable evolves in time are reported in Fig. 3.1.
The equation describing the map is a simple quadratic polynomial, which assumes
the shape represented in the first panel of Fig. 3.2. Reproducing this function with
a neural net is an easy task. Things start to become more and more complex when
this simple map is iterated (other panels in Fig. 3.2). This is a general property of
chaotic systems that can immediately be seen in the logistic map. The effect of
the increasing complexity that occurs when the map is iterated is that small errors
in the input (on the horizontal axis in Fig. 3.2) give larger and larger errors in the
output (vertical axis). A mid- to long-term prediction of chaotic systems is, there-
fore, intrinsically problematic, even if one knows the true model generating the data
[2, 3, 15].
3.1 Artificial Chaotic Systems 33
Fig. 3.2 Graph of the logistic map (r = 3.7, first panel on the left) and of the 2-to-6 iterated maps.
Grey areas show how a small error at time t propagates when the map is iterated
where a and b are model parameters, usually taking values 1.4 and 0.3, respectively.
Considering the first state variable as output, y(t) = x1 (t), the system is equivalent
to the following two-dimensional (m = 2) nonlinear regression:
The Hénon attractor and the evolution in time of the output variable y(t) are repre-
sented in Fig. 3.3.
The generalized Hénon map is an extension of the Hénon map introduced with the
purpose of generating hyperchaos (a chaotic attractor characterized by at least two
34 3 Artificial and Real-World Chaotic Oscillators
Fig. 3.3 Chaotic attractor produced by the Hénon map (a), and evolution in time of the system
output for two random initial conditions (b)
positive Lyapunov exponents, i.e., at least two directions of divergence within the
attractor) [1, 11]. The state equations of the n-dimensional case are:
x1 (t + 1) = a − xn−1 (t)2 − b · xn (t)
(3.4)
x j (t + 1) = x j−1 (t), j = 2, . . . , n.
(for n = 2, using x1 (t)/a and (−b/a)x2 (t) as new coordinates and −b as a new
parameter gives the traditional formulation (3.2)). Hyperchaotic behavior is observed
for a = 1.9 and b = 0.03. The system can be rewritten as a nonlinear regression on
the output, y(t) = x1 (t), obtaining:
We consider the 3D and the 10D generalized Hénon maps. The first is characterized by
n = m = 3. The corresponding Lyapunov exponents are L 1 = 0.276, L 2 = 0.257,
and L 3 = −4.040. The attractor’s fractal dimension, computed with the Kaplan-
Yorke formula [4], is 2.13. Figure 3.4 shows the chaotic attractor and the evolution
in time of the system’s output starting from two different initial states. The second
(n = m = 10) has 9 positive Lyapunov exponents and its attractor’s fractal dimension
is equal to 9.13.
As a further test case, we consider a non-stationary system with both slow and fast
dynamics. These kinds of processes characterize many natural phenomena and for
this reason attract the attention of many researchers in the field of dynamical systems
theory (see, for instance, [12, 19]).
3.1 Artificial Chaotic Systems 35
Fig. 3.4 Chaotic attractor produced by the 3D generalized Hénon map (a), and evolution in time
of the system output for different initial conditions selected at random (b)
As anticipated at the beginning of this chapter, chaos theory does not only consider
analytical systems like those presented above; a remarkable variety of natural phe-
nomena are thought to be chaotic, from meteorology to the orbits of celestial bodies,
from chemical reactions to electronic circuits [21].
Discovering Diverse Content Through
Random Scribd Documents
population in the Italian quarter of the North End of Boston was said
to be nearly 1.40 persons per room.[201] In the Italian quarter of
Philadelphia investigators found 30 Italian families, numbering 123
persons, living in 34 rooms. In some of the Italian tenements in this
city, lamps were kept burning all day in some of the rooms, where
day could scarcely be distinguished from night.[202] The Jews at this
time were only a little less densely crowded than the Italians. In 1891
nearly one fourth of the whole number of Jews living in two of the
precincts of the North End of Boston were living with an average of
more than two persons to a room and were found to be very
uncleanly in the care of their homes. Among the Irish an average of
1.24 persons per room was found in Boston in 1891. On the whole
they kept their tenements cleaner than did the Jews or Italians.[203]
Since the beginning of the twentieth century, interest in the slum
population of our cities has centered itself about the Slavic and other
races of southeastern Europe, even more than about the Italians and
Jews. About one sixth of the entire population of Buffalo, or 80,000
individuals, is Polish. Of these, about 4000 families, representing
20,000 persons, own their homes. They are said to be thrifty, clean,
willing, and neglected. Nearly all the Poles live in small one and two
story wooden cottages. Good tenement work thirty years ago avoided
the serious structural conditions which prevail in most cities. The
principal evil now in the Polish section is room-overcrowding. The
two-story cottages hold six or more families, while the older one-
story cottage was built for four families, though the owner is likely to
occupy two of the rear apartments. There are 15,000 of these
cottages, all subject to the tenement law. A Pole was recently made
health commissioner, and gave promise of being the best incumbent
of that office that Buffalo has ever had. That there is plenty of work
for him to do may be judged from the description of some of the
conditions which prevail.
“Counting little bedrooms, living rooms, and kitchens (and they
are pretty nearly indistinguishable), Mr. Daniels tells us that half the
Polish families in Buffalo, or 40,000 people, average two occupants
to a room. There are beds under beds (trundle beds, by the way, were
once quite respectable), and mattresses piled high on one bed during
the day will cover all the floors at night. Lodgers in addition to the
family are in some sections almost the rule rather than the exception.
Under such conditions privacy of living, privacy of sleeping, privacy
of dressing, privacy of toilet, privacy for study, are all impossible,
especially in the winter season; and those who have nerves, which
are not confined to the rich in spite of an impression to the contrary,
are led near to insanity. Brothers and sisters sleep together far
beyond the age of safety. It begins so, and parents do not realize how
fast children grow, or how dangerous it all is.”[204]
Even in Buffalo, the congestion problem is not limited to the Poles.
The author just quoted describes the Italians as tending to establish
residences in old hotels, warehouses, and abandoned homesteads,
and says, “As late as 1906 we found Italians living in large rooms,
subdivided by head-high partitions of rope and calico, with a
separate family in each division.”
In Milwaukee there are three foci of the tenement evil, the Italian
quarter, the Polish quarter, and the Jewish quarter. While there are
not the large tenement houses that prevail in larger cities, there are
the same evil conditions in the small cottages of the laboring class.
The following paragraphs give a vivid picture of some of the
conditions in each of these three sections.
In the Italian district, “Entering one of these dwellings we had to
duck our heads to escape a shower bath from leaking pipes above the
door. Incidentally, we had to dodge a crowd of the canine family
which did not seem to be particularly pleased with our visit. The
rooms were dark. Something, which I supposed was food or intended
for food, was bubbling on a little stove. A friendly goat was playing
with the baby on the floor, and the pigeons cooed cheerily near by.
Through the door of the kitchen we got the odor of the stable. The
horses had the best room. In the middle room, which was absolutely
dark, on a bed of indescribable filth, lay an aged woman, groaning
with pain from what I judged to be ulcerated teeth, but which for
aught she knew might have been a more malignant disease. In this
single dwelling, which is not unlike many we saw, there lived
together in ignorant misery one man, two women, ten children, six
dogs, two goats, five pigeons, two horses, and other animal life which
escaped our hurried observation.”
“In the Ghetto, in one building, live seventy-one people,
representing seventeen families. The toilets in the yard freeze in
winter and are clogged in summer. The overcrowding here is fearful
and the filth defies description. Within the same block are crowded a
number of tenements three and four stories high with basement
dwellings. One of these is used as a Jewish synagogue. Above and
beneath and to the rear this building is crowded with tenement
dwellers. The stairways are rickety, the rooms filthy, and all are
overcrowded. The toilets for the whole population are in the cellar
adjoining some of the dwelling rooms, reached by a short stairway.
At the time of our visit the floors of this toilet, both inside and
outside, were covered with human excrement and refuse to a depth
of eight to twelve inches. Into this den of horrors all the population,
male and female, had to go.”
A typical dwelling of the Polish working people is thus described.
“There is an entrance, perhaps under the steps, which leads to the
apartments below. In this semibasement in the front lives a family.
There are perhaps two rooms, sometimes only one. In the rear of this
same basement lives another family. Above, on the first floor, lives
another family, likewise in two or three small rooms; and in the rear
is another. Thus four or more families live in one small cottage—and,
often, in true tenement style, they ‘take in’ boarders.... Here,
together, live men, women, children, dogs, pigeons, and goats in
regular tenement and slum conditions.”[205]
Such instances as these, which might be multiplied almost
indefinitely, are individual manifestations of conditions which are
represented en masse by the figures of the Immigration Commission.
It is apparent that slum conditions exist, fully developed, in other
places than the great cities, and in other types of building than the
regulation tenement. As will be seen later, they may be found in
communities which do not come under the head of cities at all. The
slum is a condition, not a place, and will crop up in the most
unexpected places, whenever vigilance is relaxed. The slum can never
be eradicated by erecting model dwellings, however well planned,
nor by any other superficial method alone. The foundation of the
slum rests in the social and economic relations of society, and can be
effectually attacked only through them.
In the foregoing quotations, frequent reference is made to the
filthy condition in which the dwellings of the foreign-born are kept.
It is the current idea among a large class of people that extreme
uncleanliness characterizes the great majority of immigrant homes.
Unfortunately there is all too large a basis of truth for this
impression. Yet there is undoubtedly much exaggeration on this
point in the popular mind. The Immigration Commission found that
out of every 100 homes investigated in its study of city conditions, 45
were kept in good condition, and 84 in either good or fair condition,
though the foreign-born were inferior in this respect to the native-
born. In many cases the filthy appearance of the streets in the
tenement districts is due to negligence on the part of city authorities,
rather than to indifference on the part of the householders. “In
frequent cases the streets are dirty, while the homes are clean.”[206]
Not only is it an error to suppose that all immigrants are filthy, but it
is also untrue that all immigrants who are filthy are so from choice.
While the standards of decency and cleanliness of many of our
immigrant races are undoubtedly much below those of the natives,
there are many alien families who would gladly live in a different
manner, did not the very conditions of their existence seem to thrust
this one upon them, or the hardship and sordidness of their daily life
quench whatever native ambition for better things they might
originally have had.
In the foregoing paragraphs mention has been made of the
boarder as a characteristic feature of life in the tenements. He is, in
fact, a characteristic feature of the family life of the newer immigrant
wherever found. Since so large a proportion of the modern
immigrants are single men, or men unaccompanied by their wives
(see p. 191), there is an enormous demand for accommodations for
male immigrants who have no homes of their own. This demand is
met in two main ways. The most natural, and perhaps the least
objectionable, of the two, where there are a certain number of
immigrant families of the specified race already in this country, is for
a family which has a small apartment to take in one or more boarders
or lodgers of their own nationality. In this way they are able to add to
their meager income, and thereby to increase the amount of their
monthly savings, or perhaps to help pay off the mortgage on the
house if they happen to be the owners. The motive is not always a
financial one, however, but occasionally the desire to furnish a home
for some newcomer from the native land, with whom they are
acquainted, or in whom they are interested for some other reason.
[207]
The second way of solving the problem is for a number of men to
band themselves together, hire an apartment of some sort, and carry
on coöperative housekeeping in one way or another. A description of
these households will be given later (p. 247).
The keeping of boarders or lodgers[208] is a very widespread
practice among our recently immigrating families.
Among the households studied by the Immigration Commission in
its investigation of cities, 13 per cent of the native-born white
households kept boarders, and 27.2 per cent of the foreign-born. The
following foreign-born nationalities had high percentages, as shown
by the figures: Russian Hebrews, 32.1 per cent; north Italians, 42.9
per cent; Slovaks, 41 per cent; Magyars, 47.3 per cent; Lithuanians,
70.3 per cent. A similar showing is made by the figures given in the
report of the Immigration Commission on Immigrants in
Manufacturing and Mining (abstract quoted). The percentage of
households keeping boarders, as shown in that report, is as follows:
Race (foreign-born)—
Norwegian 3.8
Bohemian and Moravian 8.8
Croatian 59.5
South Italian 33.5
Magyar 53.6
Polish 48.4
Roumanian 77.9
Servian 92.8
209. Rept. Imm. Com., Imms. in Mfg. and Min., Abs., p. 147.
The average number of boarders per household, based on the
number of households keeping boarders, was as follows:
Nativity Number
Native-born white of native father 1.68
Native-born of foreign father 1.52
Foreign-born 3.53
Race (foreign-born)—
Bulgarian 8.29
Croatian 6.39
Roumanian 12.23
Servian 7.25
223. Compiled from Rept. Imm. Com., Imms. in Mfg. and Min., Abs.
226. Rept. Imm. Com., Imms. in Mfg. and Min., Abs. p. 91.
There is a marked difference between races in this respect. The
lowest figures among the foreign-born were: Albanian, $8.07; Greek,
$8.41; Portuguese, $8.10; Syrian, $8.12; Turkish, $7.65. Some of the
foreign-born rank well above the natives, as, for instance:
Norwegian, $15.28; Scotch, $15.24; Scotch-Irish, $15.13; Swedish,
$15.36; Welsh, $22.02.
The average yearly earnings (approximate) of male employees 18
years of age or over were as follows:
229. Rept. Imm. Com., Imms. in Mfg. and Min., Abs., p. 139.
Thus there is a smaller proportion of families among the native-
born of foreign fathers who rely upon other members of the family
than the husband for part of the family income than of the native-
born of native father. It appears that the explanation of the
peculiarity which has been noticed must be either that only the more
prosperous of the native-born of foreign parentage are heads of
families, or that those families of this class which do receive income
from other sources than the husband receive a much greater total
amount than among the native-born of native father, so as to raise
the average. The former explanation seems the more probable, for
while 67.3 per cent of the male native-born white employees of native
fathers, 20 years of age or over, were married, only 56.5 per cent of
the native-born of foreign fathers of the same age were married.
Native-born employees of foreign parentage who are old enough to
be the heads of families are predominantly representatives of the old
immigration, and hence stand high on the wage scale. The very small
percentage of families among the foreign-born which derive their
entire income from the husband indicates the extent to which the
children of this class contribute to the family support, and also the
extent to which boarders are taken.
Figures from other sources corroborate, in general, the showing
made in the foregoing tables, with some differences in detail. The
Immigration Commission in one of its other reports, namely that on
Immigrants in Cities, gives the average approximate yearly earnings
of over 10,000 male wage workers 18 years of age or over as follows:
native-born white of native father, $595; native-born of foreign
father, $526; foreign-born, $385.[230] These figures are less,
throughout, than those presented in the foregoing tables, and seem
to indicate that the average of wages in cities is less than in the
general run of organized industries throughout the country. It is
probable that a census of city workers would include many in
insignificant industries, and in occupations which could hardly be
classed as industries, where the wage scale is low.
The earnings of agricultural laborers on the farms of western New
York range from $1.25 to $1.75 per day of ten hours. South Italian
families of four or five members, engaged in this kind of work,
average from $350 to $450 for the season, extending from April to
November. Poles, working as general farm laborers the year round,
earn from $18 to $20 per month.[231] Among the anthracite coal
miners of Pennsylvania, the average yearly wage of the contract
miners, who make up about twenty-five per cent of persons
employed about the mines, is estimated at about $600 per year,
while “adults in other classes of mine workers, who form over sixty
per cent of the labor force, do not receive an annual average wage of
$450.”[232] In the extensive array of wage figures given by Mr.
Streightoff, distinction is not made between natives and immigrants,
but the general showing harmonizes so well with what has already
been given as to obviate the necessity of going into this question in
further detail.[233] We are justified in setting down the average
earnings of wage-working adult male immigrants as from $350 to
$650 per year, and the average annual income of immigrant families
at from $500 to $900.
The figures given for individual immigrant incomes have been
confined to male workers, for the reasons that they are
representative, and are of primary importance in determining the
status of the immigrant family in this country. The wages of female
workers range on the average from 30 to 40 per cent below those of
males. Full comparisons are given in the volume of the Immigration
Commission Report on Immigrants in Manufacturing and Mining.
The next question which arises is, to what degree are these
incomes, of individuals and families, adequate to furnish proper
support to an average family of five persons? This problem involves
the determination of the minimum amount on which a family can
live in decency under existing conditions in America. Numerous
efforts have been made to solve this question. The estimate of the
Bureau of Statistics of Massachusetts is $754.[234] The Charity
Organization Society of Buffalo regards $634 a year as the “lowest
tolerable budget which will allow the bare decencies of life for a
family of five.”[235] A special committee of the New York State
Conference of Charities and Corrections in 1907 made the following
estimates as to the income necessary for a family of five persons in
New York City.
“$600–$700 is wholly inadequate to maintain a proper standard
of living, and no self-respecting family should be asked or expected
to live on such an income.”
“With an income of between $700–$800 a family can barely
support itself, provided it is subject to no extraordinary expenditures
by reason of sickness, death, or other untoward circumstances. Such
a family can live without charitable assistance through exceptional
management and in the absence of emergencies.”
“$825 is sufficient for the average family of five individuals,
comprising the father, mother, and three children under 14 years of
age to maintain a fairly proper standard of living in the Borough of
Manhattan.”
Mr. Streightoff summarizes the evidence in the following words:
“It is, then, conservative to set $650 as the extreme low limit of the
Living Wage in cities of the North, East, and West. Probably $600 is
high enough for the cities of the South. At this wage there can be no
saving, and a minimum of pleasure.”[236]
The close correspondence of these various estimates gives them a
high degree of credibility. If we fix these standards in mind, and then
look back over the wage scales given on the foregoing pages, we are
struck with the utter inadequacy of the annual incomes of the
foreign-born to meet even these minimum requirements of decency.
It is obvious that an enormous number of immigrant families, if
dependent solely on the earnings of the head of the family, would fall
far below any of these standards, and that many of them, even when
adding to their resources by the labors of wife and children, and the
contributions of boarders, cannot possibly bring the total income up
to the minimum limit. Even the average income in many occupations
is far below this minimum, and it must be considered that while an
average indicates that there are some above, there must also be many
below, the line. What must be the condition of those below! The
average family income of the foreign-born studied in the
Immigration Commission’s investigation of the manufacturing and
mining industries was $704. Mr. Frederic Almy states that 96 per
cent of the Poles under investigation in Buffalo earn less by $110
than the $634 per year which was set as the “lowest tolerable
budget.”[237]
A vast amount of information covering a number of miscellaneous
aspects of human life, which fall under the general head of the
standard of living, is furnished by the Immigration Commission, in
its report on the manufacturing and mining industries. Some of the
most important of these facts are summarized in the following tables.
First, as to the situation of young children in the homes of
immigrants.
PER CENT OF CHILDREN 6 AND UNDER 16 YEARS OF AGE[238]
Male Female
At At At At At At
Home School Work Home School Work
Native-born white of
native father 5.4 90.9 3.6 6.9 90.5 2.6
Native-born of foreign
father 10.2 83.9 5.9 12.6 83.5 3.9
Foreign-born 13.2 77.0 9.9 19.1 73.6 7.3
238. Rept. Imm. Com., Imms. in Mfg. and Min., Abs., pp. 194–195.
Among the following races the following per cent of foreign-born
male children of the specified age were at work: German, 13.9; south
Italian, 13.3; Lithuanian, 14.3; Portuguese, 15.7; Ruthenian, 14.6;
Scotch, 19.0; Syrian, 22.6.
The following table, showing the per cent of literacy of the
employees studied in these industries, is based on information for
500,329 employees, and hence has a remarkable trustworthiness:
LITERACY OF EMPLOYEES IN MINING AND MANUFACTURING[239]
NATIVITY MALES FEMALES
Per Cent who Per Cent who
Read Read and Write Read Read and Write
Native-born white of native father 98.2 97.9 98.8 98.4
Native-born of foreign father 99.0 98.7 99.0 98.8
Foreign-born 85.6 83.6 90.8 89.2
240. Rept. Imm. Com., Imms. in Mfg. and Min., Abs., p. 198.
It is thus apparent how large a proportion of our foreign-born
laborers have not even taken the first essential step toward
assimilation. This evil is, of course, practically overcome in the
second generation. Almost all of the native-born persons of foreign
fathers, six years of age or over, speak English, though some races
show from 6 to 8 per cent who do not.
The percentage who can speak English naturally increases with the
length of residence in the United States, until a percentage of 83.1 is
reached for all foreign-born employees who have been in the United
States ten years or more. But even in this group a very low
percentage is found among the Cuban and Spanish cigar makers, of
whom almost three fifths are unable to speak the English language.
The age of the immigrant at the time of arriving in the United
States has a great deal to do with the ability to speak English. The
percentage of those who were under fourteen when they arrived who
can speak English is nearly twice as large as that of those who were
fourteen or over. The reasons for this are the greater adaptability of
the younger immigrants, and their greater opportunities of going to
school. The relatively poor showing of the females is probably due to
their greater segregation, which prevents them from coming in touch
with Americans or older immigrants of other races.
One of the special reports of the Immigration Commission deals
with the children of immigrants in schools and brings out some very
significant facts. Practically all of the information was secured in
December, 1908. Naturally this investigation involved a study of the
children of native-born fathers also. A general investigation was
made in the public schools of thirty cities, including the first twenty
cities in point of population, as shown by the census of 1900, with
the exception of Washington, D.C., Louisville, Ky., and Jersey City,
N.J. An investigation was also carried on in regard to parochial
schools in twenty-four cities, and an investigation of the students in
seventy-seven institutions of higher learning. In addition to this
general investigation, an intensive investigation was made in twelve
cities, including seven cities not in the previous list, making a total of
thirty-seven cities in which public schools were studied. The total
number of public school pupils for whom information was secured
was 1,815,217. Thus the investigation was a very inclusive one, and
the results may be taken as representative of educational conditions
in the cities of the entire country.
Of the total number of public school children studied in the thirty-
seven cities, 766,727 were of native-born fathers, and 1,048,490 of
foreign-born fathers. The children of native-born white fathers
constituted 39.5 per cent of the total, while among the children of
foreign-born fathers there were the following percentages of the total
number: Hebrews, 17.6; Germans, 11.6; Italians (north and south),
6.4; total, native-born father, 42.2 per cent; total, foreign-born
father, 57.8 per cent.
The different cities show a marked difference in the proportion of
children who come from foreign-born fathers, as the following table
will show:
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
ebooknice.com