On The Characterization
On The Characterization
CITATIONS READS
46 40
5 authors, including:
A. G. Constantinides
Imperial College London
466 PUBLICATIONS 5,935 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by A. G. Constantinides on 02 April 2014.
Subject collections Articles on similar topics can be found in the following collections
Email alerting service Receive free email alerts when new articles cite this article - sign up in
the box at the top right-hand corner of the article or click here
The need for the characterization of real-world signals in terms of their linear, nonlinear,
deterministic and stochastic nature is highlighted and a novel framework for signal
modality characterization is presented. A comprehensive analysis of signal nonlinearity
characterization methods is provided, and based upon local predictability in phase space,
a new criterion for qualitative performance assessment in machine learning is introduced.
This is achieved based on a simultaneous assessment of nonlinearity and uncertainty
within a real-world signal. Next, for a given embedding dimension, based on the target
variance of delay vectors, a novel framework for heterogeneous data fusion is introduced.
The proposed signal modality characterization framework is verified by comprehensive
simulations and comparison against other established methods. Case studies covering a
range of machine learning applications support the analysis.
Keywords: signal nonlinearity; embedding dimension; delay vector variance;
machine learning; data fusion
1. Introduction
nonlinearity
chaos
?
(c) ?
(b)
? ?
? NARMA
(a)
? ? ?
ARMA
linearity
determinism stochasticity
Figure 1. Classes of real-world signals spanned by the properties ‘nonlinearity’ and ‘stochasticity’.
Areas where methodologies for the analysis are readily available are highlighted, such as ‘chaos’
and ‘ARMA’.
Theiler et al. (1992) have introduced the concept of ‘surrogate data’, which has
been extensively used in the context of statistical nonlinearity testing. The
surrogate data method tests for a statistical difference between a test statistic
computed for the original time series and for an ensemble of test statistics
computed on linearized versions of the data, the so-called surrogate data, or
‘surrogates’ for short. In other words, a time series is nonlinear if the test statistic
for the original data is not drawn from the same distribution as the test statistics
P
2
An AR model of order m is a linear stochastic model of the form, xk Z m iZ1 ai xkKi C n; the current
value of the process, xk, is expressed as a finite, linear combination of the previous values of the
process and a random shock, n.
3
The analysis of the nonlinearity of a signal can often provide insights into the nature of the
underlying signal production system. However, care should be taken in the interpretation of the
results, since the assessment of nonlinearity within a signal does not necessarily imply that
the underlying signal generation system is nonlinear: the input signal and system (transfer
function) nonlinearities are confounded.
(a) 3 (b)
amplitude 2
1
0
–1
–2
–3
(c) 3 (d )
2
amplitude
1
0
–1
–2
–3
0 100 200 300 400 0 100 200 300 400
time time
Figure 2. Surrogate realization for the Lorenz series. (a) FT-based without endpoint matching,
(b) FT-based with endpoint matching, (c) amplitude-adjusted Fourier transform-based with
endpoint matching, and (d ) iAAFT with endpoint matching.
back into the time domain. This way, the so-called ‘FT-based’ surrogates are
designed to have the same amplitude spectrum and hence the linear properties
(mean, variance and autocorrelation function) identical to those of the original
time series, but are otherwise random.
Since the FFT assumes the time series to be periodic over a time segment, a
mismatch between the start- and endpoint results in a periodic discontinuity that
introduces high-frequency artefacts (Theiler et al. 1994; windowing can be
applied to compensate for this spectral leakage). Examples of FT-based
surrogates for the chaotic Lorenz series, without and with endpoint matching,
are given in figure 2a,b. As with the AR-based method (§3a), signal distributions
of the surrogates do not necessarily resemble those of the original time series,
which can lead to false rejections of the null hypothesis. In order to exclude such
‘false’ rejections, Theiler et al. (1992) proposed an amplitude transform of the
original time series that makes the distribution Gaussian, prior to the application
of the FT method, which is transformed back to the original distribution
afterwards (amplitude-adjusted Fourier transform method, AAFT). Rather than
fitting the observation function, h($), with a parametric model, they employed a
rank-ordering procedure, i.e. the time series is sorted6 and the sample with rank k
is set to the same value as the kth sample in a sorted Gaussian series of the same
length as the original time series. An example of the AAFT surrogate for the
chaotic Lorenz series, including the endpoint matching procedure, is shown
in figure 2c.
The iterative amplitude-adjusted Fourier transform (iAAFT ) method. The
drawback of the AAFT method is that it forces the amplitude spectrum of
surrogates to be flatter than that of the original time series, which, again, can lead
to false rejections of the null hypothesis. To that end, Schreiber & Schmitz (1996)
6
Sorting a time series refers to sorting the signal amplitudes in an increasing order.
proposed the iAAFT method that produces surrogates with identical (‘correct’)
signal distributions and approximately identical amplitude spectra as the
original time series. For {jSkj}, the Fourier amplitude spectrum of the original
time series, s, and {ck}, the sorted version of the original time series, at every
iteration j, two series are generated: (i) r ( j ), which has the same signal distribution
as the original and (ii) s( j ), with the same amplitude spectrum as the original.
Starting with r (0), a random permutation of the original time series, the iAAFT
method is given as follows:
The iteration stops when the difference between {jSkj} and the amplitude
spectrum of r ( j ) stops decreasing (Schreiber & Schmitz 2000). An iAAFT
surrogate for the Lorenz series, using the endpoint matching is shown in
figure 2d. The iAAFT method for generating surrogate time series has become a
standard, giving more stable results than the other available methods
(Kugiumtzis 1999; Schreiber & Schmitz 2000).
(c ) Hypothesis testing
To describe the fundamental property of signal nonlinearity, a null hypothesis
is asserted that the time series is linear; it is rejected if the associate test statistic
does not conform with that of a linear signal. Since the analytical form of the
probability distribution functions of the metrics (‘test statistics’) is not known, a
non-parametric rank-based test may be used (Theiler & Prichard 1996). For
every original time series, NsZ99 surrogates are generated; the test statistics for
the original, t o, and for the surrogates, t s,i (iZ1, ., Ns), are computed, the series
{t o, t s,i} is sorted and the position index (rank) r of t o is determined. A right-
tailed (left-tailed) test rejects a null hypothesis if rank r of the original exceeds 90
(r%10), and a two-tailed test is judged ‘rejected’ if rank rO95 (or r%5). It is
convenient to define the symmetric rank rsymm as
8 r
>
> ; for rightKtailed tests;
>
> Ns C 1
>
>
>
>
>
> Ns C 2Kr
>
> ; for leftKtailed tests;
>
>
< Ns C 1
rsymm ½% Z ð3:2Þ
> ðN C 1Þ
>
> s
>
> Kr
>
> 2
>
> ; for twoKtailed tests;
>
> Ns C 1
>
>
>
: 2
0.04 0.02
0.02
10 100 1000 0 0.02 0.04 0.06 0.08 0.10
n
Figure 3. (a) DVS and (b) cumulative d–e plots of Mackey–Glass chaotic time series.
(a ) DVS plots
The idea underpinning the method introduced by Casdagli (1991), the DVS
plots, is to construct piecewise linear approximations of the unknown prediction
function that maps the DVs onto their corresponding targets, using a variable
number n of neighbouring DVs for generating these approximations. The DVS
method examines the (robust) average prediction error E(n) for local linear
models as a function of the number of data points, n, within the local linear
model. The prediction error as a function of the locality of the model conveys
information regarding the nonlinearity of the signal. Indeed, a small value of n
corresponds to a deterministic model (Farmer & Sidorowich 1987), large values
of n correspond to fitting a stochastic linear AR model, whereas intermediate
values of n to fitting nonlinear stochastic models. A DVS plot for the Mackey–
Glass chaotic (nonlinear deterministic) time series for mZ2 is shown in figure 3a.
The minimum of the delay vector variance (DVV) curve is at the l.h.s. of the
plot, indicating correctly a nonlinear and deterministic nature of this time series.
— The pairwise (Euclidean) distances between the DVs x(i ) and x( j ) are
computed and denoted by di, j. The distance between the corresponding targets
(using the L2 norm) is denoted by ei, j.
— The e values are averaged, conditional to d, i.e. eðrÞZ ej;k , for r%dj,k!rCDr,
where Dr denotes the width of the ‘bins’ used for averaging ej,k.
7
For simplicity, t is set to unity in all simulations.
— The smallest value for e(r) is denoted by E Z limr/0 eðrÞ and is a measure for
the predictability of the time series.
The ‘cumulative’ version of e(r) avoids the need for setting a bin width Dr:
3c ðrÞ Z ej;k with dj;k ! r; ð4:1Þ
where ej;k is, as before, the mean pairwise distance between targets. Figure 3b
shows the cumulative plot for the Mackey–Glass time series.
The heuristics for determining parameter E is the Y-intercept of the linear
regression of the Nd (d, e) pairs with the smallest d. In the example shown in
figure 3b, this yields EZ0.0138 and indicates a deterministic nature. This value
can be used as a test statistic for a left-tailed nonlinearity test using surrogate
data. In our simulations, NdZ500.
(d ) Correlation exponent
This approach to nonlinearity detection is described by Grassberger & Procaccia
(1983) and yields an indication of the local structure of a strange attractor.
The correlation integral is computed as
1
C ðlÞ Z lim 2 fnumber of pairs ði; j Þ for which kxðiÞKxðj Þk! lg; ð4:4Þ
N/N N
8
For comparison, time lag t is set to unity in all simulations.
(a) (b)
Figure 4. Nonlinearity analysis for the Mackey–Glass time series. (a) C3 method and (b) REV
method. The thick lines represent the test statistics for the original time series (to), and the thin
lines represent those for 24 surrogates (ts,i).
where l is varied and N is the number of DVs available for the analysis.
Grassberger & Procaccia (1983) established that the correlation exponent, i.e. the
slope of the ðlnðC ðl ÞÞ; lnðl ÞÞ-curve, can be taken as a measure for the local
structure of a strange attractor. Several methods exist for determining the range
over which the slope is to be computed (‘scaling region’; see Theiler & Lookman
1993; Hegger et al. 1999). The correlation integral curve is examined in similar
regions for both original and surrogate data, and a difference in the slope indicates
a difference in the local structure of the attractor yielding two-tailed tests.
The DVV method is somewhat related to the d–e method and the DVS plots
(Casdagli 1991).
For a given embedding dimension m, the DVV approach can be summarized as
follows.
— The mean, md, and s.d., sd, are computed over all pairwise Euclidean distances
between the DVs, kxðiÞKxðj Þk (isj ).
— The sets Uk(rd) are generated such that Uk ðrd ÞZ fxðiÞkxðkÞKxðiÞk% rd g, i.e.
sets that consist of all the DVs that lie closer to x(k) than a certain distance rd,
taken from the interval ½maxf0; md K nd sd g; md C nd sd , e.g. Ntv uniformly
spaced distances, where nd is a parameter controlling the span over which to
perform the DVV analysis.
— For every set Uk(rd), the variance of the corresponding targets, s2k ðrd Þ, is
computed. The average over all sets Uk(rd), normalized by the variance of the
time series, s2x , yields the ‘target variance’, s2 ðrd Þ
P
N
1
N s2k ðrd Þ
kZ1
s2 ðrd Þ Z
: ð5:1Þ
s2x
We only consider a variance measurement valid, if the set Uk(rd) contains at
least NoZ30 DVs, since having too few points for computing a sample
where s2
s;i ðrd Þ
is the target variance at span rd for the ith surrogate, and the
average is taken over all spans rd that are valid for both the original and the
9
Note that we use the term ‘standardized’ in the statistical sense, namely as having zero mean and
unit variance.
10
Note that while computing this average, as well as with computing the r.m.s.e., only the valid
measurements are taken into account.
(a) (b)
1.0
surrogate surrogate
*2)
0.8
target variance (
0.6
0.4 original signal original signal
0.2
0
–3 –2 –1 0 1 2 3 –6 – 4 –2 0 2 4 6
standardized distance standardized distance
(c) (d )
1.0
0.8
surrogates
0.6
0.4
0.2
0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0
original original
Figure 5. Nonlinear and deterministic nature of signals. DVV plots for (a) AR(4) signal and (b)
Narendra model 3. The plots are obtained by plotting the target variance as a function of
standardized distance. DVV scatter diagrams for (c) AR(4) model and (d ) Narendra model 3. The
diagrams are obtained by plotting the target variance of the original signal against the mean of the
target variances of the surrogate data.
surrogates. The DVV plots represent a single test statistic, allowing for
traditional (right-tailed) surrogate testing (the deviation from the average is
computed for the original and surrogate time series).
For generality, consider a unit variance deterministic signal (sum of three sine
waves, scaled to unit variance) contaminated with uniformly distributed white
noise with s.d. sn. After standardizing to unit variance, the resulting signal, nk, is
passed through a nonlinear system
nk
xk Z arctanðgnl C T xðkÞÞ C ; ð6:1Þ
2
where gnl controls the degree of nonlinearity, C Z ½0:2;K0:5T , and x(k) are the
DVs of embedding dimension mZ2. This is a benchmark nonlinear system
referred to as model 2 (Narendra & Parthasarathy 1990). The predictability
is influenced by sn, whereas the degree of nonlinearity is controlled by gnl.
Table 1. Results of the rank tests for the tile set. (Significant rejections of the null hypothesis at the
level of 0.1 are indicated by boxes. C3, third-order cumulant; COR, correlation exponent; DVV,
delay vector variance; REV, reversal metric.)
0.0 0.0 31 38 99 3 22
1.5 0.0 45 59 6 88 100
2.0 0.0 54 73 2 1 100
2.5 0.0 65 52 2 1 100
0.0 0.5 36 81 56 83 52
1.5 0.5 52 82 73 54 98
2.0 0.5 54 100 94 1 100
2.5 0.5 43 87 95 1 100
0.0 1.0 34 82 52 100 28
1.5 1.0 57 89 52 10 82
2.0 1.0 71 24 11 1 100
2.5 1.0 38 41 76 1 100
To illustrate the potential of the DVV method, a ‘tilt set’ of nine time series is
generated, defined by sn2{0, 0.25, 0.5} and gnl2{0, 0.5, 1.0}, which spans the
signal space from figure 1.
For the tile set, we used an embedding dimension of mZ2, and the maximal
span, nd, was determined by visual inspection so that the DVV plots converged
to unity at the extreme right, yielding ndZ3. The results of the rank tests for the
tile set (6.1) are shown in table 1 (significant rejections at the level of 0.1 are
shown in boxes). The DVS method is not included, since it does not allow for a
quantitative analysis. From table 1, in the absence of noise (snZ0), only the d–e,
the COR and DVV methods detected nonlinearities for slopes gnlR2.0. When
noise was added, the time REV was able to detect the nonlinear nature for high
slopes gnl. The third-order cumulant (C3) metric was unable to detect
nonlinearities in this type of signal, whereas the correlation exponent (COR)
analysis was detecting (wrongly) nonlinearity even for the linear case with
gnlZ0. The d–e method failed to detect nonlinearities within the signals when the
noise was present, since it is based on the deterministic properties of a time
series. Only the DVV method consistently detected nonlinear behaviour for
gnlR2, for all noise levels (right most column in table 1).
The results for the DVS and DVV analyses are illustrated in figures 6 and 7,
respectively. The degree of nonlinearity, gnl, increases from left to right, and the
noise level, sn, increases from top to bottom. The DVS plots in figure 6 show
that, as gnl increases, the error discrepancy between the best local linear model
and the global linear model becomes larger, indicating a higher degree of
nonlinearity. In the DVV scatter diagrams (figure 7), the effect of an increasing
degree of nonlinearity corresponds to a stronger deviation from the bisector line
(dashed line). The effect of increasing sn in the DVS plots is a higher error value
at the optimal degree of locality. For instance, in the first column of the tile
figures, the lowest detected level of uncertainty increased from top to bottom
nl
n 0.15
0.11
0.28 0.13
0.10
E 0.11
0.27
0.09
0.10 0.26 0.07 0.05
Figure 6. DVS plots for the tile set. The degree of nonlinearity increases from left to right and the
noise level from top to bottom.
(figure 6). Conversely, for increasing degrees of nonlinearity (first row in figures),
the minimum of the DVV curve becomes more pronounced in figure 6, and the
deviation from the bisector line becomes more emphasized in figure 7.
This section illustrates the applications of the DVV method for a range of
machine learning problems.
nl
n
surrogate 1.0
0.5
1.0
surrogate
0.5
1.0
surrogate
0.5
Figure 7. DVV scatter diagrams for the tile set. The degree of nonlinearity increases from left to right
and the noise level from top to bottom. The error bars indicate the s.d. from the mean of s2.
On the other hand, iterative a posteriori (or data reusing) techniques naturally
employ prior knowledge, by reiterating the above update for a fixed tap input
vector x(k), whereby the data-reusing update is governed by the refined
estimation error ei(k) for every a posteriori update iZ1, ., L.
Figure 8 illustrates the quantitative performance measure (the prediction gains
R p) and qualitative performance by means of the DVV scatter diagrams, for
three different learning algorithms and various orders of data reusing iterations.
The least mean square (LMS) algorithm was used to train a linear finite impulse
response (FIR) filter, while the nonlinear gradient descent (NGD) and real-time
recurrent learning (RTRL) were used to train a nonlinear neural dynamical
perceptron and recurrent perceptron, respectively. This was achieved for
the prediction of a nonlinear benchmark signal (5.3). The three columns in
figure 8a–c illustrate different orders of data reusing (0 times, 3 times and 9 times,
from left to right), and the three rows in figure 8(i)–(iii) represent the three
different algorithms (LMS, NGD and RTRL from top to the bottom). In terms
of the prediction gain R p and nature preservation, there is a tendency along each
row, that with the increase in the order of data reusing, the quantitative
performance index R p increased as well, and the DVV scatter diagrams for the
filter output approached those for the original signal (dotted line, figure 8).
(b ) Functional magnetic resonance imaging applications
The general linear model is still widely used in neuroscience, exhibiting
suboptimal performance, which also depends on the recording method Vanduffel
et al. (2001). Our aim was to show that the different recording methods convey
surrogates
0.5
(ii) 1.0
surrogates
0.5
(iii) 1.0
surrogates
0.5
Figure 8. DVV analysis of the qualitative performance of recursive and iterative algorithms for
benchmark nonlinear signal (5.3). (a) Standard algorithm, (b) three iterations of DR, and (c) nine
iterations of DR. (i) FIR filter (LMS), (ii) feedforward perception, and (iii) recurrent perception.
The dotted lines represent the original signal, while the solid lines represent DVV scatter diagrams
for the one-step ahead prediction.
0.8
surrogate
0.6
0.4
0.2
0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0
2 2 2 2
original * original * original * original *
Figure 9. DVV scatter diagrams for the fMRI time series (a) B1, (b) B2, (c) B3, and (d ) B4
(Gautama et al. 2003).
(Mandic et al. 2005a,b). Sleep stage scoring (Anderson & Horne 2005; Morrell
et al. 2007) based on the combination of data obtained from multichannel EEG
and other body sensors is one such application. To illustrate the usefulness of the
DVV method in the fusion of heterogeneous data, a set of publicly available
(http://ida.first.fraunhofer.de/wjek/publications.html) physiological recordings
of five healthy humans during three consecutive afternoon naps were used. Three
physiological signals for every patient and each nap were considered: the EEG,
electrooculogram (EOG) and respiratory trace (RES).
Manual scoring by medical experts divides these signals into six classes:
(i) awake, eye open (W1), (ii) awake, eye closed (W2), (iii) sleep stage I (S1),
(iv) sleep stage II (S2), (v) no assessment (NA), and (vi) artefact in EOG signal
(AR). In order to score sleep stages, the DVV features from EEG, EOG and RES
were concatenated to give a fused feature vector, and then the classification was
performed using a simple neural network classifier.
Figure 10a,b illustrates the labels assigned by the medical expert and a simple
perceptron, respectively, using a classifier based on the DVV features, for the
first nap of patient 1. From the figure, it can be observed that these two labels
are a close match and that a simple concatenation of the DVV features provides a
rich information source in heterogeneous data fusion. Fusion of linear (power
spectrum) and nonlinear (DVV) features has also found application in the
modelling of awareness of car drivers, namely the detection of microsleep events
(Golz & Sommer 2007; Sommer et al. 2007).
(a) (b)
W1
sleep stage W2
S1
S2
NA
AR
0 2 4 6 8 10 12 0 2 4 6 8 10 12
× 104 × 104
Figure 10. Labels for sleep stage scoring. (a) Labels assigned by a medical expert. (b) Labels
calculated using the DVV methodology.
0 50
–2
–1 0 1 2 0 0.5 1.0 1.5
real (z) n
Figure 11. Judging the complex nature of an Ikeda map. (a) The original signal. (b) Behaviour in
the presence of noise.
part. A statistical test11 for judging whether the underlying signal is complex
or bivariate is based upon the complex DVV method (Schreiber & Schmitz
2000). Figure 11b shows the result of the statistical test for an Ikeda map
contaminated with complex white Gaussian noise (CWGN) over a range of
power levels. The complex Ikeda map and CWGN had equal variances, and the
noisy signal Ikedanoisy was generated according to IkedanoisyZIkedaoriginalC
gn!CWGN, where gn denotes the ratio between the s.d. of the complex Ikeda
map and that of the additive white Gaussian noise. For low levels of noise, the
time series was correctly judged complex.
Table 2 illustrates the results of a statistical test on the different regions of
wind data recorded over 24 hours.12 It is shown in the table that the more
intermittent and unpredictable the wind dynamics (‘high’), the greater the
advantage obtained by the complex-valued representation of wind (Goh et al.
2006), while for a relatively slowly changing case, it is not. Also, there are
stronger indications of a complex-valued nature when the wind is averaged over
shorter intervals, as represented by the respective percentage values of the ratio
of the rejection of the null hypothesis of a real bivariate nature.
11
We consider the underlying complex-valued time series bivariate if the null hypothesis is rejected
in the statistical test.
12
The wind data with velocity v and direction q are modelled as a single quantity in a complex
representation space, namely v(t)e jq(t).
Table 2. The rejection rate (the higher the rejection rate the greater the benefit of complex-valued
representation) for the wind signal.
averaged over 1 s 96 71
averaged over 60 s 83 58
8. Summary
References
Anderson, C. & Horne, J. A. 2005 Electroencephalographic activities during wakefulness and sleep
in the frontal cortex of healthy older people: links with “thinking”. Sleep 26, 968–972.
Casdagli, M. 1991 Chaos and deterministic versus stochastic non-linear modeling. J. R. Stat. Soc.
B 54, 303–328.
Diks, C., Van Houwelingen, J., Takens, F. & DeGoede, J. 1995 Reversibility as a criterion for
discriminating time series. Phys. Lett. A 201, 221–228. (doi:10.1016/0375-9601(95)00239-Y)
Farmer, J. & Sidorowich, J. 1987 Predicting chaotic time series. Phys. Rev. Lett. 59, 845–848.
(doi:10.1103/PhysRevLett.59.845)
Friston, K., Mechelli, A., Turner, R. & Price, C. 2000 Nonlinear responses in fMRI: the balloon
model, volterra kernels, and other hemodynamics. NeuroImage 12, 466–477. (doi:10.1006/nimg.
2000.0630)
Gautama, T., Mandic, D. & Van Hulle, M. 2003 Signal nonlinearity in fMRI: a comparison
between BOLD and MION. IEEE Trans. Med. Imaging 22, 636–644. (doi:10.1109/TMI.2003.
812248)
Gautama, T., Mandic, D. & Van Hulle, M. 2004 A non-parametric test for detecting the complex-
valued nature of time series. Int. J. Know. Based Intell. Eng. Syst. 8, 99–106.
Goh, S. L., Chen, M., Popovic, D. H., Aihara, K., Obradovic, D. & Mandic, D. P. 2006 Complex-
valued forecasting of wind profile. Renew. Energy 31, 1733–1750. (doi:10.1016/j.renene.2005.
07.006)
Golz, M. & Sommer, D. 2007 Automatic knowledge extraction: fusion of human expert ratings and
biosignal features for fatigue monitoring applications. In Signal processing techniques for
knowledge extraction and information fusion (eds D. P. Mandic, M. Golz, A. Kuh, D. Obradovic
& T. Tanaka), pp. 299–316. Berlin, Germany: Springer.
Grassberger, P. & Procaccia, I. 1983 Measuring the strangeness of strange attractors. Physica D
9, 189–208. (doi:10.1016/0167-2789(83)90298-1)
Hegger, R., Kantz, H. & Schreiber, T. 1999 Practical implementation of nonlinear time series
methods: the TISEAN package. Chaos 9, 413–435. (doi:10.1063/1.166424)
Ho, K., Moody, G., Peng, C., Mietus, J., Larson, M., Levy, D. & Goldberger, A. 1997 Predicting
survival in heart failure case and control subjects by use of fully automated methods for deriving
nonlinear and conventional indices of heart rate dynamics. Circulation 96, 842–848.
Kaplan, D. 1994 Exceptional events as evidence for determinism. Physica D 73, 38–48. (doi:10.
1016/0167-2789(94)90224-0)
Kaplan, D. 1997 Nonlinearity and nonstationarity: the use of surrogate data in interpreting
fluctuations. In Frontiers of blood pressure and heart rate analysis (eds M. Di Rienzo, G. Mancia,
G. Parati, A. Pedotti & A. Zanchetti), pp. 15–28. Amsterdam, The Netherlands: IOS Press.
Kugiumtzis, D. 1999 Test your surrogate data before you test for nonlinearity. Phys. Rev. E 60,
2808–2816. (doi:10.1103/PhysRevE.60.2808)
Mandic, D. & Chambers, J. 2001 Recurrent neural networks for prediction: learning algorithms
architecture and stability. Chichester, UK: Wiley.
Mandic, D., Goh, S. L., Aihara, K. 2005a Sequential data fusion via vector spaces: complex
modular neural network approach. In Proc. IEEE Workshop on Machine Learning for Signal
Processing, pp. 147–151.
Mandic, D. et al. 2005b Data fusion for modern engineering applications: an overview. In Proc. Int.
Conf. on Artificial Neural Networks, ICANN ’05, Warsaw, Poland, vol. 2, pp. 715–721.
Morrell, M. J., Meadows, G. E., Hastings, P., Vazir, A., Kostikas, K., Simonds, A. & Corfield,
D. R. 2007 The effects of adaptive servo ventilation on cerebral vascular reactivity in patients
with congestive heart failure and sleep-disordered breathing. Sleep 30, 648–653.
Narendra, K. S. & Parthasarathy, K. 1990 Identification and control of dynamical systems using
neural networks. IEEE Trans. Neural Netw. 1, 4–27. (doi:10.1109/72.80202)
Poon, C. S. & Merrill, C. 1997 Decrease of cardiac choas in congestive heart failure. Nature 389,
492–495. (doi:10.1038/39043)
Rissanen, J. 1978 Modelling by shortest data description. Automatica 14, 465–471. (doi:10.1016/
0005-1098(78)90005-5)
Schreiber, T. 1999 Interdisciplinary application of nonlinear time series methods. Phys. Rep. 308,
1–64. (doi:10.1016/S0370-1573(98)00035-0)
Schreiber, T. & Schmitz, A. 1996 Improved surrogate data for nonlinearity tests. Phys. Rev. Lett.
77, 635–638. (doi:10.1103/PhysRevLett.77.635)
Schreiber, T. & Schmitz, A. 1997 On the discrimination power of measures for nonlinearity in a
time series. Phys. Rev. E 55, 5443–5447. (doi:10.1103/PhysRevE.55.5443)
Schreiber, T. & Schmitz, A. 2000 Surrogate time series. Physica D 142, 346–382. (doi:10.1016/
S0167-2789(00)00043-9)
Sommer, D., Chen, M., Golz, M., Trutschel, U. & Mandic, D. 2007 Feature fusion for the detection
of microsleep events. Int. J. VLSI Signal Process. Syst. 49, 329–342. (doi:10.1007/s11265-007-
0083-4)
Theiler, J. & Lookman, T. 1993 Statistical error in a chord estimator of correlation dimension: the
“rule of five”. Int. J. Bifurcat. Chaos 3, 765–771. (doi:10.1142/S0218127493000672)