CRM Data2DifEquations
CRM Data2DifEquations
Equations
Jim Ramsay
McGill University
Dx(t ) f [ x(t ), t ]
The themes
Differential equations are powerful
tools for modeling data.
There are new methods for
estimating differential equations
directly from data.
Some examples are offered, drawn
from chemical engineering and
medicine.
Differential Equations as
Models
DIFE’S make
explicit the relation
between one or
more derivatives D x(t ) x(t )
2 2
and the function
itself.
An example is the
harmonic motion
equation:
Why Differential Equations?
The behavior of a derivative is often of
more interest than the function itself,
especially over short and medium time
periods.
What often counts is how rapidly a system
responds rather than its level of response.
Velocity and acceleration can reflect
energy exchange within a system.
Recall equations like f = ma and e = mc2.
Natural scientists often provide theory
to biologists and engineers in the form
of DIFE’s.
Many fields such as pharmacokinetics
and industrial process control
routinely use DIFE’s as models,
especially for input/output systems.
DIFE’s are especially useful when
feedback systems must be developed
to control the behavior of systems.
The solution to an mth order linear
DIFE is an m-dimensional function
space, and thus the equation can
model variation over replications as
well as average behavior.
Dx(t ) y (t ) z (t )
Dy (t ) x(t ) ay (t )
Dz (t ) b ( x(t ) c) z (t )
Stochastic DIFE’s
constraints.
Stochastic time.
Differential equations and time
scales
DIFE’s are important where there are
events at different time scales.
The order of the equation plus one
corresponds to the number of time
scales.
A first-order equation can model events
on two time scales: long-term, modeled
by x(t), and short-term, modeled by
Dx(t).
Handwriting has four time
scales
Average spatial position needs only x(t),
time scale is many seconds.
Overall left-to-right trend requires Dx(t) ,
with a time scale a second or less.
Cusps, loops, strokes require D2x(t) , with
a time scale of 100 msec or so.
Transient effects such from pen contacting
paper require D3x(t) with a scale of 10
msec.
If we can model data on functions or
functional input/output systems, we
will have a modeling tool that greatly
extends the power and scope of
existing nonparametric curve-fitting
techniques.
These models will be dynamic in the
sense of also modeling the rate of
change in the system.
We may also get better estimates of
functional parameters and their
derivatives.
A simple input/output
system
We begin by looking at a first order
DIFE for a single output function x(t)
and a single input function u(t).
(SISO)
But our goal is the linking of multiple
inputs to multiple outputs (MIMO) by
linear or nonlinear systems of
arbitrary order m.
Dx(t ) (t ) x(t ) (t )u (t )
•u(t) is often called the forcing function.
•α(t) and β(t) are the coefficient functions
that define the DIFE.
•The system is linear in these coefficient
functions, and in the input u(t) and output
x(t).
In this simple case, an analytic solution is
possible:
t
x(t ) h(t )[ x(0) [ s u ( s ) / h( s )]ds ]
0
where
t
h(t ) e 0
( s ) ds
time t1:
(t t )
x(t ) [1 e 1
], t t1
α/β is the
gain in the
system.
Constant β
controls the
responsivit
y of the
system to a
change in
input.
How can we estimate a
DIFE from noisy data?
The DIFE as a linear differential
operator
We can express the first order DIFE as a
linear differential operator:
j 0 k 1
Smoothing data with the
operator L
If we know the differential equation, then the
operator Lαβ can define a data smoother. The
penalized least squares fitting criterion is:
N 2
2
PENSSE yi x(ti ) L x(t ) dt
i 1
y , Z [ Z ' Z R , ] 1 Z '[ y s , ]
R , L L ', s , L u
How to estimate L
Lαβ is a function of weight coefficients α(t)
and β(t).
If α(t) and β(t) are functions of parameter
vectors a and b, respectively, then we can
optimize the profiled error sum of squares
N 2
to be zero or a constant.
Force others to be smooth,
any order.
Multiple inputs u (t) and outputs x (t).
j i
Replicated functional data.
Nonlinear DIFE’s and operators.
What about choosing λ?
Choosing the smoothing parameter λ is
always a delicate matter.
The right value of λ will be rather large if
the data can be well-modeled by a low-
order DIFE, but not so large as to fail to
smooth observational noise and small
additional functional variation.
Generalized cross-validation seems to
work.
Some Simulations
Let’s see how well this method works
where we know what we’re
estimating.
A simple harmonic example
For i=1,…,N and j=1,…,n, let
yij ci1 ci 2t j ci 3 sin 6 t j ci 4 cos 6 t j ij
where the cik’s and the εij’s are N(0,1); and t = 0(0.01)1.
1.5
0.5
data
-0.5 x(t)
u(t)
-1
-1.5
0 2 4 6 8 10 12
t
Results from 100 samples using minimum
generalized cross-validation to choose λ:
DIFE’s.
A smooth strictly
monotone function
can be expressed
as the second
order DIFE
We can monotonically smooth data by
estimating the second order DIFE directly.
3.5
2.5
2
x(t)
1.5
1
Data
Estimate
0.5
True
-0.5
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t
12
Data
Estimate
10 True
8
Dx(t)
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t
A Simulated Chemical
Reactor
Here is a textbook model for the
input and output concentrations in
a non-isothermal continuously-
stirred tank reactor.
Input measurements are (1) input
concentration Cin , (2) flow rate F,
(3) temperature T
Output is concentration Cout .
The Differential Equation
DCout (t ) t Cout (t ) F t Cin t
where
ln k0 1000 / T
(t ) e
The two parameters to be estimated
are:
K0 and τ
Process control experiments
Engineers studying systems like
these like to carry out experiments in
which inputs are stepped up or down
at random times.
They infer the dynamics of the
process from the impacts of these
steps on the output(s).
10
9
(t)
6
0 20 40 60 80 100 120 140
2.5
u(t)
1.5
1
0 20 40 60 80 100 120 140
t
We solved this differential equation
for known values of the two unknown
parameters,
and then added zero mean Gaussian
error with a standard deviation of
0.01.
Our estimate of k0 was 8.11 as opposed
to the data-generating value of 8.33.
0.32
0.3
0.28
0.26
x(t)
0.24
0.22
0.2
0.18
0.16
0 20 40 60 80 100 120 140
t
A Real-Data Example
Flow in an oil refinery
distillation column
The single input is “reflux flow” and
the output is “tray 47” level.
There were 194 sampling points.
30 B-spline basis functions were used
to fit the output, and a step function
was used to model the input.
Results for the refinery data
After some experimentation with first and
second order models, and with constant
and varying coefficient models, the clear
conclusion seems to be the constant
coefficient model:
Dx(t ) 0.02 x(t ) 0.19u (t )
Summary
We can estimate differential equations
directly from noisy data with little bias and
good precision.
This gives us a lot more modeling power,
especially for fitting input/output functional
data.
Estimates of derivatives can be much
better, relative to smoothing methods.
Special functions, such as monotone, can be
fit by estimating the DIFE that defines them.