0% found this document useful (0 votes)
6 views

1.3.2a Continuous Data Analysis With Analog Computers Using Statistical and Regression Techniques 1964

This paper discusses the continuous analysis of data using analog computers and statistical techniques, focusing on estimating fundamental parameters such as mean and variance. It highlights the advantages of using analog methods for real-time data analysis in industrial processes, enabling continuous monitoring and control. The paper also presents circuit designs for calculating the mean and variance of continuous signals, emphasizing the importance of weighted averages for accurate data representation.

Uploaded by

info1danish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

1.3.2a Continuous Data Analysis With Analog Computers Using Statistical and Regression Techniques 1964

This paper discusses the continuous analysis of data using analog computers and statistical techniques, focusing on estimating fundamental parameters such as mean and variance. It highlights the advantages of using analog methods for real-time data analysis in industrial processes, enabling continuous monitoring and control. The paper also presents circuit designs for calculating the mean and variance of continuous signals, emphasizing the importance of weighted averages for accurate data representation.

Uploaded by

info1danish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

GENERAL SECTION COMPUTING TECHNIQUES: 1. 3.

2a

CONTINUOUS DATA ANALYSIS WITH ANALOG COMPUTERS USING

STATISTICAL AND REGRESSION TECHNIQUES

ABSTRACT: This paper shows how certain fundamental statistical


parameters of a given population--the mean, the variance, the auto-
correlation function, the cross correlation function, the Fourier trans-
form , or the power spectrum--can be estimated continuously through
calculations employing simple analog techniques. In addition, analog
methods are developed for continuous regression analysis wherein two
or more populations or variables are compared statistically to find
significant relationships.

C Electronic Associates, In c . 1964


All Rights Reserved
Printed in U.S.A . 26 4 Bulletin No. ALAe 62023
CONTINUOUS DATA ANALYSIS WITH ANALOG COMPUTERS USING

ST ATISTICAL AND REGRESSION TECHNIQUES

GENERAL ly simple and economical computer circuits. The


mean and standard deviation of a noisy signal can
The need for statistical data analysis in theprocess be recorded continuously and on-line, so that all
industries is well known. Measurements of process subsequent problems in interpreting the data can be
variables or parameters are subject to random dis- reduced markedly. More complex but still relatively
turbances such as the presence of impurities in inexpensive circuits can be used to record con-
varying amounts, environmental changes, weather, tinuously, either on-line or off-line, the Fourier
etc. Very often it becomes necessary to obtain the Series Coefficients of a Signal. Or, in the dynamic
"best estimate" of a variable over some prior time testing of systems, the transformation from im-
interval for purposes of control. It is this concept of pulse response to frequency response can be ac-
"estimate" that introduces statistics. complished, thus permitting determination of the
best combination of simple input and easily-inter-
With the availability of small, rugged and reliable preted output.
analog computing components especially designed
for plant environments, it becomes feasible both THE MEAN
economically and technically to apply statistical
techniques to the analysis of continuous data for One of the fundamental statistical estimates is that
either measurement or control. Of special impor- of the "mean" or the "arithmetic average" of a
tance is the fact that an analog device can do a variable or a parameter. When dealing with dis-
simple or complex calculation task while remaining crete information, the mean is defined by the sum-
a small package in terms of physical dimensions and mation
cost. Other computational approaches almost al-
ways imply the purchase of a relatively large' 'min-
N
imum" amount of hardware. Thus, one is encour-
aged to explore the "simple" applications-- situa-
tions which pay their own way while providing ex-
L
i=l
f.
1

(1)
perience in the use and testing of the analog ap- f = N
proach.

Such computations can be performed Off-Line, On- which is recognized easily as the familiar arith-
Line Open Loop, or On-Line Closed Loop on con- metic average. For data analysis, this statistical
tinuous-signal inputs; digitizing of the analog signal property is important for two reasons: 1) it is fun-
is unnecessary. Noisy signals, unavoidable in many damental to the definition of other statistical param-
pilot or plant operations, whose deviation traces eters' and 2) it applies equaUywell to normal popu-
serve as the basis for subsequent calculations can lation distributions and to those that are not dis-
be "reduced" to more meaningful form by relative- tributed normally.
It would appear desirable to utilize this statistical
property in the analysis of continuous data or for ... t<tl I
o~___T~2()~I_______ [[:>____ I 1T2
f(tldt=f (t)
T_2_-_T_I__~TI~_____o
the measurement of continuous process variables
for purposes of control. In order to do so, it be-
comes necessary to obtain the "estimate" of the
mean as a continuous and changing function of Figure 2. Analog Circuit for Calculation of Estimate of the
time. Specifically, one must be ableJo define and Mean for a Fixed Time Interval
compute the average or mean value, f, of a signal,
f(t) , varying with time over the interval T 1~ t.$T 2' A refinement to this circuit would be to generate
f(t) as a continuously varying function of time as
As an example, assume that a steel mill is pro- shown in Figure 3. The time interval then must be
ducing a continuous metal strip which, ideally, considered as a variable so that the average is
should be uniform but actually is fluctuating in computed continuously from time T l' In Figure 3,
thickness (as shown in Figure 1) because of in- T 1 is considered to be zero computer time and T 2
evitable random disturbances in the process. Over
has been replaced by t since the upper limit of the
integral is a variable.

rn
DESIRED
w OR NOMINAL t
:J:
(.) VALUE
z
~~~~~~~~~~~bu~~~~~~~dr~~ f
I
(n="t L
rt f(t)dt
rn o
w
z
:.::
(.)

:J:
I-
+ f (t)

LENGTH:::: TIME
Figure 3. Analog Circuit for Calculation of a Continuous
Figure 1. Thickness of Steel Plate from a Rolling Mill as a Estimate of the Mean far a Fixed Time Interval
Function of Time

The circuit of Figure 3, although theoretically an


any time interval of reasonable length, the mean improvement over that of Figure 2, has two rather
thickness should be equal to the nominal value al- obvious limitations: 1) the uncertainty of the divi-
though small deviations are allowable. A senSing sion when t = 0, and 2) the need to select maximum
device is monitoring the thickness as the strip running time in advance, since the integrator will
emerges from the mill, and a transducer is gen- eventually overload. This latter difficulty also oc-
erating a signal, f(t) , which is proportional to the curs with the circuit of Figure 2. In each circuit,
instantaneous thickness. We would like to compute the integration can continue only over a certain
the average value of this signal so that it can be range of time, and the circuit then must be reset.
compared with the desired or nominal value in However, the past values of f(t) are lost in the re-
order to see if the process is under control. setting; the average computed during the second
"run" are independent of the values off(t) obtained
The most obvious definition of the mean or aver- during the first run. If a succession of runs of
age value for f(t) over the interval T 1.$ t~T 2 is length T are made and the circuits are reset each
time, it is clear that the last computed average
depends only on the behavior of f(t) in the last T
1 units of time. In other words, from the point of
f (t) f(t) dt (2)
view of the most recent average, information older
than T units of time is obsolete.

This value can be computed with the simple circuit The resetting necessary with the previous two cir-
of Figure 2. At time Tl the integrator is placed cuits can be aVOided, and a much simpler circuit
in the COMPUTE mode, and at time T2 its output is obtained without worry about overloads and divi-
observed. The integrator then can be reset and sion circuits. The clue to the method is the fact
another average taken. that past values of f(t) become obsolete. Since the
basic signal, f(t) , is continuous, it seems advan- This can be simplified by letting T 1- -00, or
tageous to let past inform.ation become obsolete
!fradually rather than abruptly. This means that
f(t) has to be defined in such a way that recent
values count much more heavily than earlier values -aT at
f (T) =a e 2 e f(t) dt (6)
and the behavior of f(t) in the remote past has very
little effect. This suggests that a weighted average
be used.

The weighted-average-f(t) of a function, f(t), over The minus infinity in the lower limit serves to in-
T 1 ~ t ~ T 2 with weight function <f> (t) is defined by dicate that the average has been generated for such
(1)* where <f> (t)~O in the interval Tl~t.:5T2' Thus a long time that the effect of what happened before
T 1 is negligible. In other words, since the exponen-
tial weighting function, eat, approaches zero as
t--oo, the importance of events prior to T1 is negli-
T2
1; 1
f(t) <f> (t) dt
gible if Tl is suitably chosen.

Dropping the subscripts, Equation 6 can be written


f (t) (3)
T2 as
f
Tl
<f> (t) dt
T

The integral in the denominator serves to "nor-


malize" the expression. The function <f> (t) can be
f (T) =ae
-aT

L f(t) e
at
dt (7)

chosen arbitrarily to emphasize or de-emphasize


various parts of the interval from T 1 to T 2' Re- arranging

Remembering the requirement that the recent past


must be emphasized and the remote past de-em- T
phasized, it follows that we should choose a weight-
ing function, <f> (t), which is increasing and such that f(T) = a f f(t) e -a(T - t) dt (8)

lim t~t~:O. Many functions have this property but


-00

the exponential function is a natural one and leads


to a simple computer circuit. Picking an exponential Otterman (2) defines this to be the "Exponentially
weighting function, eat (a>O) , Equation 3 becomes Mapped Past" or EMP of f(t) over a time interval
defined by a ~

lT2 Implementation of the analog circuit for solving this


eat f(t) dt equation is reasonably straightforward. Differen-
Tl tiating Equation 7 with respect to machine time,
f (t) (4)
T, (t is a dummy variable) gives
fT 2 at
e dt
Tl T

T2
d f(T)
dT
= al(~e-aT) feat f(t) dt (9)

-00

f (t) a
.h 1
eat f(t) dt
(5)
+e -aT [eaT f(T)]}

aT2 aTl
e - e d f (T) = a[- f (T) + f(T)] = af(T) - af (T) (10)
dT
tThose familiar with linear analysis and, in particular, convolution
integrals, will recognize Equation 8 as the output of a filter whose
*Numbers in parentheses in the body of the text refer to references impulse response ;s ae -at; that is, a first·order filter with time
listed in APPENDIX 1. constant lla.
Equation 10 is implemented by the simple circuit In the circuit of Figure 4 it is obvious that an initial
of Figure 4, which is recognized easily as the cir- condition applied to the integrator will improve the
cuit for a simple filter or first order lag. Note that computed average at the beginning. This value should
represent a good guess as to the nominal or expected
mean value of f(t). One normally would have such an
-f (t)
estimate available. If itis a good estimate, the com-
puted average will be reasonable from the start; if it
is a bad one, it will not make any difference after
about three to five time constants.
ex THE VARIANCE

Figure 4. Analog Circuit for Obtaining the EMP Estimate of A second important statistical parameter is the
the Mean variance which is used to give a basic measure of
the distribution of a population. It is defined as the
square of the standard deviation and is equal to the
the input and output signals have been written in mean- squared deviation of the variable from its
terms of the more familiar notation for time, t, mean. For discrete data, an estimate of the variance
which is not to be confused with the dummy variable is obtained with the summation
of Equation 7.

The value of the constant, ex , determines how fast


past information becomes obsolete. It is chosen
(11)
arbitrarily to be large enough to filter out non-
essential random fluctuations and small enough
not to obscure long term trends. A useful rule of
thumb can be developed by examining the response
of the circuit of Figure 4. If f(t) changes abruptly The term N - 1 corresponds to the number of
(step input), f(t) will follow gradually, making 95% degrees of freedom involved in the calculation of
of the change in 3 time constants or a time interval the estimates of the variance (3); practical con-
of 3/ex. In other words, as shown in Figure 5, after siderations dictate that the number of samples,
N, will be larger than one.
FUNCTION TO BE
/AVERAGED
In a manner similiar to the definition of the mean,
1&1
...J
Otterman (2) defines the EMP variance as
III
c(
0:
~

.2(T) ~{ [f(t) _ I(t)]" e - (T - t) dt (12)


-00

t-+ which, based on the preceding development of the


Figure 5. The EMP Mean of a Continuous Variable Provides EMP mean, will be recognized as the weighted
a Measure of the Averoge of the Variable for a Continuously average of the square of the deviation of the
Updated Fixed Time Interval. Note the 95% decrease in the variable from its mean.
value of the weighting function over a period of length 3/a.
This means that the weighted average at time, t, is virtually
independent of values that occurred prior to time t - 3/a. The computer circuit for calculating an estimate of
the EMP variance is developed easily without re-
three time constants, the integrator has forgotten course to mathematical manipulations. From the
95% of the information it had before the step change. definition, and remembering that averaging is ac-
Consequemly, the EMP average defined by Equation complished by the first-order filter circuit, the
8 is an estimate* of the mean over a time interval following operations are required:
approximately equal to 3/ex.
*/1 a 99% criterion were used, the time interval would be approxi-
1) form the mean with the first-order filter
mately 51a. circuit.
2) subtract the mean from the current value these readings until the smoke has been blown
of f(t). away (by a fan) and smooth pouring is established.
Note that even after these conditions have been
3) square the difference of the mean from the attained, the temperature measurement, t 1<t<t 2 ,
current value of f(t). is subject to fluctuations.

4) average the square with a second filter At present, the "temperature" reading is inserted
circuit. manually into the charge balance computer (the
mean value of temperature is "guesstimated" by
From these requirements the circuit of Figure 6 the operator). This could be automated easily with
is derived easily. an EMP mean value circuit since the transducer
signal is a continuous electrical signal. The condi-
tion that the reading of the mean value circuit
should not be used at the beginning of the time
history, t<tl, for reasons mentioned previously,
can be automated by using the standard deviation
(variance) as a control criterion, i.e., when 0'2(T)
is greater than a reference value, do not use
f(T) , when 0' 2(T) is less than a reference value,
Figure 6. Analog Circuit for Calculatior of the EMP Estimate use f(T). The reference value chosen will depend
of the Variance on the maximum variance expected during smooth
pour conditions. This can be mechanized readily
Example: The LD Steel Process can be used as an on the analog computer by means of a comparator.
example of the use of the EMP mean and variance
for the control of a process. This is the oxygen With this simple technique a better estimate of the
steel making process wherein it is possible to con- mean temperature could be inserted into the charge
trol bath temperatures without an external fuel computer automatically and economically.
supply by charging the vessel with materials that
are thermally balanced. The charge materials con- AUTOCORRE LA TION
sist of hot metal (iron), scrap, and lime.
The autocorrelation function, defined as an integral
The hot metal temperature can range from 2200°F between fixed limits, is converted easily to a con-
to 2600°F, and, hence, it is necessary to measure tinuous EMP autocorrelation function, ¢ (T), by the
the temperature of the iron to obtain a correct definition
thermal balance.

A two-color radiation pyrometer method is used to


measure the iron temperature while it is being
poured into the vessel. A typical trace is shown in
• (T) ~ 1
T

-00
fIt) fIt -;) e -a (T - t) dt (13)

Figure 7. (For further details of the process the


reader is referred to reference 4.)
Cross correlation also could be accomplished by
2600
the substitution of a second function, g(t), into the
READING TAKEN time delay box, r, shown in Figure 8, so that the
2500 output of the delay box is g(t - r) and the output of
lL
o
2400 the multiplier becomes - [f(t)] [g(t - T)] •
2300 L-______________ ~ ________________ ~~

to

Figure 7. Plot of Hot Metal Temperature vs Time for the LD


Steel Process

The initial variations in temperature, t o <t<tl,


are due to the presence of smoke and the forma- Figure 8. Circuit for Obtaining the Continuous EMP
tion of voids in the pour. The operator disregards Autocorrelation Function for Time Delay r
Reasonable time delays are obtained easily by as- F(w). The real component, E 1 , is formed from the
sembling linear analog computing components. Fig- transfer function
ure 9 shows a fourth-order Pade circuit for genera-
E1 · (P + a)a
NOTE: 7/15. =O.4667( 1/.) f(t) - - 2 2 (18)
(P +a) + w

and the imaginary component, E2, from

aw
2 2 (19)
(P +a) + w
1/.
Figure 9. Circuit for Fourth-Order Pade Approximation for
Ideal Time Delay of Magnitude r (5) Note that this circuit gives pew) at one value of w.
The parameter, w, can be changed merely by
ting a time delay, T. This circuit is accurate to with- changing the two potentiometers labelled "w"
in 1 degree of phase shift for input frequencies in
f(t) such that the product of the maximum useful
signal frequency, wm , with the time delay, T, shall
not exceed 6.5 radians, i.e.,

TW < 6. 5 radians
m- (14)

FOURIER AND POWER SPECTRUM ANALYSIS pew)

The EMP Fourier transform of f(t) is defined as

Figure 10. Circuit for Calculation of EMP Fourier Transform

F(w) = 1 T

-00
f(t) e
-aCT - t)
e
-jwt
dt (15)
and Power Spectrum

An alternative is to build similar circuits in


parallel, all having the same input, f(t) , and dif-
or
fering only in the setting of w. This will allow
(16)
many points of the power spectrum to be obtained
-j wT f(t) e -aCT - t) ejw(T - t) dt simultaneously.
F(w) = a e

REGRESSION ANALYSIS

Until now, only those statistical parameters that


The EMP power spectrum is defined as
describe a single population--mean, variance,
(17) power spectrum, etc.--have been discussed. Of
interest, also, is the method of statistics whereby
pew) = \F(w) \ 2 relationships between two or more populations,
representing different variables, are found. This

= .2 [100 f(t) e-·(T - t) Cos w (T _ t) dt] 2


method is called' 'regression analysis".

r
There are many types of regression---linear,
quadratic, high order, multivariate, etc. These

+ .2 [l f(t) e -.(T - t) Sinw(T - t) dt


terms refer to the type of expression used to re-
late the variables. For example,

y=mx+b (linear regression) (20)


Figure 10 shows the analog circuit for obtaining the 2 (21)
power spectrum, pew), of the Fourier transform, Y = ax + bx + c (quadratic regression)
Regression consists, essentially, of finding the Figure 11 shows the analog computer circuit for
best "fit" to a set of data using a least-squares calculating the "least squares" parameters m
criterion. While several authors have hinted at and b using the definite integral. One should note
obtaining a "least squares" fit by analog techniques
for special cases, none have shown a straight-
forward solution to the regression problem as de-
fined above for continuous variables. -V(t)

As an illustration of how least squares fitting


,
would be performed on the analog computer, con-
sider the linear case defined by Equation 20. For i XVdt

this general equation it can be shown (6) that the


following two equations will define the unknown
parameters m and b, the slope and intercept of
the line, respectively.

N
m 2: Xi + Nb (22)
i=l
Figure 11. Circuit for Obtaining the "Least Squares"
N N Regression Parameters!!! and ~
"'"" X.Y. (23)
L..J 1 1 + LXi
i=l i=l that in calculating m and b from the definite in-
tegral we form an "algebraic" loop. This brings
up the question of circuit stability. It can be shown"
Since X and Y will be continuous functions of that once the computation is under way, (t>O), the
time, the discrete summation from 1 to N can loop will be stable unless X is constant. However,
be replaced by a time integral where the total if X is constant it cannot be used as an independent
time, t, is proportional to N. Therefore, Equa- variable in a correlation study. Overloads at the
tions 22 and 23 become very beginning of the computation (a region of no
interest) can be taken care of with feedback limiters
on the division amplifiers, or by using a "steepest
t descent" division circuit (7).
mf. o
Xdt+bt (24)
It should be observed that m and b are defined at
every instant of time. For small values of t
(corresponding to small sample size), the estimates
t of m and b will be relatively insignificant and,
f o
t XY dt
dt+bi Xdt (25)
hence, will be changing rapidly. As the time in-
terval increases, however, the values of m and b
become more significant and actually should reach
"steady state" or non-changing values.

These two equations can be solved simultaneously The technique for linear regression can be extended
to yield m and b as follows: to quadratic or higher order regressions, it then
being necessary to define a new set of equations--
t t such as Equations 22 and 23-- for determining the
b =!a Y dt - m fax dt
(26)
unknown parameters of the regression system.
Once the equations are defined, they can be con-
t verted to continuous integrals and instrumented by
standard analog techniques.

Again we are dealing with continuous signals, which


(27) means that there must be a limit of the integration
interval if circuits such as that of Figure 11 are to
* See Reference (8)
be used. Just as before, the need for resetting of
integrators can be eliminated by converting the
equations for m and b to EMP equations and, there-
by, obtaining truly continuous estimates of the re-
gression parameters.

Equations 22 and 23 are rewritten by dividing


through N. This yields

N N
Li=l
Yi LXi
i=l
m + b (28)
N N

N N N

L X.Y.
1 1 LX~ LXi Figure 12. Unsealed Analog Circuit for Calculation of
i=l i=l i=l (29) Continuous EMP Values of Regression
m + b
N N N Parameters I:!! and ~

CONCLUSIONS
Recalling the correspondence between EMP vari-
ables and discrete summations, one can transform The conversion of statistical parameters to EMP
Equations 28 and 29 immediately into continuous variables enables data analysis to be performed
EMP notation which gives continuously through the use of relatively simple
analog circuits. Limits on the size of the integra-
mX + b (30) tion interval normally encountered with continuous
signals have been eliminated; the need to "reset"
(31) the integrators is no longer required since they are
now serving as convolution circuits rather than pure
where m and b are now the EMP estimates of the accumulators. A continuous estimate of statistical
regression parameters. The analog circuit required parameters can be calculated readily.
is shown in Figure 12. The statements made with
regard to the stability of the circuit shown in Fig- The concept of replacing discrete summations with
ure 11 apply also to the algebraic loop found in the the EMP mean can be a valuable one. In addition to
Figure 12 circuit, obvious uses for instrumentation and control, for
both on-line and off-line systems, this technique
It should be observed that time has, in effect, been also can be used in analog simulation studies. For
taken out of the problem by the conversion to EMP example, it is sometimes desirable to calculate
variables. There is no longer any need to "reset" the rms value of a computed variable. This is ac-
the integrators since they are now serving as con- complished quite simply by 1) squaring the in-
volution circuits rather than pure accumulators. It stantaneous value of the variable, 2) taking the
follows that circuits similiar to those shown can be mean of the square of the variable with an EMP
instrumented for continuous higher order and con- circuit, and 3) taking the square root of the mean.
tinuous multi-variable regressions. All that is reo Other uses arise in simulation workwhere Gaussian
quired is more analog computing equipment. noise is used to disturb a particular parameter.
APPENDIX I: REFERENCES

(1) Davenport, W.B., Jr., and W.L. Root: "An Introduction to the Theory of Random Signals
and Noise", McGraw-Hill Book Company, Inc., New York, 1958.

(2) Otterman, Joseph: "The Properties and Methods for Computation of Exponentially
Mapped Past Statistical Variables", IRE TRANS. on Automatic Control, Volume AC-5
Number 1, January 1960, pp. 11-17.

(3) Volk, William: "Applied Statistics for Engineers", McGraw-Hill Book Company, Inc.,
New York, 1958, p. 136.

(4) Slatosky, W.J.: "End Point Temperature Control in LD Steel Making", Journal of
Metals, Volume 12, March 1960, pp. 226-230.

(5) Brenner, M.M., and J.D. Kennedy: "Dead Time Simulation for Electronic Analog Com-
puters", National Simulation Council, December 11, 1957.

(6) Widder, D.V.: Advanced Calculus, Prentice-Hall, 1947, P. 108.

(7) Favreau, R.R.: "Dividing Circuit Obtained by Applying Method of Steepest Ascent",
Princeton Computation Center Report 132, Electronic Associates, Inc., Prmceton,
New Jersey.

(8) Hannaner, George: "Algebraic Loops - Some Stability Considerations" Education and
Training Memo #22, Electronic Associates, Inc., Princeton, New Jersey.
EAr ELECTRONIC ASSOCIATES, INC. Long Branch, New Jersey
ADVANCED SYSTEMS ANALYSIS AND COMPUTATION SERVICES/ANALOG COMPUTERS/HYBRID ANALOG·DIGITAL COMPUTATION EQUIPMENT/SIMULATION SYSTEMS/
SCIENTIFIC AND LABORATORY INSTRUMENTS/INDUSTRIAL PROCESS CONTROL SYSTEMS/PHOTOGRAMMETRIC EQUIPMENT/RANGE INSTRUMENTATION SYSTEMS/TEST
AND CHECK·OUT SYSTEMS/MILITARY AND INDUSTRIAL RESEARCH AND DEVELOPMENT SERVICES/FIELD ENGINEERING AND EQUIPMENT MAINTENANCE SERVICES.

You might also like