0% found this document useful (0 votes)

148 views

What NARX Networks Can Compute

NARX neural networks are a type of recurrent neural network with limited feedback that comes only from the output neuron rather than hidden states. This document proves that NARX networks are computationally universal, meaning they are at least as powerful as Turing machines. It shows that NARX networks can simulate fully connected recurrent neural networks with only a linear slowdown in computation time. As fully connected networks are known to be Turing equivalent, this establishes that NARX networks are also Turing equivalent and universal computation devices, despite their limited feedback structure.

Uploaded by

cristian_master

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

148 views

What NARX Networks Can Compute

Uploaded by

cristian_master

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

W h a t N A R X N e t w o r k s Can C o m p u t e

Bill G. H o r n e 1, H a v a T. S i e g e l m a n n 2 a n d C. Lee Giles 1,3

1 NEC Research Institute, 4 Independence Way, Princeton, NJ 08540

2 Dept. of Information Systems Eng., Faculty of Ind. Eng. and Management,
Technion (Israel Institute of Tech.), Haifa 32000, Israel
3 UMIACS, University of Maryland, College Park, MD 20742

Abstract. We prove that a class of architectures called NARX neural

networks, popular in control applications and other problems, are at least
as powerful as fully connected recurrent neural networks. Recent results
have shown t h a t fully connected networks are Turing equivalent. Building
on those results, we prove that N A R X networks are also universal com-
putation devices. NARX networks have a limited feedback which comes
only from the output neuron rather than from hidden states. There is
much interest in the amount and type of recurrence to be used in recur-
rent neural networks. Our results pose the question of what amount of
feedback or recurrence is necessary for any network to be Turing equiv-
alent and what restrictions on feedback limit computational power.

1 Introduction

1.1 Background
Much of the work on the computational capabilities of recurrent neural networks
has focused on synthesis: how neuron-like elements are capable of constructing fi-
nite state machines (FSMs) [I, ii, 15, 16, 23]. All of these results assume that the
nonlinearity used in the network is a hard-limiting threshold function. However,
when recurrent networks are used adaptively, continuous-valued, differentiable
nonlinearities are almost always used. Thus, an interesting question is how the
computational complexity changes for these types of functions. For example, [18]
has shown that finite state machines can be stably mapped into second order
recurrent neural networks with sigmoid activation functions. More recently, re-
current networks have been shown to be at least as powerful as Turing machines,
and in some cases can have super-Turing capabilities [12, 21, 22].

1.2 S u m m a r y of Results

T h i s w o r k e x t e n d s t h e i d e a s d i s c u s s e d a b o v e to an i m p o r t a n t class of d i s c r e t e -
t i m e n o n l i n e a r s y s t e m s called Nonlinear AutoRegressive with exogenous inputs
( N A R X ) m o d e l [14]:

y(t)= f ( u ( t - n u ) , . . . , u ( t - 1 ) , u ( t ) , y ( t - n y ) , . . . , y ( t - 1 ) ) , (1)
96

where u(t) and y(t) represent input and output of the network at time t, nu and
ny are the input and output order, and the function f is a nonlinear function.
It has been demonstrated that this particular model is well suited for modeling
nonlinear systems [3, 5, 17, 19, 24]. When the function f can be approximated
by a Multilayer Perceptron, the resulting system is called a N A R X network [3,
17]. Other work [10] has shown that for the problems of grammatical inference
and nonlinear system identification, gradient descent learning is more effective
in NARX networks than in recurrent neural network architectures that have
"hidden states." For these studies, the NARX neural net usually converges much
faster and generalizes better than the other types of recurrent networks.
This work proves that NARX networks are computationally at least as strong
as fully connected networks within a linear slowdown. This implies that NARX
networks with a finite number of nodes and taps are at least as powerful as Turing
machines, and thus are universal computation devices, a somewhat unexpected
results given the limited nature of feedback in these networks.
These results should be contrasted with the mapping theorems of [6] which
imply NARX networks should be capable of representing arbitrary systems ex-
pressible in the form of equation (1), which give no bound to the number of nodes
required to achieve a good approximation. Furthermore, how such systems relate
to conventional models of computation is not clear.
Finally we provide some related results concerning NARX networks with
hard-limiting nonlinearities. Even though these networks are only capable of
implementing a subclass of finite state machines called finite memory machines
(FMMs) in real time, if given more time (a sublinear slowdown) they can simulate
arbitrary FMMs.

2 Recurrent Neural Network Models

For our purposes we need consider only fully-connected and NARX recurrent
neural networks. These networks will have only single-input, single-output sys-
terns, though these results easily extend to the multi-variable case.
We shall adopt the notation that x corresponds to a state variable, u to an
input variable, y to an output variable, and z to a node activation value. In
each of these networks we shall let N correspond to the dimension of the state
space. When necessary to distinguish between variables of the two networks,
those associated with the NARX network will be marked with a tilde.

2.1 ~lly Connected Recurrent Neural Network

The state variables of a recurrent network are defined to be the memory elements,
i.e. the set of time delay operators. In a fully connected network there is a one-to-
one correspondence between node activations and state variables of the network,
since each node value is stored at every time step. Specifically, the value of the
N state variables at the next time step are given by xi(t + 1) = zi(t). Each
97

node weights and sums the external inputs to the network and the states of the
network. Specifically, the activation function for each node is defined by

zi(t)=a(~ai,jxj(t)+biu(t)+ci)j=l (2)

where ai,j, bi, and ci are fixed real valued weights, and a is a nonlinear function
which will be discussed below. The output is assigned arbitrarily to be the value
of the first node in the network, y(t) = zl(t).
The network is said to be fully connected because there is a weight between
every pair of nodes. However, when weight ai,j = 0 there is effectively no con-
nection between nodes i and j. Thus, a fully connected network is very general,
and can be used to represent many different kinds of architectures, including
those in which only a subset of the possible connections between nodes are used.

2.2 NARX Recurrent Neural Network

A NARX network consists of a Multilayer Perceptron (MLP) which takes as

input a window of past input and output values and computes the current output.
Specifically, the operation of the network is defined by

t)(t) = k~ ( g ( t - n ~ ) , . . . , g ( t - 1),~(t),~(t- ny),...,~(t- 1)) , (3)

where the function ~ is the mapping performed by the MLP.

The states of the NARX network correspond to a set of two tapped-delay
lines. One consists of nu taps on the input values, and the other consists of ny
taps on the output values. Specifically, the state at time t corresponds to the
values

5~(t)=[ ~t(t-n~)... g(t-1) ~(t-ny) ... ~(t-1) ] .

The MLP consists of a set of nodes organized into two layers. There are/~ nodes
in the first layer which perform the function

i= 1,... ,H.

The output layer consists of a single linear node ~l(t) = ~ = 1 wij2j (t) + Oi.

D e f i n i t i o n 1. A function a is said to be a bounded, one-side saturated (BOSS)

function if it satisfies the following conditions: (i.) a has a bounded range, i.e.,
L < a(x) < U, L ~ U for all x E IR. (ii.) a is left-side saturated, i.e. there exists
a finite value s, such that a(x) = S for all x ~ s. (iii.) a is non-constant.
98

Many sigmoid-like functions are BOSS functions including hard-limiting thresh-

old functions, saturated linear functions, and "one side saturated sigmoids",
(

a(x)=~O 1 x_<c
[ y+-j_~ x > c

where c E R.

3 Turing Equivalence of N A R X Networks

We prove t h a t NARX networks with BOSS functions are capable of simulating
fully connected networks with only a linear slowdown. Because of the universality
of some types of fully connected networks with a finite number of nodes, we
conclude t h a t the associated NARX networks are Turing universal as well.

T h e o r e m 2. N A R X networks with one hidden layer of nodes with B O S S acti-

vation functions and a linear output node can simulate fully connected recurrent
networks with B O S S activation functions with a linear slowdown.

Here we present a sketch of the proof of the theorem. The interested reader
is referred to [20] for more details.

Proof. To prove the theorem we show how to construct a N A R X network iV

t h a t simulates a fully connected network j r with N nodes, each of which uses
a BOSS activation function or. The N A R X network requires N + 1 hidden layer
nodes, a linear output node, an output shift register of order ny = 2N, and no
taps on the input. Without loss of generality we assume t h a t the left saturation
value of a is S = 0.
The simulation suffers a linear slowdown; specifically, if F computes in time
T, then the total computation time taken by iV is ( N + 1)T. At each time step,
iV will simulate the value of exactly one of the nodes in jr. The additional time
step will be used to encode a sequencing signal indicating which node should be
simulated next.
The output taps of iV will be used to store the simulated states of ~ ; no
taps on the input are required, i.e. n~ = 0. At any given time the t a p p e d delay
line must contain the complete set of values corresponding to all N nodes of jc
at the previous simulated time step. To accomplish this, a t a p p e d delay line of
length ny = 2N can be used which will always contain all of the values of j r at
time t - 1 immediately preceding a sequencing signal, #, to indicate where these
variables are in the tap.
The sequencing signal is chosen in such a way t h a t we can define a simple
function f~ t h a t is used to either "turn off" neurons or to yield a constant value,
according to the values in the taps. Let # = U + e for some positive constant e.
We define the atfine function, f~(x) = x - #. Then, f~(#) = 0 and f ~ ( x ) <_ - e
for all x E [L, U]. It can be shown that the ith hidden layer node takes on a
non-zero value only when the sequencing symbol occurs in state X2N-~+l and
99

when the values of zj(t - 1) are stored in states X N T m - - i , m --~ 1 , . . . , N . It can

be shown that the ith node in the hidden layer of Af is updated as follows.

2i(k + 1) =

a([ Nm~_la~,mS:N+,~-~(k)+b~u(k)~-e~]~-~i[X2N-i+l(k)--].t])', (4)

where the constant fli is large enough to make the input to a less than s when
22N-i+1 (k) ~ # so that the whole function is zero. A similar argument is used
to ensure that the final node implements the sequencing signal properly. Since
only one of the hidden layer nodes is non-zero, the output node of Af is simply
a linear combination so that the output of the network is equal to the value of
the currently active hidden layer node. The resulting network will simulate iT
with a linear slowdown.

It has been shown that fully connected networks with a fixed, finite number
of saturated linear activation functions are universal computation devices [22].
As a result it is possible to simulate a Turing machine with the NARX network
such that the slowdown is constant regardless of problem size. Thus, we conclude
that

NARX networks with one hidden layer of nodes with saturated

C o r o l l a r y 3.
linear activation functions and linear output nodes are Turing equivalent.

4 NARX Network with Hard-limited Neurons

Here we look at variants of the NARX networks, in which the output functions
are not linear combiners but rather a hard-limiting nonlinearity.
If the inputs are binary, then recurrent neural networks are only capable
of implementing Finite State Machines (FSMs), and in real time NARX net-
works are only capable of implementing a subset of FSMs called finite memory
machines (FMMs) [13].
Intuitively, the reason why FMMs are constrained is that there is a limited
amount of information that can be represented by feeding back the outputs alone.
If more information could be inserted into the feedback loop, then it should be
possible to simulate arbitrary FSMs in structures like NARX networks. In fact,
we next show that this is the case. We will show that NARX networks with
hard-limiting nonlinearities are capable of simulating fully connected networks
with a slowdown proportional to the number of nodes. As a result, the NARX
network will be able to simulate arbitrary FSMs.

NARX networks with hard-limiting activation functions, one hid-

T h e o r e m 4.
den layer of N + 1 nodes, and a output tapped delay line of length 4N + 1 can
simulate fully connected networks with N hard-limiting activation functions with
a slowdown of 2N + 3.
100

Proo]. See [20].

In [11] it was shown that any n-state FSM can be implemented by a four
layer recurrent neural network with O (vr~) hard-limiting nodes. It is trivial
to show that a fully connected recurrent neural network can simulate an L -
layer recurrent network with a slowdown of L. Based on the fact that a NARX
network with hard-limiting output nodes is only capable of implementing FMMs,
we conclude that

C o r o l l a r y 5. For every F S M A/t, there exists an F M M which can simulate J~4

with 0 (x/~) nodes and 0 (x/~) slowdown.

5 Conclusion

We proved that NARX neural networks are capable of simulating fully connected
networks within a linear slowdown, and as a result are universal dynamical
systems. This theorem is somewhat surprising since the nature of feedback in
this type of network is so limited, i.e. the feedback comes only from the output
neuron.
The Turing equivalence of NARX neural networks implies that they are ca-
pable of representing solutions to just about any computational problem. Thus,
in theory NARX networks can be used instead of fully recurrent neural nets
without loosing any computational power.
But Turing equivalence implies that the space of possible solutions is ex-
tremely large. Searching such a large space with gradient descent learning al-
gorithms could be quite difficult. Our experience indicates that it is difficult to
learn even small finite state machines (FSMs) from example strings in either of
these types of networks unless particular caution is taken in the construction
of the machines [9, 4]. Often, a solution is found that classifies the training set
perfectly, but the network in fact learns some complicated dynamical system
which cannot necessarily be equated with any finite state machine,
NARX networks with hard-limiting nonlinearities can be shown to be capable
of only implementing a subclass of finite state machines called finite memory
machines. But, they can implement arbitrary finite state machines if a sublinear
slowdown is allowed.
These results open several questions. What is the simplest feedback or recur-
rence necessary for any network to be ~ r i n g universal? Do these results have
implications about the computational power of other types of architectures such
as recurrent networks with local feedback [2, 7, 8]?

Acknowledgements
We would like to thank Peter Tifio and Hanna Siegelmann for many helpful comments.
101

References

1. N. Alon, A.K. Dewdney, and T.J. Ott. Efficient simulation of finite automata by
neural nets. JACM, 38(2):495-514, 1991.
2. A.D. Back and A.C. Tsoi. FIR and IIR synapses, a new neural network architecture
for time series modeling. Neural Computation, 3(3):375-385, 1991.
3. S. Chert, S.A. Billings~ and P.M. Grant. Non-linear system identification using
neural networks. Int. J. Control, 51(6):1191-1214, 1990.
4. D.S. Clouse, C.L. Giles, B.G. Horne, and G.W. Cottrell. Learning large deBruijn
automata with feed-forward neural networks. Technical Report CS94-398, CSE
Dept., UCSD, La Jolla, CA, 1994.
5. J. Connor, L.E. Atlas, and D.R. Martin. Recurrent networks and NARMA rood-
eling. In NIPS4, pages 301-308, 1992.
6. G. Cybenko. Approximation by superpositions of a sigmoidal function. Math. of
Control, Signals, and Sys., 2(4):303-314, 1989.
7. B. de Vries and J.C. Principe. The gamma model - - A new neural model for
temporal processing. Neural Networks, 5:565-576, 1992.
8. P. Frasconi, M. Gori, and G. Soda. Local feedback multilayered networks. Neural
Computation, 4:120-130, 1992.
9. C.L. Giles, B.G. Horne, and T. Lin. Learning a class of large finite state machines
with a recurrent neural network. Neural Networks, 1995. In press.
10. B.G. Horne and C.L. Giles. An experimental comparison of recurrent neural net-
works. In NIPST, 1995. To appear.
11. B.G. Horne and D.R. Hush. Bounds on the complexity of recurrent neural network
implementations of finite state machines. In NIPS6, pages 359-366, 1994.
12. J. Kilian and H.T. Siegelmann. On the power of sigmoid neural networks. In Proc.
6th ACM Work. on Comp. Learning Theory, pages 137-143, 1993.
13. Z. Kohavi. Switching and finite automata theory. McGraw-Hill, New York, NY,
2rid edition, 1978.
14. I.J. Leontaritis and S.A. Billings. Input-output parametric models for non-linear
systems: Part I: deterministic non-linear systems. Int. J. Control, 41(2):303-328,
1985.
15. W.S. McCulloch and W.H. Pitts. A logical calculus of the ideas immanent in
nervous activity. Bull. Math. Biophysics, 5:115-133, 1943.
16. M.L. Minsky. Computation: Finite and infinite machines. Prentice-Hall, Engle-
wood Cliffs, 1967.
17. K.S. Narendra and K. Parthasarathy. Identification and control of dynamical sys-
tems using neural networks. IEEE Trans. on Neural Networks, 1:4-27, March
1990.
18. C.W. Omlin and C.L. Giles. Stable encoding of large finite-state automata in
recurrent neural networks with sigmoid discriminants. Neural Computation, 1996.
accepted for publication.
19. S.-Z. Qin, H.-T. Su, and T.J. McAvoy. Comparison of four neural net learning
methods for dynamic system identification. IEEE Trans. on Neural Networks,
3(1):122-130, 1992.
20. H.T. Siegelmann, B.G. Horne, and C.L. Giles. Computational capabilities of
NARX neural networks. Technical Report UMIACS-TR-95-12 and CS-TR-3408,
Institute for Advanced Computer Studies, University of Maryland, 1995.
21. H.T. Siegelmann and E.D. Sontag. Analog computation via neural networks. The-
oretical Computer Science, 131:331-360, 1994.
102

22. H.T. Siegelmann and E.D. Sontag. On the computational power of neural net-
works. J. Comp..and Sys. Science, 50(1):132-150, 1995.
23. H.T. Siegelmann, E.D. Sontag, and C.L. Giles. The complexity of language recog-
nition by neural networks. In Algorithms, Software, Architecture (Proc. of IFIP
12th World Computer Congress), pages 329-335. North-Holland, 1992.
24. H.-T. Su, T.J. McAvoy, and P. Werbos. Long-term predictions of chemical pro-
cesses using recurrent neural networks: A parallel training approach. Ind. Eng.
Chem. Res., 31:1338-1352, 1992.

MtA - Tome of The Watchtowers
100% (5)
MtA - Tome of The Watchtowers
162 pages
William G. Niederland - The Schreber Case - Psychoanalytic Profile of A Paranoid Personality-Routledge (1984) PDF
No ratings yet
William G. Niederland - The Schreber Case - Psychoanalytic Profile of A Paranoid Personality-Routledge (1984) PDF
197 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
20 pages
Ministry of Higher Education and Scientific Research University of Technology Computer Engineering Department
No ratings yet
Ministry of Higher Education and Scientific Research University of Technology Computer Engineering Department
6 pages
Psychol Limits NN
No ratings yet
Psychol Limits NN
11 pages
Surrogate Gradient Learning in Spiking
No ratings yet
Surrogate Gradient Learning in Spiking
25 pages
Narx Model NN 0 Matlab
No ratings yet
Narx Model NN 0 Matlab
8 pages
Module 5(Chapter 10)
No ratings yet
Module 5(Chapter 10)
17 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
29 pages
5 Basic Neural Networks
No ratings yet
5 Basic Neural Networks
40 pages
DNN U2 Notes
No ratings yet
DNN U2 Notes
32 pages
M3 L4 RNN Regularization
No ratings yet
M3 L4 RNN Regularization
24 pages
Week-3 Module-2 Neural Network
No ratings yet
Week-3 Module-2 Neural Network
58 pages
Unit 2-stu-RG
No ratings yet
Unit 2-stu-RG
15 pages
Neural Networks, Radial Basis Functions, and Complexity
No ratings yet
Neural Networks, Radial Basis Functions, and Complexity
26 pages
Date: 08 Oct 2014 Recurrent Neural Network Architectures
No ratings yet
Date: 08 Oct 2014 Recurrent Neural Network Architectures
2 pages
Y W X y F (X) X W X: Neural Networks Viewed As Directed Graph
No ratings yet
Y W X y F (X) X W X: Neural Networks Viewed As Directed Graph
12 pages
Chap 7.2 Sequence Analysis Using RNN LSTM
No ratings yet
Chap 7.2 Sequence Analysis Using RNN LSTM
60 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
6 pages
Module5_notes
No ratings yet
Module5_notes
23 pages
A Note On The Equivalence of NARX and RNN
No ratings yet
A Note On The Equivalence of NARX and RNN
7 pages
Neural Network Basics 2.1 Neurons or Nodes and Layers
No ratings yet
Neural Network Basics 2.1 Neurons or Nodes and Layers
9 pages
Neural Network
No ratings yet
Neural Network
37 pages
Neural Networks
No ratings yet
Neural Networks
21 pages
Recurrent Neural Network
No ratings yet
Recurrent Neural Network
10 pages
Parallelized Deep Neural Networks
No ratings yet
Parallelized Deep Neural Networks
34 pages
Main Paper
No ratings yet
Main Paper
22 pages
Supervised Learning Unit 4-Neural Network
No ratings yet
Supervised Learning Unit 4-Neural Network
30 pages
Introduction To Neural Networks: CMSC475/675 Fall 2004
No ratings yet
Introduction To Neural Networks: CMSC475/675 Fall 2004
19 pages
NN Introduction
No ratings yet
NN Introduction
29 pages
Unit 4-Health care and Deep Learninh
No ratings yet
Unit 4-Health care and Deep Learninh
87 pages
FPGA Implementation of A Trained Neural Network: Seema Singh, Shreyashree Sanjeevi, Suma V, Akhil Talashi
No ratings yet
FPGA Implementation of A Trained Neural Network: Seema Singh, Shreyashree Sanjeevi, Suma V, Akhil Talashi
10 pages
Neural Networks From Scratch: 3.1 Formal Neuron
No ratings yet
Neural Networks From Scratch: 3.1 Formal Neuron
8 pages
Varios Modelos de Neuronas (Abrir)
No ratings yet
Varios Modelos de Neuronas (Abrir)
9 pages
Chapter 4 Data Sci
No ratings yet
Chapter 4 Data Sci
58 pages
Unit III (2) RNN, LSTM, Gru
No ratings yet
Unit III (2) RNN, LSTM, Gru
14 pages
Artificial Intelligence in Robotics-21-50
No ratings yet
Artificial Intelligence in Robotics-21-50
30 pages
Artificial NN
No ratings yet
Artificial NN
11 pages
Apicella et al. 2019_A simple and efficient architecture for trainable activation functions
No ratings yet
Apicella et al. 2019_A simple and efficient architecture for trainable activation functions
15 pages
3 8 BoykoN Bosik Ivanets
No ratings yet
3 8 BoykoN Bosik Ivanets
6 pages
Chapter 6 - Artificial Intelligence notes
No ratings yet
Chapter 6 - Artificial Intelligence notes
13 pages
Artificial Neural Networks: Part 1/3
No ratings yet
Artificial Neural Networks: Part 1/3
25 pages
Neural Networks
No ratings yet
Neural Networks
37 pages
Lec1 Inroduction To Neural Network
No ratings yet
Lec1 Inroduction To Neural Network
23 pages
ML_Lec-22
No ratings yet
ML_Lec-22
25 pages
Lecture 2.1.10 Characteristics of Neural Networks Terminology
No ratings yet
Lecture 2.1.10 Characteristics of Neural Networks Terminology
2 pages
Recurrent Neural Network
No ratings yet
Recurrent Neural Network
81 pages
A Brief Review of Feed-Forward Neural Networks
No ratings yet
A Brief Review of Feed-Forward Neural Networks
8 pages
XOR Problem Demonstration Using MATLAB
0% (1)
XOR Problem Demonstration Using MATLAB
19 pages
Unit-5 (1)
No ratings yet
Unit-5 (1)
10 pages
Partial Fractal
No ratings yet
Partial Fractal
10 pages
Recurrent Neural Networks: CSC2535 2013: Advanced Machine Learning
No ratings yet
Recurrent Neural Networks: CSC2535 2013: Advanced Machine Learning
57 pages
Lec 10 New
No ratings yet
Lec 10 New
57 pages
Deep Learning
No ratings yet
Deep Learning
15 pages
Unit 2
No ratings yet
Unit 2
18 pages
Session 1
No ratings yet
Session 1
8 pages
ANN Assignment
No ratings yet
ANN Assignment
10 pages
Comparison Study and Analysis of Implementing Activation Function of Machine Learning in MATLAB and FPGA
No ratings yet
Comparison Study and Analysis of Implementing Activation Function of Machine Learning in MATLAB and FPGA
10 pages
1.1 Artificial Neural Networks
No ratings yet
1.1 Artificial Neural Networks
56 pages
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
From Everand
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
Fouad Sabry
No ratings yet
Constructed Layered Systems: Measurements and Analysis
From Everand
Constructed Layered Systems: Measurements and Analysis
W. H. Cogill
No ratings yet
Troanary Photonic Storage Blueprint - How Light Based Logic can Redefine Computation and Data Storage: Volume 10 Troanary Photonic Storage Blueprint, #1
From Everand
Troanary Photonic Storage Blueprint - How Light Based Logic can Redefine Computation and Data Storage: Volume 10 Troanary Photonic Storage Blueprint, #1
Ylia Callan
No ratings yet
Teaching High School Quantum Computing Scenarios
No ratings yet
Teaching High School Quantum Computing Scenarios
6 pages
VRST 20
No ratings yet
VRST 20
11 pages
Qutrit Quantum Computer With Trapped Ions: NOT XOR XOR
No ratings yet
Qutrit Quantum Computer With Trapped Ions: NOT XOR XOR
7 pages
Bus Travel Time Prediction Under High
No ratings yet
Bus Travel Time Prediction Under High
12 pages
Choosingamixedmethodsdesign PDF
No ratings yet
Choosingamixedmethodsdesign PDF
31 pages
ARIMA Models For Bus Travel Time Prediction
No ratings yet
ARIMA Models For Bus Travel Time Prediction
11 pages
Dynamic Travel Time Prediction Models
100% (1)
Dynamic Travel Time Prediction Models
14 pages
Dynamic Freeway Travel-Time Prediction With Probe Vehicle Data, Link Based Versus Path Based
No ratings yet
Dynamic Freeway Travel-Time Prediction With Probe Vehicle Data, Link Based Versus Path Based
15 pages
Statistical Machine Learning For Information Retrieval - Adam Berger PDF
No ratings yet
Statistical Machine Learning For Information Retrieval - Adam Berger PDF
147 pages
Select: Cancer Metabolism
No ratings yet
Select: Cancer Metabolism
2 pages
Time Series
No ratings yet
Time Series
819 pages
E0505 CH1.6
No ratings yet
E0505 CH1.6
2 pages
The AI World in 2050 A Glimpse Into The Future
No ratings yet
The AI World in 2050 A Glimpse Into The Future
8 pages
Q1 Module 2 CPAR
No ratings yet
Q1 Module 2 CPAR
12 pages
Prescott Drink Menu July 2020
No ratings yet
Prescott Drink Menu July 2020
1 page
The Global Economy Notes
100% (1)
The Global Economy Notes
14 pages
BC 327 Datasheet
No ratings yet
BC 327 Datasheet
1 page
Rate of Change: Example 1 Determine All The Points Where The Following Function Is Not Changing
No ratings yet
Rate of Change: Example 1 Determine All The Points Where The Following Function Is Not Changing
5 pages
Marriage Story
No ratings yet
Marriage Story
167 pages
Food Packaging Materials
No ratings yet
Food Packaging Materials
2 pages
Automatic Scorer Catalog
100% (1)
Automatic Scorer Catalog
222 pages
Rx300s6-Label-Sb-System Board
No ratings yet
Rx300s6-Label-Sb-System Board
1 page
Arc Discharge Laser Ablation Chemical Vapor Deposition
No ratings yet
Arc Discharge Laser Ablation Chemical Vapor Deposition
6 pages
Statement of Feasibility and Feasibility Report: Final Report Final Report
No ratings yet
Statement of Feasibility and Feasibility Report: Final Report Final Report
56 pages
Chemistry of Essential Oils: Why The Functional Group Theory Is Wrong
100% (2)
Chemistry of Essential Oils: Why The Functional Group Theory Is Wrong
36 pages
Manual C3 en Es FR 001
No ratings yet
Manual C3 en Es FR 001
30 pages
Happiness - Notes PDF
No ratings yet
Happiness - Notes PDF
7 pages
Ampara District: Ministry of Transport and Highways Sri Lanka
No ratings yet
Ampara District: Ministry of Transport and Highways Sri Lanka
78 pages
UPQC Control Stategy
No ratings yet
UPQC Control Stategy
10 pages
710 Lets Talk About Climate Change
No ratings yet
710 Lets Talk About Climate Change
2 pages
Pork Meat Ball Fettucine
No ratings yet
Pork Meat Ball Fettucine
5 pages
Nutritional and Health Benefits of Carrots and Their Seed Extracts
No ratings yet
Nutritional and Health Benefits of Carrots and Their Seed Extracts
12 pages
Zolid Tab Leaflet Pakistan
No ratings yet
Zolid Tab Leaflet Pakistan
2 pages
Rotary Tables
No ratings yet
Rotary Tables
62 pages
Tolerance - E.O.T Cranes
No ratings yet
Tolerance - E.O.T Cranes
3 pages
Plant Virus Transmission BT Insects
No ratings yet
Plant Virus Transmission BT Insects
12 pages
Contact Analysis Techniques
No ratings yet
Contact Analysis Techniques
40 pages
____tecnica_1generac_8-10-13kva_50_hz
No ratings yet
____tecnica_1generac_8-10-13kva_50_hz
4 pages
g3 Gas
No ratings yet
g3 Gas
2 pages

What NARX Networks Can Compute

Uploaded by

What NARX Networks Can Compute

Uploaded by

W h a t N A R X N e t w o r k s Can C o m p u t e

Bill G. H o r n e 1, H a v a T. S i e g e l m a n n 2 a n d C. Lee Giles 1,3

1 NEC Research Institute, 4 Independence Way, Princeton, NJ 08540

Abstract. We prove that a class of architectures called NARX neural

2 Recurrent Neural Network Models

2.1 ~lly Connected Recurrent Neural Network

2.2 NARX Recurrent Neural Network

A NARX network consists of a Multilayer Perceptron (MLP) which takes as

t)(t) = k~ ( g ( t - n ~ ) , . . . , g ( t - 1),~(t),~(t- ny),...,~(t- 1)) , (3)

where the function ~ is the mapping performed by the MLP.

5~(t)=[ ~t(t-n~)... g(t-1) ~(t-ny) ... ~(t-1) ] .

D e f i n i t i o n 1. A function a is said to be a bounded, one-side saturated (BOSS)

Many sigmoid-like functions are BOSS functions including hard-limiting thresh-

3 Turing Equivalence of N A R X Networks

T h e o r e m 2. N A R X networks with one hidden layer of nodes with B O S S acti-

Proof. To prove the theorem we show how to construct a N A R X network iV

when the values of zj(t - 1) are stored in states X N T m - - i , m --~ 1 , . . . , N . It can

a([ Nm~_la~,mS:N+,~-~(k)+b~u(k)~-e~]~-~i[X2N-i+l(k)--].t])', (4)

NARX networks with one hidden layer of nodes with saturated

4 NARX Network with Hard-limited Neurons

NARX networks with hard-limiting activation functions, one hid-

Proo]. See [20].

C o r o l l a r y 5. For every F S M A/t, there exists an F M M which can simulate J~4

You might also like