0% found this document useful (0 votes)
4 views

Distributed Algorithms For Stochastic Source Seeking With Mobile Robot Networks

good

Uploaded by

894987633
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Distributed Algorithms For Stochastic Source Seeking With Mobile Robot Networks

good

Uploaded by

894987633
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Nikolay A.

Atanasov1
Distributed Algorithms
Department of Electrical and
Systems Engineering,
University of Pennsylvania,
for Stochastic Source Seeking
Philadelphia, PA 19104
e-mail: [email protected]
With Mobile Robot Networks
Jerome Le Ny Autonomous robot networks are an effective tool for monitoring large-scale environmen-
Department of Electrical Engineering and GERAD, tal fields. This paper proposes distributed control strategies for localizing the source of a
Ecole Polytechnique de Montreal, noisy signal, which could represent a physical quantity of interest such as magnetic force,
Montreal, QC H3T-1J4, Canada heat, radio signal, or chemical concentration. We develop algorithms specific to two sce-
e-mail: [email protected] narios: one in which the sensors have a precise model of the signal formation process
and one in which a signal model is not available. In the model-free scenario, a team of
George J. Pappas sensors is used to follow a stochastic gradient of the signal field. Our approach is distrib-
Department of Electrical and uted, robust to deformations in the group geometry, does not necessitate global localiza-
Systems Engineering, tion, and is guaranteed to lead the sensors to a neighborhood of a local maximum of the
University of Pennsylvania, field. In the model-based scenario, the sensors follow a stochastic gradient of the mutual
Philadelphia, PA 19104 information (MI) between their expected measurements and the expected source location
e-mail: [email protected] in a distributed manner. The performance is demonstrated in simulation using a robot
sensor network to localize the source of a wireless radio signal.
[DOI: 10.1115/1.4027892]

1 Introduction achieve a formation, which minimizes the estimation error. Simi-


larly, in Ref. [9], a circular formation is used to estimate the signal
The ability to detect the source of a signal is a fundamental
gradient in a distributed manner based on a Newton–Raphson con-
problem in nature. At a microscopic level, some bacteria are able
sensus method. A drawback of these works is the assumption that
to find chemical, light, and magnetic sources [1,2]. At a macro-
the sensor formation is maintained perfectly throughout the execu-
scopic level, similar behavior can be observed in predators who
tion of the algorithm which is hardly possible in a real environ-
seek a food source using their sense of smell. Reproducing this
ment. In this paper, imperfect formations are explicitly handled by
behavior in mobile robots can be used to perform complex mis-
recomputing the correct weights necessary to combine the sensor
sions such as environmental monitoring [3,4], intelligence, sur-
observations at every measurement location. Choi et al. [11,12]
veillance, and reconnaissance [5], and search and rescue
present a general distributed learning and control approach for
operations [6].
sensor networks and apply it to source seeking. The sensed signal
This paper discusses how to control a team of mobile robotic
is modeled by a network of radial basis functions (RBFs) and re-
sensors to locate the source of a noisy signal, which represents a
cursive least squares are used to obtain the model parameters.
physical quantity such as magnetic force, heat, radio signal, or
Instead of a sensor network, a single vehicle may travel to several
chemical concentration. We distinguish between two cases:
sensing locations in order to collect the same measurements
model-free and model-based. The first scenario supposes that the
[13–17]. While costly maneuvers are required to climb the gradi-
sensors receive measurements without knowledge of the signal
ent effectively, in our previous work [18], we discussed the pa-
formation process. This is relevant when the signal is difficult to
rameter choices which enable good performance.
model or the environment is unknown a priori. Online modeling
In the model-based scenario, we choose the next configuration
of the signal requires time and computational resources and might
for the sensing team by maximizing the MI between the source
not be feasible, especially on small platforms and in time-critical
location estimate and the expected measurements. Even if all pose
missions. In contrast, the second scenario supposes the sensors
and measurement information are available at a central location,
have an accurate signal model which can be exploited to localize
evaluating the MI utility function is computationally demanding.
the source, potentially faster and with better accuracy.
Charrow et al. [19] focus on approximating MI when the sensed
signal is Gaussian and the sensors use a particle filter to estimate
1.1 Related Work. Our model-free source-seeking approach the source location. Hoffman and Tomlin [20] compute the expec-
consists in climbing the gradient of the signal field by using a sto- tation over the measurements only for pairs of sensors, thus
chastic approximation (SA) technique to deal with the underlying decreasing the dimension of the required integration. Instead of
noise. Our strategy is robust to deformations in the geometry of MI, in this work, we approximate the MI gradient. A related work
the sensor network and can be applied to sensors with limited which uses the MI gradient is Ref. [21], in which the computa-
computational resources and no global localization capabilities. tional complexity is reduced by integrating over binary sensor
Recent model-free source-seeking work which uses a sensor for- measurements and only for sensors whose fields of view overlap.
mation to ascend the gradient of the signal field includes Refs. [3] A fully distributed approach based on belief consensus is pro-

and [7–9]. Ogren et al. [3] use artificial potentials to decouple the posed in Ref. [22]. This paper is also related to consensus control,
formation stabilization from the gradient ascent. Centralized least which seeks agreement in the states of multi-agent dynamical sys-
squares are used to estimate the signal gradient. A distributed tems. Recent results [23,24] address switching topologies, nontri-
approach for exploring a scalar field using a cooperative Kalman vial delays, and asynchronous estimation but with the main
filter is presented in Ref. [10]. The authors design control laws to difference that the sensors agree on their own states, while in this
work they need to agree on the exogenous state of the source.
1
Corresponding author.
Contributed by the Dynamic Systems Division of ASME for publication in the
JOURNAL OF DYNAMIC SYSTEMS, MEASUREMENT, AND CONTROL. Manuscript received
1.2 Contributions. We develop a distributed approach for
January 31, 2014; final manuscript received June 10, 2014; published online October stochastic source seeking using a mobile sensor network, which
21, 2014. Assoc. Editor: Dejan Milutinovic. does not rely on a model of the signal field or global localization.

Journal of Dynamic Systems, Measurement, and Control MARCH 2015, Vol. 137 / 031004-1
C 2015 by ASME
Copyright V

Downloaded From: http://dynamicsystems.asmedigitalcollection.asme.org/ on 01/29/2016 Terms of Use: http://www.asme.org/about-asme/terms-of-use


Our method uses a finite-difference (FD) scheme to estimate the In the model-based case, the sensors have accurate knowledge
signal gradient correctly, even when the sensor formation is not of h(, ) which can be exploited to maximize the information that
maintained well. In the model-based case, we show that a SA to future measurements provide about the signal source. We choose
the MI gradient using only a few predicted signal measurements is MI as a measure of informativeness and formulate the following
enough to provide good control performance. This is in contrast problem.
with existing work which insists on improving the quality of the Problem 2.2 (Model-Based Source Seeking). Given the sensor
gradient estimate as much as possible. poses xt1 2 X n and a prior distribution of the source state y at
time t – 1, choose the control input ut 2 U n , which optimizes the
1.3 Paper Organization. In Sec. 2, we describe the consid- following:
ered source-seeking scenarios precisely. Our model-free and
model-based approaches are discussed in detail in Secs. 3 and 4, max Iðy; zt jxt Þ
u1;t ;…;un;t
respectively, assuming all-to-all communication among the sen-
sors. Distributed versions are presented and analyzed in Sec. 5. s:t: xi;t ¼ xi;t1 þ ui;t ; i ¼ 1; …; n (3)
Finally, in Sec. 6, we present an application to wireless radio
source localization and compare the performance of the two zi;t ¼ hðxi;t ; yÞ þ vi;t ; i ¼ 1; …; n
methods.
We resort to SA methods in both scenarios and emphasize their
2 Problem Formulation usefulness in simplifying the algorithms, while providing theoretic
guarantees about the performance.
Consider a team of n sensing robots with states
fx1;t ; …; xn;t g  X ffi Rdx at time t. The states typically comprise
pose and velocity information but might include other operational 3 Model-Free Source Seeking
parameters too. At a high-level planning stage, we suppose
that the vehicles have discrete single-integrator dynamics 3.1 Model-Free Algorithm. Our model-free approach is to
xi,tþ1 ¼ xi,t þ ui,t, where ui;t 2 U is the control input to sensor i. design an iterative optimization scheme, which causes the centroid
The task is to localize a static signal source, whose unknown state mt of the robot formation to ascend the gradient g(x, y): ¼ rxh(x, y)
is y 2 Y ffi Rdy . The state captures the source position and other of the measurement signal. The gradient ascent leads mt to a (often
observable properties of interest. At time t, each sensor i has local) maximum of the signal field, which is appropriate in view of
access to a noisy measurement zi;t 2 Z ffi Rdz of the signal gener- Assumption (2). In detail, the desired dynamics for the centroid are
ated by y
mtþ1 ¼ mt þ ct gðmt ; yÞ (4)
zi;t ¼ hðxi;t ; yÞ þ vi;t (1)
A complication arises because the sensors do not have access to
g(, y) and can only measure a noisy version of h(, y) at their cur-
where vi,t is the measurement noise, whose values are independent
rent positions. Supposing noise-free measurements for now, the
at any pair of times and among sensors. The noise depends on the
sensors can approximate the signal gradient at the formation cent-
states of the sensor and the source, i.e., vi,t(xi,t, y), but to simplify
roid via a FD scheme
notation we do not make it explicit. We assume that the noise is
zero mean and has a finite second moment, i.e., Evi;t ¼ 0; 8i; t; xi;t 0 1
and trðE½vi;t vTi;t Þ < 1. In the reminder, we use the notation hðx1;t ; yÞ
 T  T  T B .. C
xt :¼ xT1;t ; …; xTn;t ; ut :¼ uT1;t ; …; uTn;t ; zt :¼ zT1;t ; …; zTn;t ; gðmt ; yÞ ¼ rx hðmt ; yÞ ¼ Wðxt ÞB
@ .
C  bt
A (5)
 T
and vt :¼ vT1;t ; …; vTn;t :
hðxn;t ; yÞ
In the model-free scenario, the sensors simply receive measure-
ments without knowing the signal model h(, ). We suppose that
the team where Wðxt Þ 2 Rdx n is a matrix of FD weights, which depends
P adopts some arbitrary formation, with center of mass
mt :¼ ni¼1 xi;t =n at time t, which can be enforced using potential on the sensor states xt, and bt 2 Rdx captures the error in the
fields [3] or convex optimization [25]. The sensors use the cent- approximation. The most natural way to obtain the FD weights is
roid mt as the estimate of the source state y at time t and try to to require that the approximation is exact for a set of test functions
lead it toward the true source location based on the received meas- wi, i ¼ 1,…, n, commonly polynomials, which can represent the
urements. Let f : X ! Y be a known transformation, which maps shape of g(, y). In particular, the following relation needs to hold:
the team centroid to a source estimate. For example, if the robot
state space captures both position and orientation, e.g., 2@ 3
2 3 w1 ðmt Þ
X ¼ SEð2Þ, but we are interested only in position estimates for w1 ðx1;t Þ    w1 ðxn;t Þ 6 @x 7
the source, e.g., Y ¼ R2 , then f will be the projection which 6 7 6 7
6 .. .. 7Wðxt ÞT ¼ 6
6 .
.
7
7 (6)
extracts the position components from the centroid mt 2 X . We 4 . . 5 6 . 7
consider the following problem. 4 5
wn ðx1;t Þ    wn ðxn;t Þ @
Problem 2.1 (Model-Free Source Seeking). Assume that the wn ðmt Þ
measurement signal in Eq. (1) is scalar2 and its expectation is @x
maximized at the true state y of the source
where ð@=@xÞwi ðxÞ is a row vector of partial derivatives. When
f 1 ðyÞ 2 arg max hðx; yÞ (2) xi;t 2 R, the most common set of test functions are the monomials
x2X wi(x) ¼ xi1, in which case (6) becomes a Vandermonde system.
The standard (monomial) FD approach is problematic when the
Generate a sequence of control inputs u0, u1,… for the team of states xi,t are high-dimensional and not in a lattice configuration
sensors in order to drive its centroid mt toward a maximum of the because the system in Eq. (6) becomes ill-conditioned.
 These dif-
signal field h(, y). ficulties are alleviated by using RBFs wi ðxÞ :¼ /ð x  xi;t Þ as
test functions.
2
In particular, using Gaussian RBFs, /ðdÞ
2
The assumption is made only to simplify the presentation of the gradient ascent
:¼ eðddÞ , with shape parameter d > 0, guarantees that Eq. (6) is
approach in the model-free case. The approach generalizes to signals of higher nonsingular [26]. Then, the FD weights obtained from Eq. (6) as a
dimension. function of xt are

031004-2 / Vol. 137, MARCH 2015 Transactions of the ASME

Downloaded From: http://dynamicsystems.asmedigitalcollection.asme.org/ on 01/29/2016 Terms of Use: http://www.asme.org/about-asme/terms-of-use


Wðxt Þ ¼ Rðxt ÞT Uðxt ÞT (7) X
1 X
1
ct ¼ 1 and c2t < 1;
2
where, for x 2 X n , we let Uij ðxÞ :¼ ed kxj xi k2 and
2 t¼0 t¼0

2  2 3 (A3) {Dt} is martingale difference sequence with respect to the


 X n  family of r-algebras F t :¼ rðm0 ; Ds ; 0 s tÞ, i.e., Dt
6   !T 7
d2 x1  xi =n
6 X n 7 is measurable with respect to F t ; E½kDt k < 1, and
6 2d2 e  i¼1

2 x1  xi =n 7 E½Dtþ1 jF t  ¼ 0 a.s. for all t 0. Also, Dt is square inte-
6 7
6 i¼1 7 grable with E½kDtþ1 k2 jF t  Kð1 þ kmt k2 Þ a.s. for t 0
6 7
6 7 and some constant K > 0;
RðxÞ :¼ 6 6
.. 7
7 (8)
6 . 7 (A4) {mt} is bounded, i.e., sup t kmt k < 1 a.s.;
6  2 7 (A5) {bt} is bounded and bt ! 0 a.s. as t ! 1.
6  X n  7
6   ! T7
6 d xn 
2
xi =n 7 The proposed source-seeking algorithm (9) can be converted to
4 2   X n
5
2d e i¼1 2 xn  xi =n the SA form (10) as follows:
i¼1
0 1
hðx1;t ; yÞ þ v1;t
Since the measurements are noisy, sensor i can observe only zi,t B .. C
rather than h(xi,t, y). As a result, the gradient ascent (4) can be mtþ1 ¼ mt þ ct Wðxt Þzt ¼ mt þ ct Wðxt ÞB
@ .
C
A
implemented only approximately via g(mt, y)  W(xt)zt instead of hðxn;t ; yÞ þ vn;t
Eq. (5) and with the additional complication that the measurement
noise makes the iterates mt random. Our stochastic model-free ¼ mt þ ct ðgðmt ; yÞ þ bt þ Wðxt Þvt Þ
source-seeking algorithm is
where the second equality follows from Eq. (5). Assumption (A1)
mtþ1 ¼ mt þ ct Wðxt Þzt (9) ensures that m_ ¼ gðm; yÞ has a unique solution for any initial con-
dition and any fixed source state y. Assumption (A2) can be satis-
The convergence of similar source-seeking schemes is often fied by an appropriate choice of the step-size, e.g., ct ¼ 1/(t þ 1).
studied in a deterministic framework [3] by assuming that the The selection of proper step-sizes is an important practical issue
noise can be neglected, which is difficult to justify. In Sec. 3.2, we that is not emphasized in this paper but is discussed at length in
show that the center of mass mt, following the dynamics (9) with Refs. [18], [27], and [30]. We can satisfy (A4) by requiring that
appropriately chosen step-sizes ct, converges to a neighborhood of the environment X of the sensors is bounded and if necessary use
a local maximum of h(, y). Assuming all-to-all communication or a projected version of the gradient ascent [28, Chap. 5.4]. This
a centralized location, which receives all state and measurement also ensures that the FD weights are bounded and in turn (A3) is
information from the sensors, the stochastic gradient ascent (9) satisfied
can be implemented as is. It requires that the sensors are localized  2 h i h i
relative to one another, i.e., in the inertial frame of one sensor, but EkDt k2 EkDt k22 ¼ E kDt k22 jF t1 ¼ E kWðxt Þvt k22
not globally, in the world frame. Notably, it is also not important
X
n
to maintain a rigid sensor formation as the correct FD weights kWðxt Þk22 Ekvt k22 ¼ kWðxt Þk22 trðE½vi;t vTi;t Þ < 1
necessary to combine the observations are recomputed at every i¼1
measurement location. Section 3.2 shows that the only require-
ment is that the sensor team is not contained in a subspace of Rdx E½Dt jF t1  ¼ E½Wðxt Þvt  ¼ Wðxt ÞEvt ¼ 0
when measuring (e.g., at least three noncollinear sensors are
needed for dx ¼ 2). since the measurement noise in Eq. (1) is uncorrelated in time and
has zero mean and a finite second moment. Note that the error
3.2 Convergence Analysis. To carry out the convergence term in Eq. (5) violates (A5) because it does not converge to 0.
analysis of the stochastic gradient ascent in Eq. (9), we resort to However, if we ensure that the sensor formation is not contained
the theory of SAs [27,28]. It is sufficient to consider the following in a subspace of Rdx , then bt remains bounded by some e0 > 0,
SA algorithm: i.e., supt kbt k e0 . Then, the argument in Ref. [28, Chap. 5, Theo-
rem 6] shows that the iterates mt converge a.s. to a small neighbor-
mtþ1 ¼ mt þ ct ðgðmt Þ þ bt þ Dt Þ (10) hood of a local maximum, whose size depends on e0. The result is
summarized below:
where bt is a bias term, Dt is a random zero-mean perturbation, ct THEOREM 1. Suppose that the gradient g(x, y) ¼ rxh(x, y) of the
is a small step-size, and mt is a random sequence whose asymp- measurement signal is Lipschitz continuous3 in x, the step-sizes ct
in Eq. (9) satisfy (A2), the sensor state space X is bounded,
totic behavior is of interest. The main result is that the iterates mt
in Eq. (10) asymptotically follow the integral curves of the ordi- and the sensor formation is not contained in a subspace of Rdx at
nary differential equation (ODE) m_ ¼ gðmÞ. Since in our case the measurement locations. Then, algorithm (9) converges to a
small neighborhood around a local maximum of the signal field
with a fixed source state y, g(m): ¼ rxh(m, y), the ODE method
(Ref. [28] Chap. 2 and Ref. [29]) shows that the iterates {mt} h(, y).
almost surely (a.s.) converge to the set fxjrx hðx; yÞ ¼ 0g of criti-
cal points of h(, y) under the following Assumptions:3
4 Model-Based Source Seeking
(A1) The map g is Lipschitz continuous;4
(A2) Step-sizes {ct} are positive scalars satisfying 4.1 Model-Based Algorithm. In this section, we address
Problem 2.2 assuming all-to-all communication. The sensors can
follow the gradient of the cost function in Eq. (3) to reach a local
maximum:
3
While Assumptions (A1)–(A5) are sufficient to prove the convergence in our
application, they are by no means the weakest possible. If necessary some can be xtþ1 ¼ xt þ ct rx Iðy; zt jxÞjx¼xt (11)
relaxed using the results in stochastic approximation [27,28].
4
Given two metric spaces ðX ; dx Þ and ðG; dg Þ, a function g : X ! G is Lipschitz
continuous if there exists a real constant 0 L < 1 such that: where ct is the step-size at time t. Let pðzjy; xÞ denote the probabi-
dg ðgðx1 Þ; gðx2 ÞÞ Ldx ðx1 ; x2 Þ; 8x1 ; x2 2 X . lity density function (pdf) of the measurement signal in Eq. (1).

Journal of Dynamic Systems, Measurement, and Control MARCH 2015, Vol. 137 / 031004-3

Downloaded From: http://dynamicsystems.asmedigitalcollection.asme.org/ on 01/29/2016 Terms of Use: http://www.asme.org/about-asme/terms-of-use


Let pt(y) be the pdf used by the sensors at time t to estimate the 4.2 Implementation Details. To implement the stochastic
state of the source, which is assumed independent of xt. The fol- gradient ascent in Eq. (13), the sensors need to propagate pt(y)
lowing theorem gives an expression for the MI gradient provided over time and sample from pt ðzt jxt Þ. We achieve the first require-
that pðzjy; xÞ is differentiable with respect to the sensor ment by a particle filter [32, Chap. 4], which approximates pt(y)
configurations. m Np
by a set of weighted samples fwm t ; yt gm¼1 as follows:
THEOREM 2 (Ref. [31]). Let random vectors Y and Z be jointly PNp m m
pt ðyÞ  m¼1 wt dðy  yt Þ, where d() is a Dirac delta function.
distributed with pdf pðy; zjxÞ, which is differentiable with respect
to the parameter x 2 X. Suppose that the support of pðy; zjxÞ does Using the particle set, we can write p and the measurement pdf as
not depend on x. Then, the gradient with respect to x of the MI follows:
between Y and Z is Np
X rx pðzjym pðzjym
ðð t ; xÞ t ; xÞ
pðzjy; xÞ pt ðz; xÞ  wm
t log
rx IðY; ZjxÞ ¼ ðrx pðy; zjxÞÞ log dydz m¼1
pðzjxÞ pðzjxÞ
pðzjxÞ
Np
X
where pðzjy; xÞ and pðzjxÞ are the conditional and the marginal pt ðzjxÞ  wm m
t pðzjyt ; xÞ
pdfs of Z, respectively. m¼1
Obtaining the MI gradient is computationally very demanding
for two reasons. First, an approximate representation is needed for where pðzjy; xÞ and its gradient can be decomposed further
the continuous pdfs in the integral. Second, at time t, the integra-
tion is over the collection of all sensor measurements Y
n
 T
zt ¼ zT1;t ; …; zTn;t , which can have a very high dimension in pðzjy; xÞ ¼ pðzj jy; xj Þ
j¼1
practice. As mentioned in Sec. 1, most existing work has focused
on accurate approximations. However, Theorem 2 allows us to @pðzjy; xÞ @pðzk jy; xk Þ Y
¼ pðzj jy; xj Þ
make a key observation @xk @xk j6¼k
ð
rx Iðy; zt jxt Þ ¼ E½pt ðzt ; xt Þ jxt  ¼ pt ðzt ; xt Þpt ðzz jxt Þdzt
due to the independence of the observations in Eq. (1). In practice,
where there is a trade-off between moving the sensors and spending time
ð approximating the gradient of the MI (12). The SA in Eq. (13)
rx pðzjy; xÞ pðzjy; xÞ (12) uses a single sample from pt ðjxt Þ but if sampling is fast compared
pt ðz; xÞ :¼ pt ðyÞ log dy
pt ðzjxÞ pt ðzjxÞ to the time needed to relocate the sensors, more samples can be
ð
used to get a better estimate of the gradient. We use Monte Carlo
pt ðzjxÞ :¼ pðzjy; xÞpt ðyÞdy integration, which proceeds as follows:
N
(1)  from the discrete distribution w1t ; …; wt p .
Sample mðlÞ
where the independence between y and xt is used for the decompo- 
mðlÞ
sition: pt ðy; zjxÞ ¼ pðzjy; xÞpt ðyÞ. Relying on the signal model, the (2) Sample zðlÞ from the pdf pðjyt ; xt Þ.
sensors can simulate realizations of the random variable zt, iid (3) Repeat steps (1) and (2) to obtain Nz samples fzðlÞgNl¼1
z
.
1
PNz
with pdf pt ðzt jxt Þ. Instead of computing the integral in Eq. (12) (4) Approximate: rx Iðy; zt jxt Þ  Nz l¼1 pt ðzðlÞ; xt Þ.
needed for the gradient ascent (11), we propose the following sto-
Note that the advantage of improving the gradient estimate is not
chastic algorithm:
clear and should not necessarily be prioritized over the sensor
xtþ1 ¼ xt þ ct pt ðzt ; xt Þ (13) motion. The SA techniques show that even an approximation with
a single sample is sufficient to make progress. In contrast, the
related approaches mentioned in the Introduction insist on
It can be written in the SA form (10) as follows: improving the quality of the gradient estimate as much as possi-
ble. Depending on the application, this can slow down the robot
xtþ1 ¼ xt þ ct Ezt ½pt ðzt ; xt Þjxt  þ ct Dt motion and possibly make the algorithms impractical. Our more
¼ xt þ ct ðrx Iðy; zt jxt Þ þ Dt Þ flexible approach adds an extra degree of freedom by allowing a
trade-off between the gradient estimation quality and the motion
speed of the sensors.
where Dt :¼ pt ðzt ; xt Þ  E½pt ðzt ; xt Þjxt . To evaluate the conver-
gence, we consider Assumptions (A1)–(A5) again. As before, sat-
isfaction of (A2) is achieved by a proper step-size choice, while 5 Distributed Algorithms
(A4) holds due to the bounded workspace X . Assumption (A5) is
In many scenarios, all-to-all communication is either infeasible
satisfied because in this case the bias term is zero. To verify (A3),
or prone to failures. In this section, we present distributed versions
note that Dt is measurable with respect to F t ¼ rðx0 ; Ds ;
of the model-free and the model-based algorithms. Since the
0 s tÞ and for t 1
model-free algorithm should be applicable to light-weight plat-
forms with no global localization capabilities, the sensors use
E½Dt jF t1  ¼ E½pt ðzt ; xt Þ  E½pt ðzt ; xt Þjxt jF t1 
noisy relative measurements of their neighbors’ locations to esti-
¼ E½pt ðzt ; xt ÞjF t1   E½pt ðzt ; xt Þjxt  ¼ 0 mate the collective formation state. In the model-based case, the
sensors may spread around the environment and we are forced to
assume that each agent is capable of estimating its own state xi,t.
Finally, if pðzjy; xÞ and its gradient rx pðzjy; xÞ are sufficiently reg- We begin with preliminaries on distributed estimation.
ular (e.g., the former is bounded away from zero and the latter is
Lipschitz continuous and bounded), the square integrability condi-
tion on Dt is satisfied. 5.1 Preliminaries on Distributed Estimation. Let the com-
This analysis demonstrates that even if a single zt sample is munication network of the n sensors be represented by an undir-
used to approximate the MI gradient (instead of the integration in ected graph G ¼ ({1,…,n}, E). Suppose that the sensors need to
Eq. (12)), the stochastic gradient ascent (13) will converge to a estimate an unknown static parameter h*僆 H, where H Rdh is
local maximum of the MI between the source state and the future a convex space. At discrete times k 2 N, each sensor i receives a
sensor measurements. random sequence of iid signals si ðkÞ 2 Rdi , which are drawn from

031004-4 / Vol. 137, MARCH 2015 Transactions of the ASME

Downloaded From: http://dynamicsystems.asmedigitalcollection.asme.org/ on 01/29/2016 Terms of Use: http://www.asme.org/about-asme/terms-of-use


a distribution with conditional pdf li ðjhÞ and are independent of parameter h* is stationary, the particle positions hmi;k will remain
those received by the other sensors. These individual signals, the same across the sensors for all time. The update equation of
although potentially informative, do not reveal the parameter the distributed filter in Eq. (14), specialized to particle distribu-
completely, i.e., each sensor faces a local identification problem. tions, only needs to propagate the particle importance weights wm i;k
We assume, however, that h* is identifiable if one has access to and is summarized in Algorithm 1.
the signals received by all sensors. Thus, individual sensors need
to supplement their local observations with information communi- Algorithm 1: Distributed particle filter at sensor i
cated with their neighbors. In particular, each sensor i propagates m
a pdf pi;k : H ! R 0 over the parameter space 1: Input: Particle sets fwm j;k ; hj;0 g for m ¼ 1,…, Np and

Y j 2 Ni [ fig, private signal si(k þ 1), and pdf li ðjÞ


pi;kþ1 ðhÞ ¼ gi;k li ðsi ðk þ 1ÞjhÞ ðpj;k ðhÞÞaij 2: Output: Particle weights fwm
i;kþ1 g for m ¼ 1,…, Np
j2Ni [fig P
m
3: Average priors: w i;k exp m
j2Ni [fig aij logðwj;k Þ
h^i ðkÞ 2 arg max pi;k ðhÞ (14) m
h2H 4: Update: wm
i;kþ1 m
w i;k li ðsi ðk þ 1Þjhi;0 Þ for m ¼ 1,…, Np
5: Normalize the weights
where gi,k is a normalization constant ensuring that pi,kþ1 is a 6: Return fwm i;kþ1 g for m ¼ 1,…, Np
proper pdf, Ni is the set of nodes
P (neighbors) connected to sensor
i, and aij are weights such that j2Ni [fig aij ¼ 1. The update is the
same as the standard Bayes rule with the exception that sensor i 5.2 Distributed Model-Free Algorithm. To distribute the
does not just use its own prior but a geometric average of its model-free algorithm (9), the sensor formation needs to estimate
neighbors’ priors. Given that G is connected, Rad and Tahbaz- its configuration xt, the centroid mt, and the SA to the signal gradi-
Salehi [33] show that the distributed estimator (14) is weakly con- ent W(xt)zt at each measurement location (i.e., at each time t)
sistent5 under broad assumptions on the signal models li ðjhÞ. The using only local information. We introduce a fast time-scale k ¼ 0,
results in Refs. [34] and [35] suggest that this algorithm is even 1,…, which will be used for the estimation procedure at each time
applicable to a time-varying graph topology with asynchronous t. During this, the sensors remain stationary and we drop the index
communication. t to simplify the notation. As mentioned earlier, we suppose that
each sensor i receives a relative measurement of the state of each
5.1.1 Specialization to Gaussian Distributions. We now spe- of its neighbors j 2 Ni
cialize the general scheme of Ref. [33] to Gaussian distributions.
sij ðkÞ ¼ xj  xi þ eij ðkÞ; eij ðkÞ N ð0; Eij Þ (17)
To our knowledge, this specialization is new and the theorem
obtained below (Theorem 3) shows that the resulting distributed
linear Gaussian estimator is mean square consistent,6 which is where eij(k) is the measurement noise, which is independent at any
stronger than the weak consistency4 shown in Ref. [33, Theorem pair of times on the fast time-scale and across sensor pairs. If each
1]. Suppose that the agents’ measurement signals are linear in the sensor manages to estimate the states of the whole sensor forma-
parameter h* and perturbed by Gaussian noise tion using the measurements {sij(k)}, then each can compute the
FD weights in Eq. (7) on its own.
si ðkÞ ¼ Hi h þ ei ðkÞ; ei ðkÞ N ð0; Ei Þ; 8i (15) The distributed linear Gaussian estimator (16) can be employed
to estimate the sensor states x. Notice that it is sufficient to esti-
Let Gðx; XÞ denote a Gaussian distribution (in information space) mate x in a local frame because neither the FD computation (7)
with mean X1x and covariance matrix X1. Since the private nor the gradient ascent (9) requires global state information.
observations (15) are linear Gaussian, without the loss of general- Assume that all sensors know that sensor  1 is the origin at
ity, the pdf pi,k of agent i is the pdf of a Gaussian Gðxi;k ; Xi;k Þ. every measurement location. Let x :¼ 0T ðx2  x1 ÞT   
Exploiting that the parameter h* is static, the update equation of ðxn  x1 ÞT T denote the true sensor states in the frame of sensor 1.
the distributed filter in Eq. (14), specialized to Gaussian distribu- Let x^i ðkÞ denote the estimate that sensor i has of x* at time k on
tions, is the fast time-scale. The vector form of the measurement equations
(17) is
X
xi;kþ1 ¼ aij xj;k þ HiT E1
i si ðkÞ
j2Ni [fig
sðkÞ ¼ ðB  Idx ÞT x þ eðkÞ (18)
X
Xi;kþ1 ¼ aij Xj;k þ HiT E1
i Hi (16) where B is the incidence matrix of the communication graph G.
j2Ni [fig The measurements (18) fit the linear Gaussian model in Eq. (15).
h^i ðkÞ :¼ X1
i;k xi;k
Since the first element of x* is always 0, only (n – 1)dx compo-
nents need to be estimated. As the rank of B  Idx is also (n – 1)dx,
Theorem 3 allows us to use the distributed estimator (16) to
In this linear Gaussian case, we prove (in Ref. [36]) a strong result
update x^i ðkÞ.
about the quality of the estimates in Eq. (16).
Concurrently with the state estimation, sensor i would be
THEOREM 3. Suppose that the communication graph G is con-
 T obtaining observations zi,t(k) of the signal field for k ¼ 0, 1,….7 In
nected and the matrix H1T … HnT has rank dh. Then, the the centralized case (Sec. 3), each sensor uses the following gra-
estimates
 (16) of all agents converge in mean square to h*, i.e., dient approximation:
2 
 
lim E h^i ðkÞ  h  ¼ 0; 8i X
n
k!1
gðmt ; yÞ  Wðxt Þzt ¼ coli ðWðxt ÞÞzi;t (19)
i¼1
5.1.2 Specialization to Particle Distributions. Suppose that
m Np
the pdf pi,k is represented by a set of particles fwm i;k ; hi;k gm¼1 ,
which are identical for all sensors initially (at k ¼ 0). Since the where coli ðWðxt ÞÞ denotes the ith column of the FD-weight ma-
trix. Since xt and zt are not available in the distributed setting,
each sensor can use its local measurements zi,t(k) and its estimate
5
Weak consistency means that the estimates h^i ðkÞ converge in probability to h*,
i.e. limk!1 Pðkh^i ðkÞ  h k eÞ ¼ 0 for any e > 0 and all i.
6
Mean-square consistency means that the estimates h^i ðkÞ converge in mean- 7
The time-scales of the relative state measurements and the signal measurements
square to h*, i.e. limk!1 E½kh^i ðkÞ  h k2  ¼ 0 for all i. might be different but for simplicity we keep them the same.

Journal of Dynamic Systems, Measurement, and Control MARCH 2015, Vol. 137 / 031004-5

Downloaded From: http://dynamicsystems.asmedigitalcollection.asme.org/ on 01/29/2016 Terms of Use: http://www.asme.org/about-asme/terms-of-use


x^it ðkÞ of the sensor states to form its own local estimate of the sig- As a result of Theorem 5 and the SA algorithm in Eq. (13), sen-
nal gradient sor i updates its pose as follows:

1 X k
xi;tþ1 ¼ xi;t þ ct pt ðzfig[V i ;t ; xfig[V i ;t Þ (22)
xit ðkÞÞÞ
g^i;t ðkÞ :¼ coli ðWð^ zi;t ðsÞ (20)
k þ 1 s¼0
This update is still not completely distributed as it requires knowl-
In order to obtain an approximation to g(mt, y) as in Eq. (19) in a edge of xV i ;t and the pdf pt.8 We propose to distribute the computa-
distributed manner, we use a high-pass dynamic consensus filter tion of pt via the distributed particle filter (Algorithm 1). Then,
[37] to have the sensors agree on the value of the sum each sensor maintains its own estimate of the source pdf, pi,t, rep-
! resented by a particle set fwm m
i;t ; yi;t g. Given a new measurement,
1X n
zi,tþ1, sensor i averages its prior, pi,t, with the priors of its neigh-
g^t ðkÞ :¼ n g^i;t ðkÞ
n i¼1 bors and updates it using Bayes rule. Finally, to obtain xV i ;t we use
a flooding algorithm (Algorithm 2). The convergence analysis of
Each node maintains a state qi,k, receives an input lik, and pro- the gradient ascent scheme in the distributed case (22) remains the
vides an output rik with the following dynamics: same as in Sec. 4 because each sensor i computes the complete
X X MI gradient. This is possible because due to Theorem 5 the states
qi;kþ1 ¼ qi;k þ b ðqj;k  qi;k Þ þ b ðlj;k  li;k Þ and measurements of distant sensors are not needed, while Algo-
j2Ni j2Ni (21) rithm 2 provides the information from the nearby sensors.
ri;k ¼ qi;k þ li;k
Algorithm 2: State exchange algorithm at sensor i
where b > 0 is a step-size. For a connected network, P Ref. [37, 1: Input: Communication radius rc, sensing radius rs, state xi
Theorem 1] guarantees that ri,k converges to 1=n i li;k as k 2: Output: Array ai with ai[j] ¼ xj if j 2 V i [ fig and
! 1. The following result can be shown by letting li;k :¼ g^i;t ðkÞ ai[j] ¼ empty else
and is proved in the Appendix.
THEOREM 4. Suppose that the communication graph G is 3: ai ½i xi ; ai ½j empty; j 6¼ i ä Holds the required sen-
strongly connected. If the sensor nodes estimate their states x* from sor states
the relative measurements (18) using algorithm (16), compute the 4: b minfceilð2rs =rc Þ; ng ä Number of rounds needed
FD weights (7) using the state estimates, and run the dynamic con- 5: for k ¼ 1…b do
sensus filter (21) with input li;k :¼ g^i;t ðkÞ, which was defined in 6: Send ai to neighbors Ni , receive {aj} from j 2 Ni
Eq. (20), then the output ri,k of the consensus filter satisfies 7: for j 2 Ni do
8: for l ¼ 1…n do
n lim E½ri;k  ¼ gðm ; yÞ þ b; 8i 2 f1; …; ng 9: if ðai ½l¼ emptyÞ&&ðaj ½l6¼ emptyÞ then ai ½l aj ½l
k!1
P 6 Applications
where g(m*, y) is the true signal gradient at m :¼ ni¼1 xi =n and
b is the error in the FD approximation (5). The performance of the source-seeking algorithms is demon-
After this procedure, the agents agree on a centroid for the for- strated in simulation using a team of ten sensors to localize the
mation and a gradient estimate, which can be used to compute the source of a wireless radio signal. A radio signal is suitable for
next formation centroid according to Eq. (9). Since the FD comparing the two algorithms because it is very noisy and diffi-
weights are recomputed at every t, the formation need not be cult to model and yet most approaches for wireless source seeking
maintained accurately. This allows the sensors to avoid obstacles are model-based. We begin by modeling the received signal
and takes care of the motion uncertainty. strength (RSS), which is needed for the model-based algorithm.

5.3 Distributed Model-Based Algorithm. In this section, we 6.1 RSS Model. Let the positions of a wireless source and re-
aim to distribute the model-based source-seeking algorithm (13). ceiver in 2D be y and x, respectively. The RSS (dBm) at x is mod-
We assume that sensors which are sufficiently far from each other eled as
receive independent information. This is justified because when
the sensing footprints of two sensors do not overlap, their sensed Prx ðx; yÞ ¼ Ptx þ Gtx  Ltx þ Grx  Lrx
signals (if any) will not be coming from the same source. As a
result, computing the MI gradient in Eq. (12) with respect to xi is  Lfs ðx; yÞ  Lm ðx; yÞ  Rðx; yÞ
decoupled from the states of the distant sensors.
THEOREM 5. Let V i denote the set of sensors (excluding i) whose where Ptx is the transmitter output power (18 dBm in our experi-
sensing footprints overlap with that of sensor i. Let V i denote the ments), Gtx is the transmitter antenna gain (1.5 dBi), Ltx is the
rest of the sensors. Suppose that sensor i’s measurements, zi, are transmitter loss (0 dB), Grx is the receiver antenna gain (1.5 dBi),
independent (not conditionally on y, as before) of the measure- Lrx is the receiver loss (0 dB), Lfs is the free space loss (dB), Lm is
ments, zVi , obtained by the sensors Vi , i.e., pt ðzi ; zVi jxi ; xVi Þ the multipath loss (dB), and R is the noise. The free space loss is
¼ pt ðzi jxi Þpt ðzVi jxVi Þ. Then, modeled as
@ @ Lfs ðx; yÞ ¼ 27:55 þ 20 log10 ðÞ þ 20 log10 ðkx  yk2 Þ
Iðy; zi ; zV i ; zV i jxi ; xV i ; xV i Þ ¼ Iðy; zi ; zV i jxi ; xV i Þ
@xi @xi
where  is the frequency (2400 MHz). The model from Ref. [38]
Proof. By the chain rule of MI and then the independence of zi
is used for the multipath loss
and zV i
(
Iðy; zi ; zV i ; zV i jxi ; xV i ; xV i Þ a þ bkðx; yÞ; if kðx; yÞ > 0
Lm ðx; yÞ ¼
0; else
¼ Iðy; zi ; zV i jxi ; xV i Þ þ Iðy; zV i jzi ; zV i ; xi ; xV i ; xV i Þ

¼ Iðy; zi ; zV i jxi ; xV i Þ þ Iðy; zV i jzV i ; xV i ; xV i Þ


8
Since all sensors have the same observation model h(, ), each sensor can
The second term above is constant with respect to xi 䊏 simulate measurements zV i ;t as long as it knows the configurations zV i ;t .

031004-6 / Vol. 137, MARCH 2015 Transactions of the ASME

Downloaded From: http://dynamicsystems.asmedigitalcollection.asme.org/ on 01/29/2016 Terms of Use: http://www.asme.org/about-asme/terms-of-use


Fig. 1 Joint position and gradient estimation at a single measurement location (on the fast time-scale). The first plot shows
the true sensor positions (red circles), initial position estimates (blue circles), and the true gradient of the signal field (red
arrow). The second plot shows the position estimates after 40 iterations (blue circles) and the gradient estimate of sensor 1
(blue arrow). The third column shows the root mean squared error (RMSE) of the position (top) and centroid (bottom) esti-
mates of all sensors averaged over 50 independent repetitions. The fourth column shows the RMSE of the gradient magni-
tude and orientation estimates.

6.2 Simulation Results. The first experiment aims at verify-


ing the conclusions of Theorem 4 when the sensor formation is
not maintained well, namely that the distributed relative pose esti-
mation and the consensus on the local FD gradient estimates con-
verge asymptotically to an unbiased (up to the error in the FD
approximation) gradient estimate. Ten sensors were arranged in a
distorted “circular” formation (see Fig. 1) and were held station-
ary during the estimation procedure (on the fast time-scale). Ini-
tially, the sensors assumed that they were in a perfect circular
formation of radius of 1.75 m. Relative measurements (17) with
noise covariance Eij ¼ 0.4I2 were exchanged to estimate the sensor
states. At each time k, sensor i used its estimate x^i ðkÞ to compute
Fig. 2 The paths followed by the sensors after 30 iterations of the FD weights via Eq. (7). Wireless signal measurements
the model-free source-seeking algorithm in an obstacle-free
obtained according to the RSS model were combined with the FD
environment. The white circles indicate sensor 1’s estimates of
the source position over time. The plots on the right show the weights to form the local gradient estimates (20), which were
average error of the source position estimates and its standard used to update the state of the consensus filter according to Eq.
deviation averaged over 50 independent repetitions. (21). Figure 1 shows that the errors in the pose and the gradient
estimates tend to zero after 80 iterations on the fast time-scale.
Next, we demonstrate the ability of our algorithms to localize
where a is a multiwall constant (30 dB), b is a wall attenuation the source of a wireless signal obtained according to the RSS
factor (15 dB/m), and k(x, y) denotes the distance traveled by the model of Sec. 6.1. The performance of the model-free algorithm
ray from y to x through occupied cells in the environment (repre- is illustrated in Fig. 2. A circular formation with radius of 1.75 m
sented as an occupancy grid). Finally, if the measurement is line- consisting of ten sensors was maintained. The communication ra-
of-sight, i.e., k(x, y) ¼ 0, the fading R(x, y) is Rician (l, r); other- dius was 6 m, while the sensing radius was infinite. The sensors
wise it is Rayleigh (r). We used l ¼ 4 dB and r ¼ 20 dB in the did not coordinate to maintain the formation. They were kept to-
simulations. gether by the agreement on the centroid and the signal gradient,

Fig. 3 The paths followed by the sensors after 30 iterations of the model-based source-seeking algorithm in an environment
without obstacles (left) and with obstacles (right). The white circles indicate sensor 1’s estimates of the source position over
time. The plots show the average error of the source position estimates and its standard deviation averaged over 50 inde-
pendent repetitions. The evolution of sensor 1’s distributed particle filter is shown in each scenario (bottom row).

Journal of Dynamic Systems, Measurement, and Control MARCH 2015, Vol. 137 / 031004-7

Downloaded From: http://dynamicsystems.asmedigitalcollection.asme.org/ on 01/29/2016 Terms of Use: http://www.asme.org/about-asme/terms-of-use


achieved via the distributed state estimation and the consensus fil- We claim that this implies that the sequence of FD weights
ter. At time t, each sensor i applied the control ui;t ¼ ct g^i;t ðKmax Þ, Wð^xi ðkÞÞ computed in Eq. (7) is UI for each i. The matrix U in Eq.
where g^i;t ðKmax Þ is the gradient estimate after Kmax ¼ 50 iterations (7) is a bounded continuous function of x^i ðkÞ, which means that
on the fast time-scale and ct is the step-size. Unlike the persistent there
 exists a constant KiU 1 for each P i such that
measurements in Fig. 1, the sensors measured their relative states Uð^xi ðkÞÞT 1 KiU . Define ai ðkÞ :¼ x^ii ðkÞ  nj¼1 x^ij ðkÞ=n. From
and the wireless signal only 10 times and stopped updating their Eq. (8),
local gradient estimates to enable faster convergence of the con- 2
sensus filter. The initial distance between the signal source and  2 2 3T 

 2d2 ed ka1 ðkÞk2 aT1 ðkÞ 
the centroid of the sensor formation was 44.2 m. Averaged over 6 7 
  6 7   
50 independent repetitions, the sensors managed to estimate the Wð^ xi ðkÞÞ1  6 .. 7  Uð^
xi ðkÞÞT 1
4 . 5 
source location within 4.62 m in 30 iterations. 
 2 
The same initial source and sensor positions were used to set up  2d2 ed2 kan ðkÞk2 aTn ðkÞ 
1
the model-based experiments. Figure 3 shows the performance in
environments with and without obstacles. The communication ra- X
n 2 
ed kaj ðkÞk2 aj ðkÞ1
2

dius was 10 m, while the sensing radius was infinite. The sensors 2d2 KiU
maintained distributed particle filters with 4000 particles and used j¼1

five signal measurements to update the filters before moving  


Xn  
2  i 1X n

(unlike the ten used in the model-free case). A stochastic MI gra- 2d KiU x^j ðkÞ  i
x^l ðkÞ
dient was obtained via ten simulated signal measurements only. j¼1
 n l¼1

1
Averaged over 50 independent repetitions, the sensors managed to n  
X  i   
estimate the source location within 2.96 m in the obstacle-free 4d2 KiU x^j ðkÞ ¼ 4d2 KiU x^i ðkÞ1
case and 1.86 m in the obstacle case after 30 iterations of the algo- j¼1
1
rithm. It is interesting to note that the performance of the model-
based algorithm is better when obstacles are present in the envi-
By UI of f^ xi ðkÞg, for any e > 0, there exist Ki 僆 [0, 1) such that
ronment. When the model is good and the environment is known, i
E½kx^ ðkÞk1 1fkx^i ðkÞk1 Ki g  e for all k. Then for all i, k
the wall attenuation of the signal helps discount many hypotheti-
cal source locations, which would not be possible in the obstacle- h  i
free case (see the filter evolution in Fig. 3). As a result, the model- E Wð^ xi ðkÞÞ1 1fkWð^xi ðkÞÞk1 4d2 KU Ki g
i
based algorithm outperforms the model-free one. However, we h  i
expect that as the quality of the model degrades so would the per- 4d 2
KiU E x^i ðkÞ 1 2 U i 4d2 KiU e
1 f4d K kx^ ðkÞk1
i 4d2 KiU Ki g
formance of the model-based approach and the model-free algo-
rithm would become more attractive.
xi ðkÞÞ is a continuous function of x^i ðkÞ by the continuous
Since Wð^
p
mapping theorem, Wð^ xi ðkÞÞ ! Wðx Þ; 8i. This, coupled with the
7 Conclusion L1
UI of fWð^xi ðkÞÞg for all i implies that Wð^
xi ðkÞÞ ! Wðx Þ; 8i. The
Distributed model-free and model-based approaches for source signal measurements zi(s) in Eq. (20) are independent of the esti-
seeking with a mobile sensor network were developed. The sto-
mates Wð^xi ðkÞÞ because the latter are based on the relative meas-
chastic gradient ascent approach in the model-free case does not
urements in Eq. (17). Therefore,
need global localization and is robust to deformations in the ge-
ometry of the sensor team. The SA simplifies the algorithm and 1 X k
provides convergence guarantees. The model-based method has E^ xi ðkÞÞÞ
gi ðkÞ ¼ E½coli ðWð^ Ezi ðsÞ
k þ 1 s¼0 (23)
the sensors follow a stochastic gradient of the MI between their
expected measurements and source estimates. In this case, the SA i
enables a key trade-off between time spent moving the sensors and ¼ E½coli ðWð^
x ðkÞÞÞhðxi ; yÞ ! coli ðWðx ÞÞhðxi ; yÞ
time spent planning the most informative move. The experiments
Now, consider the behavior of the consensus filter in Eq. (21) with
show that the model-based algorithm outperforms the model-free
li;k ¼ g^i ðkÞ. Eliminating the state qi,k and writing the equations in
one when an accurate model of the signal is available. Its draw-
matrix form gives
backs are that it relies on knowledge of the environment, global
localization, and a flooding algorithm to exchange the sensor states, rkþ1 ¼ ðIndx  bðL  Idx ÞÞrk þ ðlkþ1  lk Þ
which can be demanding for the network. If computation is limited,
the environment is unknown, the signal is difficult to model, or
global localization is not available, the model-free algorithm would where L is the Laplacian of the communication graph G. Taking
be the natural choice. In the future work, we plan to apply the algo- expectations above results is a deterministic linear time-invariant
rithms to other signals, compare their performance to other system, which was analyzed in Ref. [37]. In light of Eq. (23),
approaches in the literature, and carry out real-world experiments. Proposition 1 in Ref. [37] shows that for all i
!
1X n
lim E½ri;k   coli ðWðx ÞÞhðxi ; yÞ ¼ 0
Acknowledgment k!1 n i¼1
This work was supported by ONR-HUNT Grant No. N00014-
Finally, the FD approximation in Eq. (5) shows that
08-1-0696 and by TerraSwarm, one of six centers of STARnet, a
Semiconductor Research Corporation program sponsored by 0 1
hðx1 ; yÞ
MARCO and DARPA. 1 B .. C 1
lim Erik ¼ Wðx Þ@ . A ¼ ðgðm ; yÞ þ bÞ; 8i
k!1 n n
hðxn ; yÞ

Appendix: Proof of Theorem 4


L2
From Theorem 3, x^i ðkÞ ! x ; 8i, which implies convergence in References
L and in probability. Convergence in L1 implies that the sequence
1
[1] Lux, R., and Shi, W., 2004, “Chemotaxis-Guided Movements in Bacteria,”
xi ðkÞg is uniformly integrable (UI) for all i [39, Theorem 5.5.2].
f^ Crit. Rev. Oral. Biol. Med., 15(4), pp. 207–220.

031004-8 / Vol. 137, MARCH 2015 Transactions of the ASME

Downloaded From: http://dynamicsystems.asmedigitalcollection.asme.org/ on 01/29/2016 Terms of Use: http://www.asme.org/about-asme/terms-of-use


[2] Frankel, R., Bazylinski, D., Johnson, M., and Taylor, B., 1997, “Magneto-Aero- [21] Dames, P., Schwager, M., Kumar, V., and Rus, D., 2012, “A Decentralized Con-
taxis in Marine Coccoid Bacteria,” Biophys. J., 73(2), pp. 994–1000. trol Policy for Adaptive Information Gathering in Hazardous Environments,”

[3] Ogren, P., Fiorelli, E., and Leonard, N., 2004, “Cooperative Control of Mobile IEEE Conference on Decision and Control (CDC), Dec., pp. 2807–2813.
Sensor Networks,” IEEE Trans. Autom. Control, 49(8), pp. 1292–1302. [22] Julian, B., Angermann, M., Schwager, M., and Rus, D., 2012, “Distributed
[4] Sukhatme, G., Dhariwal, A., Zhang, B., Oberg, C., Stauffer, B., and Caron, D., Robotic Sensor Networks: An Information-Theoretic Approach,” Int. J. Rob.
2007, “Design and Development of a Wireless Robotic Networked Aquatic Mi- Res., 31(10), pp. 1134–1154.
crobial Observing System,” Environ. Eng. Sci., 24(2), pp. 205–215. [23] Yin, G., Yuan, Q., and Wang, L., 2013, “Asynchronous Stochastic Approxima-
[5] Rybski, P., Stoeter, S., Erickson, M., Gini, M., Hougen, D., and Papanikolopou- tion Algorithms for Networked Systems: Regime-Switching Topologies and
los, N., 2000, “A Team of Robotic Agents for Surveillance,” International Con- Multiscale Structure,” SIAM Multiscale Model. Simul., 11(3), pp. 813–839.
ference on Autonomous Agents, Barcelona, Spain, ACM, New York, pp. 9–16. [24] Yu, W., Zheng, W., Chen, G., Ren, W., and Cao, J., 2011, “Second-Order Con-
[6] Kumar, V., Rus, D., and Singh, S., 2004, “Robot and Sensor Networks for First sensus in Multi-Agent Dynamical Systems With Sampled Position Data,” Auto-
Responders,” IEEE Pervasive Comput., 3(4), pp. 24–33. matica, 47(7), pp. 1496–1503.
[7] Wu, W., and Zhang, F., 2011, “Experimental Validation of Source Seeking [25] Derenick, J., and Spletzer, J., 2007, “Convex Optimization Strategies for
With a Switching Strategy,” IEEE International Conference on Robotics and Coordinating Large-Scale Robot Formations,” IEEE Trans. Rob., 23(6), pp.
Automation (ICRA), pp. 3835–3840. 1252–1259.
[8] Li, S., and Guo, Y., 2012, “Distributed Source Seeking by Cooperative Robots: [26] Fornberg, B., Lehto, E., and Powell, C., 2013, “Stable Calculation of Gaussian-
All-to-All and Limited Communications,” IEEE International Conference on Based RBF-FD Stencils,” Comput. Math. Appl., 65(4), pp. 627–637.
Robotics and Automation (ICRA), pp. 1107–1112. [27] Kushner, H., and Yin, G., 2003, Stochastic Approximation and Recursive Algo-
[9] Brinon-Arranz, L., and Schenato, L., 2013, “Consensus-Based Source-Seeking rithms and Applications, 2 ed., Springer-Verlag, New York.
With a Circular Formation of Agents,” European Control Conference, pp. [28] Borkar, V., 2008, Stochastic Approximation: A Dynamical Systems Viewpoint,
2831–2836. Cambridge University Press, Cambridge, UK.
[10] Zhang, F., and Leonard, N., 2010, “Cooperative Filters and Control for Cooper- [29] Ljung, L., 1977, “Analysis of Recursive Stochastic Algorithms,” IEEE Trans.
ative Exploration,” IEEE Trans. Autom. Control, 55(3), pp. 650–663. Autom. Control, 22(4), pp. 551–575.
[11] Choi, J., Oh, S., and Horowitz, R., 2009, “Distributed Learning and Cooperative [30] Spall, J., 2003, Introduction to Stochastic Search and Optimization, John
Control for Multi-Agent Systems,” Automatica, 45(12), pp. 2802–2814. Wiley & Sons, Hoboken, NJ.
[12] Jadaliha, M., Lee, J., and Choi, J., 2012, “Adaptive Control of Multiagent Sys- [31] Schwager, M., Dames, P., Rus, D., and Kumar, V., 2011, “A Multi-Robot Con-
tems for Finding Peaks of Uncertain Static Fields,” ASME J. Dyn. Syst. Meas. trol Policy for Information Gathering in the Presence of Unknown Hazards,”
Control, 134(5), p. 051007. Proceedings of International Symposium on Robotics Research, Aug.
[13] Azuma, S., Sakar, M., and Pappas, G., 2012, “Stochastic Source Seeking by [32] Thrun, S., Burgard, W., and Fox, D., 2005, Probabilistic Robotics, MIT,
Mobile Robots,” IEEE Trans. Autom. Control, 57(9), pp. 2308–2321. Cambridge, MA.
[14] Zhang, C., Arnold, D., Ghods, N., Siranosian, A., and Krstić, M., 2007, “Source [33] Rad, K., and Tahbaz-Salehi, A., 2010, “Distributed Parameter Estimation in
Seeking With Non-Holonomic Unicycle Without Position Measurement and Networks,” IEEE Conference on Decision and Control (CDC), pp. 5050–5055.
With Tuning of Forward Velocity,” Syst. Control Lett., 56(3), pp. 245–252. [34] Shahrampour, S., and Jadbabaie, A., 2013, “Exponentially Fast Parameter Esti-
[15] Liu, S., and Krstić, M., 2010, “Stochastic Source Seeking for Nonholonomic mation in Networks Using Distributed Dual Averaging,” IEEE Conference on
Unicycle,” Automatica, 46(9), pp. 1443–1453. Decision and Control (CDC), pp. 6196–6201.
[16] Stanković, M., and Stipanović, D., 2010, “Extremum Seeking Under Stochastic [35] Tahbaz-Salehi, A., and Jadbabaie, A., 2010, “Consensus Over Ergodic Station-
Noise and Applications to Mobile Sensors,” Automatica, 46(8), pp. 1243–1251. ary Graph Processes,” IEEE Trans. Autom. Control, 55(1), pp. 225–230.
[17] Ghods, N., and Krstić, M., 2011, “Source Seeking With Very Slow or Drifting [36] Atanasov, N., Le Ny, J., and Pappas, G., 2014, “Distributed Algorithms for Sto-
Sensors,” ASME J. Dyn. Syst. Meas. Control, 133(4), p. 044504. chastic Source Seeking With Mobile Robot Networks: Technical Report,” pre-
[18] Atanasov, N., Le Ny, J., Michael, N., and Pappas, G., 2012, “Stochastic Source print arXiv: 1402.0051.
Seeking in Complex Environments,” IEEE International Conference on [37] Spanos, D., Olfati-Saber, R., and Murray, R., 2005, “Dynamic Consensus on
Robotics and Automation (ICRA), pp. 3013–3018. Mobile Networks,” 16th IFAC World Congress, International Federation of
[19] Charrow, B., Kumar, V., and Michael, N., 2013, “Approximate Representations Automatic Control, Prague, Czech Republic.
for Multi-Robot Control Policies That Maximize Mutual Information,” [38] Capulli, F., Monti, C., Vari, M., and Mazzenga, F., 2006, “Path Loss Models
Robotics: Science and Systems (RSS), Berlin, Germany. for IEEE 802.11a Wireless Local Area Networks,” 3rd International Sympo-
[20] Hoffmann, G., and Tomlin, C., 2010, “Mobile Sensor Network Control Using sium on Wireless Communication Systems, pp. 621–624.
Mutual Information Methods and Particle Filters,” IEEE Trans. Autom. Con- [39] Durrett, R., 2010, Probability: Theory and Examples, Vol. 4, Cambridge Uni-
trol, 55(1), pp. 32–47. versity, Cambridge University Press, New York.

Journal of Dynamic Systems, Measurement, and Control MARCH 2015, Vol. 137 / 031004-9

Downloaded From: http://dynamicsystems.asmedigitalcollection.asme.org/ on 01/29/2016 Terms of Use: http://www.asme.org/about-asme/terms-of-use

You might also like