0% found this document useful (0 votes)
6 views

Sensors 19 01524 v2

Uploaded by

luketianruo
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Sensors 19 01524 v2

Uploaded by

luketianruo
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

sensors

Article
Scalable Gas Sensing, Mapping, and Path Planning
via Decentralized Hilbert Maps
Pingping Zhu 1, * , Silvia Ferrari 1 , Julian Morelli 1 , Richard Linares 2 and Bryce Doerr 3
1 Sibley School of Mechanical and Aerospace Engineering, Cornell University, Ithaca, NY 14853, USA;
[email protected] (S.F.); [email protected] (J.M.)
2 Department of Aeronautics and Astronautics, Massachusetts Institute of Technology,
Cambridge, MA 02139, USA; [email protected]
3 Department of Aerospace Engineering and Mechanics, University of Minnesota,
Minneapolis, MN 55455, USA; [email protected]
* Correspondence: [email protected]

Received:18 December 2018; Accepted: 24 March 2019; Published: 28 March 2019 

Abstract: This paper develops a decentralized approach to gas distribution mapping (GDM) and
information-driven path planning for large-scale distributed sensing systems. Gas mapping is
performed using a probabilistic representation known as a Hilbert map, which formulates the
mapping problem as a multi-class classification task and uses kernel logistic regression to train a
discriminative classifier online. A novel Hilbert map information fusion method is presented for
rapidly merging the information from individual robot maps using limited data communication.
A communication strategy that implements data fusion among many robots is also presented for
the decentralized computation of GDMs. New entropy-based information-driven path-planning
methods are developed and compared to existing approaches, such as particle swarm optimization
(PSO) and random walks (RW). Numerical experiments conducted in simulated indoor and outdoor
environments show that the information-driven approaches proposed in this paper far outperform
other approaches, and avoid mutual collisions in real time.

Keywords: gas sensing; multi-agent systems; very large-scale robotic systems; information theory

1. Introduction
Fugitive emissions and the dispersion of pollutants in the atmosphere are significant concerns
affecting public health as well as climate change. The accidental release of hazardous gases from both
urban and industrial sources is responsible for a variety of respiratory illnesses and environmental
concerns [1,2]. Many sensors are currently fabricated and deployed both indoors and outdoors for
air quality data collection and communication. However, obtaining a spatial representation of a gas
distribution is a challenging problem because existing mapping and fusion algorithms do not scale
to networks of potentially hundreds of mobile and stationary sensors. The decentralized mapping
and path-planning methods presented in this paper are applicable to very large-scale sensor systems,
and as such can potentially be used to assess air quality, classify safe and hazardous areas based on the
concentration of the harmful gases, and even localize fugitive emissions [3].
Due to the fundamental mechanisms of atmospheric gas dispersion, auxiliary tools are required
to ensure early detection and to respond with appropriate counter measures by planning future
measurements efficiently in both space and time. Methods for obtaining gas distribution maps (GDM)
have recently been developed along two lines of research [4]. In the first line of research, a stationary
sensor network is used to collect and fuse measurements to estimate the position of a source, typically
requiring expensive and time-consuming calibration and data-recovery operations [5]. The use of

Sensors 2019, 19, 1524; doi:10.3390/s19071524 www.mdpi.com/journal/sensors


Sensors 2019, 19, 1524 2 of 21

stationary sensors, however, is not typically effective or sufficiently expedient for gas sensing in
response to high source rates of critical emissions. The second line of research, pursued in this paper,
involves the use of mobile sensors, such as terrestrial robots equipped with gas sensors that can be
controlled to rapidly and efficiently collect and fuse measurements over time. The latter approach
provides for flexible sensor configurations that can respond to measurements online and thus can be
applied to mapping gas distributions in unknown and complex environments.
Mobile gas sensing can be performed by a single robot [2,3,6–9] or by networks of robots also
referred to as multi-agent systems (MAS) [10–14]. Compared to single robots, MAS present several
advantages, including increased probability of success and improved overall operational efficiency [14],
but also present additional technical challenges. In addition to requiring solving the GDM problem
as a dynamical optimization problem, multi-agent path planning, coordination, communication, and
fusion can become intractable as the number of robots increases [4,15–20].
MAS path planning and coordination can be achieved via centralized or decentralized methods,
depending on the underlying communication infrastructure [21]. Centralized methods require
persistent communication between a central station and every agent in the network, such that the
central station can process and fuse all sensor measurements and use them to plan, or re-plan, the robot
paths in a coordinated and collaborative fashion. Decentralized methods allow each robot to process
its own measurements individually and then to communicate and fuse it with the measurements of a
subset of collaborative robots, such as its nearest neighbors, with established connectivity. As a result,
the performance of centralized methods depends entirely on the reliability of communication protocols
under the operating conditions. Because fragile communication links are common in many hazardous
scenarios, reliance on persistent communications with the central station and the associated power
consumption can hinder or even prevent the applicability of MAS over large regions of interest that
require repeated long-distance data transmission. Nevertheless, most of the existing GDM methods to
date rely on centralized methods and algorithms [10,12,13].
One of the main challenges in developing decentralized GDM methods lies in solving the data
fusion problem for neighboring robots such that each robot can build its own representation of the
GDM based on local measurements, but also update it incrementally as new information is obtained
from neighboring robots. Considering the limited communication bandwidth and computing resources
of most autonomous robots, it is also impractical to expect each sensor to share all the raw measurement
data with its neighbors. Therefore, the decentralized approach to gas source localization presented
in [11] shares only the largest gas concentration and corresponding source position with its neighbors.
However, this decentralized approach cannot be extended to high-performance GDM representations,
such as Gaussian process (GP) mixture models [8] or kernel-based models [3,22], and fusing GDMs
obtained by different robots would result in redundant and expensive computations that potentially
operate on the same raw data repeatedly.
The Hilbert map is an alternate GDM that represents a probability map learned from local
concentration measurements [7,23]. The novel GDM representation developed in this paper uses
kernel logistic regression (KLR) to express the probability that the gas concentration at a certain
position belongs to a predefined range as a Hilbert map function. By this approach, new decentralized
fusion algorithms can be developed that present several advantages, including decentralized fusion,
agent-level complete GDM representation, and update for decentralized decision making. Additionally,
information fusion operations are implemented via simple summations and only the local Hilbert
map needs to be shared among neighboring robots at every measurement update. As a result, at the
end time of the task, information about the gas concentration distribution over the entire region of
interest can be delivered to the client or operator even if only a few robots complete the mission.
Two GDM-based path-planning methods, the entropy-based artificial potential field (EAPF) and the
entropy-based particle swarm optimization (EPSO) algorithms, are presented in this paper, and the
simulation results show that they significantly outperform existing algorithms.
Sensors 2019, 19, 1524 3 of 21

2. Problem Formulation
Consider the problem of optimally planning the trajectory of a distributed sensing system comprised
of N cooperative robots engaged in GDM through a large region of interest (ROI), denoted by W ⊂ R2 .
In general, the gas concentration at a position x ∈ W , at time t, is modeled as a random variable C(x, t),
thus the whole gas concentration is a random field. The gas distribution is considered constant over a
time interval for simplicity and thus is modeled as a spatial function c(x) defined over W . The approach
can be extended to time-varying concentrations by augmenting the dimensionality of the GDM.
Let U ⊂ Rm denote the space of admissible actions or controls. The dynamics of each robot are
governed by stochastic differential equations (SDEs),

ẋn (t) = f[xn (t), un (t), t] + Gw(t), xn ( T0 ) = xn0 and n = 1, · · · , N (1)

where xn (t) ∈ W denotes the nth robot’s position and the velocity at time t, un (t) ∈ U denotes the nth
robot action or control, and xn0 denotes the robot initial conditions at initial time T0 . The robot dynamics
in (1) are characterized by additive Gaussian white noise, denoted by w(t) ∈ R2 , and G ∈ R2×2 is a
known constant matrix. In this paper, we assume that the position of the nth robot is obtained by an
on-board GPS and is denoted by x̂n (t), where (ˆ·) denotes the estimated variable’s value.
All N robots are equipped with identical metal oxide sensors in order to cooperatively map a
time-constant gas distribution, where the quantity and position of gas plumes are unknown a priori.
Because the reaction surface of a single gas sensor is very small (≈1 cm2 ), a single measurement
from a gas sensor can only provide information about a very small area. Therefore, to increase the
resolution of the GDM, a small metal oxide sensor array is used instead of a single oxide sensor
for each robot. The area that is covered by this small metal oxide sensor array is called the field of
view (FOV) of the robot, denoted by S(xn ) ∈ W . Because the structure of the metal oxide sensor
array is fixed, the positions of each metal oxide sensor can be approximated by the robot position
measurements, denoted by x̂n,m , m = 1 : M, where M is the number of oxide sensors on-board the nth
robot. For example, for a 5 × 5 metal oxide sensor array, the number of sensors is M = 25.
The gas concentration measurement obtained by the mth sensor can be modeled as the sum of the
actual concentration and measurement noise,

ĉn,m (xn ) = c(xn,m ) + v̂(xn,m ) (2)

where v̂(xn,m ) is a realization of the random measurement noise, V (xn,m ). Furthermore, the concentration
measurement, ĉn,m (xn ), can be treated as a sample of the random variable C(xn,m ), defined as

C (xn,m ) , c(xn,m ) + V (xn,m ) (3)

Then, the concentration measurement ĉn,m (xn ) can be considered as an observation of the random
field C (·) at position xn,m .
Because the actual gas concentration, c(x), and the distribution of the measurement noise, V (x),
are unknown, our objective is to approximate the random field C (·), and then use this approximated
random field to estimate the gas concentration map. The cost function of the nth robot over a fixed time
interval [t0 , t f ] can then be expressed as an integral of the future measurement values, conditioned on
past measurements, and the robot control usage, i.e.,
Z Z t
f
Jn = H [Cn (x|Mn ( T f ))]dx + un (t) T Run (t)dt (4)
W t0

where H [Cn (x|Mn (t))] is the entropy of the random variable Cn (x|Mn (t)) approximated by the nth
robot given all past measurements, Mn . R is a positive definite matrix that weighs the importance
of the elements of the control input un (t), and the superscript “T” denotes the matrix transpose.
Sensors 2019, 19, 1524 4 of 21

The optimal planning problem is to find the optimal time history of the robot state xn (t) and control
un (t), for all n = 1, · · · , N, so that the cost function Jn in (4) is minimized over [t0 , t f ], and subject
to (1).
Let the time interval [t0 , t f ] be discretized into T f equal time steps ∆t = (t f − t0 )/T and let
tk = t0 + k∆t denote the discrete-time index with k = 1, · · · , T f . Then the objective function, Jn , can be
rewritten as,
Z Tf
(T f )
Jn =
W
H [Cn (x|Mn )]dx + ∑ un,k
T
Run,k (5)
k =1

where the subscript k indicates the kth time step and the superscript “(T f )” indicates the time until
(T f ) Tf
the T f th step. The term, Mn = ∪ Mn,k , indicates the measurement history until the T f th step and
k =1
Mn,k denotes all the measurements obtained by the nth robot up to the kth time step.

3. Representation of GDM
(T )
To approximate the probability distribution of the continuous random variable Cn (x|Mn ),
there are several methods, including kernel density estimation (KDE) [24,25] and Gaussian process
regression (GPR) [26]. In the KDE method, the learned probability density function (PDF) is assumed to
be a weighted summation of many parameterized kernel functions, where the weight coefficients and
the kernel parameters are critical and learned from the raw measurement data set. In the GPR method,
the approximated PDF is assumed to be a Gaussian distribution, where a large matrix, the Gram matrix,
is very critical and learned from the raw measurement data. For both of methods, the data fusion of two
different measurement data sets and update of the approximated PDF are computationally demanding,
because all computations must be implemented from a massive amount of raw data. It is difficult to
obtain the updated parameters and coefficients from the previous parameters and coefficients directly.
The method in this paper overcomes this hurdle through an efficient fusion approach developed
by approximating the continuous probability distribution by a discrete probability distribution. Denote
the range of the concentration in the whole ROI by R = [ Lc , Hc ], and then divide the range into L
concentration intervals denoted by R1 = [ L1 , H1 ), · · · , Rl = [ Ll , Hl ), · · · , RL = [ LL , HL ], where Hl = Ll+1
for l = 1, · · · , L − 1, and L1 = Lc and HL = Hc . The cutoff coefficients, {( Ll , Hl )}lL=1 , specify the
concentration intervals of interest. Instead of approximating the PDF, this paper models the probabilities
that the gas concentration at every position x ∈ W belongs to the concentration interval { pl (x)}lL=1 , or,

pl (x) , P(C (x) ∈ Rl ), l = 1, · · · , L (6)

where P(·) denotes the probability operator. Let f c (x) denote the PDF of the concentration C (x) and
∆Ll = Hl − Ll denote the length of the lth concentration interval. The relationship between pl (x) and
f c ( x ) can be expressed as, Z
pl (x) , P(C (x) ∈ Rl ) = f c ( x )dx (7)
x ∈Rl

where
pl (x)
f c ( Ll ) = lim (8)
∆L→0 ∆L
Therefore, the discrete distribution { pl (x)}lL=1 can be used to describe the probability distribution
of the continuous random variable C (x) as L → ∞. More importantly, it can be used for
decision making, where the concentration intervals correspond to different levels of hazardous gas
concentration. The Hilbert map method, developed by Fabio Ramos and Lionel Ott in [27] to model
obstacle occupancy, is modified here to approximate the discrete distribution { pl (x)}lL=1 over the
entire ROI.
Sensors 2019, 19, 1524 5 of 21

3.1. Mapping with KLR


A Hilbert map is a continuous probability map developed by formulating the mapping problem
as a binary classification task. Let x ∈ W be any point in W and Y ∈ {0, 1} be defined as a categorical
random variable such that
(
1, if C (x) ∈ R
Y= (9)
0, if C (x) ∈
/R

where the realization is denoted by y. The Hilbert map describes the probabilities P(Y = 1|x) = p(x)
and P(Y = 0|x) = 1 − p(x) at the position x.
Consider the training measurement data set, M = {(xm , ym )}m M , where x ∈ W indicates
=1 m
the measurement position, the Boolean variable ym ∈ {1, 0} indicates whether the concentration
measurement, ĉ(xm ), belongs to the range, R (where 1 = Yes and 0 = No), and M is the number of
measurements. The probability P(Y |x) is defined as,

1 e f (x)
p ( x ) = P (Y = 1 | x ) = 1 − = (10)
1 + exp [ f (x)] 1 + e f (x)

where,
1
1 − p ( x ) = P (Y = 0 | x ) = (11)
1 + e f (x)
and f ∈ H is a Hilbert function defined on W . H is a reproducing kernel Hilbert space (RKHS)
associated with a kernel k(·, ·). The kernel mapping is denoted by k(x, ·) = ϕ(x), ϕ(x) ∈ H, and is an
injective map from W to H [28–32]. According to the kernel trick [28–32], the evaluation of the Hilbert
function can be expressed in the form of inner product,

f (x) = h f , ϕ(x)iH (12)

where h·, ·iH indicates the inner product in the RKHS. To learn the Hilbert map, the loss function JH is
defined as,
M
JH = ∑m `m +h λ2 k f k2H i
=1 n
M
o
f ( x m ) − y f ( x ) + λ k f k2 (13)
= ∑m =1 ln 1 + e m m 2 H

where `m = − ln [ P(Y = ym |xm )] is the negative log-likelihood (NLL) of the data (xm , ym ). Here,
the term λ is the regularization term, which is a small user-defined positive scalar, λ  1. Then the
gradient of the loss function with respective to f is expressed as,
h i
M e f (xm )
∂ f = ∑ m =1 1+ e f ( x m ) ϕ ( x m ) − y m ϕ ( x m )
∂JH
g = + λf
∑m M (14)
= = 1 [ P (Y = 1 | x m ) − y m ] ϕ ( x m ) + λ f
= Φ(p − y) + λ f
h iT
e f ( x1 ) e f (x M )
where Φ = [ ϕ(x1 ), · · · , ϕ(x M )], p = [ p(x1 ), · · · , p(x M )] T = ,··· , ,
1+ e f ( x1 ) 1+ e f ( x M )
T
and y = [y1 , · · · , y M ] . In addition, the Hessian operator is expressed as,
h f (x ) f (x )
i
H = ∂g
∂ f = ∂∂f e f (1x1 ) , · · · , e f (Mx M ) Φ T + λI
h 1 + e 1 + e i
∂ e f ( x1 ) ∂ e f (x M )
∂ f 1+e f (x M ) Φ + λI
= , · · · , T
h ∂ f f1(x+e) f (x1 ) f (x )
i (15)
= e 1 1
ϕ(x1 ), · · · , e f (Mx M ) 1
ϕ(x M ) Φ T + λI
1+ e f ( x1 ) 1+ e f ( x1 ) 1+ e 1+ e f ( x M )
= ΦΛΦ T + λI
Sensors 2019, 19, 1524 6 of 21

where I is an identity operator (or matrix) defined in the domain of H × H, so that for any function
h ∈ H, and Ih = h. Here, Λ is an M × M diagonal matrix defined as,

Λ = p (1 M − p ) T (16)

and 1 M = [1, · · · 1] T is an M × 1 vector with all elements equal to one.


Using the Newton-Raphson method, the Hilbert function, is updated iteratively,
 −1
f i − Hi−1 gi = f i − ΦΛi Φ T + λI

f i +1 = gi
−1
f i − ΦΛi Φ T + λI Φ
 
= [ ( p i − y ) + λ fi ]
 −1  −1
f i − ΦΛi Φ + λI
T Φ(pi − y) − λ ΦΛi Φ T + λI (17)
 
= fi
 −1
f i − ΦΛi Φ + λI
T Φ ( pi − y )


= f i − ΦRi (pi − y)
 −1
where Ri = Λi Φ T Φ + λI M

, I M is an M × M identity matrix, and the subscript “i” indicates the
ith iteration for learning function f . The Hilbert function, f (x), is evaluated iteratively as follows,

f i +1 ( x ) = h f i+1 , ϕ(x)iH
= f i (x) − ϕ T (x)ΦRi (pi − y) (18)
= f i ( x ) − k T Ri ( pi − y )

where k = Φ T ϕ(x) = [k(x1 , x), · · · , k (x M , x)] T .


It can be seen from (18) that the evaluation of fi+1 (x) only depends on the evaluations of fi (x) and
k(xm , x), m = 1, · · · , M. Therefore, the evaluation fi (x) for each iteration is not needed if x ∈
/ {x1 , · · · , x M }.
Instead, fi (x) can be calculated at the last iteration directly from the evaluations of fi (xm ), such that

i
f i (x) = f 0 (x) − k T ∑
0
Ri 0 ( pi 0 − y ) = f 0 ( x ) − k T Si (19)
i =1

where f 0 (x) is the initial function evaluation at x, prior to learning the function from M.
The following matrix
i
Si , ∑ Ri 0 ( pi 0 − y ) (20)
i 0 =1

only depends on the measurement data, M, and can be updated iteratively.


Furthermore, consider Q collocation points, X c = {xcq ∈ W }qQ=1 , in the ROI, characterized by the
h i
same spatial interval, labeled by the superscript “c”. Let Φc = ϕ(x1c ), · · · , ϕ(xcQ ) denote the feature
matrix of the collocation points. The evaluations of function f (xcq ) at all collocation points can be
updated by,

fi+1 = fi − Φc T ΦRi (pi − y) (21)


h iT
where fi = Φc T f i = f i (x1c ), · · · , f i (xcQ ) . According to (19), the evaluations, fi = Φc T f i , can be
updated directly by,
fi = f0 − Φc T ΦSi (22)

where f0 comprises the initial function evaluations at the collocation points prior to learning the
function from the measurement data M.
Sensors 2019, 19, 1524 7 of 21

In summary, instead of learning the function f or its coefficients as in traditional KLR or GPR [26],
we update the evaluations of f (x) at the collocation points, X c . Given the Hilbert function f ,
the evaluations at the collocation points provide the Hilbert map, f, defined as
h iT
f , Φc T f = f (x1c ), · · · , f (xcQ ) (23)

3.2. Temporal Update of Hilbert Map


The previous subsection develops a method for learning the Hilbert map from the measurement
data set M, obtained by the sensors during one time interval. This subsection considers a Hilbert
M
map updated according to the next data set, Mk = {(xk,m , yk,m )}m=k 1 , obtained during the kth time
M
interval. Here, Mk is the number of measurements in the kth time interval, and Xk = {xk,m }m=k 1
M
and Yk = {yk,m }m=k 1 are the measurement positions and the corresponding classification estimates,
respectively. To learn the next Hilbert map, the loss function over a period T can be expressed as,
" #
T Mk
λ
JT = ∑ γ( T − k) ∑ `k,m +
2
k f k2H (24)
k =1 m =1

where λ is the regularization term as in (13), γ( T − k) is the “forgetting factor”,


and `k,m = − ln [ P(Y = yk,m |xk,m )] is the NLL of the data (xk,m , yk,m ). Similarly to (14), the gradient is
expressed as, h i
M
gT = ∑kT=1 γ( T − k) ∑m=k 1 ∂∂f `k,m + λ f
= ∂JT
∂f
T
= [ Φ1 , · · · , Φ T ] Γ T ( p1 − y1 ) T , · · · , ( p T − y T ) T (25)

T
= Φ̃ T Γ T (p1 − y1 ) , · · · , (p T − y T )
T T
 

 f (x )
T
f (x )
e k,Mk
T e k,1
where Φk = ϕ(xk,1 ), · · · , ϕ(xk,Mk ) , pk = p(xk,1 ), · · · , p(xk,Mk )
  
= f (x ) , · · · , f (x ) ,
1+e k,1 1+e k,Mk
T
yk = yk,1 , · · · , yk,Mk for k = 1, · · · , T, and Φ̃ T = [Φ1 , · · · , Φ T ]. Furthermore, Γ T is a diagonal matrix

h iT
obtained by placing the vector, γ( T − 1)1TM1 · · · γ( T − k)1TMk · · · γ(0)1TMT on the diagonal
of a zero matrix. In addition, as with (15), the Hessian operator can be expressed as,

T T + λI
∂ f = [ Φ1 , · · · , Φ T ] Γ T Λ̃ T Φ1 , · · · , Φ T
 T
= ∂g T

HT
(26)
= Φ̃ T Γ T Λ̃ T Φ̃ TT + λI

where,

Λ1
 
··· 0 ··· 0
 .. .. .. 

 . . . 

Λ̃ T =  0 Λk 0 (27)
 

 .. .. .. 
.
 
 . . 
0 ··· 0 ··· ΛT

and Λk = pk (1 Mk − pk ) T .
Then, the update rule for learning the Hilbert function, f , at the ith iteration is given by,

f T,i+1 = f T,i − H− 1
gT,i
 T,i  −1 T
≈ f T,i − Φ̃ T Γ T Λ̃ T,i Φ̃ TT + λI Φ̃ T Γ T (p1,i − y1 ) T , · · · , (p T,i − y T ) T

(28)
T
= f T,i − Φ̃ T R̃i Γ T (p1,i − y1 )T , · · · , (p T,i − y T )T

Sensors 2019, 19, 1524 8 of 21

 −1
where R̃i = Γ T Λ̃ T,i Φ̃ T Φ̃ + λI Mtol and Mtol = ∑kT=1 Mk . According to the above equation,


the function, f , is expressed by using the set of all the measurement positions, { kT=1 Xk }, and the
S

corresponding coefficients.
To reduce the computational load, the forgetting factor, γ( T − k), is modeled by,
(
1 if T − k < τ
γ( T − k) = (29)
0 otherwise

such that each robot stores only the measurements obtained during the past τ time steps. By setting
τ = 1, the update rule (28) for learning the function f at the ith iteration is expressed as,
 −1
= f T,i − Φ T Λ T Φ TT + λI Φ T (p T − y T )

f T,i+1
(30)
= f T,i − Φ T R T,i (p T,i − y T )
 −1
where R T,i = Λi Φ TT Φ T + λI MT

.

3.3. Approximation of GDM


The previous subsections present a method for learning the Hilbert function, f , from the available
M
measurement data, comprising the measurement position data set Xk = {xk,m }m=k 1 and corresponding
M
classification estimates Yk = {yk,m }m=k 1 for k = 1, · · · , T that indicate whether the concentration
measurement at the measurement position belongs to the range, R, defined in (9). If L classification
estimates are obtained from the same concentration measurements, indicating that the concentration
measurement belongs to the L concentration ranges, such as R1 , · · · , R L , respectively, using the
learning method described in Section 3.2, one can approximate L Hilbert functions, f 1 , · · · , f L .
Furthermore, the probabilities in (6) can be evaluated at all the collocation points, X c = {xq }qQ=1 ,
such that
1
pl (xcq ) = 1 − f (xc ) , q = 1, · · · , Q and l = 1, · · · , L (31)
e q
l

Now, consider Hilbert functions, { f l }lL=1 , approximated locally by each robot. Although all the
concentration intervals are mutually exclusive and complete, i.e.,
Ri ∩ R j = ∅ (32)
L
∪ Ri = [ Lc , Hc ], i, j = 1, · · · , L and i 6= j (33)
i =1

it is not guaranteed that the sum of all the learned probabilities, { pl (xcq )}lL=1 , is equal to one,
or ∑lL=1 pl (xcq ) = 1. Therefore, the learned probabilities must be normalized to obtain the
discrete distribution, π (xcq ) = [π1 (xcq ), · · · , π L (xcq )]. Each component of the discrete distribution
is calculated by, h i
pl (xcq ) ∏1≤l 0 6=l ≤ L 1 − pl 0 (xcq pl (xcq )
πl (xcq ) = ∝ (34)
p0 (xcq ) 1 − pl (xcq )

where p0 (xcq ) is a normalization term. Let c̄ = [c̄1 , · · · , c̄ L ] denote the medians of these intervals,
where c̄l = ( Hl − Ll )/2. Then, the expectation of the random variable C (xcq ) can be expressed as,

E[C (xcq )] = π (xcq )T c̄ (35)

where E(·) is the expectation operator.


Sensors 2019, 19, 1524 9 of 21

4. Information Fusion
In this section, the problem of information fusion between neighboring robots is considered,
where each robot builds its own Hilbert function locally using a decentralized approach. Assume
that all the robots use the same collocation points, X c . To limit the amount of communicated data,
the evaluations of the Hilbert functions at the collocation points are shared among the robots, instead
of the coefficients and parameters of the Hilbert functions.

4.1. Hilbert Map Fusion


Consider two robots that have each learned their respective Hilbert functions, f 1 and f 2 , from two
different sets of measurement data, M1 and M2 , respectively. Then, the fused Hilbert function, f F ,
can be obtained based on the following theorem.

Theorem 1. Let f 1 (x) and f 2 (x) be two Hilbert functions defined on a workspace W , and approximated by two
robots based on their own measurement data sets, M1 and M2 , respectively. These two Hilbert functions can be
applied to calculate the conditional probability that the concentration is in the range R given the corresponding
measurement data sets, as follows:
1
p1 (x|M1 ) = 1 − (36)
1 + e f 1 (x)
1
p2 (x|M2 ) = 1 − (37)
1 + e f 2 (x)

Then, the fused conditional probability p F (x|M1 , M2 ) can be expressed as

1
p F (x|M1 , M2 ) = 1 − (38)
1 + e f F (x)

where f F (x) is the fused Hilbert function. In addition, the fused Hilbert function, f F (x), can be calculated from
the Hilbert functions, f 1 (x) and f 2 (x), as follows,

f F (x) = f 1 (x) + f 2 (x) − ln ε (39)

p(x)
where ε = 1− p(x) is the ratio between the prior probabilities, P(C (x) ∈ R) = p(x) and
P(C (x) ∈
/ R) = 1 − p(x), at x ∈ W .

The proof of Theorem 1 is provided in the Appendix A.1. According to Theorem 1, the following
corollary can be obtained.

Corollary 1. Assume that the prior probability is even, such as p(x) = 1/2. Then, the ratio between the prior
probabilities can be calculated by,
p(x)
ε= =1 (40)
1 − p(x)
and the fusion function can be rewritten as

f F (x) = f 1 (x) + f 2 (x) (41)

Because the prior concentration distribution is unknown, the Hilbert functions are fused according
to (41) in Corollary 1, unless otherwise stated. Furthermore, assume that all the robots have the same
collocation points, X c . The information fusion can be implemented by fusing the Hilbert maps among
neighboring robots,
f H,F = f H,1 + f H,2 (42)
Sensors 2019, 19, 1524 10 of 21

where f H,1 and f H,2 , and f H,F are the Hilbert maps associated with the Hilbert functions f 1 (x), f 2 (x),
and f F (x), respectively.

4.2. Communication Strategy


Any pair of robots in the network can share and update their Hilbert maps efficiently according
to (42). To implement data fusion for a very large number of robots, however, a communication
strategy is also required to determine how to share data between robots characterized by active
communication links. In this paper, gas measurement data are communicated at every time step Tc .
The communication protocol requires four steps at every time k, as follows. In the first, the N robots
are partitioned into smaller communication networks according to their positions and communication
range rc , and Gı denotes the index set of the robot in the ıth communication network. Then, for any
robot, an , with n ∈ Gı , there exists another robot, an0 , such that

dn,n0 = kxn − xn0 k ≤ rc , n, n0 ∈ Gı (43)

where dn,n0 is the distance between robots an and an0 , and k · k indicates the Euclidean norm.
As a second step, one robot in every communication network is selected as a temporary data center
(TDC) denoted by anı∗ , nı∗ ∈ Gı . The other robots in Gı send the Hilbert map change ∆fn,k to robot anı∗ .
The Hilbert map change, ∆fn,k , is defined as the change between the nth robot’s Hilbert map at the
current time step k and its Hilbert map at the previous communication time step, k − Tc , such that

∆fn,k = fn,k − fn,k−Tc , n ∈ Gı (44)

where fn,k and fn,k−Tc denote the Hilbert maps of the nth robot at times kth and (k − Tc )th, respectively.
In the third step, the sum of the Hilbert map changes obtained from all robots in Gı , defined as,

∆fGı ,k , ∑ ∆fn,k (45)


n∈Gı

is communicated back to the other robots, except the TDC robot, anı∗ . Finally, in the fourth step, all the
robots update their own Hilbert maps by adding the received total Hilbert map changes, ∆fGı ,k , to the
current Hilbert maps, such that
fn,k+ = fn,k + ∆fGı ,k , n ∈ Gı (46)

where fn,k+ represents the Hilbert map after the data fusion process.

5. Path-Planning Algorithms
In the previous sections, the Hilbert maps fl,n,k , l = 1, .., L, n = 1, · · · , N and k = 1, · · · , T f ,
are approximated and updated by the robots. The corresponding binary probabilities pl,n,k can be
calculated efficiently as follows
h iT
pl,n,k = pl,n,k (x1c ), · · · , pl,n,k (xcQ )

f (xc ) f (xc )
T (47)
e l,n,k 1 e l,n,k 1
= f (xc ) , · · · ,
l,n,k f (xc )
l,n,k
1+ e 1 1+ e 1

and the entropy at each collocation point xcq ∈ X c , q = 1, · · · , Q, is obtained by

hl,n,k (xcq ) = − pl,n,k (xcq ) log[ pl,n,k (xcq )] − [1 − pl,n,k (xcq )] log[1 − pl,n,k (xcq )] (48)

Then, an entropy map, hn,k is obtained from the vector,


h iT
hn,k = hn,k (x1c ), · · · , hn,k (xcQ ) (49)
Sensors 2019, 19, 1524 11 of 21

where hn,k (xcq ) = ∑lL=1 hl,n,k (xcq ) denotes the sum of the entropy at the collocation point xcq .
According to the cost function in (5), the objective of the nth robot is to minimize the uncertainty
of the concentration distribution, which can be implemented by minimizing the entropy at all the
collocation points, such that
Q
J̃n (k) = ∑ hn,k (xcq ) (50)
q =1

R (T f )
Therefore, J̃n ( T f ) can be treated as an approximation of the term W H [Cn (x|Mn )]dx in (5).
In the rest of paper, J̃n ( T f ) is applied as the approximation of the final cost function in (5) unless
otherwise stated. In other words, the nth robot should visit the area around the collocation point xcq ,
which has the higher value of hn,k (xcq ) at the kth time step. Based on this idea, two entropy-based
path-planning algorithms are proposed in the following section to control the robots such that the
value of the concentration measurements obtained by the robots in the ROI can be optimized over time.

5.1. Entropy-Based Artificial Potential Field


An information-driven approach is developed by planning the path of the robots such that they
move towards collocation points with higher entropy. The collocation points are treated as goal
positions characterized by attractive artificial potential fields defined as,

hn,k (xcq )
U att (xn , xcq ) = − , q = 1, · · · , Q and n = 1, · · · , N (51)
kxn − xcq k2

where the superscript “att” indicates the attractive field. The corresponding attractive gradient is
expressed as
att c
∂Uqatt (xn ) Uqatt (xn )
g (xn , xq ) = = (xn − xcq ), q = 1, · · · , Q (52)
∂xn kxn − xcq k4
Similarly to classic artificial potential field methods, the repulsive potential functions U rep
generated from the other robots are also considered, such that
  2
1 1
− 1
, kxn − xn0 k ≤ rrep
rep
U ( xn , xn0 ) = 2 kxn −xn0 k r rep , 1 ≤ n 6= n0 ≤ N (53)
0, kx − x 0 k > r n n rep

where rrep is the distance threshold to create a repulsion effect on the robot. The repulsive gradient is
expressed as
 
xn −xn0

− 1 1
k − k
− r k x n − x n 0 k3
, kxn − xn0 k ≤ rrep
rep
g ( xn , xn0 ) = x n x n 0 rep
, 1 ≤ n 6= n0 ≤ N (54)
0, kx − x 0 k > r
n n rep

Using (52) and (54), the total potential gradient for the nth robot is expressed as,
Q
gtol (xn ) = ξ ∑ gatt (xn , xcq ) + η ∑ grep (xn , xn0 ) (55)
q =1 1≤ n 0 6 = n ≤ N

where ξ and η are user-defined coefficients.


Based on the total gradient, gtol (xn ), the nth robot can be controlled to visit the collocation points
with higher uncertainty and avoid collisions with other robots. The algorithm is developed based on
the entropy attraction, which is named as EAPF algorithm.
Sensors 2019, 19, 1524 12 of 21

5.2. Entropy-Based Particle Swarm Optimization


The particle swarm optimization (PSO) algorithm proposed by Clerc and Kennedy in [33] and its
variants use a “constriction coefficient” to prevent the “explosion behavior” of the particles, and have
been successfully applied to GDM and gas source localization problems [12,13]. In the original PSO
algorithm and its variants, the concentration measurements are used to update the robot controls.
Considering the different objective function in (5), an entropy-based PSO (EPSO) is proposed.
At the kth time step, the update of the nth robot position, xn,k , can be described as
h i
neig glob
νn,k = χ νn,k−1 + ψ1 g att (xn,k , bn,k ) + ψ2 g att (xn,k , bn,k ) + η ∑1≤n0 6=n≤ N grep (xn,k , xn0 ,k )
(56)
xn,k+1 = xn,k + νn,k

neig glob
where νn,k represents the velocity of the nth robot at the kth time step, bn,k and bn,k are the best
neig
neighboring and global collocation points, respectively. The best neighboring collocation point, bn,k ,
is determined by,
neig
bn,k = arg max hn,k (x) (57)
x∈X c , kxn,k −xk≤rneig

where rneig is a coefficient which specifies the neighbor area from xn,k . The best global collocation point,
glob
bn,k , is determined by
glob
bn,k = arg maxhn,k (x) (58)
x∈X c

The learning coefficients, ψ1 ∈ [0, ψ̄1 ] and ψ2 ∈ [0, ψ̄2 ], are two uniform random variables. The constant
parameter χ > 0 prevents the explosion behavior. For efficient performance and prevention of the
explosion behavior in (56), the parameter settings of the learning coefficients proposed in [12,13,33] is
applied. The constriction parameter χ > 0 is calculated by (refer to [33]),



 , if ψ > 4
χ= ψ − 2 + ψ2 −4ψ (59)
κ, otherwise

where ψ = ψ̄1 + ψ̄2 and κ ∈ [0, 1].

6. Simulations and Results


The performance of the decentralized GDM methods presented in this paper is demonstrated on a
gas sensing application, where information about gas concentration obtained by a large network
of robots is fused and used for information-driven path planning in a decentralized approach.
Two indoor and outdoor environments are simulated and used to test the proposed methods. The new
entropy-based EAPF and EPSO path-planning algorithms are compared to the existing algorithms
known as random walk (RW) and classical particle swarm optimization (CPSO) [12,13,33].

6.1. Indoor GDM Sensing


The decentralized sensing system consists of a network of N = 100 robots characterized by single
integrator dynamics,
ẋi = ui , i = 1, . . . , N (60)

where xi = [ x, y] T is the robot state vector, x, y are the inertial coordinates, and ui = [u1 , u2 ] T is
the robot control vector comprised of the x- and y-velocity components. The robot state/position is
assumed to be observable by a built-in GPS with zero-mean measure noise w, where w is Gaussian
white noise N (0, Σ) with Σ = 0.05 × I2 . The above distributed sensing system is tasked with mapping a
gas distribution in an ROI W = [0, L x ] × [0, Ly ] where L x = 200 m and Ly = 160 m, with an unknown
gas distribution shown in Figure 1, and over a fixed time interval [0, T f ], where T f = 500 min.
The normalized gas concentration range R = [0, 100] (Figure 1) is divided into L = 3 intervals:
Sensors 2019, 19, 1524 13 of 21

R1 = [0, 30), R2 = [30, 70), and R3 = [70, 100], representing low-hazard, medium-hazard, and
high-hazard concentrations, respectively.

Normalized gas concentration


y (m)

x (m)
Figure 1. Gas concentration distribution in an indoor environment and initial robot deployment in ROI
(plotted by red points).

The robots are initially deployed at four corners in the ROI by sampling from a given Gaussian
Mixed Model with 4 components where µ1 = [20, 20] T , µ"2 = [20, T
# 140] , µ3 = [180, 20] and
T

10 0
µ4 = [180, 140] T , and identical covariance matrices, Σ = . The initial robot positions
0 10
are shown as red points in Figure 1. Each robot is equipped with a small metal oxide sensor array
comprised by M = 5 × 5 = 25 gas sensors, where the spatial intervals between two sensors are
all 10 cm. The FOV of each sensor array covers an area of 40 × 40 cm2 in the ROI, which is very
small relative to the whole ROI. Assume that the measurement noise of the gas concentration V (x) is
also characterized by Gaussian white noise, N (0, Σc (x)) with the covariance matrix Σc (x) = 0.5 × I2
everywhere in W . All of robots communicate with each other at the same time with a communication
period Tc = 10 min. The communication range is rc = 30 m. In addition, Q = 100 × 100 virtual
collocation points are evenly deployed in the ROI to generate the Hilbert maps, where the Gaussian
kernel is used for Hilbert function learning with a kernel size of σ = 2 m, and the parameter τ is set
to 3.
The EPSO and the CPSO neighbor range coefficient, rneig = 5 m, is applied to determine the best
neighboring collocation points. The maximum velocity of each robot is 2 m/min for all simulations.
The four path-planning algorithms, referred to as EAPF, EPSO, CPSO, and RW, are tested for mapping
the gas distribution. The approximated cost function J̃n (k) defined in (50) is calculated by each robot at
every time step as shown in Figure 2, where the solid line represents the mean of the approximated cost
values over all of the robots at the kth time step, and the dashed line indicates one standard deviation
above and below the mean. These cost histories reflect how effective the robots are at mapping the
gas distribution. A lower value of J̃n (k) indicates that more information about the gas distribution is
obtained by the robot at the kth step. It can be observed that the EAPF and EPSO algorithms achieve
significantly better performance than the others. They can map the gas distribution more rapidly
and completely, while the CPSO and RW algorithms perform poorly because they do not used prior
measurement data for planning.
The means and standard deviations of the approximated cost function values of the all robots
at the final time k = T f = 500 for all the algorithms are tabulated in Table 1. It can be seen that the
Sensors 2019, 19, 1524 14 of 21

approximated cost function obtained by the EPSO algorithm outperforms the other algorithms in the
indoor environment. Let n∗ denote the index of the robot who gets the lowest value of J̃n ( T f ) at the
final step. Then, the value of J̃n∗ (k) represents the best performance among all the robots. The estimates
of the gas concentration in the ROI are calculated based on these learned Hilbert maps according to (35),
where c̄ = [15, 50, 85] is calculated from the cutoff coefficients. For each path-planning algorithm,
three snapshots of the estimated GDMs generated by the n∗ th robot in each simulation are presented
in Figure 3, where the snapshots are taken at the k = 20th, k = 100th, k = 500th time steps, respectively.
It can be seen that the robots controlled by the EAPF and EPSO algorithms obtain clean GDMs at the
final time step, while the robots controlled by the other two existing algorithms cannot complete the
mapping task in the given time period.
𝚥𝚥̃𝑛𝑛 (𝑘𝑘) (bit)

k (min)
Figure 2. Approximated cost functions for different path-planning algorithms in the indoor environment,
where the solid lines represent the mean of the approximated cost values over all the robots at the kth
time step and the dashed lines indicate one standard deviation above and below the solid lines.

Table 1. Statistical results of the approximated cost function at the final time step in the indoor environment.

Algorithm Mean of J̃n ( T f ) Std. of J̃n ( T f )


EAPF 72.01 66.64
EPSO 28.003 36.56
CPSO 19650.11 459.56
RW 15442.17 555.88

The normalized mean square errors (NMSE) between the estimated gas distribution and the
actual gas distribution are calculated for each robot. The means and the corresponding standard
deviations of the NMSE over the all robots for the different planning algorithms are reported in
Table 2, which obviously shows that the EPSO algorithm outperforms the other algorithms to estimate
the GDMs.

Table 2. Statistical results of the NMSE between the estimated GDMs and the actual GDMs at the final
time step in the indoor environment.

Algorithm Mean of NMSE Std. of NMSE


EAPF 0.17521 0.00934
EPSO 0.17022 0.00432
CPSO 1.72230 0.03225
RW 1.20700 0.12516
Sensors 2019, 19, 1524 15 of 21

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

(j) (k) (l)

Figure 3. Evolution of the estimated gas distribution map in the indoor environment generated by the
n∗ th robot in each simulation at three instants in time, where the red point indicates the position of the
n∗ th robot and white points indicate the other robots, where the sub-figures in the first row (a–c) show
the evolution of the estimated gas distribution obtained by the EAPF algorithm; the sub-figures in the
second row (d–f) show the evolution of the estimated gas distribution obtained by the EPSO algorithm;
the sub-figures in the third row (g–i) show the evolution of the estimated gas distribution obtained by
the CPSO algorithm; and the sub-figures in the fourth row (j–l) show the evolution of the estimated
gas distribution obtained by the RW algorithm.

6.2. Outdoor GDM


To verify the robustness and versatility of the proposed approaches, a GDM shown in Figure 4,
originally presented in [34], is used to represent the gas distribution in an outdoor environment.
The gas concentrations are normalized to the range R = [0, 100] for comparison like the indoor
simulations. In this case, the intervals are chosen as R1 = [0, 60), R2 = [60, 80), and R3 = [80, 100],
to represent the plume shapes. All other parameters, including the initial robot positions, are the same
as those used in the previous subsection.
Sensors 2019, 19, 1524 16 of 21

The approximated cost function J̃n (k) obtained by all four algorithms is plotted in Figure 5.
The approximated cost function at the final step are also tabulated in Table 3. Similarly to the indoor
simulations, the gas concentration is estimated according to (35), where c̄ = [30, 70, 90] is calculated
based on the cutoff coefficients. The evolution of the estimated GDM obtained by different algorithms
is presented in Figure 6. Furthermore, the statistical results of the NMSE between the estimated GDM
and the actual GDM are reported in Table 4.

Normalized gas concentration


y (m)

x (m)
Figure 4. ROI and Gas concentration distribution in an outdoor environment and initial robot
deployment in the ROI, where the red points represent the robots.
𝚥𝚥̃𝑛𝑛 (𝑘𝑘) (bit)

k (min)
Figure 5. Approximated cost functions for different path-planning algorithms in the outdoor
environment, where the solid lines represent the mean of the approximated cost values over all
the robots at the kth time step and the dashed lines indicate one standard deviation above and below
the solid lines.
Sensors 2019, 19, 1524 17 of 21

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

(j) (k) (l)

Figure 6. Evolution of the estimated gas distribution map in the outdoor environment generated by
the n∗ th robot in each simulation at three instants in time, where the red point indicates the position of
the n∗ th robot and white points indicate the other robots, where the sub-figures in the first row (a–c)
show the evolution of the estimated gas distribution obtained by the EAPF algorithm; the sub-figures
in the second row (d–f) show the evolution of the estimated gas distribution obtained by the EPSO
algorithm; the sub-figures in the third row (g–i) show the evolution of the estimated gas distribution
obtained by the CPSO algorithm; and the sub-figures in the fourth row (j–l) show the evolution of the
estimated gas distribution obtained by the RW algorithm.

Table 3. Statistic results of the approximated cost function at the final time step in the outdoor environment.

Algorithm Mean of J̃n ( T f ) Std. of J̃n ( T f )


EAPF 243.9842 263.3285
EPSO 19.7781 55.7657
CPSO 20232.4784 147.2573
RW 16510.9489 380.798
Sensors 2019, 19, 1524 18 of 21

Table 4. Statistic results of the estimated gas distribution maps at final time step in the outdoor environment.

Algorithm Mean of NMSE Std. of NMSE


EAPF 0.05368 0.00506
EPSO 0.04272 0.00123
CPSO 0.51741 0.00402
RW 0.41734 0.01094

As expected, the results presented in Figures 5 and 6 and Tables 3 and 4 all show that the proposed
EAPF and EPSO algorithms work well in the outdoor GDM problem, while the CPSO and RW algorithms
cannot map the gas concentration in the entire workspace in the given time period. In addition, the
EPSO algorithm significantly outperforms the other algorithms in all simulations.

7. Conclusions
This paper presents a decentralized framework for GDM and information-driven path planning
in distributed sensing systems. GDM is performed using a probabilistic representation known as a
Hilbert map and a novel Hilbert map fusion method is presented that quickly and efficiently combines
information from many neighboring robots. In addition, two entropy-based path-planning algorithms,
namely the EAPF and EPSO algorithms, are proposed to efficiently control all the robots to obtain
the gas concentration measurements in the ROI. The proposed approaches are demonstrated on a
system with hundreds of robots that must map a gas distribution collaboratively over a large ROI
using on-board iron-oxide arrays and no prior information. The results show that through fusion
and decentralized processing, the entropy of the gas map decreases over time, the robot paths remain
safe (avoiding mutual collisions), and the entropy-based methods far outperform both traditional and
random approaches.

Author Contributions: Conceptualization, P.Z., J.M. and S.F.; Formal analysis, P.Z. and J.M.; Funding
acquisition, S.F.; Methodology, P.Z.; Software, P.Z.; Project administration, S.F.; Validation, B.D., R.L. and S.F.;
Writing—original draft, J.M. and P.Z.; Writing—review and editing, all authors.
Funding: This research was partially funded by National Science Foundation grant ECCS-1556900 and the Office
of Naval Research, Code 321.
Conflicts of Interest: The authors declare no conflict of interest and the funders had no role in the design of the
study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to
publish the results.

Appendix A

Appendix A.1. Proof of Theorem 1


Given two Hilbert functions f 1 (x) and f 2 (x), the conditional probabilities that the concentration
at the position x is belong the range R are expressed as

p1 (x|M1 ) = P(C (x) ∈ R|M1 ) (A1)


p2 (x|M2 ) = P(C (x) ∈ R|M2 ) (A2)

For convenience, the event C(x) ∈ R and its complement even C(x) ∈ / R are denoted by e and eC ,
respectively. Then, the fused conditional probability, p F (x|M1 , M2 ), can be expressed as,
Sensors 2019, 19, 1524 19 of 21

p F (x|M1 , M2 ) = P(e|M1 , M2 )
e,M1 ,M2 )
= PP((M ,M ) 1 2
P(e) P(M1 |e) P(M2 |e)
= P(e) P(M1 |e) P(M2 |e)+ P(eC ) P(M1 |eC ) P(M2 |eC )
P(e) P(M1 |e) P(e) P(M2 |e)
= P(e)2 P(M1 |e) P(M2 |e)+εP(eC )2 P(M1 |eC ) P(M2 |eC )
P(e,M1 ) P(e,M2 )
= P(M1 ) P(e|M1 ) P(M2 ) P(e|M2 )+εP(M1 ) P(eC |M1 ) P(M2 ) P(eC |M2 )
P(e|M1 ) P(e|M2 )
= P(e|M1 ) P(e|M2 )+εP(eC |M1 ) P(eC |M2 )
= p1 (x|M1 ) p2 (x|M2 ) (A3)
p1 (x|M1 ) p2 (x|M2 )+ε[1− p1 (x|M1 )][1− p2 (x|M2 )]
e f 1 (x) e f 2 (x)
1+ e f 1 ( x ) 1+ e f 2 ( x )
=
e f 1 (x) e f 2 (x) +ε 1 1
1+ e f 1 ( x ) 1+ e f 2 ( x ) 1+ e f 1 ( x ) 1+ e f 2 ( x )
e f1 (x)+ f2 (x)
=
e f1 (x)+ f2 (x) +ε
e f1 (x)+ f2 (x)−ln ε
=
e f1 (x)+ f2 (x)−ln ε +1
e f F (x)
=
e f F (x)+1

where,
f F (x) = f 1 (x) + f 2 (x) − ln ε (A4)

References
1. Neumann, P.P.; Asadi, S.; Lilienthal, A.J.; Bartholmai, M.; Schiller, J.H. Autonomous gas-sensitive microdrone:
Wind vector estimation and gas distribution mapping. IEEE Robot. Autom. Mag. 2012, 19, 50–61. [CrossRef]
2. Rossi, M.; Brunelli, D. Autonomous gas detection and mapping with unmanned aerial vehicles. IEEE Trans.
Instrum. Meas. 2016, 65, 765–775. [CrossRef]
3. Lilienthal, A.; Loutfi, A.; Blanco, J.L.; Galindo, C.; Gonzalez, J. Integrating SLAM into gas distribution mapping.
In Proceedings of the ICRA Workshop on Robotic Olfaction–Towards Real Applications, Freiburg, Germany,
19–21 September 2007.
4. Bayat, B.; Crasta, N.; Crespi, A.; Pascoal, A.M.; Ijspeert, A. Environmental monitoring using autonomous
vehicles: A survey of recent searching techniques. Curr. Opin. Biotechnol. 2017, 45, 76–84. [CrossRef]
[PubMed]
5. Jelicic, V.; Magno, M.; Brunelli, D.; Paci, G.; Benini, L. Context-adaptive multimodal wireless sensor network
for energy-efficient gas monitoring. IEEE Sens. J. 2013, 13, 328–338. [CrossRef]
6. Ishida, H.; Nakamoto, T.; Moriizumi, T. Remote sensing of gas/odor source location and concentration
distribution using mobile system. Sens. Actuators B Chem. 1998, 49, 52–57. [CrossRef]
7. Lilienthal, A.; Duckett, T. Building gas concentration gridmaps with a mobile robot. Robot. Auton. Syst. 2004,
48, 3–16. [CrossRef]
8. Stachniss, C.; Plagemann, C.; Lilienthal, A.J.; Burgard, W. Gas distribution modeling using sparse
Gaussian process mixture models. In Proceedings of the Robotics: Science and Systems Conference 2008,
Zürich, Switzerland, 25–28 June 2008; pp. 310–317.
9. Albertson, J.D.; Harvey, T.; Foderaro, G.; Zhu, P.; Zhou, X.; Ferrari, S.; Amin, M.S.; Modrak, M.; Brantley, H.;
Thoma, E.D. A mobile sensing approach for regional surveillance of fugitive methane emissions in oil and
gas production. Environ. Sci. Technol. 2016, 50, 2487–2497. [CrossRef]
10. Hayes, A.T.; Martinoli, A.; Goodman, R.M. Distributed odor source localization. IEEE Sens. J. 2002, 2, 260–271.
[CrossRef]
11. Jatmiko, W.; Ikemoto, Y.; Matsuno, T.; Fukuda, T.; Sekiyama, K. Distributed odor source localization in dynamic
environment. In Proceedings of the 2005 IEEE Sensors, Irvine, CA, USA, 30 October–3 November 2005.
[CrossRef]
Sensors 2019, 19, 1524 20 of 21

12. Akat, S.B.; Gazi, V.; Marques, L. Asynchronous particle swarm optimization-based search with a multi-robot
system: Simulation and implementation on a real robotic system. Turk. J. Electr. Eng. Comput. Sci. 2010,
18, 749–764.
13. Turduev, M.; Cabrita, G.; Kırtay, M.; Gazi, V.; Marques, L. Experimental studies on chemical concentration
map building by a multi-robot system using bio-inspired algorithms. Auton. Agents Multi-Agent Syst. 2014,
28, 72–100. [CrossRef]
14. Sinha, A.; Kaur, R.; Kumar, R.; Bhondekar, A.P. Cooperative control of multi-agent systems to locate source
of an odor. arXiv 2017, arXiv:1711.03819.
15. Rudd, K.; Foderaro, G.; Ferrari, S. A generalized reduced gradient method for the optimal control of
multiscale dynamical systems. In Proceedings of the 52nd IEEE Conference on Decision and Control,
Florence, Italy, 10–13 December 2013; pp. 3857–3863.
16. Foderaro, G.; Ferrari, S.; Wettergren, T.A. Distributed optimal control for multi-agent trajectory optimization.
Automatica 2014, 50, 149–154. [CrossRef]
17. Ferrari, S.; Foderaro, G.; Zhu, P.; Wettergren, T.A. Distributed optimal control of multiscale dynamical
systems: A tutorial. IEEE Control Syst. Mag. 2016, 36, 102–116.
18. Rudd, K.; Foderaro, G.; Zhu, P.; Ferrari, S. A Generalized Reduced Gradient Method for the Optimal Control
of Very-Large-Scale Robotic Systems. IEEE Trans. Robot. 2017, 33, 1226–1232. [CrossRef]
19. Foderaro, G.; Zhu, P.; Wei, H.; Wettergren, T.A.; Ferrari, S. Distributed optimal control of sensor networks
for dynamic target tracking. IEEE Trans. Control Netw. Syst. 2018, 5, 142–153. [CrossRef]
20. Doerr, B.; Linares, R.; Zhu, P.; Ferrari, S. Random Finite Set Theory and Optimal Control for Large Spacecraft
Swarms. arXiv 2018, arXiv:1810.00696.
21. Jiménez, A.; García-Díaz, V.; Bolaños, S. A decentralized framework for multi-agent robotic systems. Sensors
2018, 18, 417. [CrossRef]
22. Lilienthal, A.; Duckett, T. Creating gas concentration gridmaps with a mobile robot. In Proceedings of the
2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No. 03CH37453),
Las Vegas, NV, USA, 27–31 October 2003; Volume 1, pp. 118–123.
23. Moravec, H.; Elfes, A. High resolution maps from wide angle sonar. In Proceedings of the 1985 IEEE
International Conference on Robotics and Automation, St. Louis, MO, USA, 25–28 March 1985; Volume 2,
pp. 116–121.
24. Rosenblatt, M. Remarks on some nonparametric estimates of a density function. Ann. Math. Stat. 1956, 27,
832–837. [CrossRef]
25. Parzen, E. On estimation of a probability density function and mode. Ann. Math. Stat. 1962, 33, 1065–1076.
[CrossRef]
26. Williams, C.K.; Rasmussen, C.E. Gaussian Processes for Machine Learning; MIT Press: Cambridge, MA, USA,
2006; Volume 2.
27. Ramos, F.; Ott, L. Hilbert maps: Scalable continuous occupancy mapping with stochastic gradient descent.
Int. J. Robot. Res. 2015, 35, 1717–1730. [CrossRef]
28. Zhu, P.; Chen, B.; Príncipe, J.C. Extended Kalman filter using a kernel recursive least squares observer.
In Proceedings of the 2011 International Joint Conference on Neural Networks, San Jose, CA, USA,
31 July–5 August 2011; pp. 1402–1408.
29. Zhu, P.; Chen, B.; Príncipe, J.C. A novel extended kernel recursive least squares algorithm. Neural Netw.
2012, 32, 349–357. [CrossRef] [PubMed]
30. Zhu, P. Kalman Filtering in Reproducing Kernel Hilbert Spaces. Ph.D. Thesis, University of Florida,
Gainesville, FL, USA, 2013.
31. Zhu, P.; Chen, B.; Príncipe, J. Learning nonlinear generative models of time series with a Kalman filter in
RKHS. IEEE Trans. Signal Process. 2014, 62, 141–155. [CrossRef]
32. Zhu, P.; Wei, H.; Lu, W.; Ferrari, S. Multi-kernel probability distribution regressions. In Proceedings of
the 2015 International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, 12–17 July 2015;
pp. 1–7.
Sensors 2019, 19, 1524 21 of 21

33. Clerc, M.; Kennedy, J. The particle swarm-explosion, stability, and convergence in a multidimensional
complex space. IEEE Trans. Evolut. Comput. 2002, 6, 58–73. [CrossRef]
34. Gemerek, J.R.; Ferrari, S.; Albertson, J.D. Fugitive gas emission rate estimation using multiple heterogeneous
mobile sensors. In Proceedings of the 2017 ISOCS/IEEE International Symposium on Olfaction and
Electronic Nose (ISOEN), Montreal, QC, Canada, 28–31 May 2017; pp. 1–3.

c 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).

You might also like