0% found this document useful (0 votes)
49 views

Digital Signal Processing: B. Dulek, S. Gezici

Uploaded by

Fernando Batista
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views

Digital Signal Processing: B. Dulek, S. Gezici

Uploaded by

Fernando Batista
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Digital Signal Processing 22 (2012) 828–840

Contents lists available at SciVerse ScienceDirect

Digital Signal Processing


www.elsevier.com/locate/dsp

Cost minimization of measurement devices under estimation accuracy constraints


in the presence of Gaussian noise ✩
B. Dulek ∗ , S. Gezici ∗
Department of Electrical and Electronics Engineering, Bilkent University, Bilkent, Ankara TR-06800, Turkey

a r t i c l e i n f o a b s t r a c t

Article history: Novel convex measurement cost minimization problems are proposed based on various estimation
Available online 17 April 2012 accuracy constraints for a linear system subject to additive Gaussian noise. Closed form solutions are
obtained in the case of an invertible system matrix. In addition, the effects of system matrix uncertainty
Keywords:
are studied both from a generic perspective and by employing a specific uncertainty model. The results
Measurement cost
Cramer–Rao bound (CRB)
are extended to the Bayesian estimation framework by treating the unknown parameters as Gaussian
Parameter estimation distributed random variables. Numerical examples are presented to discuss the theoretical results in
Gaussian noise detail.
© 2012 Elsevier Inc. All rights reserved.

1. Introduction stages. It is found out that optimal detection performance can


be achieved by a randomized on–off transmission scheme of the
In this paper, we propose measurement cost minimization acquired measurements at a suitable rate. The distributed mean-
problems under various constraints on estimation accuracy for a location parameter estimation problem is considered in [4] for
system characterized by a linear input–output relationship subject WSNs based on quantized observations. It is shown that when
to Gaussian noise. For the measurement cost, we employ the re- the dynamic range of the estimated parameter is small or com-
cently proposed measurement device model in [1], and present parable with the noise variance, a class of maximum likelihood
a detailed treatment of the proposed measurement cost mini- (ML) estimators exists with performance close to that of the sam-
mization problems. Although the statistical estimation problem in ple mean estimator under stringent bandwidth constraint of one
the presence of Gaussian noise is by far the most widely known bit per sensor. When the dynamic range of the estimated parame-
and well-studied subject of estimation theory [2], approaches that ter is comparable to or large than the noise variance, an optimum
value for the quantization step results in the highest estimation
consider the estimation performance jointly with system-resource
accuracy possible for a given bandwidth constraint. In [5], a power
constraints have become popular in recent years. Distributed de-
scheduling strategy that minimizes the total energy consumption
tection and estimation problems took the first step by incorpo-
subject to a constraint on the worst mean-squared-error (MSE) dis-
rating bandwidth and energy constraints due to data processing at
tortion is derived for decentralized estimation in a heterogeneous
the sensor nodes, and data transmission from sensor nodes to a fu-
sensing environment. Assuming an uncoded quadrature amplitude
sion node in the context of wireless sensor networks (WSNs) [3–7].
modulation (QAM) transmission scheme and uniform randomized
Since then, the majority of the related studies have addressed the
quantization at the sensor nodes, it is stated that depending on
costs arising from similar system-level limitations with a relatively
the corresponding channel quality, a sensor is either on or off
weak emphasis on the measurement costs due to amplitude reso-
completely. When a sensor is active, the optimal values for trans-
lution and dynamic range of the sensing apparatus. To begin with,
mission power and quantization level for the sensor can be deter-
we summarize the main aspects of the research that has been car-
mined analytically in terms of the channel path losses and local
ried out in recent years to unfold the relationship between estima-
observation noise levels.
tion capabilities and aforementioned costs of the sensing devices. In [6], distributed estimation of an unknown parameter is dis-
In [3], detection problems are examined under a constraint on cussed for the case of independent additive observation noises
the expected cost resulting from measurement and transmission with possibly different variances at the sensors and over non-
ideal fading wireless channels between the sensors and the fusion

center. The concepts of estimation outage and estimation diver-
Part of this work is presented at IEEE International Workshop on Signal Process-
sity are introduced. It is proven that the MSE distortion can be
ing Advances for Wireless Communications (SPAWC), June 2012.
minimized under sum power constraints by turning off sensors
* Corresponding authors. Fax: +90 312 266 4192.
E-mail addresses: [email protected] (B. Dulek), [email protected] transmitting over bad channels adaptively without degrading the
(S. Gezici). diversity gain. In addition, performance decrease is reported when

1051-2004/$ – see front matter © 2012 Elsevier Inc. All rights reserved.
http://dx.doi.org/10.1016/j.dsp.2012.04.009
B. Dulek, S. Gezici / Digital Signal Processing 22 (2012) 828–840 829

individual power constraints are also imposed at each sensor. In [13] in order to calculate the optimal costs of measurement devices
[7], the distributed estimation of a deterministic parameter im- that maximize the average Fisher information for a scalar parame-
mersed in uncorrelated noise in a WSN is targeted under a total ter estimation problem.
bit rate constraint. The number of active sensors is determined Although the optimal cost allocation problem is studied for the
together with the quantization bit rate of each active sensor in single parameter estimation case in [13], and the signal recovery
order to minimize the MSE. The problem of estimating a spatially based on linear minimum mean-squared-error (LMMSE) estimators
distributed, time-varying random field from noisy measurements is discussed under cost-constrained measurements using a linear
collected by a WSN is investigated under bandwidth and energy system model in [1], no studies have analyzed the implications of
constraints on the sensors in [8]. Using graph-theoretic techniques, the proposed measurement device model in a more general setting
it is shown that the energy consumption can be reduced by con- by considering both random and nonrandom parameter estimation
structing reduced order Kalman–Bucy filters from only a subset of under various estimation accuracy constraints and uncertainty in
the sensors. In order to prevent degradation in the root-mean- the linear system model. The main contributions of our study in
squared (RMS) estimation error performance, efficient methods this paper extend far beyond a multivariate analysis of the discus-
employing Pareto optimality criterion between the communication sion in [13], and can be summarized as follows:
costs and RMS estimation error are presented. A power allocation
problem for distributed parameter estimation is investigated un- • Formulated new convex optimization problems for the min-
der a total network power constraint for various topologies in [9]. imization of the total measurement cost by employing con-
It is shown that for the basic star topology, the optimal solution straints on various estimation accuracy criteria (i.e., different
assumes either of the sensor selection, water-filling, or channel functionals of the eigenvalues of the Fisher information ma-
inversion forms depending on the measurement noise variance, trix (FIM)) assuming a linear system model1 in the presence
and the corresponding analytical expressions are obtained. Asymp- of Gaussian noise.
totically optimal power allocation strategies are derived for more • Studied system matrix uncertainty both from a general per-
complex branch, tree, and linear topologies assuming amplify-and- spective and by employing a specific uncertainty model.
forward and estimate-and-forward transmission protocols. The de- • Obtained closed form solutions for two of the proposed convex
centralized WSN estimation is extended to incorporate the effects optimization problems in the case of invertible system matrix.
of imperfect data transmission from sensors to fusion center under • Extended the results to the Bayesian estimation framework by
stringent bandwidth constraints in [10]. treating the unknown estimated parameters as Gaussian dis-
Important results are also obtained for the sensor selection tributed random variables.
problem under various constraints on the system cost and esti-
In addition to the items listed above, simulation results are
mation accuracy. The problem of choosing a set of k sensor mea-
presented to discuss the theoretical results. Namely, we compare
surements from a set of m available measurements so that the
the performance of various estimation quality metrics through nu-
estimation error is minimized is addressed in [11] under a Gaus-
merical examples using optimal and suboptimal cost allocation
sian assumption. It is shown that the combinatorial complexity of
schemes, and simulate the effects of system matrix uncertainty.
the solution can significantly be reduced without sacrificing much
We also examine the behavior of the optimal solutions returned
from the estimation accuracy by employing a heuristic based on
by various estimation accuracy criteria under scaling of the system
convex optimization. In [12], a similar sensor selection problem is
noise variances, and identify the most robust criterion to variations
analyzed in a target detection framework when several classes of
in the average system noise power via numerical examples. The re-
binary sensors with different discrimination performance and costs
lationship between the number of effective measurements and the
are available. Based on the conditional distributions of the obser-
quality of estimation is also investigated under scaling of the sys-
vations at the fusion center, the performance of the corresponding
tem noise variances.
optimal hypothesis tests is assessed using the symmetric Kullback–
The rest of this paper is organized as follows: In Section 2, we
Leibler divergence. The solution of the resulting constrained max-
pose the optimal cost allocation problem as a convex optimiza-
imization problem indicates that the sensor class with the best
tion problem under various information criteria for nonrandom
performance-to-cost ratio should be selected.
parameter vector estimation. In Section 3, we modify the proposed
As outlined above, not much work has been performed, to the
best of our knowledge, in the context of jointly designing the mea- optimization problems to handle the worst-case scenarios under
surement stage from a cost-oriented perspective while perform- system matrix uncertainty. Next, we take a specific but neverthe-
ing estimation up to a predetermined level of accuracy. In other less practical uncertainty model, and discuss how the optimization
words, the trade-offs between measurement associated costs and problems are altered while preserving convexity. In Section 4, we
estimation errors remain, to a large extent, undiscovered in the focus on two optimization problems proposed in Section 2, and
literature. On the other hand, if adopted, such an approach will simplify them to obtain closed form solutions in the case of in-
inevitably require a general and reliable method of assessing the vertible system matrix. In Section 5, we provide several numerical
cost of measurements applicable to any real world phenomenon examples to illustrate the results presented in this paper. Exten-
under consideration as well as an appropriate means of evaluat- sions to Bayesian estimation with Gaussian priors are discussed in
ing the best achievable estimation performance without reference Section 6, and we conclude in Section 7.
to any specific estimator structure. For the fulfillment of the first
requirement, a novel measurement device model is suggested in 2. Optimal cost allocation under estimation accuracy constraints
[1], where the cost of each measurement is determined by the
Consider a discrete-time system model as in Fig. 1 in which
number of amplitude levels that can reliably be distinguished. As a
noisy measurements are obtained at the output of a linear system,
consequence, higher resolution (less noisy) measurements demand
and then the measurements are processed to estimate the value of
higher costs in accordance with the usual practice. Although the
a nonrandom parameter vector θ . The observation vector x at the
proposed model may lack in capturing the exact relationship be-
tween the cost and inner workings of any specific measurement
hardware, it encompasses a sufficient amount of generality to re- 1
Such linear models have a multitude of application areas, a few examples of
main useful under a multitude of circumstances. Based on this which are channel equalization, wave propagation, compressed sensing, and Wiener
measurement model, an optimization problem is formulated in filtering problems [14,15].
830 B. Dulek, S. Gezici / Digital Signal Processing 22 (2012) 828–840

Fig. 1. Measurement and estimation systems model block diagram for a linear system with additive noise.

output of the linear system can be represented by x = H T θ + n, ∀i ∈ {1, 2, . . . , K }, since θ is a deterministic parameter vector. Then,
where θ ∈ R L denotes a vector of parameters to estimate, n ∈ R K the overall cost of measuring all the components of the observa-
is the inherent random system noise, and x ∈ R K is the observa- tion vector x is expressed as
tion vector at the output of the linear system. The system noise n

K 
K  
is assumed to be a Gaussian distributed random vector with zero- 1 σn2i
mean, independent but not necessarily identical components, i.e., C= Ci = log2 1 + . (1)
2 σm2 i
n ∼ N (0, Dn ), where Dn = diag{σn21 , σn22 , . . . , σn2K } is a diagonal co- i =1 i =1
variance matrix, and 0 denotes the all-zeros vector of length K . A closer look into (1) reveals that it is a nonnegative, mono-
We also assume that the number of observations is at least equal tonically decreasing and convex function of σm2 i , ∀σn2i > 0 and
to the number of estimated parameters (i.e., K  L) and the sys-
∀σm2 i > 0. It is also noted that a measurement device has a higher
tem matrix H is an L × K matrix with full row rank L so that the
cost if it can perform measurements with a lower measurement
columns of H span R L .
variance (i.e., with higher accuracy). Such an approach brings great
Noisy measurements of the observation vector x are made by
flexibility by enabling to work with variable precision over the ac-
K measurement devices at the output of the linear system, and
quired measurements. After formulating the measurement device
then the measured values in vector y ∈ R K are processed to es-
model as outlined above, our objective is to minimize the total cost
timate the parameter vector θ . It is assumed that each measure-
of the measurement devices under a constraint on estimation ac-
ment device is capable of sensing the value of a scalar physi-
curacy. In other words, we are allowed to design the noise levels of
cal quantity with some resolution in amplitude according to the
the measurement devices such that the overall cost is minimized
measurement model y i = xi + mi , where mi denotes the measure-
under a constraint on the minimum acceptable estimation perfor-
ment noise associated with the ith measurement device. In other
mance.
words, measurement devices are modeled to introduce additive
In nonrandom parameter estimation problems, the Cramer–Rao
random measurement noise which can be expressed as y = x + m.
bound (CRB) provides a lower bound on the mean-squared errors
It is also reasonable to assume that measurement noise vector
(MSEs) of unbiased estimators under some regularity conditions
m is independent of the inherent system noise n. In addition,
[16]. Specifically, the CRB on the estimation error for an arbitrary
the noise components introduced by the measurement devices
unbiased estimator θ̂(y) is expressed as
(the elements of m) are assumed to be zero-mean independent
Gaussian random variables with possibly distinct variances,2 i.e.,  
m ∼ N (0, Dm ), where Dm is a diagonal covariance matrix given by
E (θ̂ − θ)(θ̂ − θ)T  J−1 (y, θ )  CRB, (2)
Dm = diag{σm2 1 , σn22 , . . . , σm2 K }. Based on the outputs of the mea- where J(y, θ) is the Fisher information matrix (FIM) of the mea-
surements devices, unknown parameter vector θ is estimated. surement y relative to the parameter vector θ , which is defined
In practical scenarios, a major issue is the cost of performing as
measurements. The cost of a measurement device is primarily as-   T

sessed with its resolution, more specifically with the number of 1 ∂ p θy (y) ∂ p θy (y)
amplitude levels that the device can reliably discriminate. Intu- J(y, θ )  dy, (3)
p θy (y) ∂θ ∂θ
itively, as the accuracy of a measurement device increases so does
its cost. Therefore, it may not always be possible to make high res- where ∂/∂θ denotes the gradient (i.e., a column vector of partial
olution measurements with a limited budget. In a recent work [1], derivatives) with respect to parameters θ1 , . . . , θ K . Or, equivalently,
a novel measurement device model is proposed where the cost the elements of the FIM can be calculated from [16]
of each device is expressed quantitatively in terms of the num- 
ber of amplitude levels that can be resolved reliably. In this model, ∂ 2 log p θy (y)
J i j = −Ey|θ . (4)
the amplitude resolution of the measurement devices solely de- ∂θi ∂θ j
termines the cost of each measurement. The dynamic range or
scaling of the input to the measurement device is assumed to have The symbol  between nonnegative definite matrices in (2) rep-
no effect on the cost as long as the number of resolvable levels resents the inequality with respect to the positive semidefinite
stays the same. More explicitly, in [1], the cost associated with matrix cone. Specifically, it indicates that the difference matrix ob-
measuring the ith component of the observation vector x is given tained by subtracting the right-hand side of the inequality from
by Ci = 0.5 log2 (1 + σx2i /σm2 i ), where σx2i denotes the variance of the left-hand side is nonnegative definite. Assuming independent
the ith component of observation vector x (i.e., the variance of Gaussian distributions for n and m, it can be shown that the CRB
the input to the ith measurement device), and σm2 i is the vari- is given as follows [17]
ance of the ith component of m (i.e., the variance of the noise
− 1
introduced by the ith measurement device).3 Notice that σx2i = σn2i , CRB = J−1 (y, θ ) = H Cov−1 (n + m)H T , (5)

where Cov(·) denotes the covariance matrix of the random vec-


2 tor n + m and Cov(n + m) = Dn + Dm = diag{σn21 + σm2 1 , σn22 + σm2 2 ,
Since Gaussian distribution maximizes the differential entropy over all distri-
butions with the same variance, the assumption that the errors introduced by the . . . , σn2K + σm2 K } due to independence. Then, D  Cov−1 (n + m) =
measurement devices are Gaussian distributed handles the worst-case scenario.
3
diag{1/(σn21 + σm2 1 ), 1/(σn22 + σm2 2 ), . . . , 1/(σn2K + σm2 K )}, where
For an in-depth discussion on the plausibility of this measurement device model
and its relation to the number of distinguishable amplitude levels, we refer the Cov−1 (·) represents the inverse of the covariance matrix. Notice
reader to [1]. that the CRB can actually be attained in this case by employing
B. Dulek, S. Gezici / Digital Signal Processing 22 (2012) 828–840 831


− 1
the maximum likelihood (ML) estimator (also the best linear unbi- θ̂(y) = HDHT HDy
ased estimator (BLUE) in this case), θ̂(y) = (HDH T )−1 HDy, where K − 1 K
the efficiency of the estimator follows from linearity of the system  1  yi
T
and due to the assumption of Gaussian distributions [16]. Specifi-
= hi hi hi . (10)
σ + σmi
2
i =1 n i
2
σ + σm2 i
2
i =1 n i
cally, the covariance matrix of the estimator equals the inverse of
the FIM, i.e., Cov(θ̂ (y)) = (HDH T )−1 .
2.1. Average mean-squared error

Remark. When non-Gaussian distributions are assumed, we can The diagonal components of the CRB provide a lower bound on
utilize the preceding observation to obtain an upper bound on the the MSE while estimating the components of parameter θ . Specifi-
CRB. To see this, a few preliminaries are needed. First, the FIM of a cally,
random vector z with respect to a translation parameter is defined
 2   
as follows [17]
Ey|θ θ̂(y) − θ 2  tr J−1 (y, θ ) ,
   T
1 ∂ p z (z) ∂ p z (z) where tr{·} denotes the trace operator [16]. In other words, the
J(z)  J(θ + z, θ ) = dz, (6) harmonic average of the eigenvalues of the FIM is taken as the
p z (z) ∂z ∂z
performance metric. Based on this metric, the following measure-
where p z (z) is the probability density function of z that is inde- ment cost minimization problem is proposed:
pendent of θ . A well-known property of the FIM under translation
is J(z)  Cov−1 (z) with equality if and only if z is Gaussian [17].
 
1
K
σn2i
Based on these preliminaries, for linear models in the form of min log2 1 +
Fig. 1 but with arbitrary probability distributions for n and m, it {σm2 i }iK=1 2
i =1
σm2 i
can be shown that J(y, θ ) = HJ(n + m)H T , where J(n + m) indicates  − 1 

K
1
the FIM under a translation parameter of random vector n + m subject to tr hi hiT  E, (11)
[17]. In order to upper bound the CRB, it is first observed that
i =1
σn2i + σ 2
mi
J(n + m)  Cov−1 (n + m). Using the properties of nonnegative def-
inite matrices, we have where E denotes a constraint on the maximum allowable av-
erage estimation error. Due to the inevitable intrinsic system

− 1
CRB = J−1 (y, θ ) = HJ(n + m)H T noise, the design criterion E must satisfy E > tr{(HD−1 T −1
n H ) }=
K hi hiT −1

−1

T −1 tr{( i =1 ) }. Substituting μi = 1/(σn2i + σm2 i ), (11) becomes
 H Cov (n + m)H , (7) σn2i

1
which naturally indicates that the difference matrix obtained by K


subtracting the CRB from the covariance matrix of the linear es- max log2 1 − σn2i μi
{μi }iK=1 2
timator θ̂(y) must be nonnegative definite. Correspondingly, it is i =1
also possible to lower bound the CRB for independent random vec-  K − 1 
tors n and m. To that aim, we can revert to the Fisher Information

T
subject to tr μ i hi hi  E. (12)
Inequality (FII) [18]. FII states that J−1 (n + m)  J−1 (n) + J−1 (m)
i =1
with equality if and only if n and m are Gaussian. Therefore,
It is noted that the objective function is smooth and concave for


− 1 T − 1
CRB = J−1 (y, θ )  H J−1 (n) + J−1 (m) H . (8) ∀μi ∈ [0, 1/σn2i ). Since the constraint is also a convex function of
μi ’s for ∀μi  0, this is a convex optimization problem [19, Sec-
As a result, a lower bound on the CRB can also be obtained in tion 7.5.2]. Consequently, it can be efficiently solved in polynomial
terms of the FIMs under translation parameters (6) of random vec- time using interior point methods and the numerical convergence
tors n and m with arbitrary probability distributions. 2 is assured. It is also possible to express this optimization problem
using linear matrix inequalities (LMIs) as follows:
Returning to our case of independent Gaussian system noise
1
K
and measurement noise, the CRB is equal to the covariance ma-

trix (i.e., estimation error covariance) of the ML estimator θ̂(y) =
max log2 1 − σn2i μi
{zi }iL=1 , {μi }iK=1 2
(HDH T )−1 HDy as mentioned in the paragraph following (5). Fur- i =1
 
thermore, when the system and measurement noise distributions K
i =1 μi hi hiT e j
are not restricted to Gaussian, the covariance matrix of the linear subject to  0,
e Tj zi
estimator θ̂ (y) can also be used as an upper bound to the CRB as
shown in (7). For this reason, in the following analysis we employ 
K
several performance metrics based on the CRB given in (5) in or- j = 1, . . . , L , z i  E, (13)
der to assess the quality of estimation. In other words, we propose i =1
measurement cost minimization formulations under various esti-
where e j denotes the column vector of length L with a 1 in the
mation accuracy constraints based on the CRB expression in (5).
jth coordinate and 0’s elsewhere. Or equivalently,
However, before that analysis, we first express the CRB in a more
familiar form in the optimization theoretic sense
1
K


− 1 max log2 1 − σn2i μi

K
1 Z∈S L , {μi }iK=1 2
i =1
CRB = J−1 (y, θ ) = hi hiT , (9)  
σ +σ
2
ni
2
mi Z I
i =1 subject to K T  0, tr(Z)  E, (14)
I μ
i =1 i h i h i
and the corresponding ML estimator that achieves this bound be-
comes where S L denotes the set of symmetric L × L matrices.
832 B. Dulek, S. Gezici / Digital Signal Processing 22 (2012) 828–840

2.2. Shannon information is associated with the maximum (minimum) eigenvalue of the
CRB (FIM) [11,20–22]. The corresponding optimization problem is
An alternative measure of the estimation accuracy considers the stated as follows:
Shannon (mutual) information content between the unknown pa-
1
K
rameter vector θ and the measurement vector y. More explicitly,

max log2 1 − σn2i μi
the interest is to place a constraint on the log volume of the η - {μi }iK=1 2
confidence ellipsoid which is defined as the minimum ellipsoid i =1
 
that contains the estimation error with probability η [19, Sec- 
K
T
tion 7.5.2]. As shown in [11], the η -confidence ellipsoid is given by subject to λmin μ i hi hi  Λ, (19)
   i =1
εα = z  zT J(y, θ )z  α , (15)
where λmin {·} represents the minimum eigenvalue of its argument,
where α = F χ−21 (η) is obtained from the cumulative distribution and Λ is a predetermined lower bound on the minimum eigen-
K K hi hiT
function of a chi-squared random variable with K degrees of value of the FIM satisfying Λ < λmin {HD−
n H } = λmin {
1 T
i =1 }.
σn2i
freedom. Then, the log volume of the η -confidence ellipsoid is Since the constraint can be represented in the form of an LMI, this
obtained as4 problem can equivalently be expressed as

1 
K
1
1
K
log vol(εα ) = β − log det hi hiT ,

2 σn2i + σ 2
mi max log2 1 − σn2i μi
i =1 {μi }iK=1 2
   i =1
n n
where β = log(απ ) − log Γ +1 , (16) 
K
2 2 subject to μi hi hiT  ΛI, (20)
with Γ denoting the Gamma function. Notice that the design cri- i =1
terion is related to the geometric mean of the eigenvalues of the
where I is the L × L identity matrix. The resulting problem is also
FIM. Based on this metric, the following measurement cost opti-
convex [19, Section 7.5.2].
mization problem can be obtained:

1
K 2.4. Worst-case coordinate error variance


max log2 1 − σn2i μi
{μi }iK=1 2 Another variation of the worst-case error criteria can be ob-
i =1
tained by placing a constraint on the maximum error variance

K
subject to log det μi hi hiT  2(β − S), (17) among all the individual estimator components, i.e., restricting the
i =1
largest diagonal entry of the CRB. Using this performance criterion,
we have the following optimization problem
where μi is as defined in (12) and S is a constraint on the
log volume of η -confidence ellipsoid satisfying S > β − 0.5 log det
1
K
K 


(HD−
hi hiT
). Since log det( iK=1 μi hi hiT ) max log2 1 − σn2i μi
n H ) = β − 0.5 log det(
1 T
i =1 σn2i {μi }iK=1 2
i =1
is a smooth concave function of μi for μi  0, the resulting op- − 1
timization problem is convex [19, Section 3.1.5]. The smoothness 
K
T
property of the problem is also very helpful for obtaining the so- subject to max μ i hi hi  , (21)
j =1,..., K
lution via numerical methods. i =1 j, j
By introducing a lower triangular nonsingular matrix L and uti- where  is a constraint on the maximum allowable diagonal en-
lizing Cholesky decomposition of positive definite matrices, it is try of the CRB (estimation error covariance matrix) satisfying  >
possible to rewrite
 K the constraint in terms of a lower bound. To K hi hiT −1
max j =1,..., K ((HD−1 T −1
n H ) ) j , j = max j =1,..., K (( ) ) j , j . This
i =1 μi hi hi  LL . Then, the optimization problem
T T i =1
that aim, let σn2i
can be expressed equivalently as problem can equivalently be expressed as

1
K

1
K


max log2 1 − σn2i μi max log2 1 − σn2i μi
L∈U L , {μi }iK=1 2
i =1 {μi }iK=1 2
i =1
   
I L T 
L
 e Tj
subject to K T  0, log L i ,i  (β − S), subject to K  0, j = 1, . . . , L , (22)
L i =1 μ i h i h i i =1 ej i =1 μi hi hiT
(18)
where e j denotes the column vector of length L with a 1 in the jth
where U L denotes the set of lower triangular nonsingular L × L
coordinate and 0’s elsewhere. This is also a convex optimization
square matrices, L i ,i represents the ith diagonal coefficient of L,
problem [19, Section 7.5.2].
and L is the dimension of L.
3. Extensions to cases with system matrix uncertainty – robust
2.3. Worst-case error variance
measurement
When the primary concern shifts from accuracy requirements
towards robust behavior, it may be more desirable to have a con- It may also be the case that there exists some uncertainty con-
straint on the worst-case variance of the estimation error, which cerning the elements in the system matrix H [11]. Suppose that
the system matrix H can take values from a given finite set H.
In the robust measurement problem, we consider the optimization
4
We use ‘log’ without a subscript to denote the natural logarithm. over the worst-case scenario. Specifically, we choose the matrix
B. Dulek, S. Gezici / Digital Signal Processing 22 (2012) 828–840 833

from the family of system matrices H resulting in the worst esti- H is in general not finite, and the solutions of the above opti-
mation accuracy constraint, and perform the optimization accord- mization problems require general techniques from semi-infinite
ingly. Recalling that the infimum (supremum) preserves concavity convex optimization such as those explained in [23,24]. In the fol-
(convexity), it is possible to restate the measurement cost opti- lowing, a specific uncertainty model is considered where it is pos-
mization problems given in Section 2, and still maintain convex sible to further simplify the optimization problems given in (26)
optimization problems. Then, the resulting optimization problems and (27) by expressing the constraints as LMIs. To that aim, let
with respect to each criterion are expressed as follows. H ∈ H = {H̄ + :  T 2   }, where  · 2 denotes the spectral
norm (i.e., the square root of the largest eigenvalue of the positive
3.1. Average mean-squared error semidefinite matrix  T ). It is possible to express this constraint
as an LMI,  T   2 I. Suppose also that μ is defined as the fol-
1 lowing diagonal matrix μ  diag{μ1 , μ2 , . . . , μ K }, and W  LL T is
K


max log2 1 − σn2i μi a symmetric positive definite matrix. Then, the constraint in (26)
{μi }iK=1 2
i =1 can be expressed in terms of H̄ and  as
  −1

K
subject to sup tr μ T
 E, W  H̄μH̄ T + H̄μ T + μH̄ T + μ T ,
i hi hi (23)
H∈H
i =1 for all  T   2 I. (29)
or equivalently, Similarly, the constraint in (27) is given by

1 
K


max log2 1 − σn2i μi ΛI  H̄μH̄T + H̄μT + μH̄T + μT ,
Z∈S L , {μi }iK=1 2
i =1 for all  T   2 I. (30)
 
Z I
subject to K T  0 for all H ∈ H, In [25, Theorem 3.3], a necessary and sufficient condition is de-
I μ
i =1 i h i h i rived for quadratic matrix inequalities in the form of (29) and (30)
tr(Z)  E. (24) to be true. In the light of this theorem, (29) holds if and only if
there exists t  0 such that
3.2. Shannon information  
H̄μH̄ T − W − tI H̄μ
 0, (31)
1
K

2
μH̄T μ + t2 I
max log2 1 − σ ni μi
{μi }iK=1 2 and (30) holds if and only if there exists t  0 such that
i =1
   

K
H̄μH̄ T − (Λ + t )I H̄μ
T
subject to inf log det μ i hi hi  2(β − S), (25)  0. (32)
H∈H
i =1
μH̄T μ + t2 I
or equivalently, Notice that (31) and (32) are both linear in μ, W and t. Hence,
under this specific uncertainty model, we can express the opti-
1 
K

mization problem in (26) as
max log2 1 − σn2i μi
L∈U L , {μi }iK=1 2
1
K
i =1
 

I L T max log2 1 − σn2i μi
subject to K  0 for all H ∈ H, t ,W∈s++ , {μi }iK=1
L 2
T i =1
L i =1 μ i h i h i  
H̄μH̄ T − W − tI H̄μ

L
subject to  0,
log L i ,i  (β − S). (26) μH̄T μ + t2 I
i =1 log det(W)  2(β − S), t  0, (33)
3.3. Worst-case error variance where s++ denotes symmetric positive-definite L × L matrices.
L

Similarly, it is possible to write the optimization problem in (27)


1
K

as
max log2 1 − σn2i μi
{μi }iK=1 2
1
K
i =1

max log2 1 − σn2i μi

K
t ,{μi }iK=1 2
subject to μi hi hiT  ΛI for all H ∈ H. (27) i =1
 
i =1 H̄μH̄ T − (Λ + t )I H̄μ
subject to  0, t  0. (34)
μH̄T μ + t2 I
3.4. Worst-case coordinate error variance

1 4. Special case – invertible system matrix H


K


max log2 1 − σn2i μi
{μi }iK=1 2 When the system matrix H is a K × K invertible matrix mean-
i =1
  −1 ing that the number of unknown parameters is equal to the num-

K
ber of observations, it is possible to obtain closed-form solutions
subject to sup max μi hi hiT  . (28)
H∈H j =1,..., K of the optimization problems stated in (11) and (17). Moreover, for
i =1 j, j
the solution of (11), it is not necessary to assume that the compo-
When the set H is finite, the problem can be solved using nents of the system noise n are independent; it is sufficient to have
standard arguments from convex optimization. However, the set n as a Gaussian distributed random vector with zero-mean and
834 B. Dulek, S. Gezici / Digital Signal Processing 22 (2012) 828–840

arbitrary covariance matrix (possibly colored), i.e., n ∼ N (0,  n ) From (37), it is noted that the constraint function is linear in σm2 i ’s,
with {σn21 , σn22 , . . . , σn2K } constituting the diagonal components of the objective function is convex, and both functions are contin-
 n , and 0 denoting the all-zeros vector of length K as before. To uously differentiable which altogether indicate that Slater’s con-
that aim, assuming independent Gaussian distributions for n and dition holds. Therefore, Karush–Kuhn–Tucker (KKT) conditions are
m, and square H with full-rank (invertible), it is observed that necessary and sufficient for optimality. Then, the optimal measure-

− 1 ment noise variances can be calculated from
CRB = J−1 (y, θ ) = H Cov−1 (n + m)H T 


−1 T −1
σn2i σn4i σn2i
= H Cov(n + m)H 2
σ =−
mi + +γ , (38)

−1 T
T 2 4 fi
= H  n H−1 + H−1 D m H−1 , (35)
where γ > 0 is obtained by substituting (38) into the average MSE
K
where the first part of the CRB, (H−1 ) T  n H−1 is a known quan- constraint, that is i =1 f i σmi = E − t.
2

tity, and the second part (H−1 ) T D H−1 will be subject to design
m Special case: When the inverse of the system matrix has nor-
while assessing the quality of the estimation. Similar to the previ- malized rows, i.e., f i = 1, and the components of the system
ous discussion, CRB can be achieved in this case by employing the noise are independent zero-mean Gaussian random variables,
K the
i =1 σmi =
2
corresponding linear unbiased estimator which turns out simply optimal measurement noise variances should satisfy
K
to be a multiplication of the measurement vector with the inverse E− σn2i . If identical system noise components are assumed
i =1
of the system matrix, i.e., θ̂(y) = (H−1 ) T y. Returning to two com- as well, i.e., σn2i = σn2 , i = 1, . . . , K , then the optimal solution re-
monly used performance metrics introduced in Section 2, we next
sults in σm2 i = σm2 , i = 1, . . . , K , where σm2 = E/ K − σn2 is obtained
examine the closed-form solutions of the corresponding cost min-
from the average MSE constraint. The corresponding optimal cost
imization problems.
is given by ( K /2) log2 (E/(E − K σn2 )). This is an increasing function
of K for fixed E. Furthermore, the derivatives of all orders with
4.1. Average mean-squared error
respect to K exist, and are positive for K < E/σn2 . Therefore, esti-
mating more parameters under an average error constraint based
Due to the CRB, it is known that the average MSE while esti-
on the CRB requires even more accurate measurement devices with
mating the components of the parameter θ is bounded from below
higher costs as long as K < E/σn2 is satisfied.
as
 2   
Ey|θ θ̂(y) − θ 2  tr J−1 (y, θ ) 4.2. Shannon information

T  
T 
= tr H−1  n H−1 + tr H−1 Dm H−1 , Another measure of estimation accuracy that results in a closed
form solution in the case of invertible system matrix H is the
where the last equality follows from the linearity of the trace op- Shannon information criterion. Using this metric as the constraint
erator and the invertibility of H. Since (H−1 ) T  n H−1 is known, function, we are effectively restricting the log volume of the η -
let t = tr{(H−1 ) T  n H−1 }. When the aim is to minimize the mea- confidence ellipsoid to stay below a predetermined value S. Using
surement cost subject to a constraint on the lower bound for the similar arguments to Section 2.2 and the invertibility of H,
average MSE (achievable in the case of Gaussian distributions), the


optimization problem can be expressed similarly to (11) as follows: log det H Cov−1 (n + m)H T
 


1
K
σn2 = log det H · det Cov−1 (n + m) · det HT
min log2 1 + 2i
{σm2 i }iK=1 2
i =1
σmi 
K



 = 2 log | det H| − log σn2i + σm2 i , (39)
−1 T −1
subject to tr H Dm H  E − t, (36) i =1

where E denotes a constraint for the overall average estimation er- where the second equality follows the properties of the deter-
ror suggested by the CRB (achievable in this case), and t represents minant and logarithm, i.e., det H = det H T , det(Cov−1 (n + m)) =
the unavoidable estimation error due to intrinsic system noise n. 1/ det(Cov(n + m)), and Cov(n + m) = Dn + Dm = diag{σn21 + σm2 1 ,
Notice that for consistency, the design parameter E should be se- σn22 + σn22 , . . . , σn2K + σm2 K } due to Gaussian distributed independent
lected as E > t. system and measurement noises with independent components.
From the independence of the measurement noise components, Since the system matrix H is known, let α  log | det H|. Under
Dm = diag{σm2 1 , σm2 2 , . . . , σm2 K } is a diagonal covariance matrix with these conditions, the optimization problem in (17) can be stated
σm2 i > 0, ∀i ∈ {1, 2, . . . , K }. In the view of this observation, it is as
possible to simplify the objective function further by defining  
1
K
F  (H−1 ) T = [f1 f2 . . . f K ], where fi represents the ith row of the σn2
min log2 1 + 2i
inverse of the system matrix H. Let f i  fi 22 denote the square of {σm2 i }iK=1 2 σmi
i =1
the Euclidean norm of the vector fi , that is, the sum of squares of
the elements in fi . It is noted that f i is always positive for invert- 
K


ible H, and is constant for fixed H. Then the optimization problem subject to log σn2i + σm2 i  2(S + α − β), (40)
in (36) can be expressed as follows: i =1

  where S and β are as defined in (17).


1
K
σn2i Notice that although the objective in (40) is a convex function
min log2 1 +
{σm2 i }iK=1 2
i =1
σm2 i of σm2 i ’s, the constraint is not a convex set. In fact, the constraint
set is what is left after the convex set

K
 
subject to f i σm2 i  E − t , σm2 i  0, ∀i ∈ {1, 2, . . . , K }. 
K


2 2 2
i =1 C= σ m  0: log σ +σ
ni mi > 2(S + α − β)
(37) i =1
B. Dulek, S. Gezici / Digital Signal Processing 22 (2012) 828–840 835

is subtracted from {σ 2m  0}. Since the global minimum of the un- it is assured that observations corrupted by weak, moderate and
constrained objective function is achieved for σ 2m = ∞ which is strong levels of Gaussian noise are available with similar propor-
contained in set C and the objective function is convex, it is con- tions for the estimation stage. In the following, we look into the
cluded that the minimum
K of the objective function has to occur at problem of optimally assigning costs to measurement devices un-
i =1 log(σni + σmi ) = 2(S + α −β) must be satis-
2 2 der various estimation accuracy constraints when the variances of
the boundary, i.e.,
fied [26]. Therefore, we can take the constraint as equality in (40). the intrinsic system noise components are uniformly distributed
This is a standard optimization problem that can be solved using as explained above. Note that our results obtained in the previ-
Lagrange multipliers. Hence, by defining   2(S + α − β), we can ous section are still valid for Gaussian system noise processes with
write the Lagrange functional as arbitrary diagonal covariance matrices (i.e., the nonzero compo-
  nents of the diagonal covariance matrix need not be uniformly dis-
1
K

σn2i tributed as in this example). In obtaining the optimal solutions for
J σm2 1 , . . . , σm2 K = log2 1 + the convex optimization problems stated above, fmincon method
2
i =1
σm2 i
from MATLAB’s Optimization Toolbox and the CVX software [27]
K

are used.
+λ log σn2i + σm2 i −  , (41)
i =1 5.1. Performance of various estimation quality metrics under perfect
2
system state information
and differentiating with respect to σ we have the following as- mi ,
signment of the noise variances to the measurement devices First, we investigate the cost assignment problem under perfect

2 information on the system matrix and intrinsic noise variances. Re-
σm2 i = γ 1/ K − 1 σn2i , where γ =  K . (42) call that four different performance constraints are proposed for
j =1 σn2j that purpose in Section 2. In the following four experiments, we
For consistency, thedesign parameter S should be selected as analyze the behavior of the total measurement cost while each
 = 2(S + α − β) > iK=1 log(σn2i ) since the intrinsic system noise constraint metric is varied between its extreme values. The to-
puts a lower bound on the minimum attainable volume of the con- tal cost is measured in bits by taking logarithms with respect to
fidence ellipsoid. Some properties of the obtained solution can be base 2. The constraint metric is expressed as the ratio of its cur-
summarized as follows: rent value to the value it attains for the limiting case when zero
measurement noise variances are assumed. As an example, for av-
• For given , K and σn2i ’s, the minimum achievable cost is erage mean-squared-error criterion, the total measurement cost C
1/ K will be tabulated versus E/ tr{(HD− 1 T −1
n H ) }.
( K /2) log2 ( γ γ1/ K −1 ), where γ is computed as in (42). In addition to the optimal cost allocation scheme proposed in
• For a fixed value of K (available number of observations), re- this paper, we also consider two suboptimal cost allocation strate-
laxing the constraint on the volume of the η -confidence el- gies:
lipsoid (increasing the value of  ) results in smaller measure-
ment device costs with a limiting value of 0, as expected. • Equal cost to all measurement devices: In this strategy, it is as-
• If the observation variances are equal; that is, σn2i = σn2 , i = sumed that a single set of measurement devices with iden-
1, . . . , K , employing identical measurement devices for all the tical costs is employed for all observations so that Ci = C,
observations; that is, σm2 i = σm2 , i = 1, . . . , K , is the optimal i = 1, 2, . . . , K . This, in turn, implies that the ratio of the mea-
strategy. From (42), the optimal value of the measurement surement noise variance to the intrinsic system noise vari-
noise variances is calculated as σm2 ,opt = e / K − σn2 , and the ance, x  σm2 i /σn2i , is constant for all measurement devices.
corresponding minimum total measurement cost is given as Then, the total cost can be expressed in terms of x as C =
/(2 log 2) − ( K /2) log2 (e / K − σn2 ) which is an increasing 0.5K log2 (1 + 1/x), and similarly the FIM becomes J(y, θ ) =
function of K for  > K log σn2 . Intuitively, this result as well
−1 T
HDn H 1
K hi hiT
x+1
= x+1 i =1 . Using this observation, the constraint
indicates that estimating more parameters under a fixed con- σn2i
straint on the volume of the ellipsoid containing the estima- functions provided for different performance metrics in the
tion errors requires a higher total measurement device cost. optimization problems (11), (17), (19), and (21) can be al-
gebraically solved for equality to determine the value of x
5. Numerical results without applying any convex optimization techniques, and the
corresponding measurement variances and cost assignments
In this section, we present an example that illustrates several can be obtained.
theoretical results developed in the previous section. To that aim, • Equal measurement noise variances: In this case, measurement
a discrete-time linear system as depicted in Fig. 1 is considered devices are assumed to introduce random errors with equal
noise variances, that is, σm2 i = σm2 , i = 1, 2, . . . , K . In other
y = H T θ + n + m, (43) words, all observations are assumed to be corrupted with
identical noise processes, and the best measurement noise
where θ is a length-20 vector containing the unknown param- variance value that minimizes the overall measurement cost
eters to be estimated, H is a 20 × 100 system matrix with full while satisfying the estimation accuracy constraint is selected.
row rank, the intrinsic system noise n and the measurement noise Accordingly, the objective function  in the proposed optimiza-
m are length-100 Gaussian distributed random vectors with in- K
tion problems simplifies to C = 0.5 i =1 log2 (1 + σn2i /σm2 ) and
dependent components. The entries of the system matrix H are the FIM employed in the constraint functions takes the form
generated from a process of i.i.d. uniform random variables in the K hi hiT
J(y, θ ) = . By substituting these expressions into
interval [−0.1, 0.1]. Also, the components of the system noise vec- i =1 σn2i +σm2
tor n are independently Gaussian distributed with zero mean, and the various optimization approaches provided in Section 2,
it is assumed that their variances come from a uniform distribution these problems can be solved rapidly over a single parame-
defined in the interval [0.05, 1]. The implication of this assump- ter σm2 using the tools of convex analysis, and the optimal cost
tion is that the observations at the output of the linear system allocations can be obtained for the case of equal measurement
possess uniformly varying degrees of accuracy. In other words, noise variances.
836 B. Dulek, S. Gezici / Digital Signal Processing 22 (2012) 828–840

Fig. 2. Total cost versus normalized average MSE constraint.


Fig. 3. Total cost versus normalized Shannon information constraint.

5.1.1. Average mean-squared-error criterion


In this experiment, we study the effects of the average MSE
constraint on the total measurement device cost. Starting from
the minimum achievable value for the average MSE due to intrin-
sic system noise (i.e., tr{(HD− 1 T −1
n H ) }), we increase the constraint
up to 100 times this minimal value, as depicted in Fig. 2. Three
curves are presented corresponding to the optimal cost allocation
strategy and two suboptimal strategies, one employing equal cost
and the other employing equal noise variance among the measure-
ment devices. It is noted that the optimal strategy results in the
minimum cost for all values of the MSE constraint as expected.
Its performance is followed by the equal cost assignment scheme,
and the worst performing strategy is the one that assigns equal
measurement noise variances to all the devices. When the aver-
age MSE criterion is stringent (for smaller values of E), all the
strategies require increasingly more accurate measurements (hence
higher costs) to satisfy the constraint. As the MSE constraint is re-
laxed (i.e., for larger values of E), the measurement costs of three
different strategies start to drop down to zero but become less re-
sponsive as they move along.
Fig. 4. Total cost versus normalized worst-case error variance constraint.

5.1.2. Shannon information criterion


This experiment aims to discover the relationship between der different cost allocation strategies. Similar conclusions to the
Shannon information constraint and total measurement device previous experiments can be drawn by examining Fig. 4.
cost. Since the constraint is expressed as a ‘greater than’ inequality,
we begin with the maximum attainable value of log det(HD− 1 T
n H ) 5.1.4. Worst-case coordinate error variance
and loosen the constraint by decreasing towards the negative mul- This experiment focuses on the relationship between the con-
tiples of this quantity as shown in Fig. 3. When the constraint straint on the largest diagonal entry of the CRB and the total
is very restrictive (corresponding to high values of 2(β − S)), the measurement device costs achievable via different cost allocation
differences among the performances of optimal and suboptimal strategies. The results are illustrated in Fig. 5. It is noted that the
strategies disappear. As the constraint is relaxed away from the plots depicted in Fig. 2 embody a large degree of resemblance to
maximum attainable value, it is observed that the decrease in the those given in Fig. 5. This similarity is anticipated and can be at-
total cost is less responsive with respect to the average MSE. How- tributed to the fact that the former criterion puts a constraint on
ever, as the relaxation continues we see that the drop in the total the average of the diagonal entries of the CRB whereas the latter
cost for the Shannon information criterion maintains its pace for places a similar constraint on their maximum.
a longer time while the drop in the average MSE criterion seems Finally, we can stress a few more points. It is necessary that the
to saturate. Again similar to the previous case, the performance intrinsic system noise variances and the system matrix are jointly
of the optimal strategy is superior to the equal measurement de- evaluated to compute the optimal measurement noise variances
vice cost strategy, and the worst performance belongs to the equal and the corresponding cost allocations. In other words, in order to
measurement variance scheme. assign more cost to a specific observation, it is not sufficient to just
know that the particular observation is reliable (i.e., has smaller
5.1.3. Worst-case error variance variance) but we also need to know its intrinsic combinations with
In this experiment, we investigate the effects of the worst-case the other observations due to linear system matrix. Furthermore,
error variance criterion on the total measurement device cost un- the performance figures are quite useful in the sense that they
B. Dulek, S. Gezici / Digital Signal Processing 22 (2012) 828–840 837

Fig. 5. Total cost versus normalized worst-case coordinate error variance constraint. Fig. 6. The performance of various optimal cost allocation strategies under scaling
of the system noise variances. All costs are equal for c = 0.5.

provide the minimum cost necessary to obtain a desired level of


of increasing costs. It is noted that, except for the average MSE
estimation accuracy.
criterion, the performance of the remaining three metrics stays
in the same order for values of c above and below 0.5. Another
5.2. Performance comparison of estimation quality metrics under important observation is that among the four estimation quality
scaling of the system noise variances metrics, the performance of the MSE criterion is the one that is
least susceptible to changes in the system noise variance. That is,
In this section, we devise a new experiment in order to jointly as c is increased beyond 0.5 and decreased below 0.5, the least
assess the performance of the proposed optimal cost assignment varying performance metric corresponds to the average MSE crite-
strategies under different estimation quality metrics. Using the rion. Therefore, in applications where the level of the system noise
same set of system noise variances employed in the previous ex- variance are likely to fluctuate around a nominal value and a pre-
periments, we scale them with a factor c that varies inside the determined value of the estimation accuracy has to be satisfied,
interval [0.1, 1] with 0.01 increments. Specifically, σ̂n2i = c σn2i , i = the average MSE criterion provides the most robust alternative in
1, . . . , K , where c ∈ {0.1 : 0.01 : 1}. For such a comparison to make terms of the measurement device selection. However, even in this
sense, the constraints on the estimation quality metrics are se- case, a small change of order 0.01 in the value of the scale param-
lected so that the optimal total measurement costs returned by eter disturbs the total cost by more than 1 bit for the average MSE
the various approaches are equal for a certain value of the scale metric.
parameter c. Then, using the same value as the constraint, we eval- In the second example, the performances of the estimation
uate the performance of each optimal cost allocation strategy for quality metrics are equated for c = 1, resulting in a total cost
the rest of the scale parameter values. score of 320.8. We employ the same constraint value (E = 23.1371)
Two examples are constructed for this case.In the first one, the for the average MSE criterion, and the adjustments are applied to
performances of the optimal schemes under four different per- the remaining metrics. The corresponding constraint function val-
formance metrics are equated for c = 0.5, producing an optimal ues are calculated as 2(β − S) = 0.66 for the Shannon information
total cost of 40.11. The corresponding constraint function values criterion, Λ = 0.3664 for the worst-case error variance criterion,
are E = 23.1371 for the average MSE criterion, 2(β − S) = 1.9389 and  = 1.5519 for the worst-case coordinate error variance crite-
for Shannon information criterion, Λ = 0.4364 for the worst-case rion. The results are illustrated in Fig. 7. In accordance with the
error variance criterion and  = 1.3646 for the worst-case coordi- observations for high values of c in the previous example, the
nate error variance criterion. The results are illustrated in Fig. 6. worst-case error variance metric quickly responds to the drop in
Intuitively, as the intrinsic system noise variances are increased, the level of the system noise variance values. Hence, the low-
more reliable measurements (higher costs) are required to sat- est cost is provided by the worst-case error variance criterion for
isfy the same level of accuracy. Comparing the performances in c < 1. On the other hand, the optimal cost value for the average
Fig. 6, where all the costs are equated for c = 0.5, we observe MSE criterion exhibits the slowest descent for decreasing values
that the average MSE criterion results in the least (i.e., the best) of c. Also noted from the figure is that the performance curve for
optimal cost score for increasing values of the scale parameter c. the Shannon information criterion down-crosses the curve corre-
Its performance is followed by the Shannon information criterion, sponding to the worst-case coordinate error variance criterion at
next by the worst-case coordinate error variance criterion, and fi- around c = 0.21.
nally by the worst-case error variance criterion. In other words,
the effects of increasing system noise variances are much more 5.3. The relationship between the number of effective measurements
pronounced for the worst-case error variance criterion, which op- and the quality of estimation under scaling of the system noise variances
erates by setting a constraint on the minimum eigenvalue of the
FIM, than the remaining criteria. If the noise scale parameter c In this experiment we discuss the relationship between the
is decreased below 0.5, it is observed that the Shannon informa- number of effective measurements K eff and various estimation
tion criterion produces the lowest measurement cost followed by quality metrics under scaling of the system noise variances. A mea-
the worst-case coordinate error variance criterion, worst-case error surement is assessed as effective whenever the cost of that mea-
variance criterion, and finally average MSE criterion in the order surement exceeds a certain fraction of the optimal value of the
838 B. Dulek, S. Gezici / Digital Signal Processing 22 (2012) 828–840

Fig. 8. Number of effective measurements under the scaling of the system noise
Fig. 7. The performance of various optimal cost allocation strategies under scaling of
variances for various estimation accuracy metrics.
the system noise variances. All costs are equal for c = 1.

total measurement cost. More specifically, we require that Ci >


p (C/ K ) where K represents the total number of measurements.
With this construction, it is assured that the total cost of the
effective measurements is greater than (1 − p )C, from which a
suitable value for p can be determined [1]. For small values of
p, we can safely assume that the remaining measurements do
not cause a significant change on the total cost or provide any
significant contribution to the estimation accuracy. Similar to the
study in [1], p = 0.125 is selected. The same constraint values
as in Fig. 7 are employed for the estimation accuracy metrics.
Since the performances of all four estimation accuracy criteria
are fixed to a high cost score of 320.8 for c = 1, it is noted
from Fig. 8 that most of the observations are utilized at this
value of the scale parameter in order to satisfy the strict con-
straints. As the average system noise power is reduced by as-
signing smaller values to the system noise variance multiplier c,
the number of effective measurements decreases for all the four
cases in accordance with decreasing measurement costs. In other
words, lower noise variances result in looser constraints which can
be achieved by using fewer number of high resolution (costly) Fig. 9. Effects of system matrix uncertainty on the total measurement cost for Shan-
non information criterion.
measurements. For small values of c, the worst-case error vari-
ance requires the largest number of measurements followed by
the Shannon information and the worst-case error variance crite-
the average MSE criterion, the worst-case coordinate error vari-
ria in Fig. 9 and Fig. 10, respectively. For both cases, it is observed
ance criterion, and finally the Shannon information criterion. For
that the total cost increases as the amount of uncertainty in the
higher values of c, the situation is reversed apart from the average
system matrix increases for a given value of the constraint. The
MSE criterion which requires the largest number of effective mea-
increase in the system matrix uncertainty also leads to smaller val-
surements. When c  0.56, a relatively small number of accurate
ues of the maximum attainable estimation accuracy measures (the
measurements is sufficient to conduct a reliable estimation using
asymptotes where the total cost increases unboundedly).
the Shannon information criterion with respect to the remaining
criteria.
6. Extension to Bayesian framework
5.4. Effects of system matrix uncertainty
In Section 2, parameter θ is modeled as a deterministic un-
known parameter. Whenever prior information is available about
So far, we have assumed that the system matrix is known per-
the distribution of the unknown parameter, this additional infor-
fectly at the measurement stage. In this experiment, we consider
mation can be utilized at the estimation stage. As a result, a more
the case in which the measurement system can only have partial
refined metric to assess the quality of the estimator performance is
knowledge about the system matrix according to the specific un-
employed which is commonly known as the Bayesian CRB (BCRB)
certainty model introduced in Section 3. That is, the system matrix
and expressed as follows:
is represented as the sum of a known matrix plus a random distur-
bance matrix H ∈ H = {H̄ + :  T 2   }, where the degree of  
E (θ̂ − θ)(θ̂ − θ)T  (JD + JP )−1  BCRB, (44)
uncertainty is controlled with the spectral norm of the disturbance
matrix . Below, we present the results concerning the effects where JD represents data information matrix and JP represents
of system uncertainty on the optimal cost allocation problem for prior information matrix, whose elements are [16]
B. Dulek, S. Gezici / Digital Signal Processing 22 (2012) 828–840 839

estimation accuracy constraints. Uncertainty in the system matrix


has been modeled both under general terms and by using a spe-
cific uncertainty model. It has been indicated that the convexity
properties of the proposed optimization problems are preserved
under uncertainty. When the system matrix is invertible, closed
form expressions have been presented for two different estima-
tion accuracy metrics which enable a quick assessment of the
corresponding cost allocation strategies analytically or via simpler
numerical techniques. It has been shown that the prior informa-
tion can be incorporated into the optimization problems but the
resulting problems need no longer be convex. Through numerical
examples, the relationships among various criteria have been ana-
lyzed in depth.

References

[1] A. Ozcelikkale, H.M. Ozaktas, E. Arikan, Signal recovery with cost-constrained


measurements, IEEE Trans. Signal Process. 58 (2010) 3607–3617.
[2] H.V. Poor, An Introduction to Signal Detection and Estimation, Springer-Verlag,
New York, 1994.
[3] S. Appadwedula, V.V. Veeravalli, D.L. Jones, Energy-efficient detection in sensor
Fig. 10. Effects of system matrix uncertainty on the total measurement cost for
networks, IEEE J. Sel. Areas Commun. 23 (2005) 693–702.
worst-case error variance criterion.
[4] A. Ribeiro, G.B. Giannakis, Bandwidth-constrained distributed estimation for
 wireless sensor networks—Part I: Gaussian case, IEEE Trans. Signal Process. 54
∂ 2 log p θy (y)   (2006) 1131–1143.
J D i j = −Ey,θ = Eθ J(y, θ ) and
∂θi ∂θ j [5] J.-J. Xiao, S. Cui, Z.-Q. Luo, A.J. Goldsmith, Power scheduling of universal decen-
 tralized estimation in sensor networks, IEEE Trans. Signal Process. 54 (2006)
∂ 2 log w (θ ) 413–422.
J P i j = −Eθ , (45) [6] S. Cui, J.-J. Xiao, A.J. Goldsmith, Z.-Q. Luo, H.V. Poor, Estimation diversity and
∂θi ∂θ j energy efficiency in distributed sensing, IEEE Trans. Signal Process. 55 (2007)
4683–4695.
where J(y, θ ) is the standard Fisher information matrix defined
[7] J. Li, G. AlRegib, Rate-constrained distributed estimation in wireless sensor net-
in (3). works, IEEE Trans. Signal Process. 55 (2007) 1634–1643.
When the prior probability of the parameter is Gaussian with [8] H. Zhang, J. Moura, B. Krogh, Dynamic field estimation using wireless sensor
θ ∼ N (0,  θ ), under the same assumptions regarding the indepen- networks: Tradeoffs between estimation error and communication cost, IEEE
dence of n ∼ N (0, Dn ) and m ∼ N (0, Dm ), the BCRB for the linear Trans. Signal Process. 57 (2009) 2383–2395.
[9] G. Thatte, U. Mitra, Sensor selection and power allocation for distributed esti-
system given in Fig. 1 can be obtained as
mation in sensor networks: Beyond the star topology, IEEE Trans. Signal Pro-
− 1 cess. 56 (2008) 2649–2661.

K
1 [10] T.C. Aysal, K.E. Barner, Constrained decentralized estimation over noisy chan-
BCRB = hi hiT + −
θ
1
. (46) nels for sensor networks, IEEE Trans. Signal Process. 56 (2008) 1398–1410.
i =1
σn2i + σ 2
mi [11] S. Joshi, S. Boyd, Sensor selection via convex optimization, IEEE Trans. Signal
Process. 57 (2009) 451–462.
Correspondingly, the total cost function should be restated to in- [12] M. Lazaro, M. Sanchez-Fernandez, A. Artes-Rodriguez, Optimal sensor selection
corporate the change in the variance of the input to each mea- in binary heterogeneous sensor networks, IEEE Trans. Signal Process. 57 (2009)
surement noise device as follows: 1577–1587.
[13] B. Dulek, S. Gezici, Average Fisher information maximisation in presence of

K 
K  
1 σx2i cost-constrained measurements, Electron. Lett. 47 (2011) 654–656.
C= Ci = log2 1 + , (47) [14] S.M. Kay, Fundamentals of Statistical Signal Processing: Estimation Theory,

i =1 i =1
2 σm2 i Prentice Hall, Upper Saddle River, NJ, 1993.
[15] M. Hayes, Statistical Digital Signal Processing and Modeling, John Wiley & Sons,
1996.
where σx2i is the ith diagonal entry of the observation covariance [16] H.L.V. Trees, Detection, Estimation, and Modulation Theory: Part I, 2nd ed., John
matrix Cov(x) = H T  θ H + Dn . Wiley & Sons, New York, NY, 2001.
Based on these expressions, all the proposed cost minimization [17] R. Zamir, A proof of the Fisher information inequality via a data processing
formulations in Section 2 can be modified accordingly to obtain argument, IEEE Trans. Inform. Theory 44 (1998) 1246–1250.
[18] A. Dembo, T.M. Cover, J.A. Thomas, Information theoretic inequalities, IEEE
the optimal cost assignment strategies in the presence of prior in-
Trans. Inform. Theory 37 (1991) 1501–1518.
formation. Specifically, the CRB is replaced with the BCRB, and the [19] S. Boyd, L. Vandenberghe, Convex Optimization, Cambridge University Press,
cost function stated in (47) is substituted as the objective function Cambridge, UK, 2004.
inside the optimization problems given in (14), (18), (20), and (22). [20] Y. Eldar, A. Ben-Tal, A. Nemirovski, Robust mean-squared error estimation in
the presence of model uncertainties, IEEE Trans. Signal Process. 53 (2005) 168–
However, the modified optimization problems are not necessarily
181.
convex. It is also noted that the problem formulation constructed [21] Z. Ben-Haim, Y.C. Eldar, Maximum set estimators with bounded estimation er-
by employing the LMMSE estimator in [1] is equivalent to the dual ror, IEEE Trans. Signal Process. 53 (2005) 3172–3182.
of the Bayesian estimation case under the average MSE criterion [22] A. Das, D. Kempe, Sensor selection for minimizing worst-case prediction error,
given in (11) when Gaussian priors are assumed. in: Int. Conf. Inform. Process. Sensor Networks (IPSN’08), pp. 97–108.
[23] R. Hettich, K. Kortanek, Semi-infinite programming: Theory, methods, and ap-
plications, SIAM Rev. 35 (1993) 380–429.
7. Conclusion [24] A. Mutapcic, S. Boyd, Cutting-set methods for robust convex optimization with
pessimizing oracles, Optim. Methods Softw. 24 (2009) 381–406.
In this paper, we have studied the measurement cost mini- [25] Z. Quan Luo, J.F. Sturm, S. Zhang, Multivariate nonnegative quadratic mappings,
mization problem for a linear system in the presence of Gaussian SIAM J. Optim. 14 (2002) 1140–1162.
[26] R.T. Rockafellar, Convex Analysis, Princeton University Press, Princeton, NJ,
noise based on the measurement device model introduced in [1]. 1968.
By considering the nonrandom parameter estimation case, novel [27] M. Grant, S. Boyd, CVX: Matlab software for disciplined convex programming,
convex optimization problems have been obtained under various version 1.21, http://cvxr.com/cvx, 2011.
840 B. Dulek, S. Gezici / Digital Signal Processing 22 (2012) 828–840

Berkan Dulek received the B.S. and M.S. degrees with high honors in ton University in 2006. From 2006 to 2007, he worked at Mitsubishi
electrical engineering from Bilkent University, Turkey, in 2003 and 2006, Electric Research Laboratories, Cambridge, MA. Since February 2007,
respectively. He is currently studying toward the Ph.D. degree at Bilkent he has been an Assistant Professor in the Department of Electrical
University. His research interests are in statistical signal processing and and Electronics Engineering at Bilkent University. Dr. Gezici’s research
communications with emphasis on stochastic signaling, randomized de- interests are in the areas of detection and estimation theory, wire-
tection and estimation under cost constraints. less communications, and localization systems. Among his publications
in these areas is the book Ultra-wideband Positioning Systems: Theoreti-
Sinan Gezici received the B.S. degree from Bilkent University, Turkey cal Limits, Ranging Algorithms, and Protocols (Cambridge University Press,
in 2001, and the Ph.D. degree in electrical engineering from Prince- 2008).

You might also like