0% found this document useful (0 votes)
47 views

Optimal Joint Detection and Estimation in Linear Models: Jianshu Chen, Yue Zhao, Andrea Goldsmith, and H. Vincent Poor

This document presents research on optimal joint detection and estimation in linear models with Gaussian noise. The authors derive a simple closed-form expression for the joint posterior distribution of hypotheses and states. This distribution characterizes beliefs about the hypotheses and state values, and is a sufficient statistic for jointly detecting multiple hypotheses and estimating states. The developed expressions provide a unified framework for joint detection and estimation under different performance criteria.

Uploaded by

doudikid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views

Optimal Joint Detection and Estimation in Linear Models: Jianshu Chen, Yue Zhao, Andrea Goldsmith, and H. Vincent Poor

This document presents research on optimal joint detection and estimation in linear models with Gaussian noise. The authors derive a simple closed-form expression for the joint posterior distribution of hypotheses and states. This distribution characterizes beliefs about the hypotheses and state values, and is a sufficient statistic for jointly detecting multiple hypotheses and estimating states. The developed expressions provide a unified framework for joint detection and estimation under different performance criteria.

Uploaded by

doudikid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Optimal Joint Detection and Estimation in Linear Models

Jianshu Chen, Yue Zhao, Andrea Goldsmith, and H. Vincent Poor

Abstract— The problem of optimal joint detection and esti- expressed in terms of some generalized forms of likelihood
mation in linear models with Gaussian noise is studied. A simple ratios. The structure of the optimal Bayesian estimator under
closed-form expression for the joint posterior distribution of the any given constraints on false alarm probability and proba-
(multiple) hypotheses and the states is derived. The expression
crystalizes the dependence of the optimal detector on the state bility of missed detection have also been developed for the
estimates. The joint posterior distribution characterizes the binary hypothesis case [8].
beliefs (“soft information”) about the hypotheses and the values In this paper, we study the problem of optimal joint
of the states. Furthermore, it is a sufficient statistic for jointly detection and estimation for a general class of observation
detecting multiple hypotheses and estimating the states. The models, namely, linear models with Gaussian noise. Linear
developed expressions give us a unified framework for joint
detection and estimation under all performance criteria. models appear in a wide range of engineering applications
including power systems [1], channel estimation [9], [10],
I. I NTRODUCTION adaptive array processing [11]–[13], and spectrum estimation
[14]. In these applications, not only is state estimation
Detection and estimation problems appear simultaneously
of primary interest, but also the observation matrix can
and are naturally coupled in many engineering systems.
often change over time, and it is essential to detect which
Several prominent examples are as follows. To achieve
observation matrix among many possibilities is currently
situational awareness in power grids, it is essential to have
effective. We formulate these problems as joint multiple
timely detection of outages as well as estimation of system
hypothesis testing and state estimation problems. Instead
states [1], [2]. Radar systems detect the existence of targets
of focusing on a particular form of performance criterion
and also estimate their positions and velocities [3]. Wireless
and developing the corresponding optimal joint detector and
communication systems often need to decode messages and
estimator, we develop a unified Bayesian approach that can
estimate channel states at the same time [4]. In different
be applied to any given criterion. Specifically, employing a
engineering systems, the problem settings of joint detection
conjugate prior, we provide closed form expressions for the
and estimation can vary greatly, and many application-
joint posterior of the hypotheses and the system states given
specific solutions have been developed in practice.
all measurement samples. The developed expressions reveal
A classic approach that addresses the detection problem in
the exact dependence of the optimal detectors on the state
the presence of unknown states/parameters is composite hy-
estimates. Because the joint posterior is a sufficient statistic
pothesis testing [3]. Accordingly, a straightforward approach
for joint hypothesis testing and state estimation, the derived
for joint hypothesis testing and state/parameter estimation is
explicit forms of such soft information (as opposed to hard
to perform composite hypothesis testing first, followed by
decisions) can be applied to all performance criteria with
state/parameter estimation based on the hard decision made
optimality guarantees.
from hypothesis testing. However, such an approach cannot
The remainder of the paper is organized as follows. In
provide optimality guarantees under general performance cri-
Section II, we describe the system model and formulate
teria that depend jointly on detection and estimation results.
the joint detection and estimation problem. In Section III,
In the literature, several studies have addressed such joint
we provide a factorization of the likelihood function and
performance criteria. The structure of the jointly optimal
derive a simple closed form expression for the joint posterior
Bayes detector and estimator with discrete-time data was
distribution. Finally, we conclude our paper and remark on
developed in [5] and [6], and was extended to the continuous-
future directions in Section IV.
time data case in [7]. There, the detector structure was
Notation. We use boldface letters to denote random quan-
This research was supported in part by the DTRA under Grant HDTRA1- tities and use regular letters to denote realizations or deter-
08-1-0010, in part by the Air Force Office of Scientific Research under ministic quantities.
MURI Grant FA9550-09-1-0643, and in part by the Office of Naval Research
under Grant N00014-12-1-0767. II. P ROBLEM FORMULATION
J. Chen is with the Dept. of Electrical Engineering, University of
California, Los Angeles, CA 90095 USA. (e-mail: [email protected]) We consider the following observation model which en-
Y. Zhao is with the Dept. of Electrical Engineering, Princeton Uni- tails a joint detection and estimation problem. Given each of
versity, Princeton, NJ, 08544 USA, and with the Dept. of Electrical
Engineering, Stanford University, Stanford, CA, 94305 USA (e-mail: the K + 1 hypotheses H0 , H1 , . . . , HK , the M × 1 sensor
[email protected]). measurement vector xt at time t is obtained according to the
A. Goldsmith is with the Dept. of Electrical Engineering, Stanford following linear model:
University, Stanford, CA 94305 USA (e-mail: [email protected])
H. V. Poor is with the Dept. of Electrical Engineering, Princeton Univer-
sity, Princeton, NJ 08544 USA (e-mail: [email protected]) H k : x t = Hk θ + v t , k = 0, 1, . . . , K, (1)
where Hk is the M ×N observation matrix under hypothesis Lemma 1 (Factorization): According to the linear model
Hk , θ is the N × 1 unknown state vector1 to estimate, in (1), we can express the conditional distribution (the
and vt ∼ N (0, Rv ) is the M × 1 measurement noise that likelihood function) p(xi |θ, Hk ) in the following form:
is independent and identically distributed (i.i.d.) over time.
From the measurement data {xt }, we want to jointly infer p(xi |θ, Hk ) = p(xi |θ̂k,ML , Hk )
 
a) the true underlying linear model Hk , and b) the true 1 2
× exp − θ− θ̂k,ML I(θ̂ (3)
2 k,ML )
underlying states θ. Note that neither of them is known
beforehand, and we need to solve a problem of jointly
where the notation kxk2Σ denotes xT Σx for some positive
detecting Hk and estimating θ. Such problems arise in many
definite weighting matrix Σ, θ̂k,ML is the maximum likeli-
applications. We provide in the following an example that
hood estimate of θ given that hypothesis Hk is true, and
arises commonly in power grid monitoring. An outage in a
I(θ̂k,ML ) is the corresponding Fisher information matrix:
power grid will change the grid topology, and the system
operator wants to detect which outage among a candidate θ̂k,ML = (HkT Rv−1 Hk )−1 HkT Rv−1 xi (4)
set {H1 , . . . , HK } occurs, or no outage occurs (H0 ). With a
I(θ̂k,ML ) = i · (HkT Rv−1 Hk ) (5)
given set of sensors in the grid, the k th outage scenario gives
i
rises to a unique observation matrix Hk , and the sensors 1 X
xi = xt . (6)
measure the states of the grid θ via (1). Consequently, state i t=1
estimation depends on knowledge of the true outage, and Proof: See Appendix I.
outage detection depends on knowledge of the true states [2]. In [15], an asymptotic expression similar to (3) was
Clearly, solving a joint detection and estimation problem is derived for general likelihood functions satisfying certain
essential for monitoring the health of the power grid in real regularity conditions for i large. In comparison, our expres-
time. sion (3) holds for all i ≥ 1 due to the properties of the
For these purposes, this paper provides the joint posterior linear model with Gaussian noise that we have assumed.
distribution p(θ, Hk |xi ) (see (12)–(13) below), which gives Furthermore, the linear model (1) also allows us to evaluate
us the beliefs about both θ and Hk . Actually, it is also the expression for p(xi |θ̂k,ML , Hk ) given by the following
a sufficient statistic for θ and Hk given data xi , which lemma.
provides full information from the measured data xi about Lemma 2 (Expression for p(xi |θ̂k,ML , Hk )): The condi-
the hypothesis Hk and the state vector θ. Therefore, instead tional probability p(xi |θ̂k,ML , Hk ) can be expressed as
of being the optimal functions of xi , the optimal decision " #i
rule and the estimator need only be the optimal functionals 1
of p(θ, Hk |xi ). Deriving the expressions for the joint pos- p(xi |θ̂k,ML , Hk ) = M 1
(2π) 2 det(Rv ) 2
terior distribution will give us a unified framework for joint ( )
i
detection and estimation under all performance criteria (e.g., 1X 2
· exp − kxt kRv−1
minimum risk/minimum probability of error/maximum a- 2 t=1
posteriori probability (MAP) detection, and MAP/minimum- 
1 2

mean-square-error (MMSE) estimate). · exp θ̂ k,ML I(θ̂
(7)
2 k,ML )

III. J OINT POSTERIOR OF HYPOTHESES AND STATES where θ̂k,ML and I(θ̂k,ML ) are given by (4)–(6).
We now derive the joint posterior distribution of the Proof: See Appendix II.
hypothesis Hk and the unknown states θ. Specifically, we Substituting (7) into (3), we obtain the following factor-
will use p(θ, Hk |xi ) as a hybrid probability measure to ization of the likelihood function:
#i
denote the joint posterior distribution of θ and Hk :
"
i 1
p(x |θ, Hk ) = M 1
p(θ, Hk |xi ) = p(Hk |xi )p(θ|Hk , xi ) (2) (2π) 2 det(Rv ) 2
( i
)
where p(Hk |xi ) denotes the posterior probability mass func- 1X 2
· exp − kxt kRv−1
tion (PMF) of Hk and p(θ|Hk , xi ) denotes the posterior 2 t=1
probability density function (PDF) of θ given Hk . 
1 2 
· exp θ̂k,ML

2 I(θ̂k,ML )
A. The Likelihood Function  2 
1
· exp − θ − θ̂k,ML . (8)

We begin with a factorization of the likelihood func- 2 I(θ̂k,ML )
tion p(xi |θ, Hk ), which will be useful in finding sufficient
statistics for jointly detecting Hk and estimating θ, and in Note that the first two terms in (8) are independent of
computing the joint posterior distribution. the hypothesis index k and the state vector θ, while the
other two terms depend on {θ̂k,ML }, which, by (4), is
1 In addition to states, θ can also include parameters in some applications further determined by xi . Therefore, by the Neyman-Fisher
[12], [13]. For the sake of brevity, we refer to θ as states from now on. factorization theorem [3], [15], [16], xi is a sufficient statistic
 
for jointly detecting Hk and estimating θ. This fact will also 1 2
exp 2 kθ̂q,MMSE kCq,MMSE
−1
be reflected further ahead in the joint posterior expressions ·   (14)
(12)–(16), where xi is the only statistic we need to track 1 2
over time via, e.g.,
exp 2 kθq,0 kCq,0
−1 ,
−1
1

−1
xi = xi−1 + (xi − xi−1 ). (9) θ̂k,MMSE , Ck,0 +I(θ̂k,ML )
i  
−1
B. Conjugate Prior × I(θ̂k,ML ) θ̂k,ML +Ck,0 θk,0 , (15)
For a certain likelihood function, if a prior distribution 
−1
−1
produces a posterior distribution of the same family, then and Ck,MMSE , Ck,0 + I(θ̂k,ML ) . (16)
such a prior distribution is called a conjugate prior. With a Proof: See Appendix III.
conjugate prior, we need only to maintain the recursions for Note that θ̂k,MMSE is the classical MMSE estimate of θ
the parameters that describe the distribution family of the given Hk is true, and Ck,MMSE is the corresponding error
prior and the posterior. We will use this kind of prior in our covariance matrix. In the posterior expression (12),
joint detection and estimation problem. i
• f (x ) is a normalization factor.
At the beginning (before any measurement data are avail- • p(Hk ) captures theq prior PMF.
able), we assume that the prior distribution of θ and Hk are • The intuition of
det(Ck,MMSE )
is to penalize the
det(Ck,0 )
given by model complexity of H . This term reduces to
q k
p(θ, Hk ) = p(Hk )p(θ|Hk ) (10) det(I(θ̂k,ML )−1 ) in the case of an uninformative
where p(Hk ) is the prior PMF of the hypothesis Hk and prior (see (18) below), for which a discussion of its
p(θ|Hk ) is the prior PDF of the state vector θ given hypoth- meaning can be found in [15].
esis Hk . Throughout the paper, we assume that given Hk , θ • The last term in (12) characterizes the similarity be-

has a Gaussian prior: tween the data (adjusted by the prior PDF) and the
  hypothesis Hk .
1 1 2
p(θ|Hk ) = N 1 exp − kθ − θk,0 kC −1 Moreover, the exact dependence of the optimal detector on
(2π) 2 det(Ck,0 ) 2 2 k,0
the state estimator can be seen from expression (12).
(11)
Accordingly, the posterior marginal distribution p(θ|xi )
where θk,0 and Ck,0 are the corresponding prior mean and for the state vector θ can be expressed as
covariance matrix given hypothesis Hk , respectively. We will K
show in the next section that this prior is indeed a conjugate
X 1
p(θ|xi ) = p(Hk |xi ) · N 1
prior. Furthermore, we will also show that even with an k=0
(2π)
det(Ck,MMSE ) 2
2

“uninformative prior” about θ (i.e., Ck,0 → ∞ in some 


1

way), the resulting posterior will take the same form as the · exp − kθ − θ̂k,MMSE k2C −1 (17)
2 k,MMSE
joint prior distribution in (11). Therefore, alternatively, we
can think of the conjugate prior in (11) as the intermediate which is a Gaussian mixture density with K +1 components.
knowledge that we learned from earlier data. We observe from (12)–(13) that the posterior distribution
is in the same family as the prior distribution (10)–(11).
C. Main Results
Therefore, the prior we have chosen is a conjugate prior.
Theorem 1 (Optimal joint inference): Suppose the prior As a result, we need only to maintain recursions for the
distribution is given by (10)–(11). Then, the posterior dis- parameters of the posterior distribution. Specifically, we need
tribution p(θ, Hk |xi ) = p(Hk |xi )p(θ|xi , Hk ) is given by only to maintain recursions for Ck,MMSE and θ̂k,MMSE ,
where θ̂k,MMSE is the only term that depends on data via
s
1 det(Ck,MMSE )
p(Hk |xi ) = i
p(Hk ) θ̂k,ML . By (4), θ̂k,ML is a linear function of xi , which, as we
f (x ) det(Ck,0 )
  pointed out in Section III-A, is a sufficient statistic and can be
exp 12 kθ̂k,MMSE k2C −1 updated recursively by (9). Therefore, as new data stream in,
k,MMSE the optimal inference over the joint posterior p(θ, Hk |xi ) can
·   , (12)
1 2
be implemented recursively using finite memory. This fact is
exp 2 kθk,0 kC −1 also reflected in the marginal distribution p(θ|xi ), where the
k,0

1 number of Gaussian mixture components remains at K + 1


i
p(θ|Hk , x ) = N 1 over time.
(2π) 2 det(Ck,MMSE ) 2
 
1 D. The Case without Prior Knowledge of States
· exp − kθ − θ̂k,MMSE k2C −1 , (13)
2 k,MMSE When the inference algorithm has just started, we may
where not have any prior information about the states θ. In this
K
s case, we may assume that we are equally “uninformed”
X det(Cq,MMSE )
f (xi ) , p(Hq ) about the states under different hypotheses. In such a setup,
q=0
det(Cq,0 ) we let the covariance matrices of the prior p(θ|Hk ) be
the same for all Hk , i.e., Ck,0 = C0 for some invertible time. For this, it is essential to examine the evolution of the
matrix, and let C0 → ∞. Applying this procedure to joint posterior from time i−1 to i, and we expect that similar
(15)–(16), the MMSE estimate θ̂k,MMSE given Hk becomes techniques will apply. Furthermore, we have focused on the
the maximum likelihood estimate θ̂k,ML , and the MMSE optimal fixed sample size approach. Developing an optimal
Ck,MMSE given Hk becomes the inverse Fisher information sequential approach for joint detection and estimation from
matrix I(θ̂k,ML )−1 . As a result, the optimal joint inference the joint posterior distribution remains an interesting direc-
is given by the following corollary. tion for future work.
Corollary 1 (“Uninformative Prior”): Without any prior
information about the states, the joint posterior distribution A PPENDIX I
is given by P ROOF OF LEMMA 1
n o
2 From the definition of Gaussian linear model (1) and
exp 12 θ̂k I(θ̂
k,ML )
p(Hk ) · q because of the i.i.d. assumption on the noise, we can write
the joint likelihood function as

det I(θ̂k,ML )
i
p(Hk |x ) = n 2 o,
K exp 12 θ̂q,ML I(θ̂ p(xi |θ, Hk )
q,ML )
X
p(Hq ) · q  i
Y
q=0 det I(θ̂q,ML ) = p(xt |θ, Hk )
(18) t=1
i
1
 
p(θ|Hk , xi ) =
Y 1 1 2
M 1 = exp − kx t − H k θk −1
(2π) 2 det(I(θ̂k,ML ))− 2 M 1
2 Rv
t=1 (2π)
2 det(R ) 2
  v
1 2
" #i ( i
)
· exp − kθ − θ̂k kI(θ̂ . (19) 1 1X 2
2 k,ML )
= M 1 exp − kxt − Hk θkRv−1
(2π) 2 det(Rv ) 2 2 t=1
As a consequence, the posterior marginal distribution p(θ|xi ) " #i
for the state vector θ can be expressed as 1
= M 1
K
X 1 (2π) 2 det(Rv ) 2
p(θ|xi ) = p(Hk |xi ) · M 1
( i 2
)
det(I(θ̂k,ML ))− 2
(2π) 2 1 X
k=0 · exp − xt − Hk θ̂k,ML +Hk (θ̂k,ML −θ) −1


1
 2 t=1 Rv
· exp − kθ − θ̂k k2I(θ̂ . (20) #i
2 k,ML )
"
1
= M 1
(2π) 2 det(Rv ) 2
Note from (18)–(19) that the joint posterior distribution !
i
p(θ, Hk |xi ) takes the same form as the joint posterior in 1X T −1
· exp − (xt −Hk θ̂k,ML ) Rv (xt −Hk θ̂k,ML )
(12)–(13) even when we start without any prior knowledge 2 t=1
about θ. Therefore, the form of the joint prior distribution in i
!
(10)–(11) can also be viewed as the knowledge we learned 1X T T −1
· exp − (θ̂k,ML −θ) (Hk Rv Hk )(θ̂k,ML −θ)
from earlier data since they take the same form as the one 2 t=1
in (18)–(19). X i
!
T −1
· exp − (xt −Hk θ̂k,ML ) Rv Hk (θ̂k,ML −θ)
IV. C ONCLUSIONS AND FUTURE WORK
t=1
In this paper, we have studied the optimal joint detection " #i
and estimation problem in linear models with Gaussian noise. 1
= M 1
We have proved a factorization lemma of the likelihood func- (2π) 2 det(Rv ) 2
tion, and showed that the average measurement data vector i
!
1X
xi is a sufficient statistic. We then have derived a simple · exp − (xt −Hk θ̂k,ML )T Rv−1 (xt −Hk θ̂k,ML )
2 t=1
closed form expression of the joint posterior distribution !
for the hypotheses and the states. This expression reveals i
1X T T −1
the exact dependence of the optimal detector on the state · exp − (θ̂k,ML −θ) (Hk Rv Hk )(θ̂k,ML −θ)
2 t=1
estimates. The joint posterior can then be used to develop  
optimal joint detector and estimator structures under any · exp −i(xi − Hk θ̂k,ML )T Rv−1 Hk (θ̂k,ML − θ)
given performance criterion. " #i
We have studied the case in which the states are static over 1
= M 1
time. It is interesting to generalize to the case in which the (2π) 2 det(Rv ) 2
states evolve according to certain dynamics. In particular, we i
!
would like to investigate whether the joint posterior follows 1X
· exp − (xt −Hk θ̂k,ML )T Rv−1 (xt −Hk θ̂k,ML )
similar forms that only depend on the state estimates over 2 t=1
i
!
1X T T −1
T
+ i · θ̂k,ML HkT Rv−1 Hk θ̂k,ML
· exp − (θ̂k,ML −θ) (Hk Rv Hk )(θ̂k,ML −θ)
2 t=1 i
X
 h = xTt Rv−1 xt − 2θ̂k,ML
T
I(θ̂k,ML )θ̂k,ML
· exp − i xTi Rv−1 Hk (θ̂k,ML −θ) t=1
i T
T
− θ̂k,ML HkT Rv−1 Hk (θ̂k,ML −θ) + 2θ̂k,ML I(θ̂k,ML )θ̂k,ML
i
#i X
xTt Rv−1 xt − θ̂k,ML
" T
1 = I(θ̂k,ML )θ̂k,ML . (23)
= M 1 t=1
(2π) 2 det(Rv ) 2

i
! Finally, substituting (23) into (22), we establish Lemma 2.
1X
· exp − (xt −Hk θ̂k,ML )T Rv−1 (xt −Hk θ̂k,ML )
2 t=1 A PPENDIX III
i
! P ROOF OF THEOREM 1
1 X
· exp − (θ̂k,ML −θ)T (HkT Rv−1 Hk )(θ̂k,ML −θ) By Bayes’ formula, the joint posterior distribution can be
2
 t=1 expressed as
h
· exp − i xTi Rv−1 Hk (HkT Rv−1 Hk )−1 p(θ, Hk )p(xi |θ, Hk )
| {z } p(θ, Hk |xi ) = . (24)
T
θ̂k,ML p(xi )
× HkT Rv−1 Hk (θ̂k,ML − θ) To compute the above posterior distribution, we need to have
i p(xi ) given by
T
− θ̂k,ML HkT Rv−1 Hk (θ̂k,ML − θ) K Z
X
" #i p(xi ) = p(Hk ) · p(θ)p(xi |θ, Hk )dθ. (25)
1 k=0 θ∈Θ
= M 1
(2π) 2 det(Rv ) 2 To proceed, we first introduce the following lemma that gives
i
!
1X an integral result that would be useful in deriving both the
· exp − (xt −Hk θ̂k,ML )T Rv−1 (xt −Hk θ̂k,ML ) optimal detection and estimation procedures.
t=1
2
  Lemma 3 (A useful integral): Suppose we are given the
1 T T −1 Gaussian prior distribution (11). Then, the following result
· exp − (θ̂k,ML −θ) (i · Hk Rv Hk )(θ̂k,ML −θ)
2 holds:
= p(x|θ̂k,ML , Hk ) Z 
1 2 
p(θ) exp − θ − θ̂k,ML dθ (26)
 
1 T 2 I(θ̂k,ML )
· exp − (θ̂k,ML −θ) I(θ̂k,ML )(θ̂k,ML −θ) . θ∈Θ
2
 2 
1 −1
= exp I(θ̂k,ML )θ̂k,ML +Ck,0 θk,0 −1

(21) 2 −1
(Ck,0 +I(θ̂k,ML ))
A PPENDIX II 
1 2
 
P ROOF OF LEMMA 2 × exp − kθk,0 kC −1 + kθ̂k,ML k2I(θ̂
2 k,0 k,ML )

By (21), the expression for p(x|θ̂k,ML , Hk ) is given by 1


×  21 (27)
p(xi |θ̂k,ML , Hk ) 1

−1
det(Ck,0 ) 2 · det Ck,0 + I(θ̂k,ML )
" #i
1
= M 1 where θ̂k,ML and I(θ̂k,ML ) are defined in (4)–(6).
(2π) 2 det(Rv ) 2
( ) Proof: See Appendix IV.
i
1X T −1
Second, we compute the following integral and p(xi ) using
· exp − (xt − Hk θ̂k,ML ) Rv (xt − Hk θ̂k,ML ) the above lemma:
2 t=1
Z
(22) p(θ)p(xi |θ, Hk )dθ
For the summation in the exponent, we have θ∈Θ
" #i ( i
)
i 1 1X
X = · exp − kxt k2R−1
(xt − Hk θ̂k,ML )T Rv−1 (xt − Hk θ̂k,ML ) M
(2π) 2 det(Rv ) 2
1
2 t=1 v

t=1  
1
i
X · exp kθ̂k,ML k2I(θ̂
k,ML )
 T −1
= xt Rv xt − 2xTt Rv−1 Hk θ̂k,ML Z
2
 
t=1 1 2
· p(θ) exp − θ − θ̂k,ML ) I(θ̂ dθ
T
HkT Rv−1 Hk θ̂k,ML

+ θ̂k,ML 2 k,ML )
θ∈Θ
i
" #i ( i
)
X 1 1X
= xTt Rv−1 xt − 2ixTi Rv−1 Hk θ̂k,ML = M 1 · exp − 2
kxt kR−1
t=1 (2π) 2 det(Rv ) 2 2 t=1 v
   2 
1 1 −1
· exp = exp I(θ̂k,ML )θ̂k,ML +Ck,0 θk,0 −1

k,ML I(θ̂
θ̂
k,ML ) −1
2 2 (Ck,0+I(θ̂k,ML ))
 2 
1
 
−1 1 2

· exp I(θ̂k,ML )θ̂k,ML +Ck,0 θk,0 −1 × exp − kθk,0 kC −1 + kθ̂k,ML k2I(θ̂

−1
2 (Ck,0+I(θ̂k,ML )) 2 k,0 k,ML )

1
 
1 
· exp − kθk,0 k2C −1 + kθ̂k,ML k2I(θ̂ )
×  21 (30)
2 k,ML 1

k,0 −1
 − 12 det(Ck,0 ) 2 · 1 det Ck,0 + I(θ̂k,ML )
1
−1
· det(Ck,0 )− 2 · det Ck,0 + I(θ̂k,ML ) (28)
where the notation kxk2Σ = xT Σx, and in the last step we
used the fact that the integral of a Gaussian distribution over
p(xi ) the entire space equals one.
P Z
X R EFERENCES
= p(Hk ) · p(θ)p(xi |θ, Hk )dθ
p=0 θ∈Θ [1] A. Abur and A. Gomez-Exposito, Power System State Estimation:
" #i Theory and Implementation, Marcel Dekker, New York, 2004.
P [2] Y. Zhao, A. Goldsmith, and H. V. Poor, “On PMU location selection
X 1
= p(Hk ) · M 1
for line outage detection in wide-area transmission networks,” in Proc.
p=0 (2π) 2 det(Rv ) 2 IEEE Power and Energy Society General Meeting, San Diego, CA,
( ) July 2012, pp. 1–8.
i   [3] H. V. Poor, An Introduction to Signal Detection and Estimation,
1X 2 1 2
· exp − kxt kR−1 · exp θ̂ k,ML I(θ̂
Springer-Verlag, New York, 1994.
2 t=1 v 2 k,ML )
[4] A. Goldsmith, Wireless Communications, Cambridge University Press,
 2  Cambridge, UK, 2005.
1 −1 [5] D. Middleton and R. Esposito, “Simultaneous optimum detection and
· exp θ̂ )θ̂ +C θ

I( k,ML k,ML k,0 k,0 −1 −1
2 (Ck,0+I(θ̂k,ML )) estimation of signals in noise,” IEEE Trans. Inf. Theory, vol. 14, no.
  3, pp. 434–444, May 1968.
1 
[6] A. Fredriksen, D. Middleton, and V. VandeLinde, “Simultaneous signal
· exp − kθk,0 k2C −1 + kθ̂k,ML k2I(θ̂ ) detection and estimation under multiple hypotheses,” IEEE Trans. Inf.
2 k,0 k,ML

 − 12 Theory, vol. 18, no. 5, pp. 607–614, Sep. 1972.


1
−1 [7] D. G. Lainiotis, “Joint detection, estimation and system identification,”
× det(Ck,0 )− 2 · det Ck,0 + I(θ̂k,ML ) . (29) Information and Control, vol. 19, no. 1, pp. 75–92, 1971.
[8] G. V. Moustakides, G. H. Jajamovich, A. Tajer, and X. Wang, “Joint
Substituting the above expressions and (8) into (24), we detection and estimation: Optimum tests and applications,” IEEE
obtain (12)–(16) after some simple algebra. Trans. Inf. Theory, vol. 58, no. 7, pp. 4215–4229, 2012.
[9] O. Edfors, M. Sandell, J.-J. Van de Beek, S. K. Wilson, and P. O. Bor-
jesson, “OFDM channel estimation by singular value decomposition,”
A PPENDIX IV IEEE Trans. Commun., vol. 46, no. 7, pp. 931–939, Jul. 1998.
P ROOF OF LEMMA 3 [10] Y. Li, L. J. Cimini Jr, and N. R. Sollenberger, “Robust channel
estimation for OFDM systems with rapid dispersive fading channels,”
It follows that IEEE Trans. Commun., vol. 46, no. 7, pp. 902–915, Jul. 1998.
Z  2  [11] H. L. Van Trees, Optimum Array Processing, John Wiley & Sons,
1 New York, 2002.
p(θ) exp − θ − θ̂k,ML dθ

θ∈Θ 2 I(θ̂k,ML ) [12] R. Schmidt, “Multiple emitter location and signal parameter estima-
1 tion,” IEEE Trans. Antennas Propag., vol. 34, no. 3, pp. 276–280,
= N 1
Mar. 1986.
(2π) 2 det(Ck,0 ) 2 [13] R. Roy and T. Kailath, “ESPRIT-estimation of signal parameters via
Z   rotational invariance techniques,” IEEE Trans. Acoust., Speech, Signal
1 2 Process., vol. 37, no. 7, pp. 984–995, Jul 1989.
exp − kθ − θk,0 kC −1
θ∈Θ 2 k,0 [14] S. M. Kay, Modern Spectral Estimation: Theory and Application,
 2  Prentice Hall, Englewood Cliffs, NJ, 1988.
1 [15] S. M Kay, Fundamentals of Statistical Signal Processing, Volume II:
· exp − θ − θ̂k,ML dθ

2 I(θ̂k,ML ) Detection theory, Prentice Hall, Upper Saddle River, NJ, 1998.
 2  [16] S. M Kay, Fundamentals of Statistical Signal Processing, Volume I:
1 −1 Estimation Theory, Prentice Hall, Upper Saddle River, NJ, 1993.
= exp I(θ̂k,ML )θ̂k,ML +Ck,0 θk,0 −1

−1
2 (Ck,0+I(θ̂k,ML ))
 
1 2

× exp − kθk,0 kC −1 + kθ̂k,ML k2I(θ̂
2 k,0 k,ML )

1
×   21
1
−1
det(Ck,0 ) 2 · det Ck,0 + I(θ̂k,ML )
1
×  − 21
N −1
(2π) 2 det Ck,0 + I(θ̂k,ML )
Z n 1  −1
−1
× exp − θ − Ck,0 + I(θ̂k,ML )

θ∈Θ 2
  2 o
−1
× I(θ̂k,ML )θ̂k,ML + Ck,0 θk,0 −1 dθ

(Ck,0 +I(θ̂k,ML ))

You might also like