Structure of optimal strategies for remote estimation over GilbertElliott channel with feedback
Abstract
We investigate remote estimation over a GilbertElliot channel with feedback. We assume that the channel state is observed by the receiver and fed back to the transmitter with one unit delay. In addition, the transmitter gets ack/nack feedback for successful/unsuccessful transmission. Using ideas from team theory, we establish the structure of optimal transmission and estimation strategies and identify a dynamic program to determine optimal strategies with that structure. We then consider firstorder autoregressive sources where the noise process has unimodal and symmetric distribution. Using ideas from majorization theory, we show that the optimal transmission strategy has a threshold structure and the optimal estimation strategy is Kalmanlike.
1 Introduction
1.1 Motivation and literature overview
We consider a remote estimation system in which a sensor/transmitter observes a firstorder Markov process and causally decides which observations to transmit to a remotely located receiver/estimator. Communication is expensive and takes place over a GilbertElliot channel (which is used to model channels with burst erasures). The channel has two states: off state and on state. When the channel is in the off state, a packet transmitted from the sensor to the receiver is dropped. When the channel is in the on state, a packet transmitted from the sensor to the receiver is received without error. We assume that the channel state is causally observed at the receiver and is fed back to the transmitter with oneunit delay. Whenever there is a successful reception, the receiver sends an acknowledgment to the transmitter. The feedback is assumed to be noiseless.
At the time instances when the receiver does not receive a packet (either because the sensor did not transmit or because the transmitted packet was dropped), the receiver needs to estimate the state of the source process. There is a fundamental tradeoff between communication cost and estimation accuracy. Transmitting all the time minimizes the estimation error but incurs a high communication cost; not transmitting at all minimizes the communication cost but incurs a high estimation error.
The motivation of remote estimation comes from networked control systems. The earliest instance of the problem was perhaps considered by Marschak [1] in the context of information gathering in organizations. In recent years, several variations of remote estimation has been considered. These include models that consider idealized channels without packet drops [2, 3, 4, 5, 6, 7, 8, 9] and models that consider channels with i.i.d. packet drops [10, 11].
The salient features of remote estimation are as follows:

The decisions are made sequentially.

The reconstruction/estimation at the receiver must be done with zerodelay.

When a packet does get through, it is received without noise.
Remote estimation problems may be viewed as a special case of realtime communication [12, 13, 14, 15]. As in realtime communication, the key conceptual difficulty is that the data available at the transmitter and the receiver is increasing with time. Thus, the domain of the transmission and the estimation function increases with time.
To circumvent this difficulty one needs to identify sufficient statistics for the data at the transmitter and the data at the receiver. In the realtime communication literature, dynamic team theory (or decentralized stochastic control theory) is used to identify such sufficient statistics as well as to identify a dynamic program to determine the optimal transmission and estimation strategies. Similar ideas are also used in remoteestimation literature. In addition, feature (F3) allows one to further simplify the structure of optimal transmission and estimation strategies. In particular, when the source is a firstorder autoregressive process, majorization theory is used to show that the optimal transmission strategies is characterized by a threshold [5, 6, 7, 11, 10]. In particular, it is optimal to transmit when the instantaneous distortion due to not transmitting is greater than a threshold. The optimal thresholds can be computed either using dynamic programming [5, 6] or using renewal relationships [16, 10].
All of the existing literature on remoteestimation considers either channels with no packet drops or channels with i.i.d. packet drops. In this paper, we consider packet drop channels with Markovian memory. We identify sufficient statistics at the transmitter and the receiver. When the source is a firstorder autoregressive process, we show that thresholdbased strategies are optimal but the threshold depends on the previous state of the channel.
1.2 The communication system
Source model
The source is a firstorder timehomogeneous Markov process , . For ease of exposition, in the first part of the paper we assume that is a finite set. We will later argue that a similar argument works when is a general measurable space. The transition probability matrix of the source is denoted by , i.e., for any ,
Channel model
The channel is a GilbertElliott channel [17, 18]. The channel state is a binaryvalued firstorder timehomogeneous Markov process. We use the convention that denotes that the channel is in the off state and denotes that the channel is in the on state. The transition probability matrix of the channel state is denoted by , i.e., for ,
The input alphabet of the channel is , where denotes the event that there is no transmission. The channel output alphabet is , where the symbols and are explained below. At time , the channel input is denoted by and the channel output is denoted by .
The channel is a channel with state. In particular, for any realization of , we have that
(1) 
and
(2) 
Note that the channel output is a deterministic function of the input and the state . In particular, for any and , the channel output is given as follows:
This means that if there is a transmission (i.e., ) and the channel is on (i.e., ), then the receiver observes . However, if there is no transmission (i.e., ) and the channel is on (i.e., ), then the receiver observes , if the channel is off, then the receiver observes .
The transmitter
There is no need for channel coding in a remoteestimation setup. Instead, the role of the transmitter is to determine which source realizations need to be transmitted. Let denote the transmitter’s decision. We use the convention that denotes that there is no transmission (i.e., ) and denotes that there is transmission (i.e., ).
Transmission is costly. Each time the transmitter transmits (i.e., ), it incurs a cost of .
The receiver
At time , the receiver generates an estimate of . The quality of the estimate is determined by a distortion function .
1.3 Information structure and problem formulation
It is assumed that the receiver observes the channel state causally. Thus, the
information available at the receiver
The estimate is chosen according to
(3) 
where is called the estimation rule at time . The collection for all time is called the estimation strategy.
It is assumed that there is onestep delayed feedback from the receiver to
the transmitter.
The transmission decision is chosen according to
(4) 
where is called the transmission rule at time . The collection for all time is called the transmission strategy.
The collection is called a communication strategy. The performance of any communication strategy is given by
(5) 
where the expectation is taken with respect to the joint measure on all system variables induced by the choice of .
We are interested in the following optimization problem.
Problem
In the model described above, identify a communication strategy that minimizes the cost defined in (5).
2 Main results
2.1 Structure of optimal communication strategies
Twotypes of structural results are established in the realtime communication literature: (i) establishing that part of the data at the transmitter is irrelevant and can be dropped without any loss of optimality; (ii) establishing that the common information between the transmitter and the receiver can be “compressed” using a belief state. The first structural results were first established by Witsenhausen [12] while the second structural results were first established by Walrand Varaiya [13].
We establish both types of structural results for remote estimation. First, we show that is irrelevant at the transmitter (Lemma 2.1); then, we use the common information approach of [19] and establish a beliefstate for the common information between the transmitter and the receiver (Theorem 2.1).
Lemma
For any estimation strategy of the form (3), there is no loss of optimality in restricting attention to transmission strategies of the form
(6) 
The proof idea is similar to [14]. We show that is a controlled Markov process controlled by . See Section 3 for proof.
Now, following [19], for any transmission strategy of the form (6) and any realization of , define as
Furthermore, define conditional probability measures and on as follows: for any ,
We call the pretransmission belief and the posttransmission belief. Note that when are random variables, then and are also random variables which we denote by and .
For the ease of notation, for any and , define the following:

.

For any probability distribution on and any subset of , denotes .

For any probability distribution on , means that .
Lemma
Given any transmission strategy of the form (6):

there exists a function such that
(7) 
there exists a function such that
(8) In particular,
(9)
Note that in (7), we are treating as a rowvector and in (9), denotes a Dirac measure centered at . The update equations (7) and (8) are standard nonlinear filtering equations. See Section 3 for proof.
Theorem
In Problem 1.3, we have that:

Structure of optimal strategies: There is no loss of optimality in restricting attention to optimal transmission and estimation strategies of the form:
(10) (11) 
Dynamic program: Let denote the space of probability distributions on . Define value functions and as follows.
(12) and for (13) (14) where,
The proof idea is as follows. Once we restrict attention to transmission strategies of the form (6), the information structure is partial history sharing [19]. Thus, one can use the common information approach of [19] and obtain the structure of optimal strategies. See Section 3 for proof.
Remark 1
The first term in (13) is the expected communication cost, the second term is the expected costtogo when the transmitter does not transmit, and the third term is the expected costtogo when the transmitter transmits. The first term in (14) is the expected distortion and the second term is the expected costtogo.
Remark 2
Although the above model and result are stated for sources with finite alphabets, they extend naturally to general state spaces (including Euclidean spaces) under standard technical assumptions. See [20] for details.
2.2 Optimality of thresholdbased strategies for autoregressive source
In this section, we consider a firstorder autoregressive source , , where the initial state and for , we have that
(16) 
where and is distributed according to a symmetric and unimodal distribution with probability density function . Furthermore, the perstep distortion is given by , where is a even function that is increasing on . The rest of the model is the same as before.
For the above model, we can further simplify the result of Theorem 2.1. See Section 4 for the proof.
Theorem
For a firstorder autoregressive source with symmetric and unimodal disturbance,

Structure of optimal estimation strategy: The optimal estimation strategy is given as follows: , and for ,
(17) 
Structure of optimal transmission strategy: There exist threshold functions such that the following transmission strategy is optimal:
(18)
Remark 3
As long as the receiver can distinguish between the events (i.e., ) and (i.e., and ), the structure of the optimal estimator does not depend on the channel state information at the receiver.
Remark 4
It can be shown that under the optimal strategy, is symmetric and unimodal around and, therefore, is symmetric and unimodal around . Thus, the transmission and estimation strategies in Theorem 2.2 depend on the pre and posttransmission beliefs only through their means.
Remark 5
Recall that the distortion function is even and increasing. Therefore, the condition can be written as . Thus, the optimal strategy is to transmit if the perstep distortion due to not transmitting is greater than a threshold.
3 Proof of the structural results
3.1 Proof of Lemma 2.1
Arbitrarily fix the estimation strategy and consider the best response strategy at the transmitter. We will show that is an information state at the transmitter.
Given any realization of the system variables , define and . Now, for any , we use the shorthand to denote . Then,
(19) 
where we have added in the conditioning in because is a deterministic function of and follows from the source and the channel models. By marginalizing (19), we get that for any , we have
(20) 
Now, let denote the perstep cost. Recall that . Thus, by (20), we get that
(21) 
3.2 Proof of Lemma 2.1
Consider
(22) 
which is the expression for .
For , we consider the three cases separately. For , we have
(23) 
For , we have
(24) 
3.3 Proof of Theorem 2.1
Once we restrict attention to transmission strategies of the form (6), the information structure is partial history sharing [19]. Thus, one can use the common information approach of [19] and obtain the structure of optimal strategies.
Following [19], we split the information available at each agent into a “common information” and “local information”. Common information is the information available to all decision makers in the future; the remaining data at the decision maker is the local information. Thus, at the transmitter, the common information is and the local information is . Similarly, at the receiver, the common information is and the local information is . When the transmitter makes a decision, the state (sufficient for input output mapping) of the system is ; when the receiver makes a decision, the state of the system is . By [19, Proposition 1], we get that the sufficient statistic for the common information at the transmitter is
and the sufficient statistic for the common information at the receiver is
Note that is equivalent to and is equivalent to . Therefore, by [19, Theorem 2], there is no loss of optimality in restricting attention to transmission strategies of the form (10) and estimation strategies of the form
(29) 
Furthermore, the dynamic program of 2.1 follows from [19, Theorem 3].
4 Proof of optimality of thresholdbased strategies for autoregressive source
4.1 A change of variables
Define a process as follows: and for ,
Note that is a function of . Next, define processes , , and as follows:
The processes and are related as follows: , , and for
and  
Since , we have that .
It turns out that it is easier to work with the processes , , and rather than and .
Next, redefine the pre and posttransmission beliefs in terms of the error process. With a slight abuse of notation, we still denote the (probability density) of the pre and posttransmission beliefs as and . In particular, is the conditional pdf of given and is the conditional pdf of given .
Let denote the event whether the transmission was successful or not. In particular,
We use to denote the realization of . Note that is a deterministic function of and .
The timeevolutions of and is similar to Lemma 2.1. In particular, we have
Lemma
Given any transmission strategy of the form (4):

there exists a function such that
(30) In particular,
(31) where given by is the conditional probability density of , is the probability density function of and is the convolution operation.

there exists a function such that
(32) In particular,
(33)
The key difference between Lemmas 2.1 and 4.1 (and the reason that we work with the error process rather than ) is that the function in (32) depends on rather than . Consequently, the dynamic program of Theorem 2.1 is now given by
(34)  
and for  
(35)  
(36) 
where,
Again, note that due to the change of variables, the expression for does not depend on the transmitted symbol. Consequently, the expression for is simpler than that in Theorem 2.1.
4.2 Symmetric unimodal distributions and their properties
A probability density function on reals is said to be symmetric and unimodal () around if for any , and is nondecreasing in the interval and nonincreasing in the interval .
Given , a prescription is called threshold based around if there exists such that
Let denote the family of all thresholdbased prescription around .
Now, we state some properties of symmetric and unimodal distributions..
Property
If is , then
For , the above property is a special case of [5, Lemma 12]. The result for general follows from a change of variables.
Property
If is and , then for any , is .
Proof:
We prove the result for each separately. Recall the update of given by (33). For , and hence is . For ,