Structure of optimal strategies for remote estimation over Gilbert-Elliott channel with feedback
We investigate remote estimation over a Gilbert-Elliot channel with feedback. We assume that the channel state is observed by the receiver and fed back to the transmitter with one unit delay. In addition, the transmitter gets ack/nack feedback for successful/unsuccessful transmission. Using ideas from team theory, we establish the structure of optimal transmission and estimation strategies and identify a dynamic program to determine optimal strategies with that structure. We then consider first-order autoregressive sources where the noise process has unimodal and symmetric distribution. Using ideas from majorization theory, we show that the optimal transmission strategy has a threshold structure and the optimal estimation strategy is Kalman-like.
1.1 Motivation and literature overview
We consider a remote estimation system in which a sensor/transmitter observes a first-order Markov process and causally decides which observations to transmit to a remotely located receiver/estimator. Communication is expensive and takes place over a Gilbert-Elliot channel (which is used to model channels with burst erasures). The channel has two states: off state and on state. When the channel is in the off state, a packet transmitted from the sensor to the receiver is dropped. When the channel is in the on state, a packet transmitted from the sensor to the receiver is received without error. We assume that the channel state is causally observed at the receiver and is fed back to the transmitter with one-unit delay. Whenever there is a successful reception, the receiver sends an acknowledgment to the transmitter. The feedback is assumed to be noiseless.
At the time instances when the receiver does not receive a packet (either because the sensor did not transmit or because the transmitted packet was dropped), the receiver needs to estimate the state of the source process. There is a fundamental trade-off between communication cost and estimation accuracy. Transmitting all the time minimizes the estimation error but incurs a high communication cost; not transmitting at all minimizes the communication cost but incurs a high estimation error.
The motivation of remote estimation comes from networked control systems. The earliest instance of the problem was perhaps considered by Marschak  in the context of information gathering in organizations. In recent years, several variations of remote estimation has been considered. These include models that consider idealized channels without packet drops [2, 3, 4, 5, 6, 7, 8, 9] and models that consider channels with i.i.d. packet drops [10, 11].
The salient features of remote estimation are as follows:
The decisions are made sequentially.
The reconstruction/estimation at the receiver must be done with zero-delay.
When a packet does get through, it is received without noise.
Remote estimation problems may be viewed as a special case of real-time communication [12, 13, 14, 15]. As in real-time communication, the key conceptual difficulty is that the data available at the transmitter and the receiver is increasing with time. Thus, the domain of the transmission and the estimation function increases with time.
To circumvent this difficulty one needs to identify sufficient statistics for the data at the transmitter and the data at the receiver. In the real-time communication literature, dynamic team theory (or decentralized stochastic control theory) is used to identify such sufficient statistics as well as to identify a dynamic program to determine the optimal transmission and estimation strategies. Similar ideas are also used in remote-estimation literature. In addition, feature (F3) allows one to further simplify the structure of optimal transmission and estimation strategies. In particular, when the source is a first-order autoregressive process, majorization theory is used to show that the optimal transmission strategies is characterized by a threshold [5, 6, 7, 11, 10]. In particular, it is optimal to transmit when the instantaneous distortion due to not transmitting is greater than a threshold. The optimal thresholds can be computed either using dynamic programming [5, 6] or using renewal relationships [16, 10].
All of the existing literature on remote-estimation considers either channels with no packet drops or channels with i.i.d. packet drops. In this paper, we consider packet drop channels with Markovian memory. We identify sufficient statistics at the transmitter and the receiver. When the source is a first-order autoregressive process, we show that threshold-based strategies are optimal but the threshold depends on the previous state of the channel.
1.2 The communication system
The source is a first-order time-homogeneous Markov process , . For ease of exposition, in the first part of the paper we assume that is a finite set. We will later argue that a similar argument works when is a general measurable space. The transition probability matrix of the source is denoted by , i.e., for any ,
The channel is a Gilbert-Elliott channel [17, 18]. The channel state is a binary-valued first-order time-homogeneous Markov process. We use the convention that denotes that the channel is in the off state and denotes that the channel is in the on state. The transition probability matrix of the channel state is denoted by , i.e., for ,
The input alphabet of the channel is , where denotes the event that there is no transmission. The channel output alphabet is , where the symbols and are explained below. At time , the channel input is denoted by and the channel output is denoted by .
The channel is a channel with state. In particular, for any realization of , we have that
Note that the channel output is a deterministic function of the input and the state . In particular, for any and , the channel output is given as follows:
This means that if there is a transmission (i.e., ) and the channel is on (i.e., ), then the receiver observes . However, if there is no transmission (i.e., ) and the channel is on (i.e., ), then the receiver observes , if the channel is off, then the receiver observes .
There is no need for channel coding in a remote-estimation setup. Instead, the role of the transmitter is to determine which source realizations need to be transmitted. Let denote the transmitter’s decision. We use the convention that denotes that there is no transmission (i.e., ) and denotes that there is transmission (i.e., ).
Transmission is costly. Each time the transmitter transmits (i.e., ), it incurs a cost of .
At time , the receiver generates an estimate of . The quality of the estimate is determined by a distortion function .
1.3 Information structure and problem formulation
It is assumed that the receiver observes the channel state causally. Thus, the
information available at the receiver
The estimate is chosen according to
where is called the estimation rule at time . The collection for all time is called the estimation strategy.
It is assumed that there is one-step delayed feedback from the receiver to
The transmission decision is chosen according to
where is called the transmission rule at time . The collection for all time is called the transmission strategy.
The collection is called a communication strategy. The performance of any communication strategy is given by
where the expectation is taken with respect to the joint measure on all system variables induced by the choice of .
We are interested in the following optimization problem.
In the model described above, identify a communication strategy that minimizes the cost defined in (5).
2 Main results
2.1 Structure of optimal communication strategies
Two-types of structural results are established in the real-time communication literature: (i) establishing that part of the data at the transmitter is irrelevant and can be dropped without any loss of optimality; (ii) establishing that the common information between the transmitter and the receiver can be “compressed” using a belief state. The first structural results were first established by Witsenhausen  while the second structural results were first established by Walrand Varaiya .
We establish both types of structural results for remote estimation. First, we show that is irrelevant at the transmitter (Lemma 2.1); then, we use the common information approach of  and establish a belief-state for the common information between the transmitter and the receiver (Theorem 2.1).
For any estimation strategy of the form (3), there is no loss of optimality in restricting attention to transmission strategies of the form
Furthermore, define conditional probability measures and on as follows: for any ,
We call the pre-transmission belief and the post-transmission belief. Note that when are random variables, then and are also random variables which we denote by and .
For the ease of notation, for any and , define the following:
For any probability distribution on and any subset of , denotes .
For any probability distribution on , means that .
Given any transmission strategy of the form (6):
there exists a function such that
there exists a function such that
Note that in (7), we are treating as a row-vector and in (9), denotes a Dirac measure centered at . The update equations (7) and (8) are standard non-linear filtering equations. See Section 3 for proof.
In Problem 1.3, we have that:
Structure of optimal strategies: There is no loss of optimality in restricting attention to optimal transmission and estimation strategies of the form:
Dynamic program: Let denote the space of probability distributions on . Define value functions and as follows.
(12) and for (13) (14)
The proof idea is as follows. Once we restrict attention to transmission strategies of the form (6), the information structure is partial history sharing . Thus, one can use the common information approach of  and obtain the structure of optimal strategies. See Section 3 for proof.
The first term in (13) is the expected communication cost, the second term is the expected cost-to-go when the transmitter does not transmit, and the third term is the expected cost-to-go when the transmitter transmits. The first term in (14) is the expected distortion and the second term is the expected cost-to-go.
Although the above model and result are stated for sources with finite alphabets, they extend naturally to general state spaces (including Euclidean spaces) under standard technical assumptions. See  for details.
2.2 Optimality of threshold-based strategies for autoregressive source
In this section, we consider a first-order autoregressive source , , where the initial state and for , we have that
where and is distributed according to a symmetric and unimodal distribution with probability density function . Furthermore, the per-step distortion is given by , where is a even function that is increasing on . The rest of the model is the same as before.
For a first-order autoregressive source with symmetric and unimodal disturbance,
Structure of optimal estimation strategy: The optimal estimation strategy is given as follows: , and for ,
Structure of optimal transmission strategy: There exist threshold functions such that the following transmission strategy is optimal:
As long as the receiver can distinguish between the events (i.e., ) and (i.e., and ), the structure of the optimal estimator does not depend on the channel state information at the receiver.
It can be shown that under the optimal strategy, is symmetric and unimodal around and, therefore, is symmetric and unimodal around . Thus, the transmission and estimation strategies in Theorem 2.2 depend on the pre- and post-transmission beliefs only through their means.
Recall that the distortion function is even and increasing. Therefore, the condition can be written as . Thus, the optimal strategy is to transmit if the per-step distortion due to not transmitting is greater than a threshold.
3 Proof of the structural results
3.1 Proof of Lemma 2.1
Arbitrarily fix the estimation strategy and consider the best response strategy at the transmitter. We will show that is an information state at the transmitter.
Given any realization of the system variables , define and . Now, for any , we use the shorthand to denote . Then,
where we have added in the conditioning in because is a deterministic function of and follows from the source and the channel models. By marginalizing (19), we get that for any , we have
Now, let denote the per-step cost. Recall that . Thus, by (20), we get that
3.2 Proof of Lemma 2.1
which is the expression for .
For , we consider the three cases separately. For , we have
For , we have
Now, when , we have that
3.3 Proof of Theorem 2.1
Once we restrict attention to transmission strategies of the form (6), the information structure is partial history sharing . Thus, one can use the common information approach of  and obtain the structure of optimal strategies.
Following , we split the information available at each agent into a “common information” and “local information”. Common information is the information available to all decision makers in the future; the remaining data at the decision maker is the local information. Thus, at the transmitter, the common information is and the local information is . Similarly, at the receiver, the common information is and the local information is . When the transmitter makes a decision, the state (sufficient for input output mapping) of the system is ; when the receiver makes a decision, the state of the system is . By [19, Proposition 1], we get that the sufficient statistic for the common information at the transmitter is
and the sufficient statistic for the common information at the receiver is
Note that is equivalent to and is equivalent to . Therefore, by [19, Theorem 2], there is no loss of optimality in restricting attention to transmission strategies of the form (10) and estimation strategies of the form
4 Proof of optimality of threshold-based strategies for autoregressive source
4.1 A change of variables
Define a process as follows: and for ,
Note that is a function of . Next, define processes , , and as follows:
The processes and are related as follows: , , and for
Since , we have that .
It turns out that it is easier to work with the processes , , and rather than and .
Next, redefine the pre- and post-transmission beliefs in terms of the error process. With a slight abuse of notation, we still denote the (probability density) of the pre- and post-transmission beliefs as and . In particular, is the conditional pdf of given and is the conditional pdf of given .
Let denote the event whether the transmission was successful or not. In particular,
We use to denote the realization of . Note that is a deterministic function of and .
The time-evolutions of and is similar to Lemma 2.1. In particular, we have
Given any transmission strategy of the form (4):
there exists a function such that
where given by is the conditional probability density of , is the probability density function of and is the convolution operation.
there exists a function such that
The key difference between Lemmas 2.1 and 4.1 (and the reason that we work with the error process rather than ) is that the function in (32) depends on rather than . Consequently, the dynamic program of Theorem 2.1 is now given by
Again, note that due to the change of variables, the expression for does not depend on the transmitted symbol. Consequently, the expression for is simpler than that in Theorem 2.1.
4.2 Symmetric unimodal distributions and their properties
A probability density function on reals is said to be symmetric and unimodal () around if for any , and is non-decreasing in the interval and non-increasing in the interval .
Given , a prescription is called threshold based around if there exists such that
Let denote the family of all threshold-based prescription around .
Now, we state some properties of symmetric and unimodal distributions..
If is , then
For , the above property is a special case of [5, Lemma 12]. The result for general follows from a change of variables.
If is and , then for any , is .
We prove the result for each separately. Recall the update of given by (33). For , and hence is . For ,