Optimal Remote Estimation Over Use-Dependent Packet-Drop Channels - Extended Version
Consider a discrete-time remote estimation system formed by an encoder, a transmission policy, a channel, and a remote estimator. The encoder assesses a random process that the remote estimator seeks to estimate based on information sent to it by the encoder via the channel. The channel is affected by Bernoulli drops. The instantaneous probability of a drop is governed by a finite state machine (FSM). The state of the FSM is denoted as the channel state. At each time step, the encoder decides whether to attempt a transmission through the packet-drop link. The sequence of encoder decisions is the input to the FSM. This paper seeks to design an encoder, transmission policy and remote estimator that minimize a finite-horizon mean squared error cost. We present two structural results. The first result in which we assume that the process to be estimated is white and Gaussian, we show that there is an optimal transmission policy governed by a threshold on the estimation error. The second result characterizes optimal symmetric transmission policies for the case when the measured process is the state of a scalar linear time-invariant plant driven by white Gaussian noise. Use-dependent packet-drop channels can be used to quantify the effect of transmission on channel quality when the encoder is powered by energy harvesting. An application to a mixed initiative system in which a human operator performs visual search tasks is also presented.
First]David Ward Second]Nuno C. Martins
Department of Electrical and Computer Engineering and the Institute for Systems Research at University of Maryland, College Park, MD, 20742 USA (e-mail: dward2@ umd.edu).
Department of Electrical and Computer Engineering and the Institute for Systems Research at University of Maryland, College Park, MD, 20742 USA (e-mail: nmartins@ umd.edu)
Encoders often select varying channel modes to enhance transmission performance in the presence of power and energy constraints. For example, in battery-operated wireless communication systems with energy harvesting, the decision of whether to attempt transmission must be made time and again at each time-step. The charge-level of the battery induces memory in the channel, which must be monitored for use by the transmission policy. We define a class of use-dependent packet-drop channels to model the effect of attempted transmissions on current and future performance, which in our case is quantified by the probability that an attempted transmission is dropped. The memory in use-dependent packet-drop channels is modeled by a finite state machine (FSM). The state of the FSM, or channel state, determines the instantaneous probability of drop. In our formulation the only relevant input to the FSM is the time-sequence of decisions of whether to attempt a transmission.
We consider a system formed by a remote estimator, a transmission policy, a use-dependent packet-drop channel and an encoder. The estimator produces an estimate of the state of a linear time-invariant plant that is accessible to the encoder. The estimate is based on information transmitted from the encoder to the estimator via the channel. The encoder and transmission policy also have access to past transmission decisions and channel feedback on the realization of current and past drops. The encoder determines what to transmit over the channel and the transmission policy determines when to attempt a transmission. The main goal of this paper is to investigate encoders, transmission policies and remote estimators that jointly minimize the mean squared state estimation error over a finite time-horizon. Section Optimal Remote Estimation Over Use-Dependent Packet-Drop Channels - Extended Version contains the problem formulation.
The following are our two main results characterizing the structure of optimal transmission policies for our problem.
In the first result, we assume that the process to be estimated is white and Gaussian. We show that the optimal transmission policy is of the threshold type, meaning that the encoder chooses to attempt transmission when the process takes values outside a certain interval . The characteristics of the use-dependent packet-drop channel determine the values of and . In general, may not equal , even when the process is zero-mean.
In the second result, the process to be estimated is the state of a scalar linear time-invariant plant driven by white Gaussian noise, for which we seek to obtain an optimal symmetric transmission policy. We show that if the channel performs satisfactorily in all channel states, then there exists at least one symmetric threshold that, when applied to the estimation error, leads to a transmission policy that is optimal among all symmetric strategies. We present a numerical example that illustrates, for specific classes of use-dependent channels, that threshold policies are optimal among all symmetric strategies, even when there are no restrictions on the performance of the channel.
In section Optimal Remote Estimation Over Use-Dependent Packet-Drop Channels - Extended Version, the formal definition of use-dependent packet-drop channels is given and the problem is formulated. Section Optimal Remote Estimation Over Use-Dependent Packet-Drop Channels - Extended Version presents the technical results. Section Optimal Remote Estimation Over Use-Dependent Packet-Drop Channels - Extended Version outlines two engineering applications of our formulation. The Appendix presents basic concepts on quasi-convex functions.
In Lipsa and Martins (2009) and Lipsa and Martins (2011), an estimation problem over a packet drop channel with communication costs is considered. In contrast to Lipsa and Martins (2009) and Lipsa and Martins (2011), here we introduce a channel state and do not consider explicit communication costs. In our formulation, the channel state, which depends on current and past transmission decisions, and its impact on performance create an implicit communication cost. For example, in the energy harvesting application explained in section Optimal Remote Estimation Over Use-Dependent Packet-Drop Channels - Extended Version, there is no explicit cost for attempting a transmission. However, attempting a transmission reduces the energy available for future transmissions, which causes performance degradation that can be viewed as an implicit cost for attempting a transmission.
Considering costly measurements (or transmissions) in estimation and control problems has a long history and has been modeled in many ways. In Athans (1972), one of several possible measurements with different observation costs is selected to minimize a combination of error and observation cost. In Shamaiah et al. (2010), a subset of the measurements is selected in order to minimize the log-determinant of the error covariance. In Sinopoli et al. (2004), the arrival of observations is a random process and the convergence of the error covariance is studied. In Hajek et al. (2008), the task is to locate a mobile agent and the observation cost is the expected number of observations that must be made to do so.
In Weissman (2010), the capacity of channels with action-dependent states is studied. Although our problem formulation is similar to that of Weissman (2010) in motivation, it differs in several accounts. In contrast to Weissman (2010), we consider finite time horizons, a mean-squared error cost and a new class of packet-drop channels.
We use calligraphic font () to denote deterministic functions, capital letters () to represent random variables and lower case letters () to represent realizations of the random variables. Let denote the Gaussian distribution with zero mean and variance . We use to denote the finite sequence . The real line is denoted with and a subset of is denoted with double barred font, such as . The indicator function of a set is defined as
The expectation operator is denoted with . By we mean the limit of at from the right.
Consider the following scalar linear time-invariant system
where is the state, is a real constant, is independent and identically distributed Gaussian noise with zero mean and variance . The initial state is known.
Observations are made by the encoder and transmitted to the remote estimator over a use-dependent packet-drop channel, which is defined below.
Definition 1 (Use-dependent packet-drop channels)
and be given, where represents the set of possible states of a finite state machine (FSM). The channel inputs are and , which take values in and , respectively. In this model represents the information to be transmitted, while the decision to attempt a transmission (or not) is represented by (). The channel output takes values in and is determined as follows
where . Here, is a Bernoulli process characterized by , where is the state of the FSM updated by
The FSM’s initial state is known. Here, and model the effect of the input on the transitions among channel states and the probability of drop as a function of the channel state, respectively.
In figure 1, the dotted box represents the use-dependent packet-drop channel. Section Optimal Remote Estimation Over Use-Dependent Packet-Drop Channels - Extended Version discusses two applications of use-dependent packet drop channels.
At time , the transmission policy
determines whether a transmission is attempted,
based on the plant history and drop history . The remote estimator produces the state estimate,
based on the channel output history and the transmission history . The encoder determines what is transmitted,
based on the plant history and drop history .
We seek to solve the following problem.
For finite , solve
For any encoder and transmission policy, the optimal remote estimator is the conditional mean, . Also, an optimal encoder policy transmits only the current state, . This is evident from the Markov nature of and the information already available to the remote estimator. The channel drops can be calculated from ; thus, the only new information to send the remote estimator is .
Problem 2 (Main Problem)
For finite , solve
where the optimal encoder, , and optimal remote estimator, , are used.
In this section, we present our technical results. We began by defining threshold transmission policies.
Estimation error is denoted as .
A function is a threshold function if there are constants and, such that:
A function is a symmetric threshold function if there is a constant , such that:
A transmission policy is a threshold policy if the decision to transmit depends only on the current error and channel state in the following manner
for some , .
Notice that the current channel state and error are a function of the history (, ) and previous policies .
A transmission policy is a symmetric threshold policy if the decision to transmit depends only on the current error and channel state in the following manner
for some .
To investigate the structure of solutions to Problem 2, we start with the case when . The system state becomes
Since the estimation error is independent at each step, there are optimal transmission policies that only depend on the channel state and current error.
With , we reformulate Problem 2 as a dynamic program to show that there are optimal transmission policies of the threshold type, which may not be symmetric. An optimal transmission policy that is not symmetric in the estimation error is surprising since the cost function is symmetric in the error and the random process is zero-mean and symmetric.
We utilize the results in Vasconcelos and Martins (2013). In Vasconcelos and Martins (2013), a single stage estimation problem over a collision channel with two transmitters is studied. If both transmit then the remote estimator receives a collision symbol and if neither transmits a no-transmission symbol is received. The result in Vasconcelos and Martins (2013) states that the optimal policy for each transmitter is of the threshold type.
In Problem 2, when a transmission is attempted but is dropped, the remote estimator receives . This is distinguishable from when no transmission is attempted . In Vasconcelos and Martins (2013), because the remote estimator can distinguish between a collision and a no-transmission, the optimal policies are of the threshold type and may not be symmetric. Similarly for Problem 2, the remote estimator’s ability to distinguish a failed transmission and no transmission leads to optimal policies that are of the threshold type and may not be symmetric.
Problem 2 is a sequential problem; distinguishing it from Vasconcelos and Martins (2013), which is a static problem. Notice that our problem cannot be converted into a sequence of static problems because the transmission policies depend on the channel memory.
Following Vasconcelos and Martins (2013), the stage cost at time can be written as
where , and .
The stage cost at time is a function of only the current channel state and transmission policy .
Proof. From (Optimal Remote Estimation Over Use-Dependent Packet-Drop Channels - Extended Version), note that is a deterministic function of the channel state , the probability that and the distribution . This distribution can be written as
where and . Thus, (Optimal Remote Estimation Over Use-Dependent Packet-Drop Channels - Extended Version) is a function of and the probability mass function .
The transmission policy determines the distribution . Therefore, the stage cost is a function of only and .
With , Problem 2 can be written as a Markov chain with as the input, as the noise, as the state, and as the stage cost. Note the input is not , the decision to transmit, as may have been expected. The transmission policy is taken as the input because the distribution depends on the entire policy : not just the specific decision .
Using Proposition 1 and the independence of the system states over time, without loss of performance, we need to consider only transmission policies that are functions of the current system state and channel state, . Consequently, the Markov decision process can be simplified with as the input, as the noise, as the state, and as the stage cost. The associated dynamic programming recursion is shown in (2) and (3) on the next page.
Let be independent and identically distributed . The optimal transmission policy for Problem 2 is of the threshold type.
Proof. For an arbitrary transmission policy , we seek a policy that outperforms it and is a threshold policy. Note, all quantities associated with the policy have a superscript . Also, all quantities associated with policy have a superscript .
We expand our search for a policy to include randomized transmission policies. For and , let be the probability of transmitting, . Also, .
For a specific , consider a policy that matches the policy ’s probability of transmitting,
Also, let policy be such that it produces estimates that match those of policy ,
Since , we have . All the quantities in (3) are the same for both policies with the exception of , for . We will choose to reduce , for .
In Vasconcelos and Martins (2013), minimizing for subject to the constraints (6), (7) and (8) was cleverly rewritten as a constrained moment matching problem. It was shown that the optimal was a threshold function of . Using this result, we have constructed a threshold policy that outperforms .
Thus, for every and , we can construct a threshold policy that out forms . This threshold policy outperforms .
We now investigate the structure of the best symmetric transmission policies. We seek conditions under which the optimal symmetric transmission policy is a symmetric threshold policy. This is the case if the probability of drop is sufficiently small for all channel states. Even if the drop probabilities are not sufficiently small, symmetric threshold policies may still be optimal. This is highlighted by a numerical example, which suggests that there are classes of channel dynamics for which symmetric threshold policies are the best symmetric transmission policies. This is the topic of future research.
Restricting to symmetric transmission policies, Problem 2 can be written as a dynamic program. We first show that the cost-to-go functions are quasi-convex. In order to accomplish this, we write the evolution of the error in a convenient manner. Definitions for quasi-convexity and supporting results are presented in the appendix.
If is a symmetric transmission policy, then the error evolves according to
Proof. This is in principle equivalent to (Lipsa and Martins, 2009, Proposition 3.1). The difference is that here is a symmetric policy; not a symmetric threshold policy as in (Lipsa and Martins, 2009, Proposition 3.1). However, the proof only relies on the symmetric nature of the policy.
The convenient form of the error evolution in (9) is possible due to the symmetric assumption. For symmetric policies, when the optimal estimate is the same whether a transmission was attempted or not. The remote estimator’s belief depends on the value of ; however, its mean, which is the optimal estimate, does not.
The problem can be considered a Markov decision process with state , input , and noise . The cost to be minimized is
The associated dynamic programming recursion is given by
and , , and distributed .
For and , the cost-to-go functions are quasi-convex and symmetric in . The minimum value is .
Proof. We show that is a symmetric and non-decreasing function in . This implies is quasi-convex by Lemma 8. The proof is by induction. The claim holds for the initial case, . Assume is symmetric and non-decreasing in . is the minimum between and . By Lemma 10, is symmetric and non-decreasing in for . and are symmetric and non-decreasing in because they are the sum of two such functions. Thus by Lemma 9, is symmetric and non-decreasing in .
There exists a such that if for all
then the optimal symmetric transmission policy is a threshold policy.
Several lemmata will be presented to aid in the proof of this theorem. Let . Also define,
For and , if
then the optimal symmetric transmission policy for stage is a threshold policy.
Proof. We show that if (11) holds, any non-threshold, symmetric policy is not the optimal symmetric transmission policy.
For a non-threshold, symmetric policy there exists a and such that but for small . Since from (10) we have . Also, since we have . By subtracting these equations we have . By rearranging terms this becomes
Dividing by and taking the limit yields
Contradicting the assumption. Thus, the optimal policy is a threshold policy.
The condition in Lemma 6, garuntees that increases more than at every estimation error . Clearly, this is a condition that leads to threshold transmission policies.
For all and ,
Proof. We show inductively that for all and , there exists a such that , for .
This property holds for , since and . Thus, .
Assume the property holds for with . We will show the property holds for . For a specific and , there are two cases or , see (10). We prove the statement for the case when . The other case yields the same result and is analogous.
Equation (4), on the previous page, is obtained for the case using and using the bound
The right hand side of (4) is comprised of two terms. The first term is upper bounded by a .
Next, we take the expectation of (5) with respect to and then the limit with respect to . Using the inductive hypothesis to bound the second term by , this yields
Thus, with the induction is complete. We see that for all , is an adequate bound.
since by Lemma 4. Rearanging and using the bound on gives
Contradicting the assumption. Thus, the optimal policy is a threshold policy.
A model of a wireless communication channel with energy harvesting capabilities is presented in this section. This channel is modeled with a use-dependent packet-drop channel. Many different problem formulations addressing remote estimation over a battery powered channel have been considered: see Ulukus et al. (2015), Ozel et al. (2011) and Nayyar et al. (2012) and the references therein.
Consider a battery operated channel with a capacity of 4 energy units. Assume energy is harvested deterministically, as in Ozel et al. (2011), at energy unit per time step. Transmitting requires units of energy and no energy is harvested during transmission. At each time step, the decision of whether to transmit is made.
To model the battery dynamics, the FSM shown in figure 2 is used. The channel states are . Channel state denotes that the battery has energy units. If a transmission is attempted , then the battery level is reduced by energy units. Thus the channel state state transitions to state . If a transmission is not attempted , then the battery level increases by as long as the battery is not already at capacity. Thus, the channel transitions from state to state . Obviously, in states and transmitting is not allowed due to insufficient energy.
The probability of drop for each state capable of transmitting is . Transmission is not possible in states and but we assign a drop probability of for consistency.
This energy harvesting channel is clearly a use-dependent packet-drop channel. We assume that the encoder receives acknowledgements of the transmissions and that the remote estimator can distinguish between a drop and no transmission attempt. Interestingly, from Theorem 2 we have that the optimal transmission policy may not be symmetric in the estimation error even though the cost is symmetric in the estimation error and the noise is zero-mean and symmetric.
We numerically calculated the optimal symmetric transmission policies for this example when , and . The optimal symmetric transmission policy for channel state is shown in figure 6.
Notice that the optimal symmetric transmission policy is a threshold policy, even though the conditions of Theorem 5 are not satisfied. In fact, every that we tested has an optimal symmetric transmission policy that is a threshold policy. This suggests that for these channel dynamics, threshold transmission policies are optimal among all symmetric strategies.
In Theorem 5, no assumptions were made about the size of the channel state space or the channel state dynamics. For specific channel dynamics or classes of channel dynamics weakening the condition in Theorem 5 may be possible.
In this section, we seek to optimize a decision support system for human operators tracking a dynamic target.
Consider a human operator managing multiple UAVs. Tracking a dynamic target is one of operator’s many tasks. A video feed is presented to the operator (see figure 4 for an example of the video feed). The white region is drawn on the video feed by the decision support system. The operator’s task is to indicate if the target is inside this region. If outside the region the operator is requested to log the target’s current location; however, the operator is allowed to not log the target’s location if other tasks seem more vital. Schulte and Donath (2011a) perform experiments in a similar setting.
We seek to dynamically optimize the white regions in order to help the operator manage their time appropriately. If the regions are large, the target’s location is not well known. If the regions are small, then the target’s location is frequently requested. This increases the operator’s workload and the likelihood the operator will ignore the request. The channel state is used to model operator workload. The optimal transmission policies define the optimal white regions and manage the tradeoff between accuracy and workload.
Yerkes-Dodson’s law quantifies the tradeoff between operator performance and workload, see Yerkes and Dodson (1908). Yerkes-Dodson’s law states that the operator performs poorly if the workload is very high or very low. Optimizing operator decision support systems using Yerkes-Dodson’s law as an operator model is also investigated in Savla and Frazzoli (2012) and Srivastava et al. (2012). In Savla and Frazzoli (2012), the workload impacts the time to complete tasks such that under high workload situations the operator completes tasks slowly. The authors find optimal policies specifying when to present the operator with tasks in order to maximize throughput. In Srivastava et al. (2012), not all tasks must be completed and the questions of which tasks to assign, for how long, and with how much rest in-between are addressed.
In contrast to Savla and Frazzoli (2012) and Srivastava et al. (2012) and motivated by Schulte and Donath (2011b), we assume that the operator workload impacts the likelihood that the operator will ignore a request for information.
We consider the operator’s workload a function of the average number of requests over the last time steps,
If the average is high, the operator is prone to shed tasks. This workload model has memory and can be envisioned as the finite state machine in figure 5. State represents requests occurring in the last steps.
To formulate this as a use-dependent packet-drop channel we take the target’s location to be the system state, . The target being outside the white region represents an attempted transmission . The transmission policy defines the white region.
We have modeled this application as a use-dependent packet-drop channel. By Theorem 5 if the operator is unlikely to ignore requests, , then the optimal symmetric white regions are threshold policies. This is desirable since non threshold policies represent white regions that are not connected and may mislead operators.
The numerical example below suggests that threshold policies are the best symmetric policies even if the operator is likely to ignore requests. We believe this is due to the simple structure of the channel dynamics.
Note in this example is two dimensional; however, in our formulation is scalar. Under suitable independence assumptions, the results are applicable to higher dimensions.
We numerically find optimal symmetric transmission policies for this example when , and . The channel dynamics and drop probabilities are shown in figure 5. The optimal symmetric transmission policies are calculated by approximating the value functions in (10). In figure 6, the optimal policies for channel sstates and are shown. It can be seen that the policies are symmetric. In fact, for all drop probabilities that were simulated, the optimal transmission policies were threshold policies.
We investigated optimal transmission policies for a remote estimation problem over a use-dependent packet-drop channel. We presented structural results for the optimal transmission policies under two different assumptions. Also, two examples were presented. An example application to energy harvesting channels and an example application to mixed initiative teams with human operator’s performing visual search tasks were discussed.
- Athans (1972) Athans, M. (1972). On the determination of optimal costly measurement strategies for linear stochastic systems. Automatica, 8(4), 397 – 412.
- Hajek et al. (2008) Hajek, B., Mitzel, K., and Yang, S. (2008). Paging and registration in cellular networks: Jointly optimal policies and an iterative algorithm. Information Theory, IEEE Transactions on, 54(2), 608–622.
- Lipsa and Martins (2009) Lipsa, G.M. and Martins, N.C. (2009). Optimal state estimation in the presence of communication costs and packet drops. In Proceedings of the 47th Annual Allerton Conference on Communication, Control, and Computing, Allerton’09, 160–169.
- Lipsa and Martins (2011) Lipsa, G. and Martins, N. (2011). Remote state estimation with communication costs for first-order lti systems. Automatic Control, IEEE Transactions on, 56(9), 2013–2025.
- Nayyar et al. (2012) Nayyar, A., Basar, T., Teneketzis, D., and Veeravalli, V.V. (2012). Optimal strategies for communication and remote estimation with an energy harvesting sensor. CoRR, abs/1205.6018.
- Ozel et al. (2011) Ozel, O., Yang, J., and Ulukus, S. (2011). Optimal scheduling over fading broadcast channels with an energy harvesting transmitter. In Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), 2011 4th IEEE International Workshop on, 193–196.
- Savla and Frazzoli (2012) Savla, K. and Frazzoli, E. (2012). A dynamical queue approach to intelligent task management for human operators. Proceedings of the IEEE, 100(3), 672–686.
- Schulte and Donath (2011a) Schulte, A. and Donath, D. (2011a). Measuring self-adaptive uav operators’ load-shedding strategies under high workload. In Proceedings of the 9th International Conference on Engineering Psychology and Cognitive Ergonomics, EPCE’11, 342–351.
- Schulte and Donath (2011b) Schulte, A. and Donath, D. (2011b). Measuring self-adaptive uav operators’ load-shedding strategies under high workload. In HCI (21)’11, 342–351.
- Shamaiah et al. (2010) Shamaiah, M., Banerjee, S., and Vikalo, H. (2010). Greedy sensor selection: Leveraging submodularity. In Decision and Control (CDC), 2010 49th IEEE Conference on, 2572–2577.
- Sinopoli et al. (2004) Sinopoli, B., Schenato, L., Franceschetti, M., Poolla, K., Jordan, M., and Sastry, S. (2004). Kalman filtering with intermittent observations. Automatic Control, IEEE Transactions on, 49(9), 1453–1464.
- Srivastava et al. (2012) Srivastava, V., Surana, A., and Bullo, F. (2012). Adaptive attention allocation in human-robot systems. In American Control Conference (ACC), 2012, 2767–2774.
- Ulukus et al. (2015) Ulukus, S., Yener, A., Erkip, E., Simeone, O., Zorzi, M., Grover, P., and Huang, K. (2015). Energy harvesting wireless communications: A review of recent advances. Selected Areas in Communications, IEEE Journal on, 33(3), 360–381.
- Vasconcelos and Martins (2013) Vasconcelos, M. and Martins, N. (2013). Estimation over the collision channel: Structural results. In Communication, Control, and Computing (Allerton), 2013 51st Annual Allerton Conference on, 1114–1119.
- Weissman (2010) Weissman, T. (2010). Capacity of channels with action-dependent states. Information Theory, IEEE Transactions on, 56(11), 5396–5411.
- Yerkes and Dodson (1908) Yerkes, R.M. and Dodson, J.D. (1908). The relation of strength of stimulus to rapidity of habit-formation. Journal of comparative neurology and psychology, 18(5), 459–482. \@xsect In this appendix, definitions and results related to quasi-convex functions are presented.
A function is quasi-convex if for and
A function is symmetric and non-decreasing in if for ,
If is symmetric and non-decreasing in then is quasi-convex.
Proof. For , without loss of generality let . Note . For , since , we have .
Let be symmetric and non-decreasing in . The function is symmetric and non-decreasing in .
Proof. First, we show is symmetric. For ,
We now show is non-decreasing. For ,
Let be a symmetric and non-decreasing in , a random variable distributed and . The function is symmetric and non-decreasing in .
Proof. First, we show is symmetric. For ,
where . The second equality holds by change of variables .
We now show is non-decreasing. Let . Using the symmetry of , with , can be written,
There exists a