Optimal Energy Allocation for Kalman Filtering over Packet Dropping Links with Imperfect Acknowledgments and Energy Harvesting Constraints††thanks: A preliminary version of this paper was presented at the 4th IFAC NecSys workshop, Koblenz, Germany, Sep. 2013.
This paper presents a design methodology for optimal transmission energy allocation at a sensor equipped with energy harvesting technology for remote state estimation of linear stochastic dynamical systems. In this framework, the sensor measurements as noisy versions of the system states are sent to the receiver over a packet dropping communication channel. The packet dropout probabilities of the channel depend on both the sensor’s transmission energies and time varying wireless fading channel gains. The sensor has access to an energy harvesting source which is an everlasting but unreliable energy source compared to conventional batteries with fixed energy storages. The receiver performs optimal state estimation with random packet dropouts to minimize the estimation error covariances based on received measurements. The receiver also sends packet receipt acknowledgments to the sensor via an erroneous feedback communication channel which is itself packet dropping.
The objective is to design optimal transmission energy allocation at the energy harvesting sensor to minimize either a finite-time horizon sum or a long term average (infinite-time horizon) of the trace of the expected estimation error covariance of the receiver’s Kalman filter. These problems are formulated as Markov decision processes with imperfect state information. The optimal transmission energy allocation policies are obtained by the use of dynamic programming techniques. Using the concept of submodularity, the structure of the optimal transmission energy policies are studied. Suboptimal solutions are also discussed which are far less computationally intensive than optimal solutions. Numerical simulation results are presented illustrating the performance of the energy allocation algorithms.
Wireless sensor network (WSN) technologies arise in a wide range of applications such as environmental data gathering [1, 2], mobile robots and autonomous vehicles [3, 4], and monitoring of smart electricity grids [5, 6], among many others. In these applications one of the important challenges is to improve system performance and reliability under resource (e.g., energy/power, computation and communication) constraints.
A considerable amount of research has recently been devoted to the concept of energy harvesting  (see also [8, 9, 10, 11, 12, 13] among other papers). This is motivated by energy limited WSN applications where sensors may need to operate continuously for years on a single battery. In the energy harvesting paradigm the sensors can recharge their batteries by collecting energy from the environment, e.g. solar, wind, water, thermal or mechanical vibrations. However, the amount of energy harvested is random as most renewable energy sources are unreliable. In this work we will consider the remote Kalman filtering problem with random packet dropouts and imperfect receipt acknowledgments when the sensors are equipped with energy harvesting technology, and as a result, are subject to energy harvesting constraints.
Since the seminal work of , the problem of state estimation or Kalman filtering over packet dropping communication channels has been studied extensively (see for example[15, 16, 17, 18, 19, 20, 21] among others). The reader is also referred to the comprehensive survey  for some of the research on the area of control and estimation over lossy networks up to 2007. In these problems sensor measurements (or state estimates in the case of ) are grouped into packets which are transmitted over a packet dropping link such that either the entire packet is received or lost in a random manner. The focus in these works is on deriving conditions on the packet arrival rate in order to guarantee the stability of the Kalman filter.
There are other works which are concerned with estimation performance (e.g. minimizing the expected estimation error covariance) rather than just stability. For instance, power allocation techniques111We measure energy on a per channel use basis and we will refer to energy and power interchangeably. (without energy harvesting constraints) have been applied to the Kalman filtering problem in [23, 24, 25] in order to improve the estimation performance. In these works energy allocation can be used to improve system performance and reliability.
In conventional wireless communication systems, the sensors have access either to a fixed energy supply or have batteries that may be easily rechargeable/replaceable. Therefore, the sum of energy/power constraint is used to model the energy limitations of the battery-powered devices (see ). However, in the context of WSNs the use of energy harvesting is more practical, e.g., in remote locations with restricted access to an energy supply, and even essential where it is dangerous or impossible to change the batteries [26, 11]. In these situations it is possible to have communication devices with on-board energy harvesting capability which may recharge their batteries by collecting energy from the environment including solar, thermal or mechanical vibrations.
Typically, the harvested energy is stored in an energy storage such as a rechargeable battery which then is used for communications or other processing. Even though the energy harvesters provide an everlasting energy source for the communication devices, the amount of energy expenditure at every time slot is constrained by the amount of stored energy currently available. This is unlike the conventional communication devices that are subject only to a sum energy constraint. Therefore, a causality constraint is imposed on the use of the harvested energy . Communication schemes for optimizing throughput for transmitters with energy harvesting capability have been studied in [10, 11], while a remote estimation problem with an energy harvesting sensor was considered in  which minimized a cost consisting of both the distortion and number of sensor transmissions.
In this paper we study the problem of optimal transmission energy allocation at an energy harvesting sensor for remote state estimation of linear stochastic dynamical systems. In this model, sensor’s measurements as noisy versions of the system’s states are sent to the receiver over a packet dropping communication channel. Similar to the channel models in , the packet dropout probabilities depend on both the sensor’s transmission energies and time varying wireless fading channel gains. The sensor has access to an energy harvesting source which is an everlasting but unreliable energy source compared to conventional batteries with fixed energy storages. The receiver performs a Kalman filtering optimal state estimation with random packet dropouts to minimize the estimation error covariances based on received measurements. In general, knowledge at the sensor of whether its transmissions have been received at the receiver is usually achieved via some feedback mechanism. Here, in contrast to the models in [24, 27] the feedback channel from receiver to sensor is also a packet dropping erroneous channel leading to a more realistic formulation. The energy consumed in transmission of a packet is assumed to be much larger than that for sensing or processing at the sensor and thus energy consumed in sensing and processing is not taken into account in our formulation.
The objective of this work is to design optimal transmission energy allocation (per packet) at the energy harvesting sensor to minimize either a finite-time horizon sum or a long term average (infinite-time horizon) of the trace of the expected estimation error covariance of the receiver’s Kalman filter. The important issue in this problem formulation is to address the trade-off between the use of available stored energy to improve the current transmission reliability and thus state estimation accuracy, or storing of energy for future transmissions which may be affected by higher packet loss probabilities due to severe fading.
These optimization problems are formulated as Markov decision processes with imperfect state information. The optimal transmission energy allocation policies are obtained by the use of dynamic programming techniques. Using the concept of submodularity , the structure of the optimal transmission energy policies are studied. Suboptimal solutions which are far less computationally intensive than optimal solutions are also discussed. Numerical simulation results are presented illustrating the performance of the energy allocation algorithms.
Previous presentation of the model considered in this paper includes  which investigates the case with perfect acknowledgments at the sensor. Here, we address the more difficult problem where the feedback channel from receiver to sensor is an imperfect erroneous channel modelled as an erasure channel with errors.
In summary, the main contributions of this paper are as follows:
Unlike a large number of papers focusing on the stability for Kalman filtering with packet loss, e.g. [15, 16, 17, 18, 19, 20, 21], we focus on the somewhat neglected issue of estimation error performance (noting that stability only guarantees bounded estimation error) in the presence of packet loss and how to optimize it via power/energy allocation at the sensor transmitter. Note that it is quite common to study optimal power allocation in the context of a random stationary source estimation in fading wireless sensor networks , but this issue has received much less attention in the context of Kalman filtering over packet dropping links which are randomly time-varying. In particular, we consider minimization of a long-term average of error covariance minimization for the Kalman filter by optimally allocating energy for individual packet transmissions over packet dropping links with randomly varying packet loss probability due to fading. While a version of this problem was considered in our earlier conference paper , we extend the problem setting and the analysis along multiple directions as described below.
Unlike , we consider an energy harvesting sensor that is not constrained by a fixed initial battery energy, but rather the randomness of the harvested energy pattern. Energy harvesting is a promising solution to the important problem of energy management in wireless sensor networks. Furthermore, recent advances in hardware have made energy harvesting technology a practical reality .
We provide a new sufficient stability condition for bounded long term average estimation error, which depends on the packet loss probability (which is a function of the channel gain, harvested energy and the maximum battery storage capacity) and the statistics of the channel gain and harvested energy process. Although difficult to verify in general, we provide simpler forms of this condition in when the channel gains and harvested energy processes follow familiar statistical models such as independent and identically distributed processes or finite state Markov chains.
We consider the case of imperfect feedback acknowledgements, which is more realistic but more difficult to study than the case of perfect feedback acknowledgements. We model the feedback channel by a general erasure channel with errors.
It is well known that the optimal solution obtained by a stationary control policy minimizing the infinite horizon control cost is computationally prohibitive. Thus motivated, we provide structural results on the optimal energy allocation policy which lead to threshold policies which are optimal and yet very simple to implement in some practical cases, e.g. when the sensor is equipped with binary transmission energy levels. Note that most sensors usually have a finite number of transmission energy/power levels and for simplicity, sensors can be programmed to only have two levels.
Finally, also motivated by the computational burden for the optimal control solution in the general case of imperfect acknowledgments, we provide a sub-optimal solution based on an estimate of the error covariance at the receiver. Numerical results are presented to illustrate the performance gaps between the optimal and sub-optimal solutions.
The organization of the paper is as follows. The system model is given in Section II. The optimal energy allocation problems subject to energy harvesting constraints are formulated in Section III. In Section IV the optimal transmission energy allocation policies are derived by the use of dynamic programming techniques. Section V presents suboptimal policies which are less computationally demanding. The structure of the optimal transmission energy allocation policies are studied in Section VI. Section VII presents the numerical simulation results. Finally, concluding remarks are stated in Section VIII.
Ii System Model
A diagram of the system architecture is shown in Fig. 1. The description of each part of the system is given in detail below.
Ii-a Process Dynamics and Sensor Measurements
We consider a linear time-invariant stochastic dynamical process
where is the process state at time , , and is a sequence of independent and identically distributed (i.i.d.) Gaussian noises with zero mean and positive definite covariance matrix . The initial state of the process is a Gaussian random vector, independent of the process noise sequence , with mean and covariance matrix .
The sensor measurements are obtained in the form
where is the observation at time , , and is a sequence of i.i.d. Gaussian noises, independent of both the initial state and the process noise sequence , with zero mean and a positive semi-definite covariance matrix .
We enunciate the following assumption:
(A1) We assume that is stabilizable and is detectable.
Ii-B Forward Communication Channel
The measurement is then sent to a receiver over a packet dropping communication channel such that (considered as a packet) is either exactly received or the packet gets lost due to corrupted data or substantial delay. The packet dropping channel is modelled by
where is the observation obtained by the receiver at time , and denotes that the measurement packet is received, while denotes that the packet containing the measurement is lost.
Similar to , we adopt a model for the packet loss process that is governed by the time-varying wireless fading channel gains and sensor transmission energy allocation (per packet) over this channel. In this model, the conditional packet reception probabilities are given by
where is a monotonically increasing continuous function. The form of will depend on the particular digital modulation scheme being used .
We consider the case where the set of fading channel gains is a first-order stationary and homogeneous Markov fading process (see ) where the channel remains constant over a fading block (representing the coherence time of the channel ). Note that the stationary first-order Markovian modelling includes the case of independent and identically distributed (i.i.d.) processes as a special case.
We assume that channel state information is available at the transmitter such that it knows the values of the channel gains at time . In practice, this can be achieved by channel reciprocity between the sensor-to-receiver and receiver-to-sensor channels (such as in typical time-division-duplex (TDD) based transmissions). In this scenario, the sensor can estimate the channel gain based on pilot signals transmitted from the remote receiver at the beginning of each fading block. Another possibility (if channel reciprocity does not hold) is to estimate the channel at the receiver based on pilot transmissions from the sensor and send it back to the sensor by channel state feedback. However, transmitting pilot signals consumes energy which should then be taken into account. To conform with our problem formulation, we therefore assume that channel reciprocity holds.
Ii-C Energy Harvester and Battery Dynamics
Let the unpredictable energy harvesting process be denoted by which is also modelled as a stationary first-order homogeneous Markov process, and which is independent of the fading process . This modelling for the harvested energy process is justified by empirical measurements in the case of solar energy .
We assume that the dynamics of the stored battery energy is given by the following first-order Markov model
with given , where is the maximum stored energy in the battery.
Ii-D Kalman Filter at Receiver
The receiver performs the optimal state estimation by the use of Kalman filtering based on the history which is the -field generated by the available information at the receiver up to time . We use the convention .
The optimal Kalman filtering and prediction estimates of the process state are given by and , respectively. The corresponding Kalman filter error covariances are defined as
The Kalman recursion equations for and are given in . In this paper we focus on the estimation error covariance which satisfies the random Riccati equation
for where (see ). Note that appears as a random coefficient in the Riccati equation (3). Since (i) the derivation in  allows for time-varying packet reception probabilities, and (ii) in the model of this paper the energy allocation only affects the probability of packet reception via (1) and not the system state that is being estimated, the estimation error covariance recursion is of the form (3) as given in . This is in contrast to the work  where the control signal can affect the states at future times which leads to a dual effect.
Ii-E Erroneous Feedback Communication Channel
In the case of unreliable acknowledgments, the packet loss process is not known to the sensor, instead, the sensor receives an imperfect acknowledgment process from the receiver. It is assumed that after the transmission of and before transmitting the sensor has access to the ternary process where
with given dropout probability for the binary process , i.e., for all . In case (i.e., ), no signal is received on the feedback link and this results in an erasure. In case , a transmission error may occur, independent of all other random processes, with probability . This transmission error results in the reception of when , and when . We may write the transition probability matrix of the erroneous feedback channel as a homogeneous Markov process with a transition probability matrix
where for and . This channel model refers to a generalized erasure channel, namely, a binary erasure channel with errors (see Exercise 7.13 in ). This model is general in the sense that if we let then the ternary acknowledgement process reduces to a binary process with the possibility of only transmission errors, and a standard erasure channel when we set . Finally, the case of perfect packet receipt acknowledgments studied in  is a special case when and above are both set to zero.
The present situation encompasses, as special cases, situations where no acknowledgments are available (UDP-case) and also cases where acknowledgments are always available (TCP-case), see also for a discussion in the context of closed loop control with packet dropouts.
Iii Optimal Transmission Energy Allocation Problems Subject to Energy Harvesting Constraints
In this section we formulate optimal transmission energy allocation problems in order to minimize the trace of the receiver’s expected estimation error covariances (3) subject to energy harvesting constraints. Unlike the problem formulation in , in the model of this paper the optimal energy policies are computed at the sensor which has perfect information about the energy harvesting and instantaneous battery levels but has imperfect state information about the packet receipt acknowledgments.
We consider the realistic scenario of causal information case where the unpredictable future wireless fading channel gains and energy harvesting information are not a priori known to the transmitter. More precisely, the information available at the sensor at any time is given by
where is the initial condition.
The information is used at the sensor to decide the amount of transmission energy for the packet loss process. A policy for is feasible if the energy harvesting constraint is satisfied. The admissible control set is then given by
The optimization problems are now formulated as Markov decision processes with imperfect state information for the following two cases:
(i) Finite-time horizon:
and (ii) Long term average (infinite-time horizon):
where is the stored battery energy available at time which satisfies the battery dynamics (2). It is evident that the transmission energy at time , , affects the amount of stored energy available at time which in turn affects the transmission energy since by (2). In the special case of perfect packet receipt acknowledgments from receiver to sensor, the reader is referred to  for a similar long term average cost formulation under an average transmission power constraint which is a soft constraint unlike the energy harvesting constraint considered here, which is a hard constraint in an almost sure sense.
We note that the expectations in (4) and (5) are computed over random variables , and for given initial condition . Since these expectations are conditioned on the transmission success process of the feedback channel instead of the packet loss process of the forward channel , these formulations fall within the general framework of stochastic control problems with imperfect state information.
It is known that Kalman filtering with packet losses may have unbounded expected estimation error covariances in certain situations (see ). We now aim to provide sufficient conditions under which the infinite horizon stochastic control problem (5) is well-posed in the sense that an exponential boundedness condition for the expected estimation error covariance is satisfied. The reader is referred to  for the problem of determining the minimum average energy required for guaranteeing the stability of the Kalman filtering with the packet reception probabilities (1) subject to an average sum energy constraint.
Let and be the time-invariant probability transition laws of the Markovian channel fading process and the Markovian harvested energy process , respectively.
We introduce the following assumption:
(A2) The channel fading process , harvested energy process and the maximum battery storage satisfy the following:
for some .
for some . We now consider a suboptimal solution scheme to the stochastic optimal control problem (5) where the full amount of energy harvested at each time step is used, i.e., and for . Then (6) will be a sufficient condition in terms of the channel fading process, harvested energy process and the maximum battery storage. Therefore, Assumption (A2) provides a sufficient condition for the exponential boundedness (7) of the expected estimation error covariance.
The condition (6) given by Assumption (A2) may not be easy to verify for all values of , and . If we assume that the channel fading and harvested energy processes are stationary then it won’t be necessary to verify the condition for all . Furthermore, in the two most commonly used models of i.i.d. processes and finite state Markov chains, the condition can be simplified as follows:
(i) If and are i.i.d., (6) yields
(ii) If and are stationary finite state Markov chains with and states respectively, (6) yields
Iv Solutions to the Optimal Transmission Energy Allocation Problems Via Dynamic Programming
The stochastic control problems (4) and (5) can be regarded as Markov Decision Process (MDP)  problems with imperfect state information [37, 38]. In these formulations the energy harvesting sensor does not have perfect knowledge about whether its transmissions have been received at the receiver or not due to the existence of an imperfect feedback communication channel. Hence, at time the sensor has only “imperfect state information” about via the acknowledgment process . In this section we reduce the stochastic control problems with imperfect state information (4) and (5) to ones with perfect state information by using the notion of information-state .
Iv-a Information-State Dynamics
as all observations about the receiver’s Kalman filtering state estimation error covariance at the sensor after the transmission of and before transmitting . We set . The so-called information-state is defined by
which is the conditional probability of estimation error covariance given , and . The following lemma shows how can be determined from together with , and .
The information-state satisfies the following dynamics
with where is the Dirac delta function.
Proof: See the Appendix.
It is important to note that the information-state dynamics (9) depends on the fading channel gains and sensor transmission energy allocation policies via the packet reception probabilities (1). Hence, we may write (9) as
for . Note that in (10) depends on the entire function and not just its value at any particular .
In the following sections the stochastic control problems with imperfect state information (4) and (5) are reduced to problems with perfect state information where the state is given by the information-state . The resulting stochastic problems with perfect information are approached via the dynamic programming principle.
We establish some notation. Let the binary random variable be defined akin to in (3), then for a given denote
as the random Riccati equation operator. Let be the set of all nonnegative definite matrices. Then, we denote the space of all probability density functions on as where for any . Let the ternary random variable be defined akin to in Section II-E. Then, based on the information-state recursion (10) denote
for given , fading channel gain and sensor transmission energy allocation .
In the special case of perfect packet receipt acknowledgments, where and in Section II-E are set to zero, the problems (4) and (5) become stochastic control problems with perfect state information. In this case the probability density functions and in the information-state recursion (12) become Dirac delta functions.
Iv-B Dynamic Programming Principle
In this section, the transmission energy allocation policy is computed offline from the Bellman dynamic programming equations given below.
Some notation is now presented. Given the fading channel gain and the harvested energy at time we denote the corresponding fading channel gain and the harvested energy at time by and , respectively. We recall that both fading channel gains and harvested energies are modelled as first-order homogeneous Markov processes (see Section II).
Iv-B1 Finite-Time Horizon Bellman Equation
The imperfect state information stochastic control problem (4) is solved in the following Theorem.
For given initial condition the value of the finite-time horizon minimization problem (4) is given by which can be computed recursively from the backward Bellman dynamic programming equation
where . The terminal condition is given as
where all available energy is used for transmission in the final time .
Proof: The proof follows from the dynamic programming principle for stochastic control problems with imperfect state information (see Theorem 7.1 in ).
Based on Remark IV.1, it is important to note that in the special case of perfect packet receipt acknowledgments, where and in Section II-E are set to zero, the Bellman equation (13) is written with respect to Dirac delta functions in space , i.e., (see Section 4 in ).
The solution to the imperfect state information stochastic control problem (4) is then given by
with , where is the solution to the Bellman equation (13).
For computational purposes, we now simplify the terms in (13). First, we have
with the constraint that . Since the mutually independent processes and are independent of other processes and random variables, we may write
where , and and are the probability transition laws of the Markovian processes and , respectively. But,
where the function is defined in (12).
(ii) If and are finite state Markov chains with and states respectively, then the right term in (15) becomes
where , , , and are the probability transition matrices for and , respectively, and denotes the i-th component of the vector .
Note that the solution to the dynamic programming equation can only be obtained numerically and there is no closed form solution. In fact, even for a horizon 2 problem with causal information and perfect feedback acknowledgment, it can be shown that the optimal solution cannot be obtained in closed form. It can be observed however that for a fixed battery level, the energy allocation generally increases with the channel gain and when the channel gain is above some threshold, all of the available battery energy is used for transmission. Similarly, when the channel gain is kept fixed, the energy allocation is equal to the available energy and increases with increasing battery energy level. Although after some point, the energy allocated for transmission becomes less than the available energy and some energy is saved for future transmissions.
Iv-B2 Long Term Average (Infinite-Time Horizon) Bellman Equation
We present the solution to the imperfect state information stochastic control problem (5) in the following Theorem.
Independent of the initial condition , the value of the infinite-time horizon minimization problem (5) is given by which is the solution of the average-cost optimality (Bellman) equation
where , and is called the relative value function.
Proof: See the Appendix.
The stationary solution to the imperfect state information stochastic control problem (5) is then given by
where is the solution to the average cost Bellman equation (16).
Equation (16) together with the control policy defined in (17) is known as the average cost optimality equations. If a control , a measurable function , and a constant exist which solve equations (16)-(17), then the control is optimal, and is the optimal cost in the sense that
and for any other control policy such that , a.s., we have
The reader is referred to for a proof of the average cost optimality equations and related results.
We note that discretized versions of the Bellman equations (13) or (16), which in particular includes the discretization of the space of probability density functions , is used for the numerical computation to find suboptimal solutions to the stochastic control problems (4) and (5). As the number of discretization levels increases, it is expected that these discretized (suboptimal) solutions converge to the optimal solutions . We solve the Bellman equations (13) and (16) by the use of value iteration and relative value iteration algorithms, respectively (see Chapter 7 in ).
The causal information pattern is clearly relevant to the most practical scenario. However, it is also instructive to consider the non-causal information scenario where the sensor has a priori information about the energy harvesting process and the fading channel gains for all time periods including the future ones. This may be feasible in the situation of known environment where the wireless channel fading gains and the harvested energies are predictable . More importantly, the performance of the non-causal information case can serve as a benchmark (a lower bound) for the causal case. Indeed, we present some performance comparison between the performances in the causal and the non-causal case in the Numerical Examples section. Note that the energy allocation problems for the non-causal case can be solved using similar techniques to Section IV-B, and the details are omitted for brevity.
V Suboptimal Transmission Energy Allocation Problems and Their Solutions
The optimal solutions presented in Section III require us to compute the solution of Bellman equations in the space of probability densities . In this section we consider the design of suboptimal policies which are computationally much less intensive than the optimal solutions of Section IV.
Here, we only present suboptimal solutions to the finite-time horizon stochastic control problem (4). Following the same arguments one can design similar suboptimal solutions to the infinite-time horizon problem.
In this case we formulate the problem of minimizing the expected estimation error covariance as
where is an estimate of computed by the sensor based on the following recursive equations (with ):
(i) In the case we have
(ii) in the case we have
(iii) In the case we have
The reason that the solution to the stochastic control problem (18) is called suboptimal is that the true estimation error covariance matrix in (3) is replaced by its estimate . The intuition behind these recursive equations can be explained as follows. Note that in the case of perfect feedback acknowledgements, the error covariance is updated as in case , and in case . In our imperfect acknowledgement model, even when it is received, errors can occur such that is received when , and is received when . Thus the recursions given in (i) and (ii) are the weighted (by the corresponding error event probabilities) combinations of the error covariance recursions in the case of perfect feedback acknowledgements. In the case where an erasure occurs, taking the average of the error covariances in the cases and is intuitively a reasonable thing to do, which motivates the recursion in (iii).
Note that where the conditional probabilities are given in Section II-E. This together with the recursive equations of yields
Since the expression is of the same form as when is replaced by