Stability of Control Systems with Feedback from Energy Harvesting Sensors

Stability of Control Systems with
Feedback from Energy Harvesting Sensors

Nicholas J. Watkins, Konstantinos Gatsis, Cameron Nowzari, and George J. Pappas N.J. Watkins, K. Gatsis, and G.J. Pappas are with the Department of Electrical and Systems Engineering, University of Pennsylvania, Pennsylvania, PA 19104, USA, {nwatk,kgatsis,pappasg}@upenn.edu; C. Nowzari is with the Department of Electrical and Computer Engineering, George Mason University, Fairfax, VA 22030, USA, cnowzari@gmu.edu.
Abstract

In this paper, we study the problem of certifying the stability of a closed loop system which receives feedback from an energy harvesting sensor. This is important, as energy harvesting sensors are recharged stochastically, and may only be able to provide feedback intermittently. Thus, stabilizing plants with feedback provided by energy harvesting sensors is challenging in that the feedback signal is only available stochastically, complicating the analysis of the closed-loop system. As the main contribution of the paper, we show that for a broad class of energy harvesting processes and transmission policies, the plant state process can be modeled as a Markov jump linear system (MJLS), which thereby enables a rigorous stability analysis. We discuss the types of transmission policies and energy harvesting processes which can be accommodated in detail, demonstrating the generality of the results.

\renewtheoremstyle

plain\theorem@headerfont##1 ##2\theorem@separator  \theorem@headerfont##1 ##2 (##3)\theorem@separator 

I Introduction

Energy harvesting technology - which allows for a device’s battery to be recharged online by interacting with the environment - will play a significant role in the development of future smart technologies. Indeed, principled use of energy harvesting technologies will allow for the safe use of sensors in remote locations without the need for explicit, periodic maintenance or replacement. In recent years, much progress has been made in understanding how to use energy harvesting devices in networking and communications applications [1, 2, 3]. However, there is relatively little literature detailing how energy harvesting sensors can be used in control applications, where the closed-loop system’s dynamical behavior is of significant importance. A key desirable property of many control systems is provable closed-loop stability, which often serves as formal means for guaranteeing safe system operation.

Ensuring the stability of a plant which receives its feedback signal from an energy harvesting sensor is a challenging problem. Since the sensor’s energy is restored by interactions with the environment (e.g. by leveraging vibrations in a mechanical process [4], differences in temperature between a surface and the environment [5], or the presence of ambient solar light [6, 7]), it will typically be the case that feedback can only be provided intermittently. Indeed, in this context, a sensor may only provide a feedback signal with positive probability when it has sufficient energy to transmit the signal. Since the process by which the sensor’s battery is restored (i.e., the energy harvesting process) will often have a significant stochastic component, analysis of the plant’s state is difficult. In particular, correlations between the energy harvesting process and the plant state evolution makes the closed-loop dynamics of the system complicated.

Currently, most works which have considered the interface of dynamical systems and energy harvesting sensors have either not explicitly addressed closed-loop stability of the process, or have done so under conservative assumptions. Indeed, the earliest known work on sensing of dynamical systems with energy harvesting sensors [8] explicitly considers the problem of minimizing the expected value of the state estimation error - it does not explicitly address system stability. Similarly [9] and [10] find conservative conditions under which the estimation error of Kalman filters running on energy harvesting sensors will remain bounded. Works considering the closed loop stability of the system are limited.

In particular, [11, 12, 13] have studied controllers which guarantee closed-loop stability under conservative assumptions. While these do not directly assume that the open-loop system is stable, they indirectly assume that at every time increment in the process, enough energy arrives so that the sensor can communicate reliably enough with the plant to guarantee uniform decay in a norm of the system’s plant state. Since this assumption directly implies that a positive amount of energy is harvested by the sensor at all times with a positive probability, these results are restrictive. Indeed, it seems that in many practical settings, the energy harvesting process will not supply energy to the sensor for long periods of time. This is indeed the case for solar cells, and also when the sensor is deterministically recharged according to a fixed schedule. As such, it is clear that a more general stability analysis is needed; we perform such an analysis in this paper. In particular, we consider the problem of certifying the closed-loop stability of a plant when supplied with feedback in accordance to a fixed, memoryless, energy-causal transmission policy, where the sensor is recharged by process modeled by a function of a Markov process.

The primary contribution of this paper is an efficiently computable stability certification method for systems which receive feedback information from an energy harvesting sensor following a known transmission strategy, restored with energy from a known stochastic energy harvesting process. To accomplish this, we show that for a large class of transmission policies and energy harvesting processes, such systems can be modeled as a Markov jump linear system (MJLS) with a mode transition process which is defined on a state space whose size grows mildly with the size of the energy harvesting processes’ transition matrix. We then adapt stability results from the MJLS literature to our setting.

In order to demonstrate the generality of the proposed stability certification method, we discuss in detail the types of transmission policies and energy harvesting processes which can be accommodated. We show that any memoryless transmission policy can be accommodated into our framework. This is important, as this is a sufficiently broad classification so as to be useful for many systems. Indeed, we demonstrate that memoryless policies are all which are required in order to stabilize the system when the plant is scalar, and intelligently designed memoryless policies often suffice to stabilize nonscalar plants. Likewise, we show that any energy harvesting model which can be posed as a function of a finite-state Markov chain can be accommodated into our framework. This is important, as many important types of energy sources can be modeled as such, as we demonstrate in Section V. The work presented here differs from our preliminary conference paper [14] in that it extends the technical results from undisturbed scalar plants to arbitrary linear plants subject to stochastic disturbances, and provides an extended discussion on modeling different types of energy harvesting sources within the considered framework.

Organization

The paper is organized as follows. The architecture of the system we study is presented in Section II, along with a formal problem statement. The main results of our paper are contained in Section III, in which we propose a test for certifying the stability of an energy harvesting system under a fixed transmission policy. Section IV provides principles regarding the design of transmission policies for energy harvesting control systems. Section V contains examples of energy harvesting processes, with each serving to demonstrate the proposed method’s applicability to a different potential application. Section VI concludes the paper.

Notation

We denote by the set of non-negative integers, and for each we denote by the set of non-negative integers We denote by the projection of into the interval Let be a probability measure, an event which is measurable with respect to and a random variable. We use the notation for the conditional probability of given when writing the explicit expression is too cumbersome.

Ii Problem Statement

A visual representation of the system architecture we study is shown in Figure 1. This models a setting in which an energy harvesting sensor communicates over a stochastic communication channel to stabilize the evolution of a plant. The sensor stores energy in its battery, and restores its charge via a stochastic energy harvesting process. The control designer’s role in this system’s evolution is in designing transmission policies which determine when and how energy should be used in order to affect the evolution of the plant’s state vector. The principle question we address in this text is that of determining if a chosen transmission policy stabilizes the closed-loop evolution of the plant’s state. We now detail mathematical models for each component of the system, and provide a formal problem statement.

Fig. 1: Energy is supplied to an energy harvesting sensor via an external energy harvesting process which is then either immediately used for providing a feedback signal to a plant via a wireless channel or stored for later use in a finite-capacity battery.

Ii-a Plant Dynamics

We consider plants modeled as a switched linear system

(1)

where the random variable indicates whether or not the plant has received a feedback signal at time is a real -dimensional matrix which describes the nominal evolution of the state when the system operates in closed loop (i.e. has successfully received a feedback signal), is an -dimensional real matrix which gives the nominal evolution of the state variable in open loop (i.e. when the system has not received a feedback signal), and is a sequence of independent, identically distributed (i.i.d.) random variables from a mean-zero distribution with finite, positive definite second moment matrix Intuitively, we can think of the system operating in closed loop as applying an a priori designed simple linear feedback to the plant.

Note that in the case that is stable, the trivial transmission policy of never transmitting feedback stabilizes the evolution of this system. As such, we expect that in most interesting instances of this problem, will be unstable, though our analysis does not explicitly require this to be the case. Indeed, it may well be the case that when and are both stable, a switching policy can still be used in order to optimize system performance, in which case our certification method will be useful. Indeed, it is well known that it is possible to switch between two stable linear systems in such a way so as to induce instability (see, e.g., [15, Example 3.17]), and so closed-loop stability must be certified explicitly.

Note also that while the disturbance process is considered to be i.i.d. and mean-zero, results similar to those which we demonstrate here hold in the case where is neither i.i.d. nor mean-zero, but consideration of such cases significantly complicates the underlying analysis, and is thus left for formal discussion in future work. The assumption that has square integrable increments is essential, as without such an assumption, the second moment of the plant state process becomes undefined after only one time step. We do not expect these to be severe limitations in practice, as many common disturbance models (e.g. i.i.d. Gaussian) satisfy these assumptions.

Ii-B Communication Channel

In order to model channel imperfections and the decision process involved in determining when to transmit a signal, we model the distribution of as itself being a function of the amount of energy committed by the sensor to transmitting the feedback signal at time We interact with the behavior of by selecting the sensor transmission energy at each time , where the selection may in general be stochastic, in which case we design its distribution. The probability that the plant successfully receives the communication conditioned on a particular transmission energy is given by

(2)

where is an energy threshold above which the transmission is successful with probability and below which all transmissions are unsuccessful. This model well approximates stochastic channels, such as the sigmoid models often considered in practice [16, 17, 18].

Ii-C Harvesting Processes

We assume the energy harvesting process, i.e. the process which details how much energy is received by the sensor from the environment at each time, takes values in some finite set of integers and is a known, deterministic function of an observed discrete-time, discrete-space Markov process which is independent of the other stochastic processes in the system model. Formally, we can decompose into three fundamental components: a discrete, finite set of latent process states, an -dimensional probability transition matrix and a deterministic function which maps an element from the latent space of to the range space of Since we may take without loss of generality, we characterize energy harvesting processes in the remainder of the paper by specifying the pair

Fig. 2: A plot of a sample Monte Carlo simulation of the harvesting process of solar intensity studied in Section V-C. The confidence interval is given in blue shade.

We make no explicit assumptions about the ergodicity or time-invariance of and as such our model incorporates as special cases models which range from simple (e.g., with each taking the value of some fixed constant ) to complicated (e.g., a function of a periodic Markov process). This level of abstraction allows us to incorporate models for a wide variety of sources into the same framework. For instance, we can think of systems subjected to a regular charging cycle as being modeled by a harvesting process which is essentially deterministic (such as the inductive charging used for in vivo biomedical sensors [19]), as well as systems subjected to highly stochastic, time-varying charging (such as the evolution of solar intensity, as in Figure 2), with each having an energy source well-modeled by a function of a Markov process. We discuss this matter in more depth in Section V.

Ii-D Battery Dynamics

We assume the battery storage process to be bounded above by a finite battery capacity constant which is taken to be a given constant in this paper, but can be designed efficiently if the underlying plant model is sufficiently simple (see [14] for further discussion on this matter). Any energy available at time that is not used to transmit a feedback signal and is above the battery capacity is lost due to overflow. This model guarantees that each element is in the set and that the process obeys the nonlinear, stochastic dynamics

(3)

where is the selected sensor transmission energy (see Section II-B), and is the amount of energy harvested at time (see Section II-C). This battery evolution model is common in energy harvesting literature (see, e.g., [9]). To ensure the sensor always only uses energy available to it at each time, we assume that the energy usage process is subject to the energy availability constraint

(4)

which guarantees that the system never uses energy in excess of the current stored energy, and the amount of energy which has been harvested in the current time increment. Note that this constraint implies directly that takes values on the set at all times.

Ii-E Transmission History

We study sensors which have an indicator in memory, storing whether or not the sensor has attempted to transmit feedback in the previous time slot. We denote this value at time by which evolves as

(5)

In Section IV, we show that is not necessary to design stabilizing transmission policies for systems with scalar plant dynamics, but that it helps significantly for designing policies for system with higher-dimensional plants.

Ii-F Transmission Policies

In this paper, the evolution of the plant is controlled by a given transmission policy, i.e. a stochastic decision rule that the sensor uses to determine when and how to dedicate energy to providing feedback to the plant. We can think of transmission policies as conditional probability distributions, and we restrict them to be conditioned only on the current battery level, latent harvesting state, and transmission history value. Notationally, we have that our policy is defined as

(6)

Note that our assumption that the implemented transmission policy is conditioned as such is somewhat restrictive. Indeed, such an assumption in effect restricts our consideration to the case in which transmission policies are memoryless, as can be thought of as information which summarizes the current state of the sensor/energy harvesting process pair. However, we show in Section IV-A that insofar as stabilizability is concerned, this assumption is not restrictive for systems with scalar plant dynamics, as a simple greedy memoryless policy suffices to stabilize all such plants. Moreover, we show in Section IV-B that such memoryless policies are general enough to be able to encode predictive dwell time policies, which have features which help to mitigate the complexities associated to controlling systems with general linear dynamics. Moreover, it is an assumption which plays a critical role in our analysis, and it should not be expected that a substantially more complex case can be handled in full generality, for reasons discussed in more detail in Section III.

Ii-G Problem Statement

The main objective of this work is to develop an efficient stability test for the plant state process where is a fixed, memoryless transmission policy. As such, we must formally define a notion of stability for use in this paper.

Definition (Mean-Square Exponential Ultimately Bounded)

The process is mean-square exponentially ultimately bounded if and only if there exists some finite constants and such that

(7)

holds for all and any square integrable

Note that we use this definition of stability in order to emphasize the role that disturbances play in the evolution of the system. In the case where the disturbances of significant, the trace of will be significant as well, and the system’s ultimate bound may be large. However, when the disturbances tend towards zero, so too does the system’s ultimate bound. Moreover, it will be shown in Section III that the system under consideration may be modeled as a MJLS, for which it is known that mean-square exponential stability and mean-square asymptotic stability are equivalent [15, Theorem 3.9]. As such, the main results of our text are largely invariant to the notion of stability considered, and we will simply refer to as stable whenever it satisfies Definition II-G.

To make our technical statements concise, we often make reference to an energy harvesting control system, defined as:

Definition (Energy Harvesting Control System)

An energy harvesting control system (EHCS) is formally defined as the -tuple which encodes the closed-loop dynamics, open-loop dynamics, energy harvesting process, packet reception probability, transmission energy, and battery capacity of the system, respectively.

From the preceding discussion, we see that the object contains all parameters of the model proposed, except for the disturbance process. This is due to the fact that under our assumptions on the stability of a energy harvesting control system is unaffected by the presence of stochastic disturbances, which we show formally in Proposition III (see Section III).

The problem we consider in this paper is developing a tractable means for verifying the closed-loop stability of the plant state process of an EHCS under a fixed transmission policy which is defined in the sense of Section II-F. This problem is important insofar is that it allows a system designer to formally certify that under a fixed transmission policy, the evolution of a given plant will remain safe. We present the solution to this problem in Section III, in which we show that the evolution of an EHCS can be embedded into a MJLS, and adapt a relevant stability result from the MJLS literature to our setting. We demonstrate the problem considered admits transmission policies and energy harvesting process models which are sufficiently general for useful applications with detailed discussion in Section IV and Section V, respectively.

Iii Stability Certification for Energy Harvesting Control Systems

We now develop an efficient test for determining the stability of an EHCS under a given transmission policy where satisfies the definition given in Section II-F. We develop the test by embedding the evolution of the plant state process into the dynamics of a MJLS, and then adapt a stability result from the MJLS literature to show that the standard MJLS stability test can be used in our setting. The essence of the embedding we develop is captured by noting that for a fixed policy the process is Markovian, and contains everything necessary to model the dynamics of as a MJLS.

The process evolves on the state space We demonstrate that is Markovian by verifying that the Markov property holds, i.e. that is independent of when conditioned on To do so, we note that transition probabilities

are constants determined by the EHCS’s specification. To make this point more concrete, define and to be the battery level, latent harvesting process state, loop closure state, energy usage, and transmission history value of the system at state define and likewise for We now show that the probability of transitioning to some state from a particular state depends on the probabilities in the EHCS specification, and whether the battery and history dynamics are respected.

Since we have assumed to be Markovian and independent of the packet drop process, one may check for all transitions from a state to a state that occur with positive probability, we have transition probabilities given by

(8)

where denotes the probability of transition from latent harvesting state to state (see Section II-C), and is notational shorthand for the probability that the amount of energy used at state is greater than or equal to Intuitively, the right hand side of (8) partitions the set of possible process transitions with non-zero transition probabilities into events corresponding to the evolution of the states of the battery and loop closure processes. For all transitions which occur with zero probability due to not adhering to the battery dynamics (3) or the history dynamics (5), we have Note that is uniquely defined: for each state the transmission policy specifies exactly one probability distribution for which may then be used to evaluate (8) to a particular constant. As such, it fully specifies the transition probabilities of demonstrating that is a Markov chain, as claimed. We now embed the evolution of in a MJLS using this fact.

By noting that the value of is embedded in we may define to be the state of the loop closure variable at and write the plant state dynamics (1) as

(9)

where we have defined

(10)

with From this, it follows that for any fixed, memoryless transmission policy the process is a MJLS with dynamics (9), mode process and disturbance process We now state the stability test we have established in the following theorem, which shows that the stability of an EHCS under a memoryless transmission policy may be determined by solving a semidefinite program.

Theorem (EHCS Mean-square Stability Test)

Let be the state space of the mode transition process generated by an EHCS, and fix some positive constant The EHCS is stable under the transmission policy if and only if the optimal value of the semidefinite program

(11)

is equal to where is defined by (8).

Proof

See Appendix -B.

Theorem III provides a simple, efficient test for assessing the stability of an EHCS under a fixed transmission policy Note that the role of in (11) is simply in bounding the value of below, which is only important insofar that it theoretically guarantees that standard semidefinite programming algorithms will terminate with a solution in time which is polynomial with respect to the dimensionality of the plant, and the cardinality of (see, e.g., [20] for a more detailed discussion on the complexity of semidefinite programming).

Intuitively speaking, this method functions by computing a mode-dependent quadratic Lyapunov function which serves to certify the stability of the system. The system of linear matrix inequalities in the constraints of the program ensure that the value of the Lyapunov function evaluated on the plant state process of the EHCS is a strict supermartingale, i.e. it enforces that

holds for all times That this inequality implies exponential mean-square stability for unperturbed Markov jump linear systems is known (see, e.g. [15]), however the embedding used to take the EHCS model to a MJLS we have constructed is novel. However, the particular notion of stability considered here is not covered by standard MJLS stability results, due to the lack of ergodicity of Hence, we must demonstrate that testing for stability of the unperturbed MJLS is equivalent to testing for stability of the perturbed MJLS. As we need this equivalence result again later in the text (see Section IV), we state it here in the following proposition.

Proposition (Equivalance of EMSUB and EMSS)

Consider the dynamical system

(12)

where the random variable indicates whether or not the plant has received a feedback signal at time as in the description of (1). Let be the stochastic process generated by applying the transmission policy to (12), and define similarly. The process is exponentially mean-square ultimately bounded if and only if the process is exponentially mean-square stable.

Moreover, if is exponentially mean-square stable with constants and i.e.

(13)

holds for all time it then follows that we may choose to verity the exponential mean-square ultimate boundedness of

Proof

See Appendix -A.

In light of Proposition III, formal proof of Theorem III is straightforward. In particular, we may adapt known results for the exponential mean-square stability of unperturbed MJLS to our framework, and verify that the solution of the particular semidefinite program given is indeed when the system is stable. For purposes of completeness, this argument is given in detail in Appendix -B.

Remark (Stability Certification with Fixed Lyapunov Function)

All of the results of this section are written with the perspective that the transmission policy is fixed, and the Lyapunov function used to certify stability is to be computed. This was done because there are systems for which one can design good heuristic transmission policies. In fact, we show in Section IV that for scalar systems, one only need to check the stability of a particular greedy transmission policy. However, it is worth noting that this perspective is inessential to the fundamental theory.

Computation of a stability certificate is also tractable when the matrices which define the Lyapunov function i.e. are fixed, but the transmission policy is left to be determined. To do so, one may use a similar program to that of (11), but in which the variable matrices are taken to be program data, and the constants used to define are made to be optimization variables, subject to the constraints that is transmission policy in the sense of Section II-F. Having both the transmission policy and the Lyapunov function left as unknowns makes the constraints of the optimization problem a set of bilinear matrix inequalities which are nonconvex, and in general difficult to solve.

Remark (Stability Certification for General Policies)

As noted in Section II-F, we have restricted our attention here to a particular subset of memoryless transmission policies. While in principle the stability certification test given by Theorem III can be extended in a straightforward manner to any memoryless policy, we will see in Section IV that the class of transmission policies studied is sufficiently general to stabilize many interesting systems. However, this class of policies is technically restrictive in the sense that strictly more general policies can be defined in practice. For example, one may wish to make the the transmission policy time-varying.

We have not explicitly considered more general policies, because providing an efficient test for certifying the stability of the EHCS under general transmission policies is technically challenging. In particular, recent results from the MJLS literature [21] show that determining the mean-square stability of a MJLS under a time-varying mode transition process is -hard. As this is the problem which would be faced if we allowed for time-varying transmission policies, stability certification would be hard for such a problem.

Remark (Alternate Stability Tests)

It is well known in the MJLS literature that one may check the mean-square stability of the system by determining if the spectral radius of a linear operator induced by the considered MJLS less than one. This test is equivalent to that which was given by Theorem III, but may be more computationally efficient to check in some circumstances. We have fully detailed the semi-definite programming stability test here because it allows for a more accessible construction, and highlights the importance of considering fixed policies (see Remark III). A person interested in reading further about the spectral radius test can consult standard MJLS references, such as [15, Chapter 3].

Iv Transmission Policy Design

In this section, we develop some principles for designing good transmission policies for EHCS. Since co-designing a transmission policy along with the Lyapunov function required to verify the system’s closed-loop stability is a nonconvex optimization problem (see Remark III), being able to find good transmission policies is essential in designing controllers which certifiably stabilize the evolution of the system. Formally, we decompose our results into two classifications: those for systems with scalar plants, and those for systems with nonscalar plants.

For scalar systems, we see that one only ever need to check the stability of the system under a particular greedy transmission policy, and that this policy alone certifies the stabilizability of the system, i.e. whether or not a stabilizing causal transmission policy exists. This is important, insofar that it allows system designers to chose components of scalar EHCS (e.g. the battery capacity ) so as to guarantee the existence of a stabilizing policy; we discuss this topic in greater detail in the preliminary work [14]. For general nonscalar systems, we show how one can use the concept of dwell time to identify stabilizing polices for systems which the greedy policy fails to stabilize.

Iv-a A Stabilizing Policy for Scalar Systems

In this subsection, we show that a simple, greedy transmission policy is sufficient for stabilizing an EHCS with scalar plant dynamics. The greedy transmission policy in question is described mathematically by the conditional probability distribution defined as

(14)

The transmission policy applies exactly units of energy to transmitting a feedback signal at precisely those moments at which the sensor has enough energy available to do so. Despite its simplicity - note that it is a deterministic function of only the battery and harvesting states at each time, and not the transmission history - we prove that it is the only policy which needs to be investigated to establish the stabilizability of an EHCS. That is if any causal transmission policy exists which stabilizes a particular EHCS, the greedy transmission policy does so as well. We formalize this as follows:

Theorem (Existence of Stabilizing Policy for Scalar EHCS)

Consider an EHCS with There exists a causal transmission policy

which stabilizes the if and only if the process is stable. That is, a given EHCS is stabilizable if and only if it is stable under the greedy transmission policy.

We detail next the essential features of the argument supporting Theorem IV-A. Interestingly, most of the weight of the proof can be shifted onto proving a pathwise stochastic dominance inequality between the greedy policy and any other causal policy. To demonstrate this, we need to formally define a sample space for the process. For the remainder of the paper, we define the sample space as

where is a vector function containing the evolution of in its first component and numbers for the randomization required to determine particular actions from a stochastic transmission policy in its second component and is a function indicating whether or not the th feedback attempt reaches the plant successfully. Note that we have defined the sample space to consist of pairs of functions so as to be able to index time and the number of communication attempts made by the system separately; this technical detail is important.

Intuitively, the function contains all of the randomization needed to model the processes which are indexed naturally with respect to time: from it, we may fully determine the evolution of the harvesting process as described in Section II-C, and the randomization required to implement a stochastic transmission policy as described in Section II-F. The function contains the randomization needed to model the communication channel according to Section II-B: from it, we can determined whether the ’th time the transmission energy process exceeds - that is, the ’th communication attempt - results in a successful loop closure.

The only subtle point required in verifying that is a proper sample space for the process is in confirming that all process variables are fully determined by the selection of a particular sample path Since all of the process variables at a particular time can either be determined directly from or we briefly discuss how one may compute the channel energy at every time from a selected After observing any sequence of events through time and under any fixed policy we may partition into a collection of disjoint intervals such that the Lebesgue measure of is equal to the probability that . By associating to the probability measure of a sequence of i.i.d. uniform random variables on we may take for whichever satisfies and have that follows the correct distribution.

Note that - unlike in many types of analysis one may perform on models with stochastic control policies - the sample space and probability measure are both unaffected by the particular choice of transmission policy This is accomplished by taking the probability measure on the sample space to be the product of the probability measures of three independent processes. In particular, we have where is the measure induced by the latent-state process of is the measure of a sequence of i.i.d. uniform random variables on the unit interval, and is the measure of a sequence of i.i.d. Bernoulli random variables with success probability This ability to decouple the choice of strategy from the choice of probability measure is important in that it allows us to compare the performance of different control policies on a sample-by-sample basis. More precisely, if we let be the number of successful loop closures attained by a transmission policy through time on sample path we can show the following:

Lemma (Pathwise Dominance of Greedy Policy)

For all times and all samples it holds that

(15)

i.e. the greedy transmission policy dominates every other causal transmission policy in terms of successful loop closures at every time along every sample path.

Proof

See Appendix -C.

A direct consequence of Proposition III is that proving Theorem IV-A only explicitly requires analyzing the evolution of the undisturbed process Considering the consequences of Proposition IV-A in detail demonstrates that for scalar systems, the greedy transmission policy outperforms all others with respect to the state process as stated next.

Corollary (Stochastic Dominance Inequalities)

For all times and all samples it holds that

(16)

and hence it also holds that

(17)

where the expectation is taken with respect to the probability measure

Since Corollary IV-A is proven immediately by noting that is a monotone decreasing function of and integrating, formal proof is not given here. As a direct consequence of (17), it also holds that is exponentially mean-square stable only if is exponentially mean-square stable. Since stability of implies stabilizability of the EHCS and Proposition III implies that is stable if is exponentially mean-square stable, Theorem IV-A is proven as well.

Note that, as stated, the result in Theorem IV-A is conservative. Stability of the greedy policy suffices as a stabilizability test for a larger class of systems than those with scalar plants. Showing that this is the case for simultaneously diagonalizable systems involves only basic algebraic arguments, and so its proof is left out of this manuscript. One can also show that greedy policies suffice for stabilizing systems in which the plant matrices and commute, though the argument is more involved. In particular, the stochastic dominance inequality (16) no longer holds for all time. Indeed, the greedy algorithm can be suboptimal for commuting systems, but the sub-optimality only grows as a polynomial with respect to time, and as such the existence of an exponentially stabilizing policy still implies that the greedy policy exponentially stabilizes the system. However, formally proving this result requires a long, technical argument with many algebraic details, and discussing it further would take us too far from the main path of the results we wish to present in this text. As such, we discuss it no further here, and it leave it as future work.

Iv-B Predictive Dwell Time Policies

An essential feature of nonscalar linear systems which separates them from scalar linear systems is a lack of multiplicative commutativity. Without being able to commute the multiplication inherent in detailing the system’s dynamics, it is difficult to find a tractable representation of the system’s evolution. In particular, where for scalar systems, we have that is a function of only the number of attained loop closures through time and the initial condition for general systems, this is not the case. For emphasis, we write

(18)

where is the number of mode changes experienced by the plant on the interval is the mode the system is operating in after the ’st mode change, and is the number of time slots in which the system remains in the ’th mode. Equation (18) demonstrates that the sequence of dwell times are central in defining the map which takes to As such, they play a central role in our considerations in this section.

We now define a class of transmission policies informed by the decomposition (18), insofar that it emphasizes the importance of keeping the system in the closed-loop mode for significant periods of time without interruption. Essentially, dwelling in the stable mode for a long period of time allows the system to overcome the possibility of polynomial growth introduced by the switching, by way of allowing the exponential decay induced by the stability of the closed-loop matrix to have enough time to dominate the evolution of the plant. We refer to the policies we study as predictive dwell time transmission policies, defined as follows:

Definition (Predictive Dwell Time Policy)

Choose some desired dwell time and some probability We define the predictive dwell time transmission policy with parameters and to be the policy which uses exactly units of energy at the first moment which the sensor detects that it can attempt consecutive loop closures with at least probability and each moment thereafter until it lacks sufficient energy to do so any longer.

That is, we define the predictive dwell time policy with parameters and as the memoryless transmission policy

(19)

where we have defined the symbol for the probability that the process will have sufficient energy available to attempt consecutive loop closures, given an initial process state and that the system will use exactly units of energy at each time in order to do so, i.e.

(20)

where represents the battery charge state under the policy units of time into the future, i.e.

This policy, in particular, is one of interest in that it includes the greedy transmission policy developed in Section IV-A in the special case More importantly, it is not obvious from inspection of Definition IV-B that predictive dwell time policies can be computed efficiently for arbitrary problems. Specifically, a naïve method of computing the dwell time probabilities would enumerate all possible sample paths of the embedded state-space process over the interval and compute the required probability by summing over the set of samples which have the sensor transmitting feedback information on consecutive time slots. The complexity of such enumeration is exponential in and as such, would be inefficient. As such, it is important that we verify that such policies can be computed efficiently, if we are to think of them as useful. We now show how dynamic programming can be used in order to do so.

In this spirit, define the process with transition probabilities given by

(21)

and define the set as the subset of states of such that the system has enough energy to transmit feedback, i.e.

(22)

Defining to be the event that the process is in at time i.e. it suffices to demonstrate that we can compute the probability tractably with respect to the planning horizon By applying the chain rule of conditional probability (see, e.g., [22, Chapter 2]), we have

(23)

Supposing that we have the value stored in a dynamic programming table, we need only to demonstrate that can be computed efficiently. By decomposing the event we get

(24)

holds. By applying Bayes’ rule to we have the identity

(25)

in which we note that is the only term which has not yet been explicitly computed and stored. We address this by applying the chain rule a final time, to arrive at the identity

(26)

which depends only explicitly on the transition matrix and the terms which we have explicitly computed in the calculations for Algorithm 1 summarizes computing by the method just described. By inspecting our argument, we have that for each fixed value we have computations. Hence, the total complexity of computing by Algorithm 1 is

Initialization:

1:Define the process as in (21);
2:Define the set
3:Define the event
4:Compute
5:Compute
6:Compute

For

1:Compute by (26) and (25).
2:Compute by (24).
3:Compute by (23).
Algorithm 1 Set Membership Probability Computation

Since for each we need to compute a value of and the size of grows linearly with the size of each component of the process we have that for a fixed and we may efficiently compute the predictive dwell time transmission policy. We record this in the following theorem.

Theorem (Dwell Time Probability Computation Complexity)

The worst-case computational complexity of computing the dwell time probabilities required to compute the predictive dwell time transmission policy given in Definition IV-B for specified parameters and is and can be attained by implementing Algorithm 1.

We close this section by constructing a minimal example demonstrating that the greedy policy outlined in Section IV-A does not suffice to stabilize all possible non-scalar energy harvesting control systems. Choose and let

take the harvesting process to have the latent space process with transition matrix

and energy function and let the packet reception probability be The reader may verify that while is stable, its operator norm is strictly greater than one, which means there are vectors which grow in Euclidean norm when left-multiplied by As such, we may expect that we need to link together more than one loop closure in order to induce decay, and as such there may be energy sources for which a predictive dwell time policy may stabilize the system, where the greedy policy does not. A comparison of the evolution of the system under the greedy transmission policy and the predictive dwell time policy with parameters and is given in Figure 3.

Despite outperforming the predictive dwell time policy in terms of the total number of attained loop closures in the simulated time interval, the greedy policy under-performs the dwell time policy in terms of stability. This example summarizes the essential difficulty in finding stabilizing policies for EHCS with nonscalar plants: optimizing the number of loop closures is not enough to guarantee stability if the plant’s dynamics are nonscalar. However, accounting for the interplay between the modes of the system in the policy design can help mitigate the difficulties encountered in creating a good transmission policy.

(a) Plant state evolution with greedy policy.
(b) Plant state evolution with the predictive dwell time policy.
Fig. 3: A plot of a sample Monte Carlo simulation of the example EHCS, comparing the evolution of the plant state of the process with the dwell time transmission policy against the evolution of the process with the greedy transmission policy. The mean trajectory given by a solid line, the confidence interval in dark shade, with the confidence interval given in lighter shade. It is clear that the dwell time policy stabilizes the system, whereas the greedy policy does not.
Remark (Searching for Stabilizing Policies)

As noted in Remark III, searching for a stabilizing memoryless transmission policy without fixing a particular Lyapunov function is a nonconvex optimization problem, and so in general may be difficult to solve. However, one can efficiently search over the set of all predictive dwell time policies, up to a fixed maximum desired dwell time That is, the number of distinct predictive dwell time policies grows polynomially with respect to the number of states in and and so each such policy can be tested individually to determine if any such policy stabilizes the system. Formal proof of this fact follows from noting that for any fixed the particular transmission policy is fully determined by which subset of states begins a transmission sequence. This partitions the unit interval into at most disjoint subintervals, where for all in a particular subinterval, the induced policy is the same. As in general, there is no guarantee that this class of policies must contain a stabilizing policy (as is the case where ), we do not dedicate more space to formalizing this concept in greater detail here.

V Example Energy Harvesting Source Models

We now detail how energy harvesting sources can be modeled within the mathematical framework defined in Section II-C. While we have in general placed no assumptions on the transition matrix other than it be column stochastic, the statistical models we study in this section all have the following block structure:

(27)

where for each in we have that is an column stochastic matrix. By considering (27) in detail, one may note that the transitions of the encoded Markov process are such that states through transition exclusively to states through and so forth. This naturally encodes a time-varying process which is -periodic: any particular state may only be revisited by the process after some multiple of time slots have passed since its last visit. We see in the following subsections how this structure allows a user to encode sources which are time-varying.

V-a Deterministic Energy Harvesting Sources

In this subsection, we construct a general model for a situation in which energy arrives at the sensor according to a deterministic, periodic schedule. This model is appropriate in the case where the end user re-charges the device according to a fixed schedule. This is the case in some interesting applications, such as in vivo biological sensors, which can be recharged by an end-user via an inductive source [19].

Consider a periodic source with period length defined by the periodic sequence We may define the sequence by specifying its first elements, and then taking for whichever is the unique integer less than or equal to which satisfies the equality for some natural number The latent space of the energy harvesting process is used for the purpose of indexing time, hence

(28)

which we note to be a special case of (27), where for all in Note that by design, this particular choice of is a permutation matrix, which induces the latent state variable process to follow the dynamics

(29)

which serves to increment around the ring By defining we then have that the energy harvesting process with exactly models a source with fixed, deterministic, periodic increments

As a particular example, we consider a source in which for all and for all This models the situation in which an end user deterministically recharges the sensor once every day, and endows it with units of energy. Using the algorithmic techniques developed in [14] with and we find that the critical battery capacity - that is, the minimum battery capacity required to guarantee the existence of a stabilizing transmission policy - is

The results of a simulation of this system under the closed-loop evolution of the system under the greedy transmission policy is given in Figure 5, where the disturbance process is a sequence of uniform random variables on Note that the periodicity of the energy harvesting source is apparent in the statistics of the state process. After each recharging event occurs, the norm of the state decays exponentially quickly. Between recharging events, the norm increases steadily. In the case where the battery is larger than the critical battery capacity, the plant state process is stable; in the other, it is not. Note that the periodicity in the processes’ evolution may not be optimal performance, which suggests that an interesting line of future research may be in designing transmission policies which are guaranteed to be stable, but also optimize system performance.

(a) Battery Capacity
(b) Battery Capacity
Fig. 4: A plot of a sample Monte Carlo simulation of the example EHCS, with the mean trajectory given by a solid line, the confidence interval in dark shade, with the confidence interval given in lighter shade.
Fig. 5: A simulated EHCS with a deterministic recharging source, as detailed in Section V-A. Note that the simulations display periodicity in the statistics of the process, which are due to the periodicity of the source bleeding onto the plant state dynamics through the greedy transmission policy.

V-B Ergodic Energy Harvesting Sources

In this subsection, we show how the proposed model for energy harvesting sources can be used to model ergodic energy sources. These can be useful in several application areas. For example, several works propose ergodic Markov chains as a good model for the dynamics of the intensity of wind speed [23, 24, 25]. As such, if the sensor is supplied with energy by a small-scale wind turbine, we may expect an ergodic Markov chain to be a good statistical model for the energy harvesting process.

Note that any finite-state ergodic Markov process can be represented as a Markov chain with a finite, irreducible transition matrix (see, e.g., [26]). As a particular example, we consider the EHCS with as a process of i.i.d. uniform random variables on and as a process with latent Markov process with transition matrix

and energy function Note that this particular choice of corresponds to the case in which as we may simply take As constituted, is a skip-free random walk on and can be interpreted intuitively as a stochastic model for wind speed. We determine the critical battery capacity threshold to be by using the techniques developed in [14].

The evolution of with initial condition and varying from to is given in Figure 5. By inspection, one can see that for is unstable, as the sample expectation of the norm grows without bound, whereas for the system is stable, with the expectation remaining below the bound Unlike the case of a periodic source, one can note that the statistics appear to converge to a limiting distribution after an initial period of transience. This is due to the ergodicity of the stochastic source model, which induces a stationary distribution in the energy arrival process, and hence in the state space process.

(a) Battery Capacity
(b) Battery Capacity
Fig. 6: A plot of a sample Monte Carlo simulation of the example EHCS, of the system detailed in Section V-B. The mean trajectory for each plot is given by a solid line, the confidence interval in dark shade, with the confidence interval given in lighter shade. Note that unlike in the cases of periodic sources, there is no apparent periodicity in the plant’s evolution in this case.

V-C Periodic Stochastic Energy Harvesting Sources

In this subsection, we detail how our proposal of using processes of the type detailed in Section II-C for modeling stochastic energy harvesting sources can be applied to sources with macroscopic, stochastic periodic fluctuations. This is the most abstract level of generality encapsulated by models with transition matrices structured as (27). Moreover, these are features common to applications which are beholden to daily use or availability cycles. As a concrete example, we may consider the construction of a model of solar light intensity, wherein between sunset and sunrise there is insufficient light available to harvest significant energy.

In this setting, we may assume that the intensity incident to the energy harvesting device’s solar cell decomposes into two effects: the intensity which would be experienced by the solar cell on an ideal, cloudless day, and the dampening effect of clouds. In light of this, we define to be the solar intensity experienced by the sensor at time on an ideal, cloudless day. To model the effect of cloud coverage, we assume that the dampening effect of clouds evolves as an ergodic Markov chain with transition matrix and decreases the intensity of the incident sunlight additively with respect to the ideal value where in the case that at a particular time the cloud loss is more than no energy is received.

By structuring the latent space transition matrix with the block structure given by (27) with for all in we see that we have a periodic stochastic process with periodicity and possible states at each time. Note that, as before, the block structure given by (27) allows us to implicitly keep track of time, by way of noting in which latent state the process currently resides. As such, we defined to be the time with respect to the period of the process associated to the latent state Letting represent fraction of maximum cloud coverage dampening associated to the latent state we have that the total amount of energy harvested by the sensor at a latent state is given by

(30)

where we implicitly define as a function which maps the latent state to the element of the period associated to to be the maximum intensity of sunlight on a cloudless day, and to be the maximum dampening effect placed on the solar intensity due to clouds.

As a particular example, we may take as

(31)

as the ergodic Markov chain taking values on with transition matrix

(32)

loss function and transmission energy Note that the choice of as a trigonometric function of time is supported well by literature [27], however the function used usually explicitly depends on the coordinates of the device on the Earth, and its angle with respect to the surface, as well as the time of year. We have chosen a sinusoid here for simplicity; other models can be incorporated just as easily.

The simulated behavior of this model is given in Figure 2, where we have and as a sequence of i.i.d uniform random variables on We see the macroscopic periodic effects we would expect to see of a solar charging process. Namely, over each period of hours, there are approximately hours of sunlight of varying intensity, and hours of darkness, in which no energy is received by the sensor. We plot the results of a simulated EHCS under the greedy power allocation policy in Figure 7, we see the effects of this periodicity in a simulation of the system under the greedy transmission policy, in which we see performance degrade during the periods in which the system receives no energy, and performance improve when energy becomes available again. However, as predicted, the state process remains bounded for all times when a sufficiently large battery is used, and becomes unbounded otherwise, where the required size of the battery may be calculated by the techniques in [14]. These observations support the theory presented earlier in the paper, and point to an area of future work, wherein stabilizing policies which optimize performance are investigated.

(a) Battery Capacity
(b) Battery Capacity
Fig. 7: A plot of k sample Monte Carlo simulation of an example EHCS with a periodic stochastic source, as detailed in Section V-C. Note that the periodicity present in the source process is inherited by the plant state process, by way of passing through the greedy transmission policy. Note also that for this system, the critical battery capacity was found to be and which is confirmed by these simulations.

Vi Conclusions and Future Work

In this paper, we established a computationally efficient means of certifying the stability of the evolution of a plant supplied with feedback signals by an energy harvesting sensor over a wireless communication channel under a fixed transmission policy. We have shown that the developed test applies to any memoryless transmission policy. As we have also proven that memoryless policies are all which are needed in order to stabilize scalar plants, and can be used to encode more complicated predictive policies capable of stabilizing more complicated plants, we believe it to be broad enough in this regard to be useful in practice.

Moreover, we have shown that the developed test applies to any system with an energy harvesting process which can be modeled as a function of a finite-state Markov chain, and that such processes can be used as models for several interesting sources including deterministic recharging, wind harvesting, and solar harvesting. As such, the certification test we have developed is quite general in this regard as well, and we believe will be of use in future applications. Future work can come in many directions, including considering a situation in which multiple sensors communicate to multiple plants, and generalizing the control model at the plant beyond simple linear feedback.

Acknowledgments

This is supported by the TerraSwarm Research Center, one of six centers supported by the STARnet phase of the Focus Center Research Program (FCRP), a Semiconductor Research Corporation program sponsored by MARCO and DARPA.

References

  • [1] S. Ulukus, A. Yener, E. Erkip, O. Simeone, M. Zorzi, P. Grover, and K. Huang, “Energy Harvesting Wireless Communications: A Review of Recent Advances,” IEEE Journal on Selected Areas in Communications, vol. 33, no. 3, pp. 360–381, 2015.
  • [2] S. Sudevalayam and P. Kulkarni, “Energy Harvesting Sensor Nodes: Survey and implications,” IEEE Communications Surveys and Tutorials, vol. 13, no. 3, pp. 443–461, 2011.
  • [3] S. Basagni, M. Naderi, C. Petrioli, and D. Spenza, Wireless Sensor Networks With Energy Harvesting. IEEE, second ed., 2013.
  • [4] S. P. Beeby, R. N. Torah, M. J. Tudor, P. Glynne-Jones, T. O’Donnell, C. R. Saha, and S. Roy, “A micro electromagnetic generator for vibration energy harvesting,” Journal of Micromechanics and Microengineering, vol. 17, no. 7, pp. 1257–1265, 2007.
  • [5] Y. K. Tan and S. K. Panda, “Energy harvesting from hybrid indoor ambient light and thermal energy sources for enhanced performance of wireless sensor nodes,” IEEE Transactions on Industrial Electronics, vol. 58, no. 9, pp. 4424–4435, 2011.
  • [6] V. Raghunathan, A. Kansal, J. Hsu, J. Friedman, and M. Srivastava, “Design Considerations for Solar Energy Harvesting Wireless Embedded Systems,” Proceedings of the 4th international symposium on Information processing in sensor networks, vol. 00, no. C, p. 511, 2005.
  • [7] C. Alippi and C. Galperti, “An Adaptive System for Optimal Solar Energy Harvesting in Wireless Sensor Network Nodes,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 55, no. 6, pp. 1742–1750, 2008.
  • [8] A. Nayyar, T. Başar, and D. Teneketzis, “Optimal Strategies for Communication and Remote Estimation With an Energy Harvesting Sensor,” vol. 58, no. 9, pp. 2246–2260, 2013.
  • [9] M. Nourian, A. S. Leong, and S. Dey, “Optimal Energy Allocation for Kalman Filtering Over Packet Dropping Links With Imperfect Acknowledgments and Energy Harvesting Constraints,” IEEE Transactions on Automatic Control, vol. 59, no. 8, pp. 2128–2143, 2014.
  • [10] O. Ozel and V. Anantharam, “State Estimation in Energy Harvesting Systems,” in Information Theory and Applications Workshop, (La Jolla, CA, USA), pp. 1–9, IEEE, 2016.
  • [11] S. Knorn and S. Dey, “Optimal energy allocation for linear control over a packet-dropping link with energy harvesting constraints,” Automatica, vol. 77, pp. 259–267, 2017.
  • [12] Y. Li, F. Zhang, D. E. Quevedo, V. K. N. Lau, S. Dey, and L. Shi, “Power Control of an Energy Harvesting Sensor for Remote State Estimation,” IEEE Transactions on Automatic Control, vol. 9286, no. c, pp. 1–1, 2016.
  • [13] M. Calvo-Fullana, C. Antón-Haro, J. Matamoros, and A. Ribeiro, “Random Access Policies for Wireless Networked Control Systems with Energy Harvesting Sensors,” in Proceedings of the American Control Conference, pp. 3042–3047, 2017.
  • [14] N. J. Watkins, K. Gatsis, C. Nowzari, and G. J. Pappas, “Battery Management for Control Systems with Energy Harvesting Sensors,” in Proceedings of the IEEE Conference on Decision and Control, (Melbourne, Australia), p. (to appear), IEEE, 2017.
  • [15] O. Costa, M. Fragoso, and R. Marques, Discrete-Time Markov Jump Linear Systems. 2005.
  • [16] K. Gatsis, A. Ribeiro, and G. J. Pappas, “Optimal Power Management in Wireless Control Systems,” IEEE Transactions on Automatic Control, vol. 59, no. 6, pp. 1495–1510, 2014.
  • [17] J. Zhang, K. Tan, J. Zhao, H. Wu, and Y. Zhang, “A practical SNR-guided rate adaptation,” Proceedings - IEEE INFOCOM, pp. 146–150, 2008.
  • [18] N. J. Ploplys, P. A. Kawaka, and A. G. Alleyne, “Closed-Loop Control over Wireless Networks: Developing a novel timing scheme for real-time control systems,” IEEE Control Systems Magazine, no. June, pp. 58–71, 2004.
  • [19] F. Zhang, X. Liu, S. A. Hackworth, R. J. Sclabassi, and M. Sun, “In Vitro and In Vivo Studies on Wireless Powering of Medical Sensors and Implantable Devices,” in 2009 IEEE/NIH Life Science Systems and Applications Workshop, LiSSA 2009, pp. 84–87, 2009.
  • [20] L. Vandenberghe and S. Boyd, “Semidefinite Programming,” SIAM Review, vol. 38, no. 1, pp. 49–95, 1996.
  • [21] Y. Z. Lun, A. D’Innocenzo, and M. D. Di Benedetto, “On stability of time-inhomogeneous Markov jump linear systems,” in 2016 IEEE 55th Conference on Decision and Control (CDC), (Las Vegas), pp. 5527–5532, 2016.
  • [22] S. S. Venkatesh, The Theory of Probability: Explorations and Applications. first ed., 2013.
  • [23] A. Carpinone, M. Giorgio, R. Langella, and A. Testa, “Markov chain modeling for very-short-term wind power forecasting,” Electric Power Systems Research, vol. 122, pp. 152–158, 2015.
  • [24] J. Tang, A. Brouste, and K. L. Tsui, “Some improvements of wind speed Markov chain modeling,” Renewable Energy, vol. 81, pp. 52–56, 2015.
  • [25] K. Xie, Q. Liao, H.-M. Tai, and B. Hu, “Non-Homogeneous Markov Wind Speed Time Series Model Considering Daily and Seasonal Variation Characteristics,” IEEE Transactions on Sustainable Energy, vol. 3029, no. c, pp. 1–1, 2017.
  • [26] D. W. Stroock, An Introduction to Markov Processes. 2005.
  • [27] M. Iqbal, An Introduction to Solar Radiation. Toronto: Academic Press, 1983.

-a Proof of Proposition Iii

By definition, exponential mean-square stability of implies that for every initial condition there exists some positive constant and some constant in the open unit interval such that holds for all times By expanding the dynamics of appropriately, we see that

(33)

holds as well. By definition, we have that the expectation of evolves as