A Proofs

Moving horizon estimation for discrete-time linear systems with binary sensors: algorithms and stability results

Abstract

The paper addresses state estimation for linear discrete-time systems with binary (threshold) measurements. A Moving Horizon Estimation (MHE) approach is followed and different estimators, characterized by two different choices of the cost function to be minimized and/or by the possible inclusion of constraints, are proposed. Specifically, the cost function is either quadratic, when only the information pertaining to the threshold-crossing instants is exploited, or piece-wise quadratic, when all the available binary measurements are taken into account. Stability results are provided for the proposed MHE algorithms in the presence of unknown but bounded disturbances and measurement noises. Performance of the proposed techniques is also assessed by means of a simulation example.

March 20, 2018

Keywords: State estimation; moving-horizon estimation; binary measurements; stability analysis.

1 Introduction

Binary (threshold) sensors whose output can take two possible values according to whether the sensed variable exceed or not a given threshold, are nowadays commonly exploited for monitoring/control aims in a wide range of application domains. A non-exhaustive list of existing binary sensors includes: industrial sensors for brushless dc motors, liquid levels, pressure switches; chemical process sensors for vacuum, pressure, gas concentration and power levels; switching sensors for exhaust gas oxygen (EGO or lambda sensors), ABS, shift-by-wire in automotive applications; gas content sensors (, , , etc.) for gas & oil industry; traffic condition indicators for asynchronous transmission mode (ATM) networks; medical sensors/analyses with dichotomous outcomes. In some applications, binary sensors represent the only viable solution for real-time monitoring. In any case, they provide a remarkably more cost-effective alternative to traditional (continuous-valued) sensors at the price of an accuracy deterioration which can, however, be compensated by using many binary sensors (for different variables and/or thresholds) in place of a single one or few traditional sensors. Moreover, binary (threshold) measurements arise naturally in the context of networked state estimation when, in order to save bandwidth and reduce the energy consumption due to data transmission, the measurements collected by each remote sensor are compared locally with a (possibly time-varying) threshold and only information pertaining to the threshold-crossing instants is transmitted to the fusion center. This latter setting falls within the framework of event-based or event-triggered state estimation [1, 2, 3], and it is more challenging as compared to the usually addressed settings due to the minimal information exchange.

The above arguments, as well as the difficulties due to the very limited information provided by binary measurements, has motivated the work on the exploitation of binary measurements for estimation purposes. In particular, [4, 5] investigated observability and observer design for linear time-invariant (LTI) continuous-time systems under binary-valued output observations. The work in [6, 7] addressed system identification using binary sensors. Specific attention was also devoted to state estimation of hybrid nonlinear systems with binary/quantized sensors [8] and to target tracking with binary sensor networks [9]. A possible solution for coping with the high nonlinearity associated with binary measurements within a stochastic framework is particle filtering [10, 11]. However such techniques, while effective in many contexts, suffer from the so-called curse of dimensionality (i.e., the exponential growth of the computational complexity as the state dimension increases) and from the lack of guaranteed stability and performance (being based on Monte Carlo integration).

The present paper addresses state estimation for linear discrete-time systems with binary (threshold) output measurements by following a moving horizon estimation (MHE) approach. MHE techniques were originally introduced to deal with uncertainties in the system knowledge [12] and, in recent years, have gathered an increasing interest thanks to their capability of taking explicitly into account constraints on state and disturbances in the filter design [13], and on the possibility of having guaranteed stability and performance even in the nonlinear case [14, 15, 16]. In fact, MHE has been successfully applied in many different contexts, ranging from switching and large-scale systems [17, 18, 19, 20, 21] to networked systems [22, 23, 24].

In this paper, the state estimation problem with binary measurements is cast in a deterministic framework, in the sense that no probabilistic description of the plant disturbance and noises is supposed to be available. The estimates are computed by minimizing suitable cost functions defined over a given time-horizon (advancing in time) of finite length, possibly subject to linear inequality constraints accounting for the threshold measurements. Specifically, two different approaches are proposed and analyzed. In the first approach, only the threshold-crossing instants are taken into account in the definition of the cost function, by penalizing the distance of the expected continuous outputs (based on the state estimates) from the threshold at those instants. The main advantage of this solution is that the resulting cost function is quadratic. The second approach, instead, exploits all the available information by defining a piece-wise quadratic cost function which accounts for all the available binary measurements, but requires the solution of a convex optimization problem at each time instant. Both unconstrained and constrained MH state estimators will be presented for the two different choices of the cost function and stability results will be proved, assuming unknown but bounded disturbances. Some of the results of this paper have been preliminarily presented, without proof, in [25].

The rest of the paper is structured as follows. Section 2 formulates the estimation problem of interest and provides considerations concerning observability from binary measurements. Section 3 discusses how to solve the problem by means of the MHE approach, with different variants depending on the choice of the cost function as well as on the inclusion or not of constraints. Section 4 deals with the stability analysis of the proposed MH estimators. In section 5, some numerical examples are presented in order to evaluate and compare the proposed estimators. Finally, section 6 ends the paper with concluding remarks and perspectives for future work.

2 Problem formulation and preliminary considerations

State estimation of dynamical systems with binary sensors reveals a deep connection with observability properties, since the sensing outcomes provide data-sets of measurements derived by sampling the system output on a set of non-periodic and irregularly-spaced time instants. In this context, state observability may be lost and traditional estimation methodologies for linear systems may fail.

The following notation will be used throughout the paper: denotes the matrix obtained by stacking its arguments one on top of the other; denotes the diagonal matrix whose diagonal elements are the scalars ; further, given a matrix , denotes the linear transformation which converts the matrix into a column vector and . Finally, denotes the Kronecker product.

Let us consider the problem of recursively estimating the state of the discrete-time linear dynamical system

(1)

from binary (threshold) measurements

(2)

In (1)-(2): is the state to be estimated; is a known input; ; is the threshold of the th binary sensor; are matrices of compatible dimensions; and are the process and, respectively, measurement noises assumed unknown but bounded. Notice from (1)-(2) that sensor provides a binary measurement according to whether the noisy linear function of the state falls below or above the threshold . The problem (1)-(2) clearly includes, as a special instance, the case of quantized sensors, but, since each binary measurement is possibly related to a different physical variable, the considered setting is actually more general.

Before addressing the estimation problem, some preliminary considerations on the information provided by binary multisensor observations are useful. With this respect, it has been pointed out in [5] that, in the continuous-time case, the information provided by a binary sensor of the form (2) is strictly related to the threshold-crossing instants. In fact, in this case, at every instant corresponding to a discontinuity of the binary signal , it is known that the signal is equal to the threshold value , implying that the linear measurement is available. Hence, observability with binary sensors for continuous-time linear systems can be analyzed within the more general framework of observability for irregularly sampled systems [5]. In particular, observability can be ensured when the number of threshold-crossing instants (which corresponds to the number of available irregularly sampled linear measurements) is sufficiently large.

The situation is, however, different for discrete-time systems. To see this, consider a generic time instant in which the binary signal changes sign, i.e., . Then, it is not possible to state, as in the continuous-time case, that coincides with the threshold . Conversely, it can be simply concluded that there exists such that

(3)

the exact value of being clearly unknown and unobservable from the binary measurements. In view of (3), such discrete time instants at which the output of some binary sensor changes value will be more appropriately referred to as output switching or simply switching instants, instead of threshold-crossing instants like in the continuous-time case considered in [5]. It is easy to see that (3) corresponds to an uncertain linear measurement

(4)

where is the uncertainty and the measurement noise

(5)

As a consequence, even in the presence of bounded disturbances, the uncertainty associated with the measurement (3) depends on and . Recalling that, in general in the context of state estimation for uncertain systems, boundedness of the state trajectories is a prerequisite for the boundedness of the estimation error - see, for instance, the discussion in Section 2.1 of [26] - our attention will be restricted to the case of bounded state and input trajectories by making the following assumption.

  1. At any time , the vectors , , , belong to the compact sets , , , and , respectively.

In practice, the compact sets , , , need not be known by the estimator; they will only be used for stability analysis purposes.

Remark 1

When the discrete-time system is obtained by sampling a continuous-time one with system matrices (), then the amplitude of the uncertainty can be related to the sampling interval . In fact, it turns out that, since in this case and , vanishes as goes to zero and, in addition, when is small .

3 Moving horizon estimation for binary sensors

In order to estimate the state of the linear system given the binary measurements (1)-(2), a MHE approach is adopted. Then, by considering a sliding window , the goal is to find estimates of the state vectors on the basis of the information available in and of the state prediction at the beginning of . Let us denote by the estimates of , respectively, to be obtained at any stage .

Following the discussion at the end of the previous section, a first natural approach for constructing a MH estimator would amount to considering the information provided by the switching instants inside the sliding window , in order to define the cost-function to be minimized. Accordingly, for any time instant and for any sensor index , let us define the set of switching instants as

(6)

Then, the following least-squares cost function can be defined

(7)

where the positive definite matrices , and the positive scalars , are design parameters to be suitably chosen. The first term, weighted by the matrix , penalizes the distance of the state estimate at the beginning of the sliding window from the prediction . The second contribution, weighted by the matrix , takes into account the evolution of the state in terms of the state equation (1). Finally, for each sensor the third term weighted by the scalar penalizes the distances of the expected output (based on the state estimates) from the threshold at the switching instants.

Thus, at each time , the estimates in the window can be obtained by solving the following optimization problem.

Problem : Given the prediction , the input sequence , and the sets , find the optimal estimates that minimize the cost function (7).

Concerning the propagation of the estimation procedure from Problem to Problem , different prediction strategies may be adopted. For instance, a first possibility consists of assigning to the value of the estimate of made at time instant , i.e., . As an alternative, following [15], the state equation of the noise-free system can be applied to the estimate . In this case, the predictions are recursively obtained by

(8)

Such a recursion is initialized with some a priori prediction of the initial state vector. Hereby, this latter possibility will be adopted as it will facilitate the derivation of the stability results (see Section IV).

The main positive feature of Problem is that it admits a closed-form solution since the cost function (7) depends quadratically on the estimates (for the readers’ convenience an explicit expression for the solution is reported in the Appendix). On the other hand, such a cost takes into account only the information pertaining to the switching instants, which however is intrinsically uncertain as discussed in the previous section.

In order to overcome such a limitation, a different cost function can be considered by taking into account all the time instants in the sliding window . To this end, for any sensor , let us define the functions

(9)

Suppose now that at time the sensor provides a measurement . Then, the information provided by such a measurement is that the linear measurement is above the threshold , i.e., belongs to the semi-interval . Such information can be included in the cost function by means of a term of the form which penalizes the distance of the expected output from . Analogous considerations can be made for the case . Summing up, the inclusion of such terms gives rise to a cost function of the following form

(10)

While a closed-form expression for the global minimum of (10) need not exist, since is piece-wise quadratic, it is easy to see that the cost enjoys some nice properties. In fact, it is continuously differentiable with respect to the estimates and also strictly convex (since and ). Hence, standard optimization routines can be used in order to find its global minimum. Clearly, since an optimization has to be performed, it is also reasonable to include constraints accounting for the available information on the state trajectory so that the solver can work on a bounded solution set. In particular, in order to preserve convexity, it is advisable to consider a convex set containing (if is convex, one can simply set ; in general, choosing as a convex polyhedron is preferable so that only linear constraints come into play). Then, at any stage , the following optimization problem has to be solved.

Problem : Given the prediction , the input sequence , the measurement sequences , find the optimal estimates that minimize the cost function (10) under the constraints for .

Also in this case, the predictions are supposed to be recursively obtained via equation (8) starting from a prior prediction . Of course, if no information on the set is available or if it is preferable to resort to an unconstrained optimization routine, one can simply let .

As a final remark, it is worth pointing out that for the two previously presented optimization problems there is a trade-off between estimation accuracy and computational cost. In fact, the cost in Problem is quadratic but accounts only for part of the information provided by the sensors, while Problem accounts for all the available information but requires a convex optimization program to be solved.

3.1 Accounting for additional constraints

Provided that some information on the bounds of the process disturbance and measurement noises is available, additional constraints can be considered in the determination of the state estimates. For instance, considering a convex (usually polyhedral) set containing , one can impose the constraints

(11)

in the solution of the optimization problem. Moreover, assuming the knowledge of upper bounds on the amplitudes , of the measurement noises, for each and each , the constraints

(12)

can be imposed. With this respect, it is an easy matter to see that the constraints in (12) define a polyhedron in the state space as summarized in the following proposition (the proof is reported in the Appendix).

Proposition 1

Given the vector of the estimates in the observation window, the constraints in (12), for and , can be written in compact form as

(13)

where

(14)

While the inclusion of the constraints (11) and (13) in the convex optimization problem is natural, in some circumstances it may be interesting to combine them also with the quadratic cost . For example, minimizing under the linear constraints (13) can be a way to account for the information concerning the non switching instants without the necessity of considering the piece-wise quadratic cost. In fact, this would result in a quadratic programming problem (being the cost quadratic and the constraints linear) for which many efficient solvers are available. It is worth to point out that what is, among the above mentioned options, the best choice clearly depends on the situation under consideration and, in particular, on the available computational resources and on the available information (the bounds may be unknown). Nevertheless, in the next Section it will be shown that both costs and imply some nice stability properties of the resulting MH estimator.

4 Stability analysis

The focus of this section is on the analysis of the stability properties of the state estimators obtained by solving, at each time instant, either Problem or . Specifically, a complete analysis is first provided in the more involved case of Problem . This will be followed by a short discussion on the main differences in the analysis with respect to Problem . Notice that the analysis carried out in [15] for the nonlinear case cannot be directly applied in the present context, since the binary sensors do not satisfy the observability requirement of [15]. The proofs of all results reported in this section can be found in the Appendix.

For each sensor and for each time instant , let us denote by the observability matrix concerning the set of the switching instants in the observation window , i.e,

(15)

Then, the observability matrix related to the switchings in of all binary sensors is

(16)

The following uniform observability assumption is needed in order to ensure that enough information is provided by the binary sensors in each window .

  1. For any , , with .

Remark 2

The above uniform observability assumption is made in accordance with the observation that each output switching can be associated with a linear (albeit uncertain) measurement of the form (3). Hence, each switching instant can be thought of as a sampling instant for the linear output . Generally speaking, such an assumption is satisfied whenever the size of the observation window is sufficiently large and contains a sufficient number of output switchings. Conditions relating the rank of an observability matrix under irregular sampling to the number of samples and to the eigenvalues of the state transition matrix can be found in [5]. With this respect notice that, while such conditions are stated in the continuous-time case, they can be carried over to the discrete-time setting whenever the discrete-time system is obtained by sampling a continuous-time one. Specifically, for an observable sampled-data system (1)-(2) it can be found that the observability matrix defined in (15)-(16) has full rank if the number of switchings in the window , of length , is such that

(17)

where , being the spectrum (set of eigenvalues) of and the argument of . Notice that can be interpreted as the bandwidth of the system (in radians). Hence it turns out that, asymptotically for large , the condition (17) amounts to requiring that the density of switchings be greater than or equal to , which represents the system-bandwidth to Nyquist-bandwidth ratio. It is worth to point out that (17) is only a sufficient (but not necessary) condition for observability. Clearly, a possible way for enforcing (17) amounts to making each threshold oscillate in the range of variability of the corresponding continuous output with a sufficiently high frequency. This latter turns out to be particularly convenient in case is a measurement collected by a remote sensor and a time-varying threshold is used for transmission scheduling.

Before stating the main stability results, some preliminary definitions are needed. Given a symmetric matrix , let us denote by and the minimum and maximum eigenvalues of , respectively. Further, given a matrix , let us denote by its norm. Given a generic subset of an Euclidean space, let us define . Given a generic quantity related to the th binary sensor, let us define and . Finally, let us define the uniform observability index associated to the matrices as

Notice that, under assumption A2, it can be stated that . The following result can now be stated.

Theorem 1

Let assumptions A1 and A2 hold. For each , let the estimate be generated by solving Problem , with recursively obtained via equation (8), and consider the estimation error . Then, the weighted norm of the estimation error can be recursively bounded as

(18)

where

(19)

and , , , , , are suitable constants (given in the proof). In addition, if the weights and , , are selected such that , the norm of the estimation error turns out to be asymptotically bounded in that

Notice that even in the noise-free case, i.e., when the process disturbance and the measurement noise are zero and hence , the asymptotic bound on the estimation error does not go to zero due to the presence of the term in . Indeed, such a term accounts for the intrinsic uncertainty associated with the threshold-crossing instants in discrete-time as discussed at the end of Section 2 (see equation (3) and the subsequent discussion). With this respect, it is worth recalling that, when the discrete-time system under consideration is obtained by sampling a continuous-time system, the quantities and vanish as the sampling interval goes to zero. This means that the smaller is the sampling interval, the smaller turns out to be the asymptotic bound on the estimation error since the information concerning the threshold-crossing instants becomes more precise.

Another important issue concerns the solvability of the stability condition . In particular, the following result can be readily proved.

Proposition 2

Let assumption A2 hold. Then, and, hence, it is always possible to select the weights , and , , so that . In particular, in order to ensure the stability properties of the estimator, the following relation must hold:

(20)

where and are suitable positive constants (given in the proof).

Remark 3

In the statement of Theorem 1 the estimates are generated by solving Problem , in which the constraints for are present. Such an assumption can be removed in the stability analysis just by considering that, in general, the lower bound of each term (see the Appendix) is

and , where

(21)

It can be readily observed that the matrices , and are proportional to , and . Indeed, , and .

Consider now the case in which, for each , the estimate is generated by solving Problem , with recursively obtained via equation (8). Notice that, in this case, no constraint is imposed on the estimates which can be readily obtained as the unique global minimum of the strictly convex, quadratic function . A close inspection of the proof of theorem 1 shows that the same line of reasoning can be applied also for Problem . The main difference is that, when deriving the lower bound for the optimal cost, each term can be simply replaced with the quantity in accordance with the definition of cost . Then an inequality analogous to (18) can be derived, with the important difference that, in the definition of the novel , can be replaced by (which is consistent with the fact that the constraint set is not used in the solution of Problem ).

Remark 4

While the foregoing analysis does not account for the possible presence of the additional constraints discussed in Section III-A, analogous results could be easily obtained also when the constraints (11) and/or (13) are imposed in the determination of the state estimates. In this case, the bound on the estimation error turns out to be smaller thanks to the additional information provided by such constraints.

Remark 5

As a final remark, it is pointed out that the extension of the stability results reported here to the case in which the binary measurements are obtained by thresholding nonlinear output maps and/or the system dynamics is nonlinear does not entail particular conceptual difficulties, by combining the analysis of Theorem 1 with that of [15, 16]. On the other hand, in this case, establishing a link between the observability properties and the number of threshold crossing instants (see Remark 2) appears more challenging. Further, for nonlinear output maps, the resulting cost functions need not be convex.

5 A numerical example

In this section, a numerical example is given in order to show the effectiveness of MHE algorithms in the presence of binary measurements. In particular, let us consider a double oscillator, defined by the following unforced continuous-time linear dynamical system:

(22)

that models a spring - mass system (Fig. 1), in which and are, respectively, the constant factor characteristics (stiffnesses) of the springs and and the corresponding masses. The state of the system is represented by the vector , where and are, respectively, the displacements of the masses and from their static equilibrium position. The continuous-time model is discretized with sampling interval equal to [s]. The values of the model parameters are taken equal to [Kg], [N/m]. In all the simulations, the initial state is chosen so as to impose the harmonic motion condition, i.e., making the two masses oscillate with the same frequency but different amplitudes; the initial phase of the oscillations is a uniformly distributed random variable. The process disturbance is taken equal to zero, while the measurement noise is a white sequence with uniform distribution in the interval .

Figure 1: Double oscillator, given by a springs - masses system.

For the sake of brevity, only results concerning the least-squares MHE (LSMHE) algorithm, corresponding to the solution of the unconstrained Problem , are presented since the other algorithms exhibit similar behaviors. For a comparison of the different solutions the interested reader is referred to [25]. The weight matrices in the MHE cost are chosen equal to , and , the size of the sliding window is . Fig. 2 shows the time behavior of the state variables and of the corresponding estimates in one random simulation with one binary sensor with threshold and noise level . As it can be seen, despite the fact that the estimator is initialized far from the true initial state and the limited amount of information exploited in the cost function, the two trajectories converge after a few seconds.

Figure 2: True value (dashed blue line) and estimate (solid red line) of the four components of the state vector as a function of time.

Further, in order to evaluate the performance of the LSMHE algorithm with different initializations and in different settings, Monte Carlo simulations have been performed by randomly varying the measurement noise realization, the phase of the oscillations for the true state trajectories, and the a priori prediction , which is randomly generated with uniform distribution in . The root mean square error (RMSE) has been considered as performance index

(23)

where is the state estimation error at time in the th simulation run, averaged over Monte Carlo trials. Fig. 3, which shows the time behavior of the RMSE with one binary sensor with threshold and noise level , confirms the effectiveness of the LSMHE algorithm (notice that the oscillations in the plot are due to the high-nonlinearity of the sensor). The dependence of the performance on the threshold and on the noise level is analyzed in Figs. 4 and 5, respectively, where the asymptotic RMSE (ARMSE) is reported (i.e., the average of the RMSE after the transient computed in the time interval [s]). Fig. 4 shows that, when the threshold is close to zero, performance undergoes a substantial decay. Such a behavior is due to the fact that as goes to zero the observability index associated with the output switching instants become small. Finally, it can be seen from Fig. 5 that the RMSE decreases almost linearly as the noise level decreases. However, even when the noise is zero, the ARMSE does not go to zero due to the intrinsic uncertainty associated with binary measurements which was discussed at the beginning of Section 2.

Figure 3: Plot of the RMSE as a function of time for the LSMHE algorithm.
Figure 4: Plot of the ARMSE as a function of the threshold . The measurement noise level is .
Figure 5: Plot of the ARMSE as a function of the measurement noise level . The threshold is .

6 Conclusions

The paper has faced state estimation for linear discrete-time systems, in which the available information is provided by binary multi-sensor observations, in the presence of unknown but bounded noises affecting both the system and the measurement equations. Two novel moving-horizon estimators have been introduced, resulting from the minimization of a least-square and a piece-wise quadratic cost function, respectively, with the possible inclusion of constraints. The stability of the estimation error dynamics for the proposed filters has been analyzed and related with the degree of observability associated with the time instants in which the binary outputs switch value. A simulation example concerning an oscillating mechanical system has also been provided in order to show the applicability of the approach.

Appendix A Proofs

Closed-form solution of Problem :
Let us consider the cost function (7). Under the assumption , (7) can be written as the following quadratic form:

(24)

where , and the matrices , are defined as

with

and

Necessary condition for the minimum of the cost function (24) is

(25)

for any . Solving (25) as a function of , we obtain the optimal estimates , that minimize the cost function (7), namely

(26)

Choosing the weighting matrices and as positive semi-definite matrices and , the solution (26) corresponds to a global minimum, since the Hessian matrix of the cost function is strictly positive definite.

Proof of Proposition 1:
For each , we initially introduce the constraints for the th measurement equation, :

(27)

The system (27) is equivalent to the inequality

(28)

Observing that , , we obtain

If we define , , and , then we can write

since . Moreover, introducing the matrices , and as in (14), the constraints (27) can be written in matrix form, namely

(29)

where . Observing that , it can be noted that (29) is equal to (13), so that the proposition is proved.

Proof of Theorem 1:
Some preliminary definitions are needed. Let us denote by , , the th Lipschitz constant of the function with respect to on for every . Further, consider for each sensor and each sliding window , the vector . Then, we can write

where

and and are suitable matrices. Let be defined as . Clearly, is finite since can assume only a finite number of configurations in the estimation window.

Let us now consider the estimation error as ; the aim is to find a lower and an upper bound for the optimal cost

(30)

to derive a bounding sequence on the norm of the estimation error.

– Upper bound on the optimal cost :

For the optimality of the cost function , we have and hence

(31)

The discontinuous function