Offline state estimation for hybrid systems
via nonsmooth variable projection
Abstract
A hybrid dynamical system switches between dynamic regimes at time or statetriggered events. We propose an offline algorithm that simultaneously estimates discrete and continuous components of a hybrid system’s state. We formulate state estimation as a continuous optimization problem by relaxing the discrete component and use a robust loss function to accommodate large changes in the continuous component during switching events. Subsequently, we develop a novel nonsmooth variable projection algorithm with GaussNewton updates to solve the state estimation problem and prove the algorithm’s global convergence to stationary points. We demonstrate the effectiveness of our approach on simple piecewiselinear and nonlinear mechanical systems undergoing intermittent impact.
AM]Jize Zhang, ECE]Andrew M. Pace, ECE]Samuel A. Burden, AM]Aleksandr Aravkin
Department of Applied Mathematics, University of Washington, Seattle
Department of Electrical & Computer Engineering, University of Washington, Seattle
Key words: State Estimation, Hybrid Systems, Nonlinear Systems, Mechanical Systems, Optimization
1 Introduction
This paper considers the problem of using noisy measurements from a piecewisecontinuous trajectory to estimate a hybrid system’s state. The state estimation problem has been extensively studied in classical dynamical systems whose states evolve according to one (possibly time–varying) smooth model. This problem is fundamentally more challenging for hybrid systems since the set of discrete state\@footnotemark\@footnotetextWe refer to the discrete component of the hybrid system state as the discrete state, and similarly refer to the continuous state, although the state of the hybrid system is specified by both the discrete and continuous components. sequences generally grows combinatorially in time.
When the discrete state sequence and switching times are known a priori or directly measured, only the continuous state needs to be estimated, yielding a classical state estimation problem; this approach has been applied to piecewiselinear systems [30, Chap. 4.5] and to nonlinear mechanical systems undergoing impacts [25]. When the discrete state is not known or measured, estimating both the discrete and continuous states simultaneously improves estimation performance. One approach uses a bank of filters, each tuned to one discrete state, and selects the discrete states as the filter with the lowest residual [10, §4.1]. This filter bank method has been applied to hybrid systems with linear dynamics [8, §4.1] [20], nonlinear dynamics [11], and jumps in the continuous state when the discrete state changes [9]. Likewise, particle filter methods for hybrid systems [14, 17, 29] use a collection of filters, identified as particles, and are applicable to more general nonlinear process dynamics. Particle filters and filter banks are effective when the number of discrete states and dimension of continuous state spaces are small.
Another approach formulates a movinghorizon estimator over both the continuous and discrete states, resulting in a mixedinteger optimization problem [13]. The inherently discrete nature of the problem formulation enables estimation of the exact sample when the discrete state switches, at the expensive of combinatorial growth of the set of discrete decision variables as the horizon increases. Multiple methods have been developed to mitigate the challenge posed by this combinatorial complexity. One approach entails summarizing past measurements and state estimates with a penalty term in the the objective function [19]. Another approach, applicable to systems with bounded noise, entails restricting the set of possible discrete state sequences using a priori knowledge of the system [1, 2].
An alternative approach to circumventing the combinatorial challenge entailed by exactly estimating the discrete state sequence involves relaxing the discrete state estimate to take on continuous values as in [7, 22]. The latter reference uses a sparsitypromoting convex program whose objective incorporates a nonsmooth penalty across all possible discrete state sequences, and guarantees the estimate converges to the true continuous and discrete states. Both approaches are formulated for piecewiselinear systems whose continuous states do not jump when switching between subsystems; in the language of hybrid systems, the continuous states are reset using the identity function.
Our approach and contributions
We propose an offline algorithm for estimating the state of hybrid systems with nonlinear dynamics, non–identity resets, and noisy process and observation models. Our starting point is the optimization perspective on generalized and robust state estimation [3, 4]. To formulate state estimation as a continuous optimization problem, we relax the discrete state to take on continuous values as in prior work. Unlike prior work, we model process noise using the Student’s distribution, which allows large innovations and makes the method applicable to systems with nonidentity resets.
In combination, these elements yield a nonsmooth nonconvex continuous optimization formulation (Sec. 2). We develop a GaussNewton type algorithm to solve this problem and prove the algorithm globally converges to stationary points (Sec. 3). The problem formulation and algorithm is evaluated on piecewiselinear and nonlinear hybrid system models (Sec. 4).
2 Problem formulation
2.1 Process and observation models
We consider a class of discretetime switched systems
(1)  
where indexes the continuouslydifferentiable process model , is the continuouslydifferentiable measurement model that generates observations of the hidden continuous state , are process and measurement noises, and is a onehot vector\@footnotemark\@footnotetext is onehot if for all and ; denotes the set of onehot vectors.that indicates which process model is active at time . Note that the observations do not depend explicitly on the active model , which must be inferred from measurements of the continuous state .
The model that is active during each time step may be determined by an exogenous signal, prescribed as a function of time or state, or some combination thereof. Thus, the equation in (2.1) can represent the process and observation models of a wide variety of hybrid systems. We are motivated theoretically and experimentally to focus on cases where the active model is constant for many time steps, only occasionally switching to a new model. Such cases arise, for instance, when sampling trajectories of a hybrid dynamical system with a fixed timestep (so long as the timestep is much smaller then the average dwell time [21]).
When the discrete state changes in a hybrid system, the continuous state may change abruptly according to a reset map. As an example, the velocity of a rigid mass changes abruptly when it impacts a rigid surface [24]. Empirically, these discrete reset dynamics are much more poorly characterized than their continuous counterparts. For instance, whereas the ballistic trajectory of a rigid mass is wellapproximated by Newton’s laws, the abrupt change in velocity that occurs at impact is not consistent with any established impact law [18]. Including such a reset in the system model (2.1) will introduce bias into the state estimate because the model will generate erroneous predictions at resets, diminishing the accuracy of estimated states at nearby times. This observation motivates us in the next section to account for the effect of unknown resets as part of the process noise.
2.2 Process noise and observation noise models
Instead of incorporating continuous state resets explicitly into the model (2.1), we introduce a distributional assumption on the process noise that accepts large instantaneous changes in the continuous state estimate. Specifically, we assume that process noise follows a Student’s distribution. Compared with the commonlyused Gaussian distribution, the heavytailed Student’s is tolerant to large deviations in the estimate of the hidden continuous state [6]. Hence, the Student’s error model allows an instantaneous change in the state that is consistent with (2.1) before and after the change. The negative loglikelihood of the Student’s (as a function of ) is given by
where is the degrees of freedom parameter of the Student’s , and is the covariance matrix.
If the continuous state was known, then any residual between the predicted observations and actual measurements at time is due to measurement noise; in particular, the residual does not exhibit large deviations due to continuous state resets at switching times. Thus, we assume the measurement noise follows the usual Gaussian distribution, with negative loglikelihood
where is the covariance matrix. The plots below provide a comparison between the probability density (left) and the negative loglikelihood (right) for the scalar Gaussian (solid blue) and Student’s distributions (dashed red; degreeoffreedom ). probability density negative loglikelihood
2.3 State estimation problem formulation
The maximum a posteriori (MAP) likelihood for estimating states of (2.1) is given by
(2)  
Problem (2.3) is a nonlinear mixedinteger program with respect to both the continuous () and discrete () decision variables. We can significantly simplify the structure by establishing the following lemma.
Lemma 1 (Formulation Equivalence)
Given , any vectors , models , and any penalty functional , we have
and  
Proof:
Since for both problems, there are only possible values for both objective functions, i.e.
Hence, the minimum objective value for both problems will be and every minimizer is a onehot vector that selects a minimum value.
Therefore, an equivalent formulation to (2.3) is given by
(3)  
Although still a mixedinteger program, this reformulation exhibits linear coupling between the discrete variables and continuous variables . We will leverage this linear coupling when we develop our estimation algorithm based on the relaxed problem formulation introduced in the next section.
2.4 Relaxed state estimation problem formulation
Ultimately, the discrete state estimate will be specified as a onehot vector, . To formulate a continuous optimization problem that approximates the mixedinteger problem formulated in the previous section, we relax the decision variable to take values in the convex hull of .\@footnotemark\@footnotetext We use to denote the simplex in . The optimal relaxed will generally lie on the interior of the simplex, so we project the result from our relaxed optimization problem to return the onehot discrete state estimate. Since this relaxationoptimizationprojection process tends to induce frequent changes in the discrete state estimate, we introduce a smoothing term on ,
yielding the continuous relaxation of (3) given by
(4)  
where is the concatenated variable containing all , is the concatenated variable containing all , and convex indicator function is defined by
The optimal relaxed discrete state estimate is projected onto by choosing the (unique) onehot vector whose component is equal to 1.
3 State estimation algorithm
We develop an algorithm to solve the relaxed state estimation problem formulated in (4) using two key ideas:

nonsmooth variable projection;

GaussNewton descent with Student’s penalties.
These two ideas are explained in the next two subsections, followed by a convergence analysis in the third subsection.
3.1 Nonsmooth variable projection
The first idea is to pass to the value function, projecting out (partially minimizing over) the variables. Define
(5) 
with as in (4). The objective is convex in , but not strictly convex. To guarantee differentiability of , we add a smoothing term and consider
(6) 
where is usually taken to be a very small number (e.g. ) so that the added term has minimal effect on the original value function. (The minimizer of is different from that of .) The function is a Moreau envelope [27, Def 1.22] of the true value function ; we refer the interested reader to [5] for details and examples concerning the Moreau envelope specifically (and nonsmooth variable projection more broadly). The unique minimizer can be found quickly and accurately since the minimization problem with respect to is strongly convex: projected gradient descent converges linearly and can be accelerated using the Fast Iterative ShrinkageThresholding Algorithm (FISTA) [12] approach. With the minimizer , the gradient of is readily computed as
(7) 
3.2 GaussNewton descent with Student’s penalties
In this section we write the objective function (8) as a convex composite function and specify the line search we use to update . The particular line search method was first proposed in [15] for a general class of algorithms for convex composite problems; the GaussNewton update derived here is a special case of this general line search scheme.
To cast the objective into a convex composite function, let , where
with
and
At each iteration, we choose a search direction that
(9)  
where the equivalence is obtained by dropping terms independent of . In general can be any positive semidefinite matrix that varies continuously with respect to , but for our particular objective function involving Student’s t penalty, is chosen to be a Hessian approximation of the Student’s term in . Therefore the update can be interpreted as a GaussNewton style update. This approximation, proposed in [6, (5.5), (5.6)], is employed here because of its significant computational advantage; it is of the form
(10) 
with
for , and
We can rewrite as
where
and
Clearly is positive semidefinite; we show in Lemma 3 that is actually positive definite, so problem (9) reduces to the block triadiagonal linear system
Given , the new is of the form
where is a step size selected using the Armijotype [26, Sec. 3.1] line search criterion.
(11)  
with
When , we have \@footnotemark\@footnotetextWe overload here to match the notation in [15, 6]; should not be confused with , which is used to denote the simplex containing relaxed state estimates., and since we choose the minimizing
we have . Further,
by [15, Thm. 3.6]. In other words, stationarity is achieved when . When , we are guaranteed to have descent
since is positive semidefinite. This condition ensures that the line search step (11) is welldefined [15, Lemma 2.3].
3.3 Convergence of state estimation algorithm
The convergence of Algorithm 1 to a stationary point for a general class of convex composite objective functions is established in [15] and [6]. In particular [6, Theorem 5.1] establishes the possible outcomes when applying this type of algorithm; informally, either the algorithm converges or the search direction diverges. In the remainder of this section we provide two technical results needed to formalize this intuition:
Lemma 2
Let . If is bounded and is positive definite for all , then the hypotheses in [6, Theorem 5.1] are satisfied and the sequence of search directions is bounded.
Proof: The hypotheses in [6, Theorem 5.1] require that to be bounded and uniformly continuous on the set where stands for the closed convex hull. is continuous on since exists and is continuous by property of Moreau envelope and proximal operator, and is continuous trivially. Further, given that is closed by definition and bounded by assumption, it is compact. Hence is bounded and uniformly continuous on .
Now we need to show that the sequence of search direction is bounded. At any iteration, the search direction we choose satisfies
where the first inequality relies on and on the positive semidefinite property of ; the second inequality comes from ; the third inequality results from the line search condition that creates a decreasing sequence .
Since is finite, for all iterations. Because is closed by closedness of and is continuous, is also closed. Along with its boundedness by assumption, is compact. Since is continuous, its image is bounded, hence given that is positive definite there exists some for all . Therefore , which implies that cannot be unbounded.
Lemma 3
is bounded for problem (8) and is positive definite for all .
Proof: First note that is bounded by the coercivity of . This implies that for an unbounded sequence , we still have and .
If , then we can find some and a subsequence such that . By the definition of and , , which further implies that . Iteratively this means that for all , in particular for the given starting point , but that is not possible.
To show that in (10) is positive definite, recall that we can rewrite as
If there exists some such that , then
since , and
are positive semidefinite. However because each , there has to be some for each . Therefore must be positive definite for all .
4 Experiments
To evaluate the proposed approach to state estimation for hybrid systems, we apply our algorithm to linear and nonlinear impact oscillators. In addition to being wellstudied ([16, §1.2], [28]), these mechanical systems were chosen since they are among the simplest physicallyrelevant models that satisfy our assumptions, including non–identity reset maps. The parameter and trajectory regime considered in what follows is representative of a jumping robot constructed from one limb of a commerciallyavailable quadrupedal robot [23] and controlled with an eventtriggered stiffness adjustment; Figure 0(a) contains a photograph of the limb. The jumping robot’s hip and foot are constrained to move vertically in a gravitational field, so the rigid pantograph mechanism depicted in Figure 0(b) has two mechanical degreesoffreedom (DOF) coupled through nonlinear pinjoint constraints. These two DOF are preserved, but their nonlinear coupling is neglected, in the piecewiselinear model illustrated in Figure 0(c). The hybrid dynamics of these linear and nonlinear impact oscillators are specified in Section 4.1
We perform two sets of experiments. The first set of experiments in Sec. 4.2 concern the piecewiselinear model depicted in Figure 0(c) and explore the consequences of our modelling assumptions and the efficacy of our proposed algorithm:

Sec. 4.2.1 demonstrates the advantage of employing a Student’s distribution for process noise as compared to a Gaussian distribution;

Sec. 4.2.2 demonstrates the superior convergence rate yielded by GaussNewton descent directions as compared to gradient (steepest) descent;

Sec. 4.2.3 demonstrates the advantage of smoothing the relaxed discrete state estimate; and

Sec. 4.2.4 demonstrates the algorithm’s performance when onboard measurements are used instead of offboard measurements.
The second set of experiments in Sec. 4.3 evaluate our proposed approach using the nonlinear model depicted in Figure 0(b).
Since this section is devoted to comparing estimated states to ground truth simulation results, and since our approach entails the determination of a relaxed discrete state estimate en route to obtaining the discrete state estimate, we now introduce notation that distinguishes these quantities:

denotes the ground truth discrete state;

denotes the relaxed discrete state estimate;

denotes the discrete state estimate.
This notational distinction was not introduced previously in the interest of readability since there was no ambiguity entailed by overloading notation in the problem formulation and algorithm specification.
4.1 Impact oscillator hybrid system models
The continuous state for the jumping robot hybrid system model consists of the twodimensional configuration vector and corresponding velocity , where and denote the vertical height of the hip and foot, respectively. The foot is not permitted to penetrate the ground, , so the first part of the discrete state indicates whether this constraint is active: A (air) if , G (ground) if . To compensate for energy losses at impact, an eventtriggered controller stiffens or softens a spring based on which direction the hip is traveling, so the second part of the discrete state indicates the direction of travel for : if up, if down. With denoting the acceleration of the hip and foot in discrete state , formulae for this acceleration are given in Table 1. At the moment of impact (when the discrete state changes from to ) the foot velocity is instantaneously reset to , corresponding to perfectly plastic impact. An example of the jump in continuous state when transitioning from to on the foot velocity is shown in Figure 2 near time 17.5s.
Discrete state  Icon  

4.2 Piecewiselinear impact oscillator experiment
In this subsection, we employ the linear spring laws
with parameter values .
In our first demonstration the observed states are and , position of the hip and foot, leaving the velocities unobserved:
(12) 
State estimation results for this system are shown in Figure 5.
In the remainder of this subsection, we demonstrate the effects of the choices we made in our problem formulation (Sec. 2) and algorithm derivation (Sec. 3) using the piecewiselinear model as a running example. We also consider a variation where the measurements correspond to the leg length and velocity, which are more representative of the onboard measurements available to an autonomous robot operating outside of the laboratory.
4.2.1 Student’s versus Gaussian process noise
Figure 2 compares the estimation of foot velocity using Student’s with versus using Gaussian for the process noise distribution; in both cases the true discrete state is given. The estimated trajectory for both distributions match the true simulated trajectory away from jumps, while near jumps, such as around times s and s, using the Student’s distribution enables closer tracking of the instantaneous change in the true foot velocity than when using a Gaussian distribution.
4.2.2 GaussNewton versus gradient (steepest) descent
We empirically compared convergence rates for continuous state updates obtained using GaussNewton and gradient (steepest) descent directions (Algorithm 1, line 3). Figure 3 shows the log loss versus algorithm iteration for the two methods; the actual discrete state was taken as given to perform this comparison. As expected, the objective value decreases significantly faster when the search direction is determined by the GaussNewton scheme as compared to the direction of steepest descent, reaching the stopping criterion in ten times fewer iterations in our tests.
4.2.3 Smoothing the relaxed discrete state versus not
If the continuous states are given, the discrete state estimate returned by our algorithm (skipping lines 2 and 3 of Algorithm 1) is very close to the true discrete state regardless of whether a smoothing term is included in the relaxed problem formulation. When simultaneously estimating both the continuous and discrete states, the smoothing term becomes crucial, as illustrated by comparing the discrete state estimates () in Figure 4 (without smoothing) and Figure 5 (with smoothing). In particular, the estimated discrete state switches rapidly without smoothing, whereas with smoothing the discrete state tends to remain constant for many samples and change mostly near groundtruth switching times.
4.2.4 Onboard versus offboard measurements
In the laboratory, the positions of the robot hip and foot can be directly measured offboard, e.g. with an external camera system. Outside of the laboratory, only the relative position of the hip and foot can be directly measured onboard our robot. Thus, we are motivated by this practical consideration to evaluate our algorithm’s performance in the case where only the relative position and velocity of the hip and foot are measured,
(13) 
Although the full hybrid system state is formally unobservable with these relative measurements, our algorithm nevertheless yields good estimates of the discrete state as shown in Figure 6; due to large errors in the estimate of (unobservable) continuous states, we omit those results from the figure.
4.3 Piecewisenonlinear impact oscillator experiment
To test Algorithm 1 on a nonlinear model, we included the kinematic constraints depicted in Figure 0(b), resulting in a nonlinear spring force. In this model we set the two spring laws to be the same , decreasing the number of discrete states from four to two: when and when . State estimation results compare favorably with the analogous results from the piecewiselinear system when using either absolute position measurements (12) (compare Figure 7 with Figure 5) or relative measurements (13) (compare Figure 8 with Figure 6).
In Figure 7 we see that the model can estimate continuous and discrete states in the nonlinear setting. However, we do notice that the estimated trajectories are not as close to ground truth as in the linear case. In particular, when has a value only slightly greater than 0 (e.g. between times 3s and 4s), the algorithm fails to detect the transition between and .
5 Conclusion
We proposed a new state estimation algorithm for hybrid systems, analyzed its convergence properties, and evaluated its performance on piecewiselinear and nonlinear hybrid systems with nonidentity resets. The algorithm leverages a relaxed state estimation problem formulation where the decision variables corresponding to the discrete state are allowed to take on continuous values. This relaxation yields a continuous optimization problem that can be solved using recentlydeveloped nonsmooth variable projection techniques. The effectiveness of the approach was demonstrated on hybrid system models of mechanical systems undergoing impact.
Acknowledgements
References
 [1] A. Alessandri, M. Baglietto, and G. Battistelli. Recedinghorizon estimation for switching discretetime linear systems. IEEE Transactions on Automatic Control, 50(11):1736–1748, November 2005.
 [2] Angelo Alessandri, Marco Baglietto, and Giorgio Battistelli. MinimumDistance RecedingHorizon State Estimation for Switching DiscreteTime Linear Systems. In Rolf Findeisen, Frank Allgöwer, and Lorenz T. Biegler, editors, Assessment and Future Directions of Nonlinear Model Predictive Control, number 358 in Lecture Notes in Control and Information Sciences, pages 348–366. SpringerVerlag Berlin Heidelberg, 2007.
 [3] A. Y. Aravkin, J. V. Burke, L. Ljung, A. Lozano, and G. Pillonetto. Generalized Kalman Smoothing: Modeling and Algorithms. arXiv:1609.06369 [math, stat], September 2016.
 [4] Aleksandr Aravkin, James V. Burke, and Gianluigi Pillonetto. Robust and Trendfollowing Kalman Smoothers using Student’s t. IFAC Proceedings Volumes, 45(16):1215–1220, July 2012.
 [5] Aleksandr Aravkin, Dmitriy Drusvyatskiy, and Tristan van Leeuwen. Variable projection without smoothness. arXiv preprint arXiv:1601.05011, 2016.
 [6] Aleksandr Y Aravkin, James V Burke, and Gianluigi Pillonetto. Robust and trendfollowing student’s t kalman smoothers. SIAM Journal on Control and Optimization, 52(5):2891–2916, 2014.
 [7] Laurent Bako and Stéphane Lecoeuche. A sparse optimization approach to state observer design for switched linear systems. Systems & Control Letters, 62(2):143–151, February 2013.
 [8] A. Balluchi, L. Benvenuti, M.D. Di Benedetto, and A.L. SangiovanniVincentelli. Observability for hybrid systems. In 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475), volume 2, pages 1159–1164. IEEE, 2003.
 [9] Andrea Balluchi, Luca Benvenuti, Maria D. Di Benedetto, and Alberto SangiovanniVincentelli. The design of dynamical observers for hybrid systems: Theory and application to an automotive control problem. Automatica, 49(4):915–925, 2013.
 [10] Andrea Balluchi, Luca Benvenuti, Maria D. Di Benedetto, and Alberto L. SangiovanniVincentelli. Design of Observers for Hybrid Systems. In Gerhard Goos, Juris Hartmanis, Jan van Leeuwen, Claire J. Tomlin, and Mark R. Greenstreet, editors, Hybrid Systems: Computation and Control, volume 2289, pages 76–89. Springer Berlin Heidelberg, Berlin, Heidelberg, 2002.
 [11] N. Barhoumi, F. Msahli, M. Djemaï, and K. Busawon. Observer design for some classes of uniformly observable nonlinear hybrid systems. Nonlinear Analysis: Hybrid Systems, 6(4):917–929, 2012.
 [12] Amir Beck and Marc Teboulle. A fast iterative shrinkagethresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences, 2(1):183–202, 2009.
 [13] A. Bemporad, D. Mignone, and M. Morari. Moving horizon estimation for hybrid systems and fault detection. In Proceedings of the 1999 American Control Conference (Cat. No. 99CH36251), volume 4, pages 2471–2475 vol.4, June 1999.
 [14] H. A. P. Blom and E. A. Bloem. Exact Bayesian and particle filtering of stochastic hybrid systems. IEEE Transactions on Aerospace and Electronic Systems, 43(1):55–70, 2007.
 [15] James V Burke. Descent methods for composite nondifferentiable optimization problems. Mathematical Programming, 33(3):260–279, 1985.
 [16] M. Di Bernardo, editor. PiecewiseSmooth Dynamical Systems: Theory and Applications. Number 163 in Applied Mathematical Sciences. Springer, London, 2008. OCLC: ocn144515657.
 [17] A. Doucet, N. J. Gordon, and V. Krishnamurthy. Particle filters for state estimation of jump Markov linear systems. IEEE Transactions on Signal Processing, 49(3):613–624, 2001.
 [18] Nima Fazeli, Samuel Zapolsky, Evan Drumwright, and Alberto Rodriguez. Learning DataEfficient RigidBody Contact Models: Case Study of Planar Impact. In Proceedings of the 1st Annual Conference on Robot Learning, page 10, 2017.
 [19] G. FerrariTrecate, D. Mignone, and M. Morari. Moving horizon estimation for hybrid systems. IEEE Transactions on Automatic Control, 47(10):1663–1676, October 2002.
 [20] D. GómezGutiérrez, S. Čelikovský, A. RamírezTreviño, J. RuizLéon, and S. Di Gennaro. Sliding mode observer for Switched Linear Systems. In 2011 IEEE International Conference on Automation Science and Engineering, pages 725–730, 2011.
 [21] J. P. Hespanha and A. S. Morse. Stability of switched systems with average dwelltime. In Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304), volume 3, pages 2655–2660 vol.3, December 1999.
 [22] Scott C Johnson. Observability and Observer Design for Switched Linear Systems. PhD thesis, Purdue University, December 2016.
 [23] Gavin Kenneally, Avik De, and D. E. Koditschek. Design Principles for a Family of DirectDrive Legged Robots. IEEE Robotics and Automation Letters, 1(2):900–907, July 2016.
 [24] Per Lötstedt. Mechanical Systems of Rigid Bodies Subject to Unilateral Constraints. SIAM Journal on Applied Mathematics, 42(2):281–296, 1982.
 [25] L. Menini and A. Tornambe. Asymptotic tracking of periodic trajectories for a simple mechanical system subject to nonsmooth impacts. IEEE Transactions on Automatic Control, 46(7):1122–1126, July 2001.
 [26] Jorge Nocedal and Stephen J Wright. Nonlinear Equations. Springer, 2006.
 [27] R. Tyrrell Rockafellar and Roger J.B. Wets. Variational Analysis. Springer Verlag, Heidelberg, Berlin, New York, 1998.
 [28] M Schatzman. Uniqueness and continuous dependence on data for one–dimensional impact problems. Mathematical and computer modelling, 28(4—8):1–18, 1998.
 [29] C E Seah and I Hwang. State estimation for stochastic linear hybrid systems with ContinuousStateDependent transitions: An IMM approach. IEEE transactions on aerospace and electronic systems, 45(1):376–392, January 2009.
 [30] Robert R. Stengel. Optimal Control and Estimation. Dover, 1994.