Finite Horizon Backward Reachability Analysis and Control Synthesis for Uncertain Nonlinear Systems

# Finite Horizon Backward Reachability Analysis and Control Synthesis for Uncertain Nonlinear Systems

He Yin Department of Mechanical Engineering, University of California, Berkeley Andrew Packard Department of Mechanical Engineering, University of California, Berkeley Murat Arcak Department of Electrical Engineering and Computer Sciences, University of California, Berkeley Peter Seiler Department of Aerospace Engineering and Mechanics at the University of Minnesota
###### Abstract

We present a method for synthesizing controllers to steer trajectories from an initial set to a target set on a finite time horizon. The proposed control synthesis problem is decomposed into two steps. The first step under-approximates the backward reachable set (BRS) from the target set, using level sets of storage functions. The storage function is constructed with an iterative algorithm to maximize the volume of the under-approximated BRS. The second step obtains a control law by solving a pointwise min-norm optimization problem using the pre-computed storage function. A closed-form solution of this min-norm optimization can be computed through the KKT conditions. This control synthesis framework is then extended to uncertain nonlinear systems with parametric uncertainties and disturbances. The computation algorithm for all cases is derived using sum-of-squares (SOS) programming and the S-procedure. The proposed method is applied to several robotics and aircraft examples.

## 1 Introduction

Control synthesis for nonlinear systems suffers from the lack of adequate computational tools. Several recent results leverage sum-of-squares (SOS) and semidefinite programming to construct Lyapunov functions for closed-loop stability and reachability. In [19] and [13], a method for designing controllers that maximize backward reachable sets based on occupation measures is proposed, and the synthesis problem is posed as an infinite dimensional linear program, the finite dimensional approximation of which yields a polynomial control policy and an outer approximation of the largest achievable backward reachable set (BRS). However, several Lagrange multipliers are omitted in the optimization formulation, which are of critical importance for achieving tight bounds. Consequently, the outer-approximation of the BRS loses tightness as the state dimension of systems grows, and the control policy is not guaranteed to bring the system to the given target set.

The approach proposed in [10] aims to synthesize control policies to expand the infinite time horizon region of attraction. In [11], reference tracking controllers are designed to maximize the size of the set of states that are driven to a pre-defined target set. The approach in [12] is to compute a reference tracking controller by minimizing the size of the invariant funnel for the tracking error. An essential advantage of these papers is that, since control laws and storage functions are searched for at the same time, input saturation can be taken into account by adding additional multipliers in the constraints. On the other hand, since the dependence on decision variables (control policies, storage functions and multipliers) is bilinear, computational algorithms for these methods might be complicated and involve three sub-steps of searching over decision variables.

The method presented in [21] expands the region of attraction certified by a local Control Lyapunov Function (CLF), and control laws are given by variants of the Sontag formula [20] [8] [3]. The method in [23] searches for global CLFs whose level sets have similar shapes to those of CLFs obtained from the LQR problem for linearized systems and obtains near optimal performance. The framework is extended to ensure robustness against bounded parameter uncertainties and disturbances. Other approaches to computational nonlinear control synthesis include: Hamilton-Jacobi methods for reachability computations [14], control barrier functions [1], a Lyapunov-based approach utilizing state dependent linear representation of nonlinear systems [15], and a dual to the Lyapunov-based method [16].

This paper addresses the finite time horizon control problem for nonlinear systems that are affine in control, with uncertain parameters and disturbances. The objective is to maximize the volume of the under-approximated BRSs and to minimize the norm of control inputs that drive trajectories to the target sets. Dissipation inequalities and level sets of storage functions are used to characterize the under-approximated BRSs. The S-procedure [4] and SOS for polynomial non-negativity are used to derive the optimization problem of computing storage functions. A computational algorithm is proposed to decompose the optimization problem into convex and quasiconvex subproblems. Since this method does not explicitly search for a control law, the algorithm only involves a two-way search between storage functions and multipliers. Min-norm control laws are given as closed form solutions to quadratic programs based on the computed storage function, and are not restricted to be polynomial functions.

This paper is a continuation of the stability and reachability analysis methods for nonlinear systems in [25] [24] [22], which compute finite and infinite horizon reachable sets and regions of attraction for nonlinear systems with given control laws. In contrast, this paper aims to design controllers as well as under-approximate the BRS.

## 2 Notation

and denote the set of -by- real matrices and -by- real, symmetric matrices. A single superscript index denotes vectors; for example, is the set of vectors whose elements are in . is the set of differentiable functions whose derivative is continuous. is the space of -valued measureable functions , with . Define Associated with is the extended space , consisting of functions whose truncation for ; for , is in for all For , represents the set of polynomials in with real coefficients. The subset of is the set of SOS polynomials in . For , and continuous , For , and continuous , .

## 3 Storage Function Synthesis

Consider a single-input, time-varying, nonlinear system with affine dependence on the control input :

 ˙x=f(t,x)+g(t,x)u, (1)

with , , , and . Proposition 1 provides conditions on a storage function and control input to reach a desired target set.

###### Proposition 1

Given system (1), initial time , terminal time , and a target set , if there exists a storage function so that

 infu∈R{∂V∂t+∂V∂x(f(t,x)+g(t,x)u)}≤0,∀(t,x)∈[t0,T]×Rn, (A.1) ΩVT,γ⊆ΩrT0, (A.2)

then there exists a control law , such that any trajectory with initial condition evolves to , i.e. the final state is in the target set.

The set is an under-approximation of the backward reachable set for the given target set and initial time. Proposition 1 follows from a simple dissipation argument. Integrating constraint (A.1) from to yields . Thus it follows from that . Assumption (A.2) then implies that reaches the target set at time .

If for some , then (A.1) is satisfied for of proper sign and sufficiently large magnitude. On the other hand, if for some then is required to satisfy (A.1). Based on this discussion, there exists a control input such that (A.1) is feasible for a given storage function , if and only if the following set containment constraint holds for all :

 {x∈Rn∣∣∣∂V∂t+∂V∂xf(t,x)≤0}⊇{x∈Rn∣∣∣∂V∂xg(t,x)=0}. (A.3)

### 3.1 Local Analysis

If constraint (A.3) fails to hold for some points , then we look for a “local” region that excludes those points. Here we use , the level set of storage function at time , to quantify the local region, and we have the following local version of Proposition 1.

###### Theorem 1

Given system (1), initial time , terminal time , and a target set , if there exists a storage function , such that , and for all ,

 {x∈Rn∣∣∣∂V∂t+∂V∂xf(t,x)≤0}⊇{x∈Rn∣∣∣∂V∂xg(t,x)=0,V(t,x)≤γ}, (B.1)

then there exists a control law , such that , for all . Thus is the under-approximation of the backward reachable set for the given target set and initial time.

To find such a storage function using sum-of-squares programming, we restrict and to polynomial functions. It is often possible to represent nonlinear system equations with polynomials upon changes of variables, Taylor’s theorem and least squares regression [23]. To formulate the set containment constraints, we define the polynomial function , which is nonnegative when . Since a less conservative under-approximation is preferable, we want to find a storage function with the volume of being maximized. Utilizing the S-procedure to obtain sufficient conditions for the set containment constraints in Theorem 1, and SOS relaxation for polynomial nonnegativity, we obtain the following optimization problem, with bilinear SOS constraints and a non-convex objective functions.

###### Optimization problem 1
 maxV,l,s volume(ΩVt0,γ) s.t. s2(t,x),s3(t,x)∈Σ[t,x], s4(x)−ϵ∈Σ[x],ϵ>0,l(t,x)∈R[t,x], (C.1) −(∂V∂t+∂V∂xf(t,x))−s2(t,x)h(t)+l(t,x)∂V∂xg(t,x)+s3(t,x)(V(t,x)−γ)∈Σ[t,x], (C.2) −s4(x)rT(x)+(V(T,x)−γ)∈Σ[x], (C.3)

where the positive number ensures that can’t take the value of zero.

For bilinear SOS constraints (C.1) to (C.3), and , and are two pairs of bilinear decision variables. To tackle this non-convex optimization problem, we decompose it into two subproblems to iteratively search between storage function and multipliers .

###### Algorithm 1

Iterative method
Inputs: A storage function satisfying constraints (C.2) and (C.3).
Outputs: with its volume maximized.

1. step: maximization problem

 maxγ,l,s2,s3,s4γ s.t. s2(t,x),s3(t,x)∈Σ[t,x],l(t,x)∈R[t,x] s4(x)−ϵ∈Σ[t,x],ϵ>0, −(∂V0∂t+∂V0∂xf(t,x))−s2(t,x)h(t)+l(t,x)∂V0∂xg(t,x)+s3(t,x)(V0(t,x)−γ)∈Σ[t,x], −s4(x)rT(x)+(V0(T,x)−γ)∈Σ[x].
2. step: feasibility problem over decision variables

 s1(x)∈Σ[x],s2(t,x)∈Σ[t,x], s4(x)−ϵ∈Σ[t,x],ϵ>0, −(V(t0,x)−γ∗)+s1(x)(V0(t0,x)−γ∗), −(∂V∂t+∂V∂xf(t,x))−s2(t,x)h(t)+¯l(t,x)∂V∂xg(t,x)+¯s3(t,x)(V(t,x)−γ∗)∈Σ[t,x], −s4(x)rT(x)+(V(T,x)−γ∗)∈Σ[x].
###### Remark 1

For the step, is the storage function computed from the step of the previous iteration. Since and enter bilinearly, and is the objective function, then the step is a generalized SOS problem, which is proven in [18] to be quasiconvex. Thus, the global optimal solution can be computed by bisecting .

###### Remark 2

, and in the step are obtained from the step. Similar to the algorithm proposed in [9] to find the region of attraction, this algorithm makes use of from the previous iteration as a shape function for enlarging the volume of , rather than using a preset shape function. After the step, constraints of the step are active for . In the step, a new feasible is computed, which is the analytic center of the LMI constraints. Thus the step feasibility problem pushes away from the constraints, which give the next step more freedom to increase . The step is a SOS problem, which is convex. Note that although global optima for the subproblems in the and steps at each iteration can be achieved, the ultimate solution of this iterative algorithm is not necessarily the global optimal solution for optimization problem 1.

###### Remark 3

Since in many cases, we want to bring the system close to an equilibrium point, the target region is set as a neighborhood around it. Therefore, LQR controllers designed for linearization of dynamics about equilibrium points can be used to compute storage functions, which can be used to initialize .

### 3.2 Multi-input Case

In this section, the framework in the previous sections is extended to multi-input systems. Assume that there are inputs , and accordingly . Denote , where is the column of ; denote and write the multi-input system as

 ˙x=f(t,x)+m∑i=1gi(t,x)ui. (2)

The constraint (B.1) is modified to be, for all ,

 {x∈Rn∣∣∣∂V∂t+∂V∂xf(t,x)≤0}⊇{x∈Rn∣∣∣∂V∂xg1(t,x)=0,...,∂V∂xgm(t,x)=0,V(t,x)≤γ}. (D.1)

Applying the S-procedure to (D.1), we have its corresponding SOS constraint

 −(∂V∂t+∂V∂xf(t,x))−s2(t,x)h(t)+m∑i=1{li(t,x)∂V∂xgi(t,x)}+s3(t,x)(V(t,x)−γ)∈Σ[t,x]. (D.2)

By replacing (C.2) with (D.2), and keeping other constraints to be the same, we obtain an optimization problem for multi-input systems. Instead of only searching over , we now search over polynomials .

## 4 Min-norm Control Synthesis

With the storage function computed from optimization problem 1, we want to find a control law , such that the dissipation inequality in constraint (A.1) holds for all and . Also, to avoid excessive control magnitudes, we want the norm of to be minimized. Similar to the idea in [8], the control input is determined by solving the following quadratic program (QP),

 minu∈RmuTu (3) s.t.   ∂V(t,x)∂t+∂V(t,x)∂x(f(t,x)+g(t,x)u)≤0.

Since the QP (3) satisfies Slater’s condition, its closed form solution can be obtained by solving the KKT condition, which yields the optimal control law

 k(t,x)=u∗={0,b(t,x)≤0−b(t,x)a(t,x)a(t,x)Ta(t,x)T,b(t,x)>0, (4)

where

 a(t,x):=∂V(t,x)∂xg(t,x), (5) b(t,x):=∂V(t,x)∂t+∂V(t,x)∂xf(t,x).
###### Remark 4

The constraint (B.1): implies , ensures that if , we have . Therefore, there is no singularity in the control law due to division by . However, discontinuity in the control law might be possible at the points , where and . To deal with discontinuity, we can use a strict version of constraint (B.1): implies , for all , for all and implies for all , where can be the origin or some equilibrium point for the system. Then discontinuity can only happen at the point , and continuity of the control law at can be established using an analog of the small control property from [20].

## 5 Modifications for Systems with Bounded Uncertainties

For brevity of notation, we still consider single-input systems, but with uncertain parameters ,

 ˙x=F(t,x,δ)+g(t,x)u, (6)

with , , , , , and assume that lies in a known set

 δ∈Δ:={δ∈Rnδ|N(δ)≥0}.

Slightly modifying constraint (B.1), we have the dissipation inequality constraint for the uncertain system: for all ,

 {x∈Rn∣∣∣∂V∂t+∂V∂xF(t,x,δ)≤0}⊇{x∈Rn∣∣∣∂V∂xg(t,x)=0,V(t,x)≤γ}. (E.1)

Assume that is affine in , and denote , with . To simplify the analysis, assume also that the set is a bounded polytope, and define the set of vertices of , , where is the number of vertices. Since enters the system linearly, and it lies in a bounded polytope, if we impose constraint (E.1) to hold on , then it holds everywhere on . Then constraint (E.1) can be transformed into a number of constraints, for all ,

 {x∈Rn∣∣∣∂V∂t+∂V∂xf(t,x)+∂V∂xgδ(t,x)δ[i]≤0}⊇ {x∈Rn∣∣∣∂V∂xg(t,x)=0,V(t,x)≤γ},∀i=1,...,Nvertex. (E.2)

Note that constraint (E.2) doesn’t introduce as a new variable, which helps to reduce computation time.

### 5.1 Control Synthesis for Systems with Bounded Uncertainties

Similar to the QP (3), we have the min-norm QP for the uncertain system

 minu∈RmuTu (7) s.t. ∂V(t,x)∂t+∂V(t,x)∂x(f(t,x)+g(t,x)u+gδ(t,x)δ[i])≤0, ∀i=1,...,Nvertex.

Define

 a(t,x):=∂V∂xg(t,x), (8) bi(t,x):=∂V∂t+∂V∂xf(t,x)+∂V∂xgδ(t,x)δ[i], bmax(t,x):=max{b1(t,x),...,bNvertex(t,x)}.

Then the QP (7) can be rewritten as

 minu∈RmuTu s.t. a(t,x)u+bi(t,x)≤0, ∀i=1,...,Nvertex,

which is equivalent to

 minu∈RmuTu (9) s.t. a(t,x)u+bmax(t,x)≤0.

The control law is given by

 k(t,x)=u∗={0,bmax(t,x)≤0−bmax(t,x)a(t,x)a(t,x)Ta(t,x)T,bmax(t,x)>0. (10)

## 6 Modifications for Systems with L2 Disturbances

Consider a disturbed system with disturbances entering linearly

 ˙x=f(t,x)+g(t,x)u+gw(t,x)w, (11)

with , , , , , and .

###### Theorem 2

Given system (11), initial time , terminal time , a target set , and disturbances satisfying , where the non-decreasing polynomial function satisfies , , if there exists a storage function , satisfying , and for all , such that

 {x∈Rn∣∣∣∂V∂t+∂V∂xf(t,x)+∂V∂xgw(t,x)w≤wTw}⊇ {x∈Rn∣∣∣∂V∂xg(t,x)=0,V(t,x)≤γ+R2q(t)}, (F.1)

then there exists a control law , such that , for all .

In Theorem 2, the function describes how fast the energy of disturbances releases. If is not known beforehand, we need to relax constraint (F.1) to be: for all ,

 {x∈Rn∣∣∣∂V∂t+∂V∂xf(t,x)+∂V∂xgw(t,x)w≤wTw}⊇ {x∈Rn∣∣∣∂V∂xg(t,x)=0,V(t,x)≤γ+R2}, (G.1)

which can be more restrictive for the storage function, since the dissipation inequality is required to hold on a larger space in .

If, in addition to the bound above, we have a constraint for , for all . Constraint (F.1) in Theorem 2 is modified to hold for all . By modifying the SOS constraint (C.2), the SOS constraint for (F.1) can be written as

 −(∂V∂t+∂V∂xf(t,x)+∂V∂xgw(t,x)w−wTw)+l(t,x,w)∂V∂xg(t,x)−sa(t,x,w)h(t)+ sb(t,x,w)(V(t,x)−γ−R2q(t))+sc(t,x,w)(wTw−α)∈Σ[t,x,w], (12)

where , .

### 6.1 Control Synthesis for Disturbed Systems

Similar to the QP (3), the following QP gives a min-norm control input for the disturbed system, assuming the value of is not accessible

 minu∈RmuTu (13) s.t. ∂V(t,x)∂t+∂V(t,x)∂x(f(t,x)+g(t,x)u+gw(t,x)w)≤wTw,∀w∈{w∈Rw|wTw≤α}.

For brevity of notation, define , . The constraint in QP (13) can then be restated as

 maxw∈{w∈Rw|wTw≤α}(−wTw+cTw+d)≤0. (14)

Solving it with the KKT condition, we have

 w∗=⎧⎨⎩√α√cTcc,cTc≥4α,12c,cTc<4α.

Substituting into optimization problem (13), we get two QPs for two cases. The formula of control law for disturbed systems is the solution to QPs, and it is the same as equation (4), whereas and are

 a(t,x):=∂V(t,x)∂xg(t,x), (15) b(t,x):= ⎧⎨⎩∂V∂t+∂V∂xf(t,x)+√αcTc−α,cTc≥4α,∂V∂t+∂V∂xf(t,x)+cTc4,cTc<4α.

## 7 Control Synthesis for Disturbed Systems with Bounded Uncertainties

Consider a system with both parametric uncertainties and disturbances

 ˙x(t)=f(t,x)+g(t,x)u+gw(t,x)w+gδ(t,x)δ. (16)

Again, we assume that lies in the bounded polytope , and slightly modifying constraint (F.1), we get the dissipation inequality for system (16), for all ,

 {x∈Rn∣∣∣∂V∂t+∂V∂xf(t,x)+∂V∂xgw(t,x)w+∂V∂xgδ(t,x)δ[i]≤wTw}⊇{x∈Rn∣∣∣∂V∂xg(t,x)=0, V(t,x)≤γ+R2q(t)},∀i=1,...,Nvertex. (H.1)

After a storage function is obtained, the control input is computed through the following QP

 minu∈RmuTu (17) s.t. ∂V(t,x)∂t+∂V(t,x)∂x(f(t,x)+g(t,x)u+gw(t,x)w +gδ(t,x)δ[i])≤wTw,∀w∈{w∈Rw|wTw≤α},∀i=1,...,Nvertex.

Define and . The constraint in QP (17) can be restated as

 maxw∈{w∈Rw|wTw≤α}(−wTw+cTw+d+ei)≤0,∀i=1,...,Nvertex,

which is equivalent to

 maxw∈{w∈Rw|wTw≤α}(−wTw+cTw+d+emax)≤0. (18)

Notice that constraints (14) and (18) has the same optimal solution . Substituting back into constraint (18), we have two QPs. The formula of control law is the solution to QPs, and it is the same as equation (4), whereas and are

 a(t,x):=∂V(t,x)∂xg(t,x), (19) b(t,x):= ⎧⎨⎩∂V∂t+∂V∂xf(t,x)+√αcTc−α+emax,cTc≥4α,∂V∂t+∂V∂xf(t,x)+cTc4+emax,cTc<4α.

## 8 Examples

A workstation with four 2.7 [GHz] Intel Core i5 64 bit processors and 8[GB] of RAM was used for performing all computations in the following examples. The SOS optimization problem is formulated and translated into SDP using the sum-of-square module in SOSOPT [17] on MATLAB, and solved by the SDP solver Mosek [2]. Table 1 shows the degree of polynomials we chose, and the computation time it took for each example.

### 8.1 Uncertain Two-State Example

Consider the following uncertain two-state dynamics from [10], where a parametric uncertainty enters the system linearly

 ˙x1=u, ˙x2=−x1+16x31δ−u, (20)

with the prior knowledge that .

The time horizon is chosen to be , and the target set is given to be , which is shown as the blue circle in Figure 1. for the uncertain system is shown with the green curve in Figure 1, and the brown curve is for the system with set to be , i.e. for the system without uncertainty. The three trajectories are simulations of system (20), with the control law defined by equations (8)(10) and uncertain parameter drawn from the uniform distribution on at each time step,

### 8.2 Dubin’s Car

Consider Dubin’s car [7], a multi-input system

 ˙a=vcos(θ), ˙b=vsin(θ), ˙θ=ω,

with states : position, : position, : yaw angle and control inputs : turning rate, : forward speed. By a change of coordinates, it can be transformed into polynomial dynamics [6]

 ˙x1=u1, ˙x2=u2, ˙x3=x2u1−x1u2,

with , , , and , . Assume time horizon , and target set . Figure 2 show the slices of sets with , , and , respectively.

### 8.3 Cart-pole Example

The polynomial dynamics for cart-pole is from [23]

 ⎡⎢ ⎢ ⎢⎣˙x1˙x2˙x3˙x4⎤⎥ ⎥ ⎥⎦=⎡⎢ ⎢ ⎢ ⎢⎣x3x4f3(x2,x4)f4(x2,x4)⎤⎥ ⎥ ⎥ ⎥⎦+⎡⎢ ⎢ ⎢ ⎢⎣00g3(x2)g4(x2)⎤⎥ ⎥ ⎥ ⎥⎦u

with

 f3(x2,x4) =0.11707x52+0.03591x32x24−1.6032x32−0.17201x2x24+3.0313x2, f4(x2,x4) =0.24902x52+0.13049x32x