Lagrangian Approximations for Stochastic Reachability of a Target Tube

# Lagrangian Approximations for Stochastic Reachability of a Target Tube

## Abstract

In this paper we examine how Lagrangian techniques can be used to compute underapproximations and overapproximation of the finite-time horizon, stochastic reach-avoid level sets for discrete-time, nonlinear systems. This approach is applicable for a generic nonlinear system without any convexity assumptions on the safe and target sets. We examine and apply our methods on the reachability of a target tube problem, a more generalized version of the finite-time horizon reach-avoid problem. Because these methods utilize a Lagrangian (set theoretic) approach, we eliminate the necessity to grid the state, input, and disturbance spaces allowing for increased scalability and faster computation. The methods scalability are currently limited by the computational requirements for performing the necessary set operations by current computational geometry tools. The primary trade-off for this improved extensibility is conservative approximations of actual stochastic reach set. We demonstrate these methods on several examples including the standard double-integrator, a chain of integrators, and a 4-dimensional space vehicle rendezvous docking problem.

## 1 Introduction

Reach-avoid analysis is an established verification tool that provides formal guarantees of both safety (by avoiding unsafe regions) and performance (by reaching a target set). Because of these guarantees, it is often used in systems that are safety-critical or expensive, such as space systems [1], avionics [2, 3], biomedical systems [4], and other applications [5, 6, 7]. The reach-avoid set is the set of states for which there exists a control that enables the state trajectory to reach a target at some finite time horizon, , while remaining within a safe set (avoiding an unsafe set) for all instants in the time horizon. In a probabilistic system, satisfaction of the reach-avoid objective is accomplished stochastically. A dynamic programming-based solution characterizes the optimal value function, a function that assigns to each initial state the optimal probability of achieving the reach-avoid objective [8]. An appropriate level set of this value function provides the stochastic reach-avoid level set, the set of states for which probabilistic success of the reach-avoid objective is assured with at least the desired likelihood.

The theoretical framework for the probabilistic reach-avoid calculation uses dynamic programming [7, 8], and, hence, is computationally infeasible for even moderate-sized systems due to the gridding of not only the state-space, but also of the input and disturbance spaces [9]. Alternatives to dynamic programming include approximate dynamic programming [6, 10, 11], Gaussian mixtures [11], particle filters [1, 6], and convex chance-constrained optimization [1, 5]. These methods have been applied to systems that are at most 10-dimensional, at high memory and computational costs [6]. Further, since an analytical expression of the value function is not accessible, stochastic reach-avoid level sets can be computed only up to the accuracy of the gridding. Recently, a Fourier transform-based approach has provided greater scalability verifying LTI systems of dimension up to [12, 13]. In [13] the researchers established a set of sufficient conditions in which the stochastic reach-avoid set is convex and compact, enabling scalable polytopic underapproximation. However, this approach relies on numerical quadrature and is restricted to verifying the existence of open-loop controllers.

For deterministic systems (that is, systems without a disturbance input but with a control input) or systems with disturbances that come from a bounded set, Lagrangian methods for computing reachable sets are popular because they do not rely on a grid and can be computed efficiently for high-dimensional systems [4, 14, 15]. Rather than gridding the system, Lagrangian methods compute reachable sets through operations on sets, e.g. intersections, Minkowski summation, unions, etc. Thus, Lagrangian methods rely on computational geometry, whose scalability depends on the set representation and the computational difficulty of the operations used [14]. For example, sets that are represented as either vertex or facet polyhedra typically are limited by the need to solve the vertex-facet enumeration problem. Common set representations and relevant toolboxes for their implementation are: polyhedrons (MPT [16]), ellipsoids (ET [17]), zonotopes [18] (CORA [19]), star representation (HyLAA [20]), and support functions [21].

In this paper, we describe recursive techniques to obtain an under and an overapproximation of the probabilistic reach-avoid level set using Lagrangian methods. The underapproximation can be theoretically posed as the solution to the reachability of a target tube problem [22, 23, 24], originally framed to compute reachable sets of discrete-time controlled systems with bounded disturbance sets. Motivated by the scalability of the Lagrangian method proposed in [4, 25] for viability analysis in deterministic systems (that is, systems without a disturbance input but with a control input), we seek a similar approach to compute the underapproximation via tractable set theoretic operations. We originally demonstrated these methods in [26] unifying these approaches for the terminal-time reach-avoid problem. The reachability of a target tube problem is, however, a more generalized framework than the terminal-time reach-avoid problem. Hence, we extend [26] to the reachability of a target tube problem and additionally describe a recursive method for computing an overapproximation to the stochastic reach-avoid level set. Borrowing notation and terminology from [27] we call these under and overapproximations disturbance minimal and disturbance maximal reach sets, respectively.

The disturbance minimal reach set (underapproximation) is the set of initial states of the system for which there exists a disturbance that will remain in the target tube despite the worst case choice of a disturbance drawn from a bounded set , i.e. for all disturbances in . The disturbance maximal reach set (overapproximation) is the set of states for which there exists and input that will remain in the target tube given the best choice of a disturbance in a bounded set , i.e. exists a disturbance in . The original formulation [26] described the underapproximation for a single bounded disturbance set . Here we will also demonstrate that we can use many different bounded disturbance sets , , , to help reduce the conservativeness of the approximations. Using multiple bounded disturbance sets requires repeated computations of the disturbance minimal and maximal reach sets. However, because of the reduced computational time from using Lagrangian methods we can use multiple disturbance sets and still provide substantially faster computation.

The remainder of the paper is as follows: Section 2 provides the necessary preliminaries and describes the problem formulation. In Section 3, we establish sufficient conditions for the bounded disturbance sets such that the disturbance minimal and maximal reach sets are a subset and superset of the stochastic reach set, respectively. Section 4 details the recursions for computing the disturbance minimal and disturbance maximal reach sets, given a single sufficient bounded disturbance set. Section 5 describes how these approximations can be improved by using multiple bounded disturbance sets. We examine numerical methods to obtain these sufficient bounded disturbance sets in Section 6 and examine the numerical implementation challenges for computing the disturbance minimal and maximal reach sets in Section 7. Finally, we demonstrate our algorithm on selected examples in Section 8 and provide conclusions and directions of future work in Section 9.

## 2 Preliminaries

The following notation will be used throughout the paper. We denote the set of natural numbers, including zero, as , and discrete-time intervals with , for , . We will primarily use as our discrete-time index but will also use as necessary to provide clarity. The transposition of a vector is , and the concatenation of a discrete-time series of vectors is noted with a bar above the variable and a subscript with the indices, i.e. , for . The -dimensional identity matrix is noted as .

The Minkowski summation of two sets is ; the Minkowski difference (or Pontryagin difference) of two sets is . For , the indicator function corresponding to a set is where if and is zero otherwise; the Cartesian product of the set with itself times is .

### 2.1 Lower semi-continuity

Lower semi-continuous (l.s.c.) functions are functions whose sub-level sets are closed [28, Definition 7.13]. Lower semi-continuous functions have very useful properties with respect to optimization.

• Indicator function of a closed set is l.s.c.: Given a closed set , the function is l.s.c.1.

• Addition and l.s.c.: Given two l.s.c. functions such that , and , the function is l.s.c. over  [29, Ex. 1.39].

• Semicontinuity under composition: Given a l.s.c. functions and a continuous function , the function is l.s.c. over  [29, Ex. 1.40].

• Supremum of a l.s.c. function: Given such that is l.s.c., then is l.s.c. [28, Prop. 7.32(b)].

• Infimum of a l.s.c. function: Given such that is l.s.c. and is compact, then is l.s.c.. Additionally, there exists a Borel-measurable such that  [28, Prop. 7.33].

### 2.2 System description

Consider a discrete-time, nonlinear, time-varying dynamical system with an affine disturbance,

 xk+1=fk(xk,uk)+wk (1)

with state , input , disturbance , and a function . We denote the origin of as and assume without loss of generality due to the affine nature of the disturbance. We also consider the discrete-time LTV system

 xk+1=Akxk+Bkuk+wk (2)

with and . We assume is non-singular, which holds true especially for discrete-time systems that arise from sampling continuous-time systems. We will consider the cases where is uncertain (non-stochastic disturbance drawn from a bounded set) and stochastic (random vector drawn from a known probability density function). The discrete horizon length for the reachability of a target tube problem is marked as , .

### 2.3 Reachability of a target tube

As in [22, 30], we define a target tube , as an indexed collection of subsets of the state space, , for all . We assign attributes of the tube that are typically given to sets, e.g. closed, bounded, compact, convex, etc., if and only if every set in the target tube has those properties. For example, we say that the target tube is closed if and only if is closed for all .

We will denote an admissible state-feedback law using , the set of admissible state-feedback laws with , and the set of admissible control policies using .

The reachability of a target tube problem is concerned with determining the set of states at time for which there exists a control policy, , such that, for the system (1) with no disturbance, i.e. , the trajectory will remain in the target tube. Formally,

 Reachk(TN)={xk∈Tk: ∀t∈N[k,N−1],∃νt∈F, xt+1∈Tt+1} (3)

This set can be seen graphically in Figure LABEL:fig:uncertain-reach-sets.

The motivation for using target tubes is to allow for more versatility in the problem definition. We can unite the standard terminal-time reach-avoid problem—in which we desire to reach a target, , at time while remaining in a safe set, , for —and the reachability of target tubes with . The viability problem can equivalently be subsumed with . However many interesting problems fit outside of the standard terminal-time reach-avoid or viability problems.

### 2.4 Stochastic reachability of a target tube

The basic reachability of a target tube problem (3) is described with no disturbance. Here we consider (1), with a stochastic disturbance . We assume is an -dimensional random vector defined in the probability space ; denotes the minimal -algebra associated with the random vector . We assume the disturbance is absolutely continuous with a probability density function and the disturbance process is an independent and identically distributed (i.i.d.) random process.

We will denote an admissible universally-measurable state-feedback law as and the set of Markov control policies as . Since no measurability restrictions were imposed on the admissible feedback laws in Section 2.3, . Given a Markov policy and initial state , the concatenated state vector for the system (1) is a random vector defined in the probability space . The probability measure is induced from the sequence of random variables defined on the probability space , as seen in [7, Sec. 2]. For , we denote the probability space associated with the random vector as .

For stochastic reachability analysis, we are interested in the maximum likelihood that the system (1) starting at an initial state will stay within the target tube using a Markov policy. The maximum likelihood and the optimal Markov policy can be determined as the solution to the optimization problem, [7, Sec. 4]

 supπ∈MEN,π¯x[N∏i=01Ti(xi)]. (4)

Let the optimal solution to problem (4) be , the maximal Markov policy in the terminal sense [7, Def. 10]. A dynamic programming approach was presented in [7] to solve problem (4), along with sufficient conditions for the existence of a maximal Markov policy. This approach computes value functions for ,

 V∗N(x) =1TN(x) (5a) V∗k(x) =1Tk(x)∫XV∗k+1(y)Q(dy|x,u) =1Tk(x)Pxk,π∗¯x[k+1,N](N⋂t=k+1{xt∈Tt}∣∣xk=x) (5b)

where is the one-step transition kernel [7]. For notational convenience we will often simplify the condition in probability statements and simply write as .

By definition, the optimal value function provides the maximum likelihood of ensuring that the system (1) stays within the target tube when initialized to the initial state . Note that the problem discussed in [7] was specifically a reach-avoid problem, but it may be easily extended to the more general case discussed here yielding the recursion (5).

For and , the stochastic -level reach set,

 Lk(TN,α) ={y∈Tk: supπ∈MPxk,π¯x[k+1,N](N⋂t=k+1{¯xt∈Tt}∣∣y)≥α}, ={y∈Tk:V∗k(x)≥α} (6)

is the set of states that achieve the reachability of a target tube objective with a minimum probability . This set is shown graphically in Figure LABEL:fig:stoch-reach-set. For brevity we will often refer to the stochastic -level reach set simply as the stochastic reach set.

### 2.5 Disturbance minimal and maximal reachability of a target tube

Now, consider the nonlinear system, (1), with an uncertain disturbance drawn from a bounded disturbance set. Two types of phenomena are of interest: the disturbance minimal reachability of a target tube and the disturbance maximal reachability of a target tube. In this subsection, we will treat the disturbance as a non-stochastic uncertainty. For these problems we also consider admissible state-feedback policies, .

For the disturbance minimal reachability problem, we are interested in the existence of a feedback controller for which the system (1) will remain in a target tube, , despite the worst possible choice of the disturbance . The set of states for which may be driven by such a control policy is the -time disturbance minimal reach set

 Reach♭k(TN,E)={xk∈Tk: ∀t∈N[k,N−1],∃νt∈F, ∀wt∈E,xt+1∈Tt+1} (7)

The problem of disturbance maximal reachability of the target tube concerns the system (1) with . Here, we seek a feedback controller ensures that the system stays within the target tube, under the best possible choice for the disturbance . The set of states for which may be driven by such a control policy is the -time disturbance maximal reach set,

 Reach♯k(TN,O)={xk∈Tk: ∀t∈N[k,N−1],∃νt∈F, ∃wt∈O,xt+1∈Tt+1}. (8)

The disturbance minimal and maximal reach sets, (7) and (8), can be seen graphically in Figure LABEL:fig:uncertain-reach-sets.

For brevity, when the discrete-time index is apparent, we will refer to the -time disturbance minimal and maximal reach sets more succinctly as the disturbance minimal and disturbance maximal reach sets. Note that the disturbance minimal reach set is equivalent to the well studied in reachability of target tube problem [22, 30], and is also known as the robust controllable set in model predictive control community [15].

### 2.6 Problem statements

This paper utilizes disturbance minimal and maximal reach sets to efficiently and scalably approximate the stochastic -level reach set. We achieve this by addressing the following problems:

###### Problem 1.

Given a value , characterize whose corresponding disturbance minimal and maximal reach sets respectively under and overapproximate the stochastic effective -level set (6), i.e., find such that .

###### Problem 1.a.

Characterize the sufficient conditions for which the optimal control policy corresponding to the disturbance minimal and maximal reach sets is a Markov control policy for the system (1).

Next, we discuss convex optimization-based techniques to construct and .

###### Problem 2.

Propose computationally efficient methods to compute that satisfy Problem 1.

We will also discuss the min-max and min-min problems that may be used to obtain these sets [30, Sec. 4.6.2].

###### Problem 3.

Construct Lagrangian-based recursions for the exact computation of the -time disturbance minimal and disturbance maximal reach sets for the system (1).

###### Problem 4.

Improve the approximations obtained via Problem 1 by using multiple disturbance subsets.

## 3 Lagrangian approximations for the stochastic effective level sets

In this section we will details how disturbance minimal and maximal reach sets are used to approximate stochastic reach sets. We will first detail the conditions upon which allow for the disturbance minimal reach set to be a conservative underapproximation, and then will detail the conditions for that ensures that the disturbance maximal reach set will provide an overapproximation. The theory in this section will require the conditions stated in the following assumption.

###### Assumption 1.

The target tube is closed, the input space is compact, and is continuous in .

### 3.1 Underapproximation of the stochastic α-level reach set

We will first address Problem 1.a via Theorem 1.

###### Theorem 1.

Under Assumption 1, there exists an optimal Markov policy associated with the set .

The proof for Theorem 1 is deferred to Section 4.4. The guarantee of the existence of an optimal Markov policy for the robust effective target set allows for the definition of the probability measure , which is essential for demonstrating the conditions upon that will make the disturbance minimal reach set a conservative approximation of the stochastic reach set, as will be shown in the following Proposition and Theorem.

###### Proposition 1.

Under Assumption 1, for every , if , then

 Pxk,π∗¯x[k+1,N](N⋂t=k+1{xt∈Tt}∣∣ y,¯w[k,N−1]∈EN−k)=1. (9)
###### Proof.

Theorem 1 ensures that the probability measure on the left-hand side of (9) exists. The equality is thus ensured by the definition of the disturbance minimal reach set, (7). ∎

###### Theorem 2.

Under Assumption 1, for some , , and such that for all , , then .

###### Proof.

We are interested in underapproximating as defined in (6). If , then Equation 10 follows from (5b) by Theorem 1, the law of total probability, and the definition of the robust effective target set (7)—which implies that . Equation (11) follows from (10) after ignoring the second term (which is non-negative). Simplifying (11) using Proposition 1 and the i.i.d. assumption of the disturbance process, we obtain

 V∗k(x) ≥P¯w[k,N−1](¯w[k,N−1]∈EN−k) =(Pw(wt∈E))N−k=α. (12)

Thus, if then by (6), implying . ∎

Theorem 2 characterizes conditions upon such that will be a conservative underapproximation of . This will allow for fast underapproximations of to be determined through Lagrangian computation of the robust effective target set. These methods will be further detailed in Section 4.

### 3.2 Overapproximation of the stochastic α-level reach set

We now move to the less common analysis of the overapproximation of the stochastic -level reach set.

###### Theorem 3.

Under Assumption 1, , some , and such that for all , , then .

###### Proof.

After characterizing and , we will show that, , for the given . This implies that , as desired.

Using DeMorgan’s law, we can write as in Equation 17. From the definition of (8),

 X∖Reach♯k(TN,O) ={xk∈X:∃t∈N[k,N−1],∀νt∈F, ∀wt∈O,xt+1∉Tt+1}. (13)

Hence, given , for any admissible control policies (), and specifically any Markov control policies () since ,

 Pxk,π¯x[k+1,N](N⋃t=k+1{¯xt∉Tt}∣∣ y,¯w[k,N−1]∈ON−k)=1 (14)

By the law of total probability, we have (15). Since , and the disturbance is i.i.d.,

 P¯w[k,N−1](¯w[k,N−1]∈ON−k)=1−α (16)

From (14), (15), and (16), we conclude that for every , , implying by (17). ∎

Note that because of the in (17) we can be assured that the measure in (14) is well defined. This contrasts with Theorem 2 for which we needed to Theorem 1 is necessary to ensure that the probability measure in (9).

We summarize the sufficient conditions put forward in Theorems 2 and 3 in Theorem 4 to achieve the desired approximation, which addresses Problem 1, i.e. characterizes the conditions upon and that ensure that the robust and augmented effective target sets will under and overapproximate , respectively.

###### Theorem 4.

Under Assumption 1, , and such that for all , and , then for .

###### Remark 1.

The bounded disturbance sets, , , which satisfy Theorem 4 are not unique.

We will describe later, in Section 6, methodologies for computing these bounded sets. As noted in Remark 1, these sets are not unique. However, the algorithms that will be presented will provide efficient means for generating and for Gaussian and non-Gaussian disturbances alike.

## 4 Lagrangian methods for computation of disturbance minimal and maximal reach sets

In this section we presume that we have bounded sets and which satisfy the conditions given in Theorem 4. We now demonstrate convenient backward recursions to compute (7) and (8) using set operations.

For these recursions we will need to define, as in [4, 26], the unperturbed, one-step backward reach set from a set as . Formally, for a nonlinear system (1)

 R1,k(S) ≜{x−∈X:∃u∈U,∃y∈S, y=fk(x−,u)} ={x−∈X:F1,k(x−)∩S≠∅} (18)

For an LTV system (2), these can be written as

 R1,k(S) =A−1k(S⊕(−BkU)). (19)

### 4.1 Recursion for disturbance minimal reach set Reach♭k(TN,E)

The following theorem details the backward recursion for the disturbance minimal reach set.

###### Theorem 5.

For the system given in (1), the -time disturbance minimal reach set can be computed using the recursion for :

 Reach♭N(TN,E) =TN (20) Reach♭k(TN,E) =Tk∩R1,k(Reach♭k+1(TN,E)⊖E) (21)
###### Proof.

We prove this by induction. Starting with the base case, ,

 TN−1∩ R1,N−1(Reach♭N(TN,E)⊖E) =TN−1∩R1,N−1(TN⊖E) (22) =TN−1∩{x−∈X:∃ν∈F,∀wN−1∈E, fN−1(x−,ν(x−))+wN−1∈TN} (23) ={xN−1∈TN−1:∃νN−1∈F,∀wN−1∈E, fN−1(xN−1,ν(xN−1))+wN−1∈TN} (24) ={xN−1∈TN−1:∃νN−1∈F,∀wN−1∈E, xN∈TN} (25) =Reach♭N−1(TN,E). (26)

For any ,

 Tk∩ R1,k(Reach♭k+1(TN,E)⊖E) ={xk∈Tk:∃νk∈F,∀wk∈E, fk(xk,ν(xk))+wk∈Reach♭k+1(TN,E)} (27) ={xk∈Tk:∃νk∈F,∀wk∈E, xk+1∈Reach♭k+1(TN,E)}. (28)

By expanding with its definition (7),

 ={xk∈Tk:∀t∈N[k,N−1],∃νt∈F, ∀wt∈E,xt+1∈Tt+1} =Reach♭k(TN,E) (29)

which completes the proof. ∎

In systems for which the disturbance is not affine, i.e.

 xk+1=fk(xk,uk,wk)

the recursion in Theorem 5 can be altered by replacing with , where is the predecessor set [24]. Since many systems can be written in the form (1), i.e. with affine disturbances, we do not formally prove the recursion using predecessor sets.

### 4.2 Recursion for disturbance maximal reach set Reach♯k(TN,O)

We follow a similar methodology as in the previous section to establish a recursion for computing the disturbance maximal reach set.

###### Theorem 6.

For the system given in (1), the -time disturbance maximal reach set can be computed using the recursion for :

 Reach♯N(TN,O) =TN (30) Reach♯k(TN,O) =Tk∩R1,k(Reach♯k+1(TN,O)⊕(−O)) (31)
###### Proof.

Again, we prove this by induction. Starting with the base case, ,

 TN−1∩ R1,N−1(Reach♯N(TN,O)⊕(−O)) =TN−1∩R1,N−1(TN⊕(−O)) (32) =TN−1∩{x−∈X:∃ν∈F,∃wN−1∈O fN−1(x−,ν(x−))+wN−1∈TN} (33) ={xN−1∈TN−1:∃νN−1∈F,