HamiltonJacobi formulation for ReachAvoid Differential Games
Abstract.
A new framework for formulating reachability problems with competing inputs, nonlinear dynamics and state constraints as optimal control problems is developed. Such reachavoid problems arise in, among others, the study of safety problems in hybrid systems. Earlier approaches to reachavoid computations are either restricted to linear systems, or face numerical difficulties due to possible discontinuities in the Hamiltonian of the optimal control problem. The main advantage of the approach proposed in this paper is that it can be applied to a general class of target hitting continuous dynamic games with nonlinear dynamics, and has very good properties in terms of its numerical solution, since the value function and the Hamiltonian of the system are both continuous. The performance of the proposed method is demonstrated by applying it to a two aircraft collision avoidance scenario under target window constraints and in the presence of wind disturbance. Target Windows are a novel concept in air traffic management, and represent spatial and temporal constraints, that the aircraft have to respect to meet their schedule.
1. Introduction
Reachability for continuous and hybrid systems has been an important topic of research in the dynamics and control literature. Numerous problems regarding safety of air traffic management systems [1], [2], flight control [3], [4], [5] ground transportation systems [6], [7], etc. have been formulated in the framework of reachability theory. In most of these applications the main aim was to design suitable controllers to steer or keep the state of the system in a ”safe” part of the state space. The synthesis of such safe controllers for hybrid systems relies on the ability to solve target problems for the case where state constraints are also present. The sets that represent the solution to those problems are known as capture basins [8]. One direct way of computing these sets was proposed in [9], [10], and was formulated in the context of viability theory [8]. Following the same approach, the authors of [11], [12] formulated viability, invariance and pursuitevasion gaming problems for hybrid systems and used nonsmooth analysis tools to characterize their solutions. Computational tools to support this approach have been already developed by [13].
An alternative, indirect way of characterizing such problems is through the level sets of the value function of an appropriate optimal control problem. By using dynamic programming, for reachability/invariant/viability problems without state constraints, the value function can be characterized as the viscosity solution to a first order partial differential equation in the standard HamiltonJacobi form [14], [15], and [16]. Numerical algorithms based on level set methods have been developed by [17], [18], have been coded in efficient computational tools by [16], [19] and can be directly applied to reachability computations.
In the case where state constraints are also present, this target hitting problem is the solution to a reachavoid problem in the sense of [1]. The authors of [1], [20] developed a reachavoid computation, whose value function was characterized as a solution to a pair of coupled variational inequalities. In [19], [21], [22] the authors proposed another characterization, which involved only one HamiltonJacobi type partial differential equation together with an inequality constraint. These methods are hampered from a numerical computation point of view by the fact that the Hamiltonian of the system is in general discontinuous [20].
In [23], a scheme based on ellipsoidal techniques so as to compute reachable sets for control systems with constraints on the state was proposed. This approach was restricted to the class of linear systems. In [24], this approach was extended to a list of interesting target problems with state constraints. The calculation of a solution to the equations proposed in [23], [24] is in general not easy apart from the case of linear systems, where duality techniques of convex analysis can be used.
In this paper we propose a new framework of characterizing reachavoid sets of nonlinear control systems as the solution to an optimal control problem. We consider the case where we have competing inputs and hence adopt the gaming formulation proposed in [15]. We first restrict our attention to a specific reachavoid scenario, where the objective of the control input is to make the states of the system hit the target at the end of our time horizon and without violating the state constraints, while the disturbance input tries to steer the trajectories of the system away from the target. We then generalize our approach to the case where the controller aims to steer the system towards the target not necessarily at the terminal, but at some time within the specified time horizon. Both problems could be treated as pursuitevasion games, and for a worst case setting we define a value function similar to [24] and prove that it is the unique continuous viscosity solution to a quasivariational inequality of a form similar to [25], [26]. The advantage of this approach is that the properties of the value function and the Hamiltonian (both of them are continuous) enable us using existing tools to compute the solution of the problem numerically.
To illustrate our approach, we consider a reachavoid problem that arises in the area of air traffic management, in particular the problem of collision avoidance in the presence of 4D constraints, called Target Windows. Target Windows (TW) are spatial and temporal constraints and form the basis of the CATS research project [27], whose aim is to increase punctuality and predictability during the flight. In [28] a reachability approach of encoding TW constraints was proposed. We adopt this framework and consider a multiagent setting, where each aircraft should respect its TW constraints while avoiding conflict with other aircraft in the presence of wind. Since both control and disturbance inputs (in our case the wind) are present, this problem can be treated as a pursuitevasion differential game with state constraints, which are determined dynamically by performing conflict detection.
In Section II we pose two reachavoid problems for continuous systems with competing inputs and state constraints, and formulate them in the optimal control framework. Section III provides the characterization of the value functions of these problems as the viscosity solution to two variational inequalities. In Section IV we present an application of this approach to a two aircraft collision avoidance scenario with realistic data. Finally, in Section V we provide some concluding remarks and directions for future work.
2. Differential games and ReachAvoid problems
2.1. Differential game problem formulation
Consider the continuous time control system , and an arbitrary time horizon .
with , , , and .
Let , denote the set of Lebesgue measurable functions from the interval to U, and V respectively. Consider also two functions , to be used to encode the target and state constraints respectively,
Assumption 1. and are compact. , and are bounded and Lipschitz continuous in x and continuous in u and v.
Under Assumption the system admits a unique solution for all , and . For this solution will be denoted as
(1) 
Let be a bound such that for all and and for all ,
Let also and be such that
In a game setting it is essential to define the information patterns that the two players use. Following [29], [15] we restrict the first player to play nonanticipative strategies. A nonanticipative strategy is a function such that for all and for all , if for almost every , then for almost every . We then use to denote the class of nonanticipative strategies.
Consider the sets , related to the level sets of the two bounded, Lipschitz continuous functions and respectively. For technical purposes assume that is closed whereas is open. Then and could be characterized as
2.2. ReachAvoid at the terminal time
Consider now a closed set that we would like to reach while avoiding an open set . One would like to characterize the set of the initial states from which trajectories can start and reach the set at the terminal time without passing through the set over the time horizon . To answer this question on needs to determine whether there exists a choice of such that for all , the trajectory satisfies and for all .
The set of initial conditions that have this property is then
(2)  
Now introduce the value function
(3) 
can be thought of as the value function of a differential game, where is trying to minimize, whereas is trying to maximize the maximum between the value attained by at the end of the time horizon and the maximum value attained by along the state trajectory over the horizon . Based on [14], [15] and [25], we will show that the value function defined by is the unique viscosity solution of the following quasivariational inequality.
(4) 
with terminal condition .
It is then easy to link the set of to the level set of the value function defined in .
Proposition 1. .
Proof.
if and only if . Equivalently, there exists a strategy such that for all , . The last statement is equivalent to there exists a such that for all , and . Or in other words, there exists a such that for all , and for all . ∎
2.3. ReachAvoid at any time
Another related problem that one might need to characterize is the set of initial states from which trajectories can start, and for any disturbance input can reach the set not at the terminal, but at some time within the time horizon , and without passing through the set until they hit . In other words, we would like to determine the set
(5)  
Based on [30], define the augmented input as and consider the dynamics
(6) 
Let denote the solution of the augmented system, and define , and similarly to the previous case. Following [30] for every the pseudotime variable is given by
(7) 
Consider to be almost an inverse of in the sense that . In [30], was defined as the limit of a convergent sequence of functions, and it was shown that
(8) 
for any . Based on the analysis of [30], equation implies that the trajectory of the augmented system visits only the subset of the states visited by the trajectory of the original system in the time interval .
Define now the value function
One can then show that is related to the set .
Proposition 2. For , .
The proof of this proposition is given in Appendix A.
3. Characterization of the value function
3.1. Basic properties of V
We first establish the consequences of the principle of optimality for .
Lemma 1. For all and all :
(9) 
Moreover, for all .
The proof for the second part is straightforward and follows from the definition of . The proof for the first part is given in Appendix B.
We now show that is a bounded, Lipschitz continuous function.
Lemma 2. There exists a constant such that for all :
The proof of this Lemma is given in Appendix B.
3.2. Variational inequality for
We now introduce the Hamiltonian , defined by
Lemma 3. There exists a constant such that for all , and all :
The proof of this fact is straightforward (see [14], [15] for details). We are now in a position to state and prove the following Theorem, which is the main result of this section.
Theorem 1. is the unique viscosity solution over of the variational inequality
with terminal condition .
Proof.
Uniqueness follows from Lemma 2, Lemma 3 and [25]. Note also that by definition of the value function we have . Therefore it suffices to show that

For all and for all smooth : , if attains a local maximum at , then

For all and for all smooth : , if attains a local minimum at , then
The case is automatically captured by [31].
Part 1.
Consider an arbitrary and a smooth such that has a local maximum at . Then, there exists such that for all with
We would like to show that
Since by Lemma 1 , either or, . For the former the claim holds, whereas for the latter it suffices to show that there exists such that for all
For the sake of contradiction assume that for all there exists such that for some
Since is smooth and is continuous, then based on [15] we have that
for all and some , where denotes a ball centered at with radius . Because V is compact there exist finitely many distinct points , and such that and for
Define by setting for , if . Then
Since is smooth and is continuous, there exists such that for all with
Finally, define by for all . It is easy to see that is now nonanticipative and hence . So for all and all such that ,
By continuity, there exists such that for all . Therefore, for all
Let be such that
Case 1.1: If , then for we have
(10) 
Then by the dynamic programming argument of Lemma we have:
We can choose such that
and set . Since for all we have that
Hence
Since holds for all , it will also hold for , and hence the last argument establishes a contradiction.
Case 1.2: If then for we have that for all
Since by Lemma 1
then if
we can choose such that
which establishes a contradiction.
If
then we can choose such that
or equivalently , since .
Based on our initial hypothesis that , there exists a such that . If we take we establish a contradiction.
Part 2.
Consider an arbitrary and a smooth such that has a local minimum at . Then, there exists such that for all with
We would like to show that
Since it suffices to show that . This implies that for all there exists a such that
For the sake of contradiction assume that there exists such that for all there exists such that
Since is smooth, there exists such that for all with
Hence, following [15], for and any
By continuity, there exists such that for all . Therefore, for all
But by the dynamic programming argument of Lemma we can choose a such that