Optimal Sampling-Based Motion Planning under Differential Constraints: the Drift Case with Linear Affine Dynamics

Optimal Sampling-Based Motion Planning under Differential Constraints: the Drift Case with Linear Affine Dynamics

Edward Schmerling Edward Schmerling is with the Institute for Computational & Mathematical  Engineering, Stanford University, Stanford, CA 94305, schmrlng@stanford.edu.    Lucas Janson Lucas Janson is with the Department of Statistics, Stanford University, Stanford, CA 94305, ljanson@stanford.edu.Marco Pavone is with the Department of Aeronautics and Astronautics, Stanford University, Stanford, CA 94305, pavone@stanford.edu.    Marco Pavone This work was supported by an Early Career Faculty grant from NASA’s Space Technology Research Grants Program (Grant NNX12AQ43G).
Abstract

In this paper we provide a thorough, rigorous theoretical framework to assess optimality guarantees of sampling-based algorithms for drift control systems: systems that, loosely speaking, can not stop instantaneously due to momentum. We exploit this framework to design and analyze a sampling-based algorithm (the Differential Fast Marching Tree algorithm) that is asymptotically optimal, that is, it is guaranteed to converge, as the number of samples increases, to an optimal solution. In addition, our approach allows us to provide concrete bounds on the rate of this convergence. The focus of this paper is on mixed time/control energy cost functions and on linear affine dynamical systems, which encompass a range of models of interest to applications (e.g., double-integrators) and represent a necessary step to design, via successive linearization, sampling-based and provably-correct algorithms for non-linear drift control systems. Our analysis relies on an original perturbation analysis for two-point boundary value problems, which could be of independent interest.

I Introduction

A key problem in robotics is how to compute an obstacle-free and dynamically-feasible trajectory that a robot can execute [1]. The problem, in the simplest setting where the robot does not have kinematic/dynamical (in short, differential) constraints on its motion and the problem becomes one of finding an obstacle-free “geometric” path, is reasonably well-understood and sound algorithms exist for most practical scenarios. However, robotic systems do have differential constraints (e.g., momentum), which most often cannot be neglected. Despite the long history of robotic motion planning, the inclusion of differential constraints in the planning process is currently considered an open challenge [2], in particular with respect to guarantees on the quality of the obtained solution and class of dynamical systems that can be addressed. Arguably, the most common approach in this regard is a decoupling approach, whereby the problem is decomposed in steps of computing a collision-free path (neglecting the differential constraints), smoothing the path to satisfy the motion constraints, and finally reparameterizing the trajectory so that the robot can execute it [2]. This approach, while oftentimes fairly computationally efficient, presents a number of disadvantages, including computation of trajectories whose cost (e.g., length or control effort) is far from the theoretical optimum or even failure in finding any solution trajectory due to the decoupling scheme itself [2]. For these reasons, it has been advocated that there is a need for planning algorithms that solve the differentially-constrained motion planning problem (henceforth referred to as the DMP problem) in one shot, i.e., without decoupling.

Broadly speaking, the DMP problem can be divided into two categories: (i) DMP for driftless systems, and (ii) DMP for drift systems. Intuitively, systems with drift constraints are systems where from some states it is impossible to stop instantaneously (this is typically due to momentum). More rigorously, a system is a drift system if for some state there does not exist any admissible control such that [1]. For example the basic, yet representative, double integrator system (modeling the motion of a point mass under controlled acceleration) is a drift system. From a planning perspective, DMP for drift systems is more challenging than its driftless counterpart, due, for example, to the inherent lack of symmetries in the dynamics and the presence of regions of inevitable collision (that is, sets of states from which obstacle collision will eventually occur, regardless of applied controls) [1].

To date, the state of the art for one-shot solutions to the DMP problem (both for driftless and drift systems) is represented by sampling-based techniques, whereby an explicit construction of the configuration space is avoided and the configuration space is probabilistically “probed” with a sampling scheme [1]. Arguably, the most successful algorithm for DMP to date is the rapidly-exploring random tree algorithm (RRT) [3], which incrementally builds a tree of trajectories by randomly sampling points in the configuration space. However, the RRT algorithm lacks optimality guarantees, in the sense that one can prove that the cost of the solution returned by RRT converges to a suboptimal cost as the number of sampled points goes to infinity, almost surely [4]. An asymptotically-optimal version of RRT for the geometric (i.e., without differential constraints) case has been recently presented in [4]. This version, named RRT, essentially adds a rewiring stage to the RRT algorithm to counteract its greediness in exploring the configuration space. Prompted by this result, a number of works have proposed extensions of RRT to the DMP problem [5, 6, 7, 8, 9], with the goal of retaining the asymptotic optimality property of RRT. Care must be taken in arguing optimality for drift systems in particular, as the control asymmetry requires a consideration of both forward-reachable and backward-reachable trajectory approximations. Even in the driftless case, the matter of assessing optimality is quite subtle, and hinges upon a careful characterization of a system’s locally reachable sets in order to ensure that a planning algorithm examines “enough volume” in its operation, and thus enough sample points, to ensure asymptotic optimality [10]. Another approach to asymptotically optimal DMP planning is given by STABLE SPARSE RRT which achieves optimality through random control propagation instead of connecting sampled points using a steering subroutine [11]. This paper, like the RRT variations, is based on a steering function, although it may be considered less general, as it is our view that leveraging as much knowledge as possible of the differential constraints while planning is necessary for the goal of planning in real-time. In our related work [10] we provide a theoretical framework to study optimality guarantees of sampling-based algorithms for the DMP problem by focusing on driftless control-affine dynamical systems of the form . While this model is representative for a large class of robotic systems (e.g., mobile robots with wheels that roll without slipping and multi-fingered robotic hands), it is of limited applicability in problems where momentum (i.e., drift) is a key feature of the problem setup (e.g., for a spacecraft or a helicopter).

Statement of Contributions: The objective of this paper is to provide a theoretical framework to study optimality guarantees of sampling-based algorithms for the DMP problem with drift. Specifically, as in [9], we focus on linear affine systems of the form

where and are the configuration and control spaces, respectively, and it is of interest to find an obstacle-free trajectory that minimizes the mixed time/energy criterion

where is a positive definite matrix that weights control energy expenditure versus traversal time. Henceforth, we will refer to a DMP problem involving linear affine dynamics and a mixed time/energy cost criterion as Linear Quadratic DMP (LQDMP). The LQDMP problem is relevant to applications for two main reasons: (i) it models the “essential” features of a number of robotic systems (e.g., spacecraft in deep space, helicopters, or even ground vehicles), and (ii) its theoretical study forms the backbone for sampling-based approaches that rely on linearization of more complex underlying dynamics. From a theoretical and algorithmic standpoint, the LQDMP problem presents two challenging features: (i) dynamics are not symmetric [1], which makes forward and backward reachable sets different and requires a more sophisticated analysis of sampling volumes to prove asymptotic optimality, and (ii) not all directions of motion are equivalent, in the sense that some motions incur dramatically higher cost than others due to the algebraic structure of the constraints. Indeed, these are the very same challenges that make the DMP problem with drift difficult in the first place, and they make approximation arguments (e.g., those needed to prove asymptotic optimality) more involved. Fortunately, for LQDMP an explicit characterization for the optimal trajectory connecting two sampled points in the absence of obstacles is available, which provides a foothold to begin the analysis. Specifically, the contribution of this paper is threefold. First, we show that any trajectory in an LQDMP problem may be “traced” arbitrarily well, with high probability, by connecting randomly distributed points from a sufficiently large sample set covering the configuration space. We will refer to this property as probabilistic exhaustivity, as opposed to probabilistic completeness [1], where the requirement is that at least one trajectory is traced with a sufficiently large sample set. Second, we introduce a sampling-based algorithm for solving the LQDMP problem, namely the Differential Fast Marching Tree algorithm (DFMT), whose design is enabled by our analysis of the notion of probabilistic exhaustivity. In particular, we are able to give a precise characterization of neighborhood radius, an important parameter for many asymptotically optimal motion planners, in contrast with previous work on LQDMP [9]. Third, by leveraging probabilistic exhaustivity, we show that DFMT is asymptotically optimal. This analysis framework builds upon [10], and elements of our approach are inspired by [9]. We note that in [9], the authors present an excellent extension of RRT that successfully solves the LQDMP problem in simulations, even when extended to linearized systems. The asymptotic optimality claim, however, relies only on a near-neighbor set size argument: we aim to put the analysis of the LQDMP problem on more rigorous theoretical footing.

Organization: This paper is structured as follows. In Section II we formally define the DMP problem we wish to solve. In Section III we review known results about the problem of optimally connecting fixed initial and terminal states under linear affine dynamics with a quadratic cost function. Furthermore, we provide a simple, yet novel (to the best of our knowledge) asymptotic characterization of the spectrum of the weighted controllability Gramian, which is instrumental to our analysis. In Section IV we prove the aforementioned probabilistic exhaustivity property for drift systems with linear affine dynamics. In Section V we present the DFMT algorithm, and in Section VI we discuss its asymptotic optimality (together with a convergence rate characterization). Section VII contains proof-of-concept simulations. Finally, in Section VIII we discuss several features of our analysis, we draw some conclusions, and we discuss directions for future work.

Ii Problem Formulation

Let and be the configuration space and control space, respectively, of a robotic system. Within this space let us assume the dynamics of the robot are given by the linear affine system

(1)

where , , and are constants.

A tuple defines a dynamically feasible trajectory, alternatively path, if the state evolution and control input satisfy equation (1) for all . We define the cost of a trajectory by the function

(2)

where is symmetric positive definite, constant, and given. We may rewrite this cost function as , where , with the interpretation that this cost function penalizes both trajectory duration and control effort . The matrix determines the relative costs of the control inputs, as well as their costs relative to the duration of the trajectory. We denote this linear affine dynamical system with cost by .

Let be the obstacle region within the configuration space and consider the closed obstacle-free space . The starting configuration is an element of , and the goal region is an open subset of . The trajectory planning problem is denoted by the tuple . A dynamically feasible trajectory is collision-free if for all . A trajectory is said to be feasible for the trajectory planning problem if it is dynamically feasible, collision-free, , and .

Let be the set of all feasible paths. The objective is to find the feasible path with minimum associated cost. We define the optimal trajectory planning problem as follows:

LQDMP problem: Given a trajectory planning problem with cost function given by equation (2), find a feasible path such that . If no such path exists, report failure.

Our analysis will rely on two key sets of assumptions, relating, respectively, to the system and the problem-specific parameters .

Assumptions on system: We assume that the system is controllable, (i.e., the pair is controllable) [12] so that even disregarding obstacles there exist dynamically feasible trajectories between states.111This system controllability assumption is why we do not fold the constant drift term into the state . Also, we assume that the control space is unconstrained, i.e. , and that the cost weight matrix is symmetric positive definite, so that every control direction has positive cost. These assumptions will be collectively referred to as .

Assumptions on problem parameters: We require that the configuration space is a compact subset of so that we may sample from it. Furthermore, we require that the goal region has regular boundary, that is there exists such that for almost all , there exists with and , where denotes the Euclidean 2-norm ball. This requirement that the boundary of the goal region has bounded curvature almost everywhere ensures that a sampling procedure may expect to select points in the goal region near any point on the region’s boundary. We make requirements on the clearance of the optimal trajectory, i.e., its “distance” from [10]. For a given , the -interior of is the set of all states that are at least a Euclidean distance away from any point in . A collision-free path is said to have strong -clearance if its state trajectory lies entirely inside the -interior of . A collision-free path is said to have weak -clearance if there exists a path that has strong -clearance and there exists a homotopy , with and that satisfies the following three properties: (a) is a dynamically feasible trajectory for all , (b) , and (c) for all there exists such that has strong -clearance. Properties (a) and (b) are required since pathological obstacle sets may be constructed that squeeze all optimum-approximating homotopies into undesirable motion. In practice, however, as long as does not contain any passages of infinitesimal width, the fact that is controllable will allow every trajectory to be weak -clear. We claim that these assumptions about the problem parameters are mild, and can be regarded as “minimum” regularity assumptions.

All trajectories discussed in this paper are dynamically feasible unless otherwise noted. The symbol denotes the 2-norm, induced or otherwise. The asymptotic notations mean bounded above, bounded below, bounded both above and below, and asymptotically dominated, respectively.

Iii Optimal Control in the Absence of Obstacles

The goal of this section is twofold: to review results about two-point boundary value problems for linear affine systems, and to present a simple, yet novel asymptotic characterization of the spectrum of the controllability Gramian. Both results will be instrumental to our analysis of LQDMP.

Iii-a Two Point Boundary Value Problem

The material in this section is standard, we provide it to make the paper self-contained. Our presentation follows the treatment in [13, 9]. Specifically, this section is concerned with local steering between states in the absence of environment boundaries and obstacles. Given a start state and an end state , the two point boundary value problem (2BVP) is to find a trajectory between and that satisfies the system and minimizes its cost function (2). Denote this trajectory and its cost as and respectively:

Let us define the weighted controllability Gramian as the solution of the Lyapunov equation

which has the closed form expression

(3)

Under the assumptions (in particular, system (1) is controllable), we have that is symmetric positive definite for all . This fact allows us to define the weighted norm for :

Let be the zero input response of system (1), that is the solution of the differential equation

which has the closed form expression

(4)

Then for a fixed arrival time the optimal control policy for the fixed-time 2BVP is given by [13]:

(5)

which corresponds to the minimal cost (as a function of travel time )

(6)

The optimal connection time may be computed by minimizing (6) over . The state trajectory that evolves from this control policy may be computed explicitly as:

(7)

Let denote the concatenation of the trajectories between successive states .

Iii-B Small-Time Characterization of the Spectrum of the Controllability Gramian

We begin by briefly reviewing the concept of controllability indices222See [14, p. 431] or [12, p. 150] for a more detailed treatment. for a controllable system . Let denote the th column of . Consider searching the columns of the controllability matrix from left to right for a set of linearly independent vectors. This process is well-defined for a controllable pair since . The resulting set defines the controllability indices where and is called the controllability index of . The give a fundamental notion of how difficult a system is to control in various directions; indeed these indices are a property of the system invariant with respect to similarity transformation, e.g. permuting the columns of . We may also label the vectors of as in the order that they come up in . That is, and iff (note: ). Let be an orthogonalization of the ’s so that where and have the ’s and ’s as columns respectively, and is upper triangular.

Lemma III.1 (Small-Time Gramian Asymptotics).

Let the eigenvalues of be . Then as for .

Proof.

We apply the Courant-Fischer Theorem:

(8)

where , denotes a linear subspace of , and . Note that

because for all , by construction. Then

Making the identification in Equation (8) implies that ; to see that we note that any subspace of dimension cannot satisfy , as . ∎

Lemma III.1 has three immediate corollaries. The first upper bounds which bounds the local cost of motion in any direction. The second relates to the Euclidean norm through a norm-equivalence inequality. The third is a lower bound for the determinant of , a result that will prove useful for estimating the volumes of reachable sets.

Lemma III.2 (Small-time minimum eigenvalue of controllability Gramian).

Suppose that the pair has controllability index , then as , or, equivalently, .

Lemma III.3 (Norm Equivalence).

Suppose that the pair has controllability index , and consider the Cholesky factorization . Then for , and

where and .

Lemma III.4 (Small-time determinant of controllability Gramian).

Suppose that the pair has controllability indices , then as where .

Iv Probabilistic Exhaustivity

In this section we prove a key result characterizing random sampling schemes for the LQDMP problem: any feasible trajectory through the configuration space is “traced” arbitrarily well by connecting randomly distributed points from a sufficiently large sample set covering the configuration space. We will refer to this property as probabilistic exhaustivity. The same notion of probabilistic exhaustivity (clearly much stronger than the usual notion of probabilistic completeness) was introduced in the related paper [10] in the context of DMP for driftless systems. The result proven in that work does not carry over to the drift case as it relies on the metric inequality to bound the cost of approximate paths; the drift case lacks the control symmetry to make such estimates. Thus in order to prove probabilistic exhaustivity in the case of linear affine systems, we first provide a result analogous to the metric inequality characterizing the effect that perturbations of the endpoints of a path have on its cost and state trajectory. The idea, then, is that tracing waypoints may be selected as small perturbations of points along the trajectory to be approximated, provided the sample density is high enough.

Lemma IV.1 (Fixed-Time Local Trajectory Approximation).

Let , , , and denote . Consider bounded start and end state perturbations such that . Let be the optimal trajectory between the perturbed endpoints. Then for such that is sufficiently small, we have the cost bound

Additionally we may bound the geometric extent of :

for .

Proof.

Since is the optimum, regardless of the value of we have the upper bound

which expands as

where in the second line we have applied the (weighted) Cauchy-Schwarz inequality, and, in the last line, the fact that .

To bound , we first apply Cauchy-Schwarz:

Then integrating the system dynamics (1) yields

making use of a norm equivalence bound (Lemma III.3) in line two. We note that the asymptotic constants depend only on the fixed system dynamics . ∎

Motivated by Lemma IV.1, we define the perturbation ball

This set represents perturbations of with limited effects on both incoming and outgoing trajectories (depending on whether a point is viewed as an end state or start state perturbation respectively). We note that since is decreasing as increases, we have

(9)

To understand how often sample points of a planning algorithm will lie within , we lower bound its volume.

Remark IV.2 (Bounding Perturbation Ball Volume).

The inequality defines an ellipse with volume

where denotes the volume of the unit ball in . Given our asymptotic characterization of in Lemma III.4, there is a threshold and constant such that

for all .

To ensure that the term above does not become vanishingly small in application with Lemma IV.1, we also lower bound connection time in terms of connection cost.

Lemma IV.3 (Optimal Cost/Time Breakdown).

Let , , , and denote . For such that is sufficiently small,

Proof.

For fixed , consider the series expansion of the control effort term about . We claim that for some . The fact that follows from the fact that has a nonzero zeroth order term and as a consequence of Lemma III.1. We note that for general , however, it is almost certain that or one of the low-order (in ) terms of will have a component along the maximal eigenvector of , which may result in a series expansion term with up to . Then

is maximized at and we compute the ratio

as . The dominant term in the asymptotics is smallest when (corresponding to ); in particular we have for with sufficiently small.333We note that this bound does not depend on the actual values of and (in particular the constant ), but only their optimal connection cost.

Lemma IV.1 is a statement about local trajectory approximation. We now define what it means for a series of states to closely approximate a given global trajectory. Let be a dynamically feasible trajectory. Given a set of waypoints , we associate the trajectory . We consider the to -trace the trajectory if: (a) the cost of is bounded as , (b) for all , and (c) the maximum distance from any point of to is no more than , i.e. The combination of these three properties is what makes , if approximating a near-globally-optimal trajectory , amenable to recovery by the path planning algorithms we propose in the next section. In particular, (b) ensures that is the concatenation of uniformly local connections. In Theorem IV.6 we show that suitable waypoints may be found with high probability as a subset of a set of randomly sampled nodes, the proof of which requires the following two technical lemmas lower bounding the probability that a sample set will provide adequate coverage around a trajectory of interest. Let denote a set of points sampled independently and identically from the uniform distribution on .

Lemma IV.4 (Lemma IV.3, [10]).

Fix , , and let be disjoint subsets of with

for each . Let ; then the probability that more than an fraction of the sets contain no point of is bounded as:

Lemma IV.5 (Lemma IV.4, [10]).

Fix and let be subsets of , possibly overlapping, with

for each and some constant . Let ; then the probability that there exists a that does not contain a point of is bounded as:

The proofs of these two lemmas may be found in our related work [10]. As in that work and [15], our approach here for proving probabilistic exhaustivity proceeds by tiling the span of a path to be traced with two sequences of concentric perturbation balls – a sequence of “small” balls and a sequence of “large” balls. With high probability, all but a tiny fraction of the small balls will contain a point from the sample set (Lemma IV.4), and for any small balls that do not contain such a point we ensure that the corresponding large ball does (Lemma IV.5). We take these points as a sequence of waypoints which tightly follows the reference path with few exceptions, and never has a gap over any section of the reference path when it does deviate further.

Theorem IV.6 (Probabilistic exhaustivity).

Let be a system satisfying the assumptions and suppose is a dynamically feasible trajectory with strong -clearance, . Let , , and consider a set of sample nodes . Define , and consider the event that there exist waypoints which -trace , where

for a free parameter , and for some constant . Then, as , the probability that no such waypoint set exists is asymptotically bounded as

Proof.

Note that in the case we may pick to be the only waypoint and the result is trivial. Therefore assume . Make the identification , and fix sufficiently large so that and also:

(10)

Take to be points spaced along at cost intervals ; more precisely let , and for consider

Let be the first for which the set is empty; take . Note that by construction, we have .

We consider the sets and . In particular the time here is chosen so that, by Lemma IV.3, the optimal connection times between the satisfy . Applying the ball containment property (9) this means that for any such , for or . From Remark IV.2 and our choice of we have the volume bound

(11)

and similarly

(12)

for each . Combining equation (11) and Lemma IV.5, we have that the probability that there exists a that does not contain a sample point (i.e. ) is bounded as:

We note that the are disjoint (as long as ) since implies . Then we may combine equations (10) and (12), which together imply that the satisfy the condition of Lemma IV.4, to see that the probability that more than an fraction of the do not contain a sample point is bounded as:

Now, as long as neither of these possibilities holds (i.e. if every and at least a fraction of the contains a point of ), we will show that the existence of suitable waypoints is guaranteed. In that case then we may union bound the probability of failure:

as (the first term dominates asymptotically), where we have used the fact that

Suppose that every and at least a fraction of the contains a point of . Choose points accordingly: within if possible, and within otherwise. We may apply Lemma IV.1 to verify that these points -trace . For we have:

Since all but a fraction of successive points must both be in sets and obey the above cost bound, and the remaining pairs satisfy the analogous bound for sets (with instead of ), the total cost of is bounded above by . We also have for all . The maximum Euclidean distance from any point of (say, on the segment ) to is bounded above by its distance to , which by Lemma IV.1 is as since achieves some fixed maximum over . ∎

V Dfmt Algorithm

The algorithm presented here is based on FMT, from the recent work of [15], which can be thought of as an accelerated version of PRM [4]. Briefly, PRM first samples all the vertices, then constructs a fully locally connected graph, and then performs shortest path search (e.g., Dijkstra’s algorithm) on the graph to obtain a solution. FMT also samples all vertices first, but instead of a graph, lazily builds a tree via dynamic programming that very closely approximates the shortest-path tree for PRM, but saves a multiplicative factor of collision-checks by not constructing the full graph. The algorithm given by Algorithm 1, DFMT, is not fundamentally different from the original FMT algorithm, but mainly changes what “local” means under differential constraints (similar to [10], but now with drift). One more difference of DFMT presented here, even from the algorithm in [10], is that the edges are now directed, reflecting the fundamental asymmetry of differential constraints with drift.

Specifically, define the fixed-time forward-reachable and backwards-reachable sets respectively:

Membership in either reachable set may be checked by minimizing the explicit cost function (6) over travel time. The set of samples to check for membership may be pruned by considering the form of , as suggested in [9]. Let denote the boolean function which returns true if and only if lies within . Given a set of vertices , a state , and a cost threshold , let . Let denote the directed edge corresponding to with edge weight