Optimal Sampling-Based Motion Planning underDifferential Constraints: the Driftless Case

Optimal Sampling-Based Motion Planning under
Differential Constraints: the Driftless Case

Edward Schmerling, Lucas Janson, Marco Pavone Edward Schmerling is with the Institute for Computational & Mathematical  Engineering, Stanford University, Stanford, CA 94305, schmrlng@stanford.edu.Lucas Janson is with the Department of Statistics, Stanford University, Stanford, CA 94305, ljanson@stanford.edu.Marco Pavone is with the Department of Aeronautics and Astronautics, Stanford University, Stanford, CA 94305, pavone@stanford.edu.This work was supported by an Early Career Faculty grant from NASA’s Space Technology Research Grants Program (Grant NNX12AQ43G).

Motion planning under differential constraints is a classic problem in robotics. To date, the state of the art is represented by sampling-based techniques, with the Rapidly-exploring Random Tree algorithm as a leading example. Yet, the problem is still open in many aspects, including guarantees on the quality of the obtained solution. In this paper we provide a thorough theoretical framework to assess optimality guarantees of sampling-based algorithms for planning under differential constraints. We exploit this framework to design and analyze two novel sampling-based algorithms that are guaranteed to converge, as the number of samples increases, to an optimal solution (namely, the Differential Probabilistic RoadMap algorithm and the Differential Fast Marching Tree algorithm). Our focus is on driftless control-affine dynamical models, which accurately model a large class of robotic systems. In this paper we use the notion of convergence in probability (as opposed to convergence almost surely): the extra mathematical flexibility of this approach yields convergence rate bounds — a first in the field of optimal sampling-based motion planning under differential constraints. Numerical experiments corroborating our theoretical results are presented and discussed.

I Introduction

Motion planning is a fundamental problem in robotics. It involves the computation of a sequence of actions that drives a robot from an initial condition to a terminal condition while avoiding obstacles, respecting kinematic/dynamical constraints, and possibly optimizing an objective function [1]. The basic problem, where a robot does not have any constraints on its motion and only an obstacle-free solution is required, is well-understood and solved for a large number of practical scenarios [2]. On the other hand, robots do usually have stringent kinematic/dynamical (in short, differential) constraints on their motion, which in most settings need to be properly taken into account. There are two main approaches [2]: (i) a decoupling approach, in which the problem is decomposed in steps of computing a geometric collision-free path (neglecting the differential constraints), smoothing the path to satisfy the motion constraints, and finally reparameterizing the trajectory so that the robot can execute it, or (ii) a direct approach, in which the differentially-constrained motion planning problem (henceforth referred to as DMP problem) is solved in one shot. The first approach, while fairly common in practice, has several drawbacks, including the computation of very inefficient trajectories, failure in finding a trajectory due to the decoupling scheme itself, and inflated information requirements [2]. This motivates a quest for efficient algorithms that directly solve the DMP problem.

However, directly finding a feasible, let alone optimal, solution to the DMP problem is difficult (note that the basic version without differential constraints is already PSPACE-hard [3, 1], which implies NP-hard). Early work on this topic dates back to more than two decades ago [4], but the problem, especially when optimality is taken into account, is still open in many aspects [5, 2], including algorithms with practical convergence rates, guarantees on the quality of the obtained solution, and class of dynamical systems that can be addressed. To date, the state of the art is represented by sampling-based techniques, where an explicit construction of the configuration space is avoided and the configuration space is probabilistically “probed” with a sampling scheme.

Arguably, the most successful algorithm for DMP is the sampling-based rapidly-exploring tree algorithm (RRT) [6], which incrementally builds a tree of trajectories by randomly sampling points in the configuration space. Lately, several variations of the RRT algorithm, referred to as RRT, stemming from [7] and its kinodynamic extension [8], have been considered to ensure that the cost of the computed trajectory converges to the optimal cost as the number of sampled points goes to infinity [8, 9, 10, 11, 12]. These works, while providing strong experimental validation, only provide proof sketches that do not fully address many of the complications that arise in extending asymptotic optimality arguments from the geometric case to differentially constrained paths. For example, rewiring the RRT tree within a local volume containing (in expectation) a log fraction of previous samples is not sufficient in itself to claim optimality, as in [10, 11]. Additional assumptions on trajectory approximability must be stated and verified for the differential constraints in question. Such requirements are discussed in [8], but it is not clear how assuming the existence of forward-reachable trajectory approximations is sufficient for a “ball-to-ball” proof technique that requires backward approximations as well. A different approach to asymptotically optimal planning has recently been proposed by STABLE SPARSE RRT which achieves optimality through random control propagation instead of connecting existing samples using a steering subroutine [13]. This paper, like the RRT variations, follows a steering approach, although it may be considered less general, as it is our view that leveraging as much knowledge as possible of the differential constraints while planning is necessary in order to have a hope of planning in real-time.

Statement of Contributions: The objective of this paper is to provide a theoretical framework to study optimality guarantees of sampling-based algorithms for the DMP problem, and to exploit this framework to design efficient sampling-based algorithms that are guaranteed to asymptotically converge to an optimal solution. The focus of this paper is on driftless control-affine (DCA) dynamical systems of the form

where the available motions of trajectories are linear combinations given by input control functions and their corresponding actions at each point in space . Our work is written from the perspective of optimizing trajectory arc length, but applies also to cost metrics satisfying similar properties, which we discuss. This model is often the result of nonholonomic constraints that the kinematic variables of the system must satisfy [14]. A large class of robotic systems can be modeled as DCA, including mobile robots with wheels that roll without slipping [1], multi-fingered robotic hands [15], rigid bodies with zero angular momentum undergoing re-orientation [16], and systems with nonholonomic actuators [14]. The DCA model, however, rules out the presence of dynamics with drift, which in many important cases (e.g., spacecraft control) can not be neglected. Specifically, the contribution of this paper is threefold. First, we show that any trajectory of a DCA system may be “traced” arbitrarily well by connecting randomly distributed points from a sufficiently large sample set covering the configuration space. We will refer to this property as probabilistic exhaustivity, as opposed to probabilistic completeness [1], where the requirement is that at least one trajectory is traced with a sufficiently large sample set. Probabilistic exhaustivity is a key tool in proving asymptotic optimality, and is of independent interest. Second, we introduce two novel sampling-based algorithms for the solution to the DMP with DCA systems, namely the Differential Probabilistic RoadMap algorithm (DPRM) and the Differential Fast Marching Trees algorithm (DFMT). Third, by leveraging the property of probabilistic exhaustivity for DCA systems, we rigorously show that both DPRM and DFMT are asymptotically optimal. We note that in this paper we use the notion of convergence in probability (as opposed to convergence almost surely, as in, e.g., [11]): the extra mathematical flexibility of this approach yields convergence rate bounds — a first in the field of optimal sampling-based motion planning under differential constraints. Our approach is inspired by [11] and [17].

Organization: This paper is structured as follows. In Section II we provide a review of some key results in differential geometry that will be extensively used in the paper. In Section III we formally define the problem we wish to solve, while in Section IV we prove the aforementioned probabilistic exhaustivity property for DCA systems. In Section V we present the DPRM and DFMT algorithms, and in Section VI we prove their asymptotic optimality (together with a convergence rate characterization). In Section VII we provide implementation details for the proposed algorithms, and we study them via numerical experiments. Finally, in Section VIII, we draw some conclusions and we discuss directions for future work.

Ii Background Material

In this section we provide some definitions and a brief review of key results in differential geometry, on which we will rely extensively later in the paper. Let be the manifold defining a configuration space. Within this space let us consider driftless control-affine (DCA) dynamical systems of the form


where the available motions of trajectories are linear combinations given by input control functions and their corresponding actions at each point in space . We shall assume in this paper that are smooth vector fields on , and that the control set is closed and bounded. We also assume is symmetric about the origin so that the system is time-reversible and is in the interior of the convex hull of . This last condition ensures that the local possibilities for motion at each point appear as a linear space spanned by the , a fact essential to the forthcoming controllability discussion [18]. We denote the driftless control-affine system of equation (1) as . A function is called a dynamically feasible trajectory, alternatively path, if there exists a corresponding control function with which it satisfies . All trajectories discussed in this paper are dynamically feasible unless otherwise noted.

Ii-a Arc Length and Sub-Riemannian Distance

The arc length of a path is defined as

where is computed using the standard Euclidean inner product on the tangent spaces of . The arc length function induces a sub-Riemannian distance on , defined for as , where the infimum is taken over dynamically feasible trajectories connecting and . Note that for driftless control-affine systems, time-reversibility implies that is symmetric and indeed a metric. The sub-Riemannian ball may be defined in analogy to the standard Euclidean ball (i.e., ) according to . Note that by definition .

Ii-B Controllability and Reachable Sets

We now present a series of results regarding system controllability following the discussion in [19] and [20]. As noted above, the vector fields characterizing the system represent a set of possible motions for a trajectory within . More precisely at each point the vectors span a linear subspace of local directions within the tangent space . For a vector field on let denote its local flow, the function that maps an initial state to the state obtained by following the vector field for time . That is, where is a solution to the initial value problem , . Commutators of flows, akin to control switching in (1), allow for local motions transverse to the to be achieved while satisfying the differential constraints. Given two vector fields , on and a starting point , we have the approximation for small :

where the Lie bracket is a third vector field which may be computed with respect to a coordinate system as , where and are the Jacobian matrices of and respectively. Computing Lie brackets allows us to characterize all directions that are possible from each , in addition to those given by .

Let be the distribution, equivalently the set of local vector subspaces, generated by the vector fields . We define recursively ,

where . Then is the distribution generated by the iterated Lie brackets of the with terms or fewer. The Lie hull of is . Let denote the vector space corresponding to in .

The vector fields are said to be bracket generating if for all . This requirement is also referred to as Chow’s condition, or the linear algebra rank condition, and means that arbitrary local motion may be achieved by composing motions along the control directions . In fact, provided that the are bracket generating, any two points may be connected by a trajectory satisfying , that is . The remainder of this section develops tighter bounds on that will be used in our asymptotic optimality proofs of planning algorithms.

Chow’s condition implies that for all , there exists a smallest integer such that . Indeed, we have Set . The integer list is called the growth vector of at . A point is called a regular point if there exists an open neighborhood of around such that the growth vector is constant; otherwise is said to be a singular point.

We now further assume that every is a regular point, so that the growth vector is constant over the whole configuration manifold. Fix a base point . Using the bracket-generating assumption we select a local orthonormal frame for of vector fields as follows: the set spans near ; spans near ; spans near ; and so on. Define the weights if and . Applying a procedure developed in [19], the coordinate system corresponding to this local frame may be transformed into a privileged coordinate system by a polynomial change of coordinates of the form


where , , denotes a polynomial function that includes only terms of degree and . From the triangular structure of (2), it is clear that the inverse transformation is of the same form. Given privileged coordinates , define the pseudonorm at as

Using this pseudonorm we define the w-weighted box of size at as the point set We use the notation for the corresponding locus of points in given by the coordinates .

Theorem II.1 (Ball-box theorem [19]).

Fix a point and a system of privileged coordinates at . Then there exist positive constants and such that for all with ,


Constructing the coordinate system thus gives structure to how sub-Riemannian distance behaves locally; this structure may be used for steering, e.g. [21]. It can be shown that there exists a continuously varying system of privileged coordinates on so that the inequality (3) holds at all for continuous positive functions and on . Let us assume that the system is sufficiently regular such that there exist bounds and for all . We state a pair of lemmas (whose proofs are provided in the Appendix) concerning how privileged coordinates relate to the Euclidean notions of volume and distance.

Lemma II.2 (Box volume).

Fix . The volume of is given by where .

Lemma II.3 (Distance comparison).

Fix a point and a system of privileged coordinates at . Then there exists a positive constant such that for all with ,

Iii Problem Formulation

Let be the manifold defining a configuration space of a robotic system. Let be the obstacle region, such that is an open set, and denote the obstacle-free space as . The starting configuration is an element of , and the goal region is an open subset of . The trajectory planning problem is denoted by the tuple , where , as discussed in Section II, denotes a driftless control-affine system. A dynamically feasible trajectory is collision-free if for all . A trajectory is said to be feasible for the trajectory planning problem if it is dynamically feasible, collision-free, , and .

Let be the set of all feasible paths. A cost function for is a function from the set of paths to the nonnegative real numbers; in this paper we consider the cost function defined as the arc length of with respect to the Euclidean metric in . The objective is to find the feasible path with minimum associated cost; this minimum is achieved as long as is closed and bounded [1]. The optimal trajectory planning problem is then defined as follows:

Optimal motion planning for driftless systems: Given a trajectory planning problem and an arc length function , find a feasible path such that . If no such path exists, report failure.

Our analysis will rely on two key sets of assumptions, relating, respectively, to the underlying system and the problem-specific parameters .

Iii-1 Assumptions on system

As in Section II, we require from that a) the vector fields are bracket generating, b) the configuration space contains only regular points, c) there exist constants such that Theorem II.1 holds with these values at all , and d) there exists a bounding constant such that for all , where is as defined in Lemma II.3. Assumption (a) is a basic requirement for the system to be controllable, let alone optimally controlled, while (b), (c), and (d) ensure that there are no extreme regions of the configuration space which would require an unbounded sample density to capture their geometry. These assumptions will be collectively referred to as .

Iii-2 Assumptions on problem parameters

We require that the goal region has regular boundary, that is there exists such that , there exists with and . This requirement that the boundary of the goal region has bounded curvature ensures that a point sampling procedure may expect to select points in the goal region near any point on its boundary.

We also make requirements on the clearance of a trajectory, i.e., its “distance” from , standard for sampling-based methods [22]. For a given , the -interior of is defined as the set of all states that are at least a Euclidean distance away from any point in . A collision-free path is said to have strong -clearance if it lies entirely inside the -interior of . A collision-free path is said to have weak -clearance if there exists a path that has strong -clearance and there exists a homotopy , with and that satisfies the following three properties: (a) is a dynamically feasible path for all , (b) , and (c) for all , has strong -clearance for some . Properties (a) and (b) are required since pathological obstacle sets may be constructed that squeeze all optimum-approximating homotopies into undesirable motion. In practice, however, as long as does not contain any passages of infinitesimal width, the fact that is bracket-generating will allow every trajectory to be weak -clear.

Iv Probabilistic Exhaustivity of Sampling Schemes under Driftless Constraints

In this section we prove a key result characterizing random sampling schemes for motion planning under driftless differential constraints: any feasible trajectory through the configuration space is “traced” arbitrarily well by connecting randomly distributed points from a sufficiently large sample set covering the configuration space. We will refer to this property as probabilistic exhaustivity. Note that this notion is much stronger than the usual notion of probabilistic completeness in motion planning, where the requirement is that at least one feasible trajectory is traced. The notion of probabilistic exhaustivity, besides being a result of independent interest, is a strong tool in proving asymptotic optimality of sampling-based motion planning algorithms, as will be shown in Section V (specifically, we focus on differential variants of PRM [22] and FMT [17]).

Let satisfy the system (1). Given a set of waypoints , we associate a dynamically feasible trajectory that connects the nodes in order so that each connection is locally optimal, i.e. each path segment connecting to has length . We consider the waypoints to -trace the trajectory if: a) for all , b) the cost of is bounded as , and c) the distance from any point of to is no more than , i.e. for all . In the context of sampling-based motion planning, we may expect to find closely tracing as a subset of the sampled points, provided the sample size is large. We formalize this notion in Theorem IV.5, the proof of which requires three technical lemmas (the proofs of these lemmas are provided in the Appendix).

Lemma IV.1.

Let be a dynamically feasible trajectory and consider a partition of the time interval . Suppose that satisfy a) for all , and b) more than a fraction of the satisfy for a parameter . Then the cost of the trajectory sequentially connecting the nodes is upper bounded as

Remark IV.2.

If we further assume that , then the bound holds.

Let denote a set of points sampled independently and identically from the uniform distribution on .

Lemma IV.3.

Fix , , and let be disjoint subsets of with

for each . Let and define

Then .

Lemma IV.4.

Fix and let be subsets of , possibly overlapping, with

for each and some constant . Let and denote by the event that for each . Then

Before stating the theorem and proof in full, we sketch our approach for proving probabilistic exhaustivity. Given a path to be traced with waypoints from a sample set, we tile the span of the path with two sequences of concentric sub-Riemannian balls – a sequence of “small” balls and a sequence of “large” balls. With high probability, all but a tiny fraction of the small balls will contain a point from the sample set (Lemma IV.3), and for any small balls that don’t contain such a point we ensure that the concentric large ball does (Lemma IV.4). We take these points as a sequence of waypoints which tightly follows the reference path with few exceptions, and never has a gap over any section of the reference path when it deviates. We then use the metric inequality for to bound the total cost of the waypoint trajectory (Lemma IV.1).

Theorem IV.5 (Probabilistic exhaustivity).

Let be a DCA system satisfying the assumptions and suppose is a dynamically feasible trajectory with strong -clearance, . Let , , and for fixed consider the event that there exist which -trace , where

for a parameter . Then, as , the probability that does not occur is asymptotically bounded as .


Note that in the case we may pick to be the only waypoint and the result is trivial. Therefore assume . Fix sufficiently large so that . Take to be points spaced along at sub-Riemannian distances ; more precisely let , and for consider

Let be the first for which the set is empty; take . Note that since the distance is an infimum over all feasible trajectories, one of which is the segment of between and , we have the bound .

We now make the identification anticipating the application of Lemma IV.1. Take and define a sequence of sub-Riemannian balls centered along the trajectory by , where for . Within these balls define a concentric sequence of smaller, disjoint balls for each . Note that since , we have the volume lower bound and similarly Denote . We consider the event that every large ball , as well as at least a fraction of the small balls , contains at least one point of :

We claim that implies the event that there exist -tracing . If holds, then we select waypoints such that for every point, as well as for at least a fraction of the points. In particular let us select .

First note that for each implies

Next, applying Remark IV.2 we have

Finally we check that any point is within distance from a point on . To see this suppose that lies on the shortest path connecting to . Note that

Then we may write

so that . Thus the waypoints -trace .

Now making the identifications and in Lemmas IV.3 and IV.4 respectively, we see that by considering sufficiently large so that (satisfying the volume assumption of Lemma IV.3), we compute the union bound

Now, and together imply that . The second term in the bound dominates as , and . ∎

V Optimal Sampling-based Algorithms for Driftless Control-Affine Systems

In this section we present two algorithms for the motion planning problem with driftless control-affine systems. The first algorithm, named the Differential Probabilistic RoadMap algorithm (DPRM), is a derivation of the PRM algorithm presented in [22], while the second algorithm, named the Differential Fast Marching Tree algorithm (DFMT), is a derivation of the FMT algorithm presented in [17]. This section provides a description of both algorithms, while the next section focuses on their theoretical characterization (chiefly, their asymptotic optimality property).

As in Section IV, let be a function that returns a set of states sampled independently and identically from the uniform distribution on . These sampled states are connected as vertices in a graph from which a solution trajectory will be computed. Given two vertices and , we denote with the edge an optimal cost trajectory from to neglecting obstacle constraints. Let denote the boolean function which returns true if and only if the edge does not intersect an obstacle. Given a set of vertices , a state , and a threshold , let be a function that returns the set of states . Given a graph , where is the vertex set and is the edge set, and a vertex , let be the function that returns the cost of the shortest path in the graph between the vertices and . Let be the function that returns the path achieving that cost.

The DPRM algorithm is given in Algorithm 1, while the DFMT algorithm is given in Algorithm 2. DPRM works by sampling a set of points within and connecting each state to every state in a local neighborhood around it, provided that the connection is collision free. The resulting graph spans , with the local connections combining to yield a global “roadmap” for traveling between any two states, not just and the goal. The least cost path in the graph between and , computed, e.g., using the Dijkstra’s algorithm, is output by DPRM.

The DFMT algorithm essentially implements a streamlined version of DPRM by performing a “lazy” dynamic programming recursion, this time during the state connection phase instead of as a last step, to grow a tree of trajectories which moves steadily outward in cost-to-come space. In the Algorithm 2 outline, the set consists of all of the nodes that have not yet been added into the tree, while is comprised only of nodes that are in the tree. In particular, while keeps track of nodes which have already been added to the tree, nodes are removed from if they are not near enough to the edge of the expanding tree to actually have any new connections made with (see [17] for further details). At each iteration, DFMT examines the neighborhood of a state in and only considers locally-optimal (assuming no obstacles) connections for potential inclusion in the tree (see line (10)). By only checking for collision on the locally-optimal (assuming no obstacles) connection, as opposed to every possible connection (essentially what is done in DPRM), DFMT saves a large (indeed unbounded as the number of vertices increases) number of collision-check computations. A more detailed explanation of the meaning of the dual sets and and of the philosophy behind DFMT can be found in [17].

The main differences of these algorithms with respect to their geometric counterparts (i.e., PRM and FMT) are that (i) near vertices lie within the privileged coordinate box rather than within the Euclidean ball, and (ii) the edges connecting vertices are optimal trajectories rather than straight lines. Indeed, attempting connections within the sub-Riemannian ball would be the best analogy to the geometric planning case, and is preferable for efficiency if possible. The structure of this set is difficult to compute exactly, however, which motivates the choice as a tractable approximation (as also suggested in [11]). Checking connections within this superset of induces an extra run time factor of at most , a bound on the volume ratio between the two sets.

1  ;
2  for each  do
3     for each  do
4        if  then
6        end if
7     end for
8  end for
10  if  then
11      random vertex in
12     return  
13  else
14     return  Failure
15  end if
Algorithm 1 Differential Probabilistic RoadMap (DPRM)
1  ;
2  ;
4  while  do
7     for  do
10        if  then
14        end if
15     end for
17     if  then
18        return  Failure
19     end if
21  end while
22  return  
Algorithm 2 Differential Fast Marching Tree (DFMT)

Vi Asymptotic Optimality of DFMT and DPRM

In this section, we prove the asymptotic optimality of DFMT and DPRM (obtained as a simple corollary). We conclude the section by providing a discussion of the results.

Vi-a Asymptotic Optimality of DFMT and DPRM

The following theorem presents a result comparing the output cost of DFMT to the cost of any feasible trajectory.

Theorem VI.1 (Dfmt cost comparison).

Let be a trajectory planning problem satisfying the assumptions and suppose is a feasible path with strong -clearance, . Assume further that extends into the interior of , i.e. there exists such that . Let denote the cost of the path returned by DFMT with vertices using a radius

for a parameter . Then for fixed


The proof of this theorem is conceptually similar to the one of Theorem 1 in [17]. The details are provided in the Appendix. ∎

We are now in a position to state the optimality result for DFMT (note that the result also provides a convergence rate bound). The optimality of DPRM will follow as a corollary.

Theorem VI.2 (Dfmt asymptotic optimality).

Let be a trajectory planning problem, satisfying the assumptions and with -regular, such that there exists an optimal path with weak -clearance for some . Let denote the arc length of , and let denote the cost of the path returned by DFMT with vertices using the radius

for a parameter . Then for fixed


Note that implies , and DFMT will terminate immediately and optimally. In this case the result is trivial, therefore assume . Let denote the homotopy given by the weak -clearance definition for . Using the fact that , we pick an such that . Denote . We now extend into the interior of so that we may apply Theorem VI.1. Since is optimal, and the -regularity of means there exists with and . Pick and take on the straight line segment between and with . Then . The last two terms in the minimum above ensure, from Lemma II.3, that . Consider the extension of constructed by concatenating with the shortest sub-Riemannian path between and . The cost of is bounded as

The strong -clearance of is established by noting for any along the path between and :

Then by Theorem VI.1, with the approximation factor , we have for that

For the desired result follows from the monotonicity in of the above probability. ∎

Remark VI.3.

There is an implicit dependence on the problem parameters and approximation factor in the convergence rate bound given above in Theorem VI.2; these fixed parameters influence the threshold after which the stated asymptotic bound holds.

We conclude this section with the asymptotic optimality result for DPRM.

Corollary VI.4 (Dprm asymptotic optimality).

Let be a trajectory planning problem, satisfying the assumptions and with -regular, such that there exists an optimal path with weak -clearance for some . Let denote the arc length of , and let denote the cost of the path returned by DPRM with vertices using the radius

for a parameter . Then for fixed


Given the same sample set , the trajectory tree constructed by DFMT is a subgraph of the roadmap graph constructed by DPRM. Hence the cost of the path returned by DPRM is upper bounded by that of DFMT