Safe, Optimal, Real-time Trajectory Planning with a Parallel Constrained Bernstein Algorithm
To move through the world, mobile robots typically use a receding-horizon strategy, wherein they execute an old plan while computing a new plan to incorporate new sensor information. A plan should be dynamically feasible, meaning it obeys constraints like the robot’s dynamics and obstacle avoidance; it should have liveness, meaning the robot does not stop to plan so frequently that it cannot accomplish tasks; and it should be optimal, meaning that the robot tries to satisfy a user-specified cost function such as reaching a goal location as quickly as possible. Reachability-based Trajectory Design (RTD) is a planning method that can generate provably dynamically-feasible plans. However, RTD solves a nonlinear polynmial optimization program at each planning iteration, preventing optimality guarantees; furthermore, RTD can struggle with liveness because the robot must brake to a stop when the solver finds local minima or cannot find a feasible solution. This paper proposes RTD*, which certifiably finds the globally optimal plan (if such a plan exists) at each planning iteration. This method is enabled by a novel Parallelized Constrained Bernstein Algorithm (PCBA), which is a branch-and-bound method for polynomial optimization. The contributions of this paper are: the implementation of PCBA; proofs of bounds on the time and memory usage of PCBA; a comparison of PCBA to state of the art solvers; and the demonstration of PCBA/RTD* on a mobile robot. RTD* outperforms RTD in terms of optimality and liveness for real-time planning in a variety of environments with randomly-placed obstacles.
For mobile robots to operate successfully in unforeseen environments, they must plan their motion as new sensor information becomes available. This receding-horizon strategy requires iteratively generating a plan while simultaneouly executing a previous plan. Typically, this requires solving an optimization program in each planning iteration (see, e.g., ).
The present work considers a mobile ground robot tasked with reaching a global goal location in an arbitrary, static environment. To assess receding-horizon planning performance, we consider the following characteristics of plans. First, a plan should be dynamically feasible, meaning that it obeys the dynamic description of the robot and obeys constraints such as actuator limits and obstacle avoidance. Second, a plan should maintain liveness, meaning that it keeps the robot moving through the world without stopping frequently to replan, which can prevent a robot from achieving a task in a user-specified amount of time. Third, a plan should be optimal with respect to a user-specified cost function, such as reaching a goal quickly.
Ensuring that plans have these characteristics is challenging for several reasons. First, robots typically have nonlinear dynamics; this means that creating a dynamically-feasible plan typically requires solving a nonlinear program (NLP) at runtime. However, it is difficult to certify that an NLP can be solved in a finite amount of time, meaning that the robot may have to sacrifice liveness. Furthermore, even if a robot has linear dynamics, the cost function may be nonlinear; in this case, it is challenging to certify optimality due to the presence of local minima.
This paper extends prior work on Reachability-based Trajectory Design (RTD). RTD is able to provably generate dynamically-feasible trajectory plans in real time, but cannot guarantee optimality or liveness (the robot will generate plans in real time, but may brake to a stop often). We address this gap by proposing the Parallel Constrained Bernstein Algorithm (PCBA), which provably finds globally-optimal solutions to Polynomial Optimization Problems (POPs), a special type of NLP. We apply PCBA to RTD to produce an optimal version of RTD, which we call RTD* in the spirit of the well-known RRT* algorithm . We show on hardware that RTD* demonstrates liveness (in comparison to RTD) for trajectory optimization. For the remainder of this section, we discuss related work, then state our contributions.
I-a Related Work
A variety of methods exist that attempt receding-horizon planning while maintaining dynamic feasibility, liveness, and optimality. These methods can be broadly classified by whether they perform sampling or solve an optimization program at each planning iteration. Sampling-based methods typically either attempt to satisfy liveness and dynamic feasibility by choosing samples offine [34, 24], or attempt to satisfy optimality at the potential expense of liveness and dynamic feasibility . Optimization-based methods attempt to find a single optimal trajectory. These methods typically have to sacrifice dynamic feasibility (e.g., by linearizing the dynamics) to ensure liveness , or sacrifice liveness to attempt to satisfy dynamic feasibility [46, 8] (also see [18, §9] and ).
Reachability-based Trajectory Design
Reachability-based Trajectory Design (RTD) is an optimization-based receding horizon planner that requires solving a POP at each planning iteration . RTD specifies plans as parameterized trajectories. Since these trajectories cannot necessarily be perfectly tracked by the robot, RTD begins with an offline computation of a Forward Reachable Set (FRS). The FRS contains every parameterized plan, plus the tracking error that results from the robot not tracking any plan perfectly. At runtime, in each planning iteration, the FRS is intersected with sensed obstacles to identify all parameterized plans that could cause a collision (i.e., be dynamically infeasible). This set of unsafe plans is represented as a (finite) list of polynomial constraints, and the user is allowed to specify an arbitrary (not necessarily convex) polynomial cost function, resulting in a POP. At each planning iteration, either the robot successfully solves the POP to get a new plan, or it continues executing its previously-found plan. While the decision variable is typically only two- or three-dimensional, each POP often has hundreds of constraints, making it challenging to find a feasible solution in real-time . Each plan includes a braking maneuver, ensuring that the robot can always come safely to a stop if the POP cannot be solved quickly enough in any planning iteration.
Note, for RTD, optimality means finding the optimal solution to a POP at each planning iteration. The cost function in RTD’s POPs typically encode behavior such as reaching a waypoint between the robot’s current position and the global goal (e.g., [18, §9.2.1]; RTD attempts to find the best dynamically feasible trajectory to the waypoint. RTD does not attempt to find the best waypoints themselves (best, e.g., with respect to finding the shortest path to the global goal). Such waypoints can be generated quickly by algorithms such as A* or RRT* by ignoring dynamic feasibility [17, 18].
Polynomial Optimization Problems
POPs require minimizing (or maximizing) a polynomial objective function, subject to polynomial equality or inequality constraints. As a fundamental class of problems in non-convex optimization, POPs arise in various applications, including signal processing [35, 43, 27], quantum mechanics [4, 12], control theory [16, 18, 2], and robotics [39, 26]. This paper presents a novel parallelized constrained Bernstein Algorithm for solving POPs.
The difficulty of solving a POP increases with the dimension of the cost and constraints, with the number of constraints, and with the number of optima . Existing methods attempt to solve POPs while minimizing time and memory usage (i.e., complexity). Doing so typically requires placing limitations on one of these axes of difficulty to make solving a POP tractable. These methods broadly fall into the following categories: derivative-based, convex relaxation, and branch-and-bound.
Drivative-based methods use derivatives (and sometimes Hessians) of the cost and constraint functions, along with first- or second-order optimality conditions [33, §12.3, §12.5], to attempt to find optimal, feasible solutions to nonlinear problems such as POPs. These methods can find local minima of POPs rapidly despite high dimension, a large number of constraints, and high degree cost and constraints [33, Chapter 19.8]. However, these methods do not typically converge to global optima without requiring assumptions on the problem and constraint structure (e.g., ).
Convex relaxation methods attempt to find global optima by approximating the original problem with a hierarchy of convex optimization problems. These methods can be scaled to high-dimensional problems (up to 10 dimensions), at the expense of limits on the degree and sparse structure of the cost function; furthermore, they typically struggle to handle large numbers of constraints (e.g., the hundreds of constraints that arise in RTD’s POPs), unless the problem has low-rank or sparse structure . Well-known examples include the lift-and-project linear program procedure , reformulation-linearization technique , and Semi-Definite Program (SDP) relaxations [21, 39]. By assuming structure such as homogeneity of the cost function or convexity of the domain and constraints, one can approximate solutions to a POP in polynomial-time, with convergence to global optima in the limit [5, 22, 23, 42, 15]. Convergence within a finite number of convex hierarchy relaxations is possible under certain assumptions (e.g., a limited number of equality constraints [32, 20]).
Branch-and-bound methods perform an exhaustive search over the feasible region. These methods are typically limited to up to four dimensions, but can handle large numbers of constraints and high degree cost and constraints. Examples include interval analysis techniques [14, 45] and the Bernstein Algorithm (BA) [10, 31, 40]. Interval analysis requires cost and constraint function evaluations in each iteration, and therefore can be computationally slow. BA, on the other hand, does not evaluate the cost and constraint functions; instead, BA represents the coefficients of the cost and constraints in the Bernstein basis, as opposed to the monomial basis. The coefficients in the Bernstein basis provide lower and upper bounds on the polynomial cost and constraints over box-shaped subsets of Euclidean space by using a subdivision procedure [9, 30]. Note, one can use the Bernstein basis to transform a POP into a linear program (LP) on each subdivided portion of the problem domain, which allows one to find tighter solution bounds that given by the Bernstein coefficients alone . Since subdivision can be parallelized , the time required to solve a POP can be greatly reduced by implementing BA on a Graphics Processing Unit (GPU). However, a parallelized implementation or bounds on the rate of convergence of BA with constraints has not yet been shown in the literature. Furthermore, to the best of our knowledge, BA has not been shown as a practical method for solving problems in real-time robotics applications.
I-B Contributions and Paper Organization
In this paper, we make the following contributions. First, we propose a Parallel Constrained Bernstein Algorithm (PCBA) (§III). Second, we prove that PCBA always finds an optimal solution (if one exists), and prove bounds on PCBA’s time and memory usage (§IV). Third, we evaluate PCBA on a suite of well-known POPs in comparison to the Bounded Sums-of-Squares (BSOS)  solver and a generic nonlinear solver (MATLAB’s fmincon) (§V). Fourth, we apply PCBA to RTD to make RTD*, a provably-safe, optimal, and real-time trajectory planning algorithm for mobile robots (§VI), thereby demonstrating dynamic feasibility and liveness. The remainder of the paper is organized as follows. §II defines notation for RTD and POPs. §VII draws conclusions. Appendix B lists benchmark problems for PCBA.
This section introduces notation, RTD, POPs, the Bernstein form, and subdivision.
We follow the notation in . Let be real variable of dimension . A multi-index is defined as and the corresponding multi-power is defined as . Given another multi-index of the same dimension, an inequality should be understood component-wise. An -variate polynomial in canonical (monomial) form can be written as
with coefficients and some multi-index . The space of polynomials of degree , with variable , is .
We call the multi-degree of a polynomial ; each th element of is the maximum degree of the variable out of all of the monomials of . We call the degree of ; is the maximum sum, over all monomials of , of the powers of the variable . That is, , where is the sum of the elements of a multi-index.
Point and Set Notation
Let denote a general -dimensional box in , with for each . Let be the -dimensional unit box. Denote by the maximum width of a box , i.e. . For any point , denote by the Euclidean norm of , and denote by the closed Euclidean ball centered at with radius .
Let denote the time interval of a single plan (in a single receding-horizon iteration). Let denote the robot’s state space, and denote the robot’s control inputs where . The robot is described by dynamics , which we call a high-fidelity model. The parameterized trajectories are described by , where is the space of trajectory parameters. A point parameterizes a trajectory . For any , the robot uses a feedback controller to track the trajectory parameterized by (we say it tracks for short).
Ii-B Polynomial Optimization Problems
We denote a POP as follows:
The decision variable is , where is a compact, box-shaped domain and is the dimension of the program. The cost function is , and the constraints are (). We assume for convenience that is the greatest degree amongst the cost and constraint polynomials; we call the degree of the problem.
Ii-C Reachability-based Trajectory Design
Recall that RTD begins by computing an FRS offline for a robot tracking a parameterized set of trajectories. At runtime, in each receding-horizon planning iteration, the FRS is used to construct a POP as in (P); solving this POP is equivalent to generating a new trajectory plan. We now briefly describe how this POP is constructed (see  for details).
We define the FRS as the set of all states reachable by the robot when tracking any parameterized trajectory:
where is the set of valid initial conditions for the robot in any planning iteration. To implement RTD, we conservatively approximate the FRS using sums-of-squares programming. In particular, we compute a polynomial ( or ) for which the 0-superlevel set contains the FRS:
See [18, Section 3.2] for details on how to compute .
At runtime, we use to identify unsafe trajectory plans. If is a collection of points on obstacles, then we solve the following program in each planning iteration:
where is a user-constructed polynomial cost function (see (28) in §VI as an example). Note that, by (3), the set of safe trajectory plans is open, so we implement the constraints in (4) as , . Critically, any feasible solution to (4) is provably dynamically feasible (and collision-free) [18, Theorem 68]. To understand RTD’s online planning in more detail, see [18, Algorithm 2].
In this work, instead of using a derivative-based method to solve (4), we use our proposed PCBA method, which takes advantage of the polynomial structure of and . Next, we discuss Bernstein polynomials.
Ii-D Bernstein Form
A polynomial in monomial form (1) can be expanded into Bernstein form over an arbitrary -dimensional box as
where is the th multivariate Bernstein polynomial of multi-degree over , and are the corresponding Bernstein coefficients of over . A detailed definition of Bernstein form is available in . Note that the Bernstein form of a polynomial can be determined quickly , by using a matrix multiplication on a polynomial’s monomial coefficients, with the matrix determined by the polynomial degree and dimension. This matrix can be precomputed, and the conversion from monomial to Bernstein basis only needs to happen once for the proposed method (see Algorithm 1 in §III).
For convenience, we collect all such Bernstein coefficients in a multi-dimensional array , which is called a patch. We denote by (resp. ) the minimum (resp. maximum) element in the patch . The range of polynomial over is contained within the interval spanned by the extrema of , formally stated as the following theorem:
Theorem 2 ([31, Lemma 2.2]).
Let be a polynomial defined as in (5) over a box . Then, the following property holds for a patch of Bernstein coefficients
This theorem provides a means to obtain enclosure bounds of a multivariate polynomial over a box by transforming the polynomial to Bernstein form. This range enclosure can be further improved either by degree elevation or by subdivision. This work uses subdivision, discussed next, to refine the bounds.
Ii-E Subdivision Procedure
Consider an arbitrary box . The range enclosure in Theorem 2 is improved by subdividing into subboxes and computing the Bernstein patches over these subboxes. A subdivision in the th direction () is a bisection of perpendicular to this direction. That is, let
be an arbitrary box over which the Bernstein patch is already computed. By subdividing in the th direction we obtain two subboxes and , defined as
Note that we have subdivided by halving its width in the th direction; we choose as the subdivision parameter in this work, but one can choose a different value in (see [31, Equation (10)]).
The new Bernstein patches, and , can be computed by a finite number of linear transformations [31, §2.2]:
where and are constant matrices, which can be precomputed, for each (notice that [31, Equation (10)] obtains with linear operations on ). The patches and one iteration of the subdivision procedure are shown in Figure 2.
To reduce wordiness, we say that we subdivide a patch to mean the subdivision of a single subbox into two subboxes and the computation of the corresponding Bernstein patches for the POP cost and constraints.
By repeatedly applying the subdivision procedure and Theorem 2, the bounds on the range of polynomial in a subbox can be improved. In fact, such bounds can be exact in the limiting sense if the subdivision is applied evenly in all directions:
Theorem 4 ([25, Theorem 2]).
Let be a box of maximum width () and let be the corresponding Bernstein patch of a given polynomial , then
where is a non-negative constant that can be given explicitly independent of .
Theorems 4 and Corollary 5 provide shrinking bounds for values of a polynomial over subboxes as the subdivision process continues. By comparing the bounds over all subboxes, one can argue that the minimizers of a polynomial may appear in only a subset of the subboxes. This idea underlies the Bernstein Algorithm for solving POPs [30, 31], discussed next.
Iii Parallel Constrained Bernstein Algorithm
This section proposes the Parallel Constrained Bernstein Algorithm (PCBA, Algorithm 1) for solving a general POP. We extend the approach in . This approach utilizes Bernstein form to obtain upper and lower bounds of both objective and constraint polynomials (Theorem 2), iteratively improves such bounds using subdivision (Theorem 4 and Corollary 5), and removes patches that are cannot contain a solution (Theorem 12). We discuss the algorithm, the list used to store patches, tolerances and stopping criteria, subdivision, a cut-off test for eliminating patches, and the advantages and disadvantages of PCBA. The next section, §IV, proves PCBA finds globally optimal solutions to POPs, up to user-specified tolerances.
Before proceeding, we make an assumption for notational convenience, and to make the initial computation of Bernstein patches easier.
Without loss of generality, the domain of the decision variable is the unit box (i.e., ), since any nonempty box in can be mapped affinely onto .
Iii-a Algorithm Summary
We now summarize PCBA, implemented in Algorithm 1. The algorithm is initialized by computing the Bernstein patches of the cost and constraints on the domain (Line 1). Subsequently, PCBA subdivides each patch as in Remark 8 (Line 5 and Algorithm 2). Then, PCBA finds the upper and lower bounds of each new patch (Line 6 and Algorithm 3). These bounds are used to determine which patches are feasible, infeasible, and undecided as in Definition 9 (Line 7; see Algorithm 4 and Theorem 12). Algorithm 4 also determines the current solution estimate (the smallest upper bound over all feasible patches), and marks any patches that are suboptimal as in Definition 11. If every patch is infeasible (Line 8), PCBA returns that the problem is infeasible (Line 13); otherwise, PCBA checks if the current solution estimate meets user-specified tolerances (Line 9). If the tolerances are met, PCBA returns the solution estimate (Line 12). Otherwise, PCBA eliminates all infeasible and suboptimal patches (Line 10 and Algorithm 5), then moves to the next iteration (Line 11). Note, algorithms 2, 3, and 5 are parallelized.
Iii-B Items and The List
Denote an item as the tuple , where (resp. ) is shorthand for the set of patches (resp. ). We use the following notation for items. If , then , , , and .
We denote the list , , indexed by . PCBA adds and removes items from by assessing the feasibility and optimality of each item.
Iii-C Tolerances and Stopping Criteria
Recall that, by Theorem 2, Bernstein patches provide upper and lower bounds for polynomials over a box. From Theorem 4 and Corollary 5, as we subdivide into smaller subboxes, the bounds of the Bernstein patches on each subbox more closely approximate the actual bounds of the polynomial. While the bounds will converge to the value of the polynomial in the limit as the maximum width of subboxes goes to zero, to ensure the algorithm terminates, we must set tolerances on optimality and equality constraint satisfaction (the equality constraints may not be satisfied for all points in certain boxes). During optimization one is also usually interested in finding the optimizer of the cost function up to some resolution. In our case, this resolution corresponds to the maximum allowable subbox width which we refer to as the step tolerance.
We denote the optimality tolerance as , the equality constraint tolerance as , and the step tolerance as . We terminate Algorithm 1 either when is empty (the problem is infeasible) or when there exists an item that satisfies all of the following conditions:
for all ,
for all , and
for all .
We discuss feasibility in more detail in section III-E Note that, to implement the step tolerance , since we subdivide by halving the width of each subbox, we need only ensure that sufficiently many iterations have passed.
Recall that subdivision is presented in §II-E. We implement subdivision with Algorithm 2. Since the subdivision of one Bernstein patch is computationally independent of another, each subdivision task is assigned to an individual GPU thread, making Algorithm 2 parallel.
Note that the subdivision of Bernstein patches can be done in any direction, leading to the question of how to select the direction in practice. Example rules are available in the literature, such as maximum width [38, §3], derivative-based [47, §3], or a combination of the two [38, §3]. In the context of constrained optimization, the maximum width rule is usually favored over derivative-based rules for two reasons: first, computing the partial derivatives of all constraint polynomials can introduce significant computational burden, especially when the number of constraints is large (see §V-C); second, the precision of Bernstein patches as bounds to the polynomials depends on the maximum width of each subbox (Theorem 4 and Corollary 5), so it is beneficial to subdivide along the direction of maximum width for better convergence results.
In each th iteration of PCBA, we subdivide in each direction , in the order . We halve the width of each subbox each time we subdivide, leading to the following remark.
In the th iteration, the maximum width of any subbox in is .
Iii-E Cut-Off Test
Subdivision would normally occur for every patch in every iteration, leading to exponential memory usage ( patches in iteration ). However, by using a cut-off test, some patches can be deleted, reducing both the time and memory usage of PCBA (see §IV for complexity analysis). To decide which patches are to be eliminated, we require the following definitions.
An item is feasible if both of the following hold:
for all , and
for all .
An item is infeasible if any of the following hold:
for at least one , or
for at least one , or
for at least one .
An item is undecided if it is neither feasible nor infeasible.
Notice in particular, a feasible item must not be infeasible.
The solution estimate is the smallest upper bound of the cost over all feasible items in :
where if .
An item is suboptimal if
Note that Definitions 10 and 11 are dependent on ; that is, for the purposes of PCBA, optimality is defined in terms of the elements of . We show in Corollary 13 below how this notion of optimality coincides with optimality of the POP itself.
Feasible, infeasible, undecided, and suboptimal patches are illustrated in Figure 2. Any item that is infeasible or suboptimal can be eliminated from , because the corresponding subboxes cannot contain the solution to the POP (formalized in the following Theorem). We call checking for infeasible and suboptimal items the cut-off test.
Theorem 12 (Cut-Off Test).
Let be an item in . We only need to show:
if is feasible, then all points in are feasible (up to the tolerance ), and
if is infeasible, then all points in are infeasible (up to the tolerance ), and
if is suboptimal, then all points in are not optimal.
Let be arbitrary, then it follows from Theorem 2 and the definition of suboptimality that
for all . Since such point is necessarily feasible (obtained from condition 2), cannot be global minimum to the POP.
This result is the contrapositive of Theorem 12.
We implement the cut-off tests as follows. Algorithm 3 (FindBounds) computes the maximum and minimum element of each Bernstein patch; Algorithm 4 (CutOffTest) implements the cut-off tests and marks all subboxes to be eliminated with a list ; and Algorithm 5 (Eliminate) eliminates the marked subboxes from the list . Algorithms 3 and 5 are parallelizable, whereas Algorithm 4 must be computed serially.
Iii-F Advantages and Disadvantages of PCBA
PCBA has several advantages. First, it always finds a global optimum (if one exists), subject to tolerances. PCBA does not require an initial guess, and does not converge to local minima, unlike generic nonlinear solvers (e.g., fmincon ). It also does not require tuning hyperparameters. As we show in §IV, PCBA has bounded time and memory complexity under certain assumptions. Finally, due to parallelization, PCBA is fast enought to enable RTD* for real-time, safe, optimal trajectory planning, which we demonstrate in §VI.
However, PCBA also has several limitations in comparison to traditional approaches to solving POPs. First, to prove the bounds on time and memory usage, at any global minimum, we require that active constraints are linearly independent, and that the Hessian of the cost function is positive definite (see Theorems 14 and 17). Furthermore, due to the number of Bernstein patches growing exponentially with the decision variable dimension, we have not yet applied PCBA to problems larger than four-dimensional.
Iv Complexity Analysis
In this section, we prove that Algorithm 1 terminates by bounding the number of iterations of PCBA for both unconstrained and constrained POPs. We also prove the number of Bernstein patches (i.e., the length of the list in Algorithm 1) is bounded after sufficiently many iterations, under certain assumptions. For convenience, in the remainder of this section we use as a shorthand notation for where . The proofs of Theorems 16, 17, and 18 are in the Appendix.
Iv-a Unconstrained Case
We first consider unconstrained POPs whose optimal solutions are not on the boundary of . Note, we can treat optimal solutions on the boundary of as having active linear constraints; see §IV-B for the corresponding complexity analysis. In the unconstrained case, all points in are feasible, and we are interested in solving
where is an -dimensional multivariate polynomial. Given an optimality tolerance and step tolerance , we bound the number of iterations to solve (16) with PCBA as follows:
Let be the current iteration number. It is sufficient to show that, for all , there exists a subbox of such that the following conditions hold:
for all .
According to (17), the rate of convergence with respect to (WRT) the decision variables is quadratic (1st term); the rate of convergence WRT the objective function is either quadratic (2nd term) or linear (3rd term), depending on which term dominates. However, a tighter bound exists if one of the global minimizers satisfies the second-order sufficient condition for optimality [33, Theorem 2.4], which is shown in the following theorem:
Theorem 15 proves a quadratic rate of convergence WRT the objective function once the number of iterations exceeds a threshold , given that the second-order optimality condition is satisfied at some global minimizer. We now discuss the number of patches remaining after sufficiently many iterations, which gives an estimate of memory usage when Algorithm 1 is applied to solve (16).
Suppose there are global minimizers of (16), and none of them are on the boundary of the unit box . Let the Hessian be positive definite at these minimizers. Then after sufficiently many iterations of Algorithm 1, the number of Bernstein patches remaining (i.e., length of the list in Algorithm 1) is bounded by a constant.
The constant bound in Theorem 16 scales exponentially with the problem dimension, and is a function of the condition number of the cost function’s Hessian at the minimizers.
Iv-B Constrained Case
Suppose that the linear independence constraint qualification (LICQ) [33, Definition 12.4] is satisfied at all global minimizers of the constrained POP (P), and at least one constraint is active (i.e., the active set [33, Definition 12.1] is nonempty) at some minimizer . Then the maximum number of iterations needed to solve (P) up to accuracy , , and equality constraint tolerance is
where , , are constants.
Theorem 17 gives a bound on the number of PCBA iterations needed to solve a POP up to specified tolerances. In particular, (25) shows the rate of convergence is linear in step tolerance (2nd term), equality constraint tolerance (3rd term), and objective function (4th term), once the number of iterations is larger than a constant (1st term). We next prove a bound on the number of items in the list after sufficiently many iterations.
Suppose there are () global minimizers of the constrained problem (P), and none of them are on the boundary of the unit box . Let the critical cone (see [33, (12.53)]) be nonempty for () as in the proof of Theorem 17. Then after sufficiently many iterations of Algorithm 1, the number of Bernstein patches remaining (i.e., length of the list ) is bounded by a constant.
The constant proved in Theorem 18 scales exponentially with respect to the dimension of the problem.
Iv-C Memory Usage Implementation
We now state the amount of GPU memory required to store a single item , given the degree and dimension of the cost and constraint polynomials. Note that, for our implementation, all numbers in an item are represented using 4B of space, as either floats or unsigned integers.
For a multi-index , let , and let for . Let be the multi-degree of the cost . Let be a multi-degree large enough for all inequality constraints , and a multi-degree large enough for all equality constraints . By “large enough” we mean that, if is the multi-degree of any then (and similarly for ). Then, as per [44, §4.1], an item can be stored in memory as an array with the following number of entries:
where the first entries store the upper and lower bounds (in each dimension) of the subbox .
We have shown that PCBA will find a solution to (P), if one exists, in bounded time. We have also shown that the memory usage of PCBA is bounded after a finite number of iterations, which implies that the memory usage is bounded; and we have provided a way to compute how much memory is required to store the list . Next, we benchmark PCBA on a variety of problems and compare it to two other solvers.
V PCBA Evaluation
In this section, we compare PCBA against a derivative-based solver (MATLAB’s fmincon ) and a convex relaxation method (Lasserre’s BSOS ). First, we test all three solvers on eight Benchmark Evaluation problems with dimension less than or equal to 4. Second, we compare all three solvers on several Increasing Number of Constraints problems, to assess how each solver scales on a variety of difficult objective functions .
All of the solvers/problems in this section are run on a computer with a 3.7GHz processor, 16 GB of RAM, and an Nvidia GTX 1080 Ti GPU. PCBA is implemented with MATLAB R2017b executables and CUDA 10.2. Our code is available on GitHub: https://github.com/ramvasudevan/GlobOptBernstein.
V-a Parameter Selection
To set up a fair comparison, we scale each problem to the box, where is the problem dimension. For PCBA, we use the stopping criteria in §III. To choose , we first compute the patch , then set
We set the maximum number of iterations to . We do not set , which determines the minimum number of iterations; is only needed to prove complexity bounds in §IV.
BSOS [20, §4] requires the user to specify the size of the semidefinite matrix associated with the convex relaxation of the POP. This is done by by selecting a pair of parameters, and (note these are different from our use of and ). Though one has to increase and gradually to ensure convergence, larger values of and correspond to larger semidefinite programs, which can be difficult to solve. We chose and separately for the Benchmark Evaluation. We used for the Increasing Number of Constraints.
V-B Benchmark Evaluation
We tested PCBA, BSOS, and fmincon on eight benchmark POPs , listed as P1 through P8; the problems are reported in the Appendix. We ran each solver 50 times on each problem, and report the median solution error and time required to find a solution. Since fmincon may or may not converge to the global optimum depending on its initial guess, we used random initial guesses for each of the 50 attempts.
In terms of solution quality, PCBA always found the solution to within the desired optimality tolerance , except for on P1, where PCBA stopped at the maximum allowed number of iterations (28); PCBA always used between 22 and 28 iterations. BSOS always found a lower bound to the solution, as expected. While fmincon converged to the global optimum at least once on every problem, it often converged to local minima, hence the large error values on some problems. In terms of solve time, fmincon solves the fastest (in 10–20 ms), PCBA is about twice as slow as fmincon, and BSOS is one to two orders of magnitude slower than PCBA.
For PCBA, the memory usage (computed with (26)) increases roughly by one order of magnitude for each additional dimension of the decision variable increases (Table I reports the peak GPU memory used by PCBA on each benchmark problem). Notice that PCBA never uses more than several MB of GPU memory, which is much less than the 11 GB available on the Nvidia GTX 1080 Ti GPU. Figure 3 shows the number of patches and the amount of GPU memory used on P4. We see that the memory usage peaks, then stays below a constant, as predicted by Theorem 18.
|PCBA||BSOS ||fmincon |
|error||time [s]||iterations||memory||error||time [s]||error||time [s]|
V-C Increasing Constraint Problems
Next, we tested each solver on problems with an increasing number of constraints, to each solver for use with RTD; we find in practice that a robot running RTD must handle between 30 and 300 constraints at each planning iteration.
We first choose an objective function with either many local minima or a nearly flat gradient near the global optimum (the global optimizer is known for each function). In particular, we tested on the ElAttar-Vidyasagar-Dutta, Powell, Wood, Dixon-Price (with ), Beale, Bukin02, and Deckkers-Aarts problems (see the Appendix and ).
For each objective function, we generate 200 random constraints in total, while ensuring at least one global optimizer stays feasible (if there are multiple global optimizers, we choose one at random that will be feasible for all constraints). To generate a single constraint , we first create a polynomial as a sum of the monomials of the decision variable with maximum degree 2, with random coefficients in the range . To ensure is feasible, we evaluate on , then subtract the resulting value from to produce (i.e., ).
We run PCBA, BSOS, and fmincon on each objective function for 20 trials, with 10 random constraints in the first trial, and adding 10 constraints in each trial. As before, we run fmincon 50 times for each trial with random initial guesses, since its performance is dependent upon the initial guess.
In terms of solution quality, all three algorithms converge to the global optimum often when the number of constraints is low, but fmincon converges to suboptimal solutions more frequently as the number of constraints increases. PCBA and BSOS are always able to find the optimal solution. PCBA is always able to find the global optimum regardless of the number of constraints, unlike BSOS (which runs out of memory) or fmincon (which converges to local minima).
All three solvers require an increasing amount of solve time as the number of constraints increases. PCBA is comparable in speed to fmincon on 2-D problems, but is typically slower on higher-dimensional problems. Regardless of the number of constraints, BSOS takes three to four orders of magnitude more time to solve than PCBA or fmincon.
More details on PCBA are presented in Table II. PCBA’s time to find a solution increases roughly by an order of magnitude when the decision variable dimension increases by 1; however, PCBA solves all of the increasing constraint POPs within 0.5 s. The memory usage increases by 1–3 orders of magnitude with each additional dimension; however, PCBA never uses more than 650 MB of GPU memory, well below the 11 GB available. Figure 5 shows PCBA’s GPU memory usage versus the number of constraints for the Powell objective function.
As expected from the complexity bounds in §IV-B, the results in this section indicate that PCBA is practical for quickly solving 2-D POPs with hundreds of constraints. We leverage this next by applying PCBA to solve RTD’s POP (4) for real-time receding-horizon trajectory optimization.
|max time [s]||max items||max memory|
|D-P 2-D||2||0.0259||90||433 kB|
|D-P 3-D||3||0.0675||356||3.47 MB|
|D-P 4-D||4||0.402||4994||193 MB|
Vi Hardware Demonstrations
Recall from §II-C that RTD enables real-time, provably collision-free trajectory planning via solving a POP every planning iteration. RTD is provably collision-free regardless of the POP solver used, meaning that we are able to test PCBA safely on hardware; when PCBA is applied to RTD, we call the resulting trajectory planning algorithm RTD*. See Figure 1 and the video (click for link).
In this section, we apply RTD* to a Segway robot navigating a hallway with static obstacles. Recall that RTD always produces dynamically feasible trajectory plans . As proven in Corollary 13 and demonstrated in §V, PCBA always finds optimal solutions. This section shows that PCBA/RTD* improves the liveness of a robot by successfully navigating a variety of scenarios, and outperforming fmincon/RTD.
We ran two demonstrations in a m hallway. In the first demonstration, we filled the hallway with random static obstacles and ran RTD*. In the second demonstration, we constructed two difficult scenarios and ran bot PCBA/RTD* and fmincon/RTD on each.
We use a Segway differential-drive robot with a planar Hokuyo UTM-30LX LIDAR for mapping and obstacle detection (see Figure 1). Mapping, localization, and trajectory optimization run onboard, on a 4.0 GHz laptop with an Nvidia GeForce GTX 1080 GPU.
As in §II-C (also see Figure 1), the robot plans by optimizing over a set of parameterized trajectories as in [18, (16)]. The parameters are speed m/s and yaw rate rad/s, so . The trajectories are between 1 and 2 s long, depending on the robot’s initial speed (since each trajectory includes a braking maneuver). The robot must find a new plan (i.e., PCBA must solve a new POP) every 0.5 s, or else it begins executing the braking maneuver associated with its previously-computed plan [18, Remark 70]. In other words, we require PCBA to return a feasible solution or that the problem is infeasible. PCBA is given a time limit of 0.4 s to find a solution, because the robot requires 0.1 s for other onboard processes.
At each planning iteration, obstacles are converted into constraints as in [18, Sections 6 and 7]; thi