A bicriteria path planning algorithm for robotics applications.
Zachary Clawson^{2}^{2}2Center for Applied Mathematics and Department of Mathematics, Cornell University, Ithaca, NY 14853. Both authors’ work was supported in part by the National Science Foundation Grant DMS1016150. (Emails: zc227@cornell.edu and vlad@math.cornell.edu.), Xuchu (Dennis) Ding^{3}^{3}3United Technologies Research Center, Hartford, CT 06118. These authors’ work was supported by United Technologies Research Center under the Autonomy Initiative. (Emails: xuchu.ding@gmail.com, frewenta@utrc.utc.com, and sissonw2@utrc.utc.com.), Brendan Englot^{4}^{4}4Schaefer School of Engineering & Science, Stevens Institute of Technology, Hoboken, NJ 07030. (Email: benglot@stevens.edu.),
Thomas A. Frewen^{3}^{3}3United Technologies Research Center, Hartford, CT 06118. These authors’ work was supported by United Technologies Research Center under the Autonomy Initiative. (Emails: xuchu.ding@gmail.com, frewenta@utrc.utc.com, and sissonw2@utrc.utc.com.), William M. Sisson^{3}^{3}3United Technologies Research Center, Hartford, CT 06118. These authors’ work was supported by United Technologies Research Center under the Autonomy Initiative. (Emails: xuchu.ding@gmail.com, frewenta@utrc.utc.com, and sissonw2@utrc.utc.com.), Alexander Vladimirsky^{2}^{2}2Center for Applied Mathematics and Department of Mathematics, Cornell University, Ithaca, NY 14853. Both authors’ work was supported in part by the National Science Foundation Grant DMS1016150. (Emails: zc227@cornell.edu and vlad@math.cornell.edu.)
Abstract: Realistic path planning applications often require optimizing with respect to several criteria simultaneously. Here we introduce an efficient algorithm for bicriteria path planning on graphs. Our approach is based on augmenting the state space to keep track of the “budget” remaining to satisfy the constraints on secondary cost. The resulting augmented graph is acyclic and the primary cost can be then minimized by a simple upward sweep through budget levels. The efficiency and accuracy of our algorithm is tested on Probabilistic Roadmap graphs to minimize the distance of travel subject to a constraint on the overall threat exposure of the robot. We also present the results from field experiments illustrating the use of this approach on realistic robotic systems.
1. Introduction
The shortest path problem on graphs is among the most studied problems of computer science, with numerous applications ranging from robotics to image registration. Given the nodetransition costs, the goal is to find the path minimizing the cumulative cost from a starting position up to a goal state. When there is a single (nonnegative) cost this can be accomplished efficiently by Dijkstra’s algorithm [9] and a variety of related labelsetting methods.
However, in many realistic applications paths must be evaluated and optimized according to several criteria simultaneously (time, cumulative risk, fuel efficiency, etc). Multiple criteria lead to a natural generalization of path optimality: a path is said to be (weakly) Paretooptimal if it cannot be improved according to all criteria simultaneously. For evaluation purposes, each Paretooptimal path can be represented by a single point, whose coordinates are cumulative costs with respect to each criterion. A collection of such points is called the Pareto Front (PF), which is often used by decision makers in a posteriori evaluation of optimal tradeoffs. A related practical problem (requiring only a portion of PF) is to optimize with respect to one (“primary”) criterion only, but with upper bounds enforced based on the remaining (“secondary”) criteria.
It is natural to approach this problem by reducing it to single criterion optimization: a new transitioncost is defined as a weighted average of transition penalties specified by all criteria, and Dijkstra’s method is then used to recover the “shortest” path based on this new averaged criterion. It is easy to show that the resulting path is always Paretooptimal for every choice of weights used in the averaging. Unfortunately, this scalarization approach has a significant drawback [8]: it finds the paths corresponding to convex parts of PF only, while the nonconvexity of PF is quite important in many robotics applications. The positive and negative algorithmic features of scalarization are further discussed in §4.4.
The same problem applied to a stochastic setting in the context of a Markov Decision Processes (MDP) is called stochastic shortest path problem, which can be seen as a a generalization of the deterministic version considered in this paper. In this case multiple costs can be considered such that a primary cost is optimized, and an arbitrary number of additional costs can be used as constraints. Such problem can generally be solved as a Constrained MDP (CMDP) [3], and recent work to extend this approach is “Chanceconstrained” CMDP [27] that works well for large state spaces [12, 10] for robotics applications. However, by considering the additional costs as constraints, the PF cannot be extracted.
Numerous generalizations of labelsetting methods have also been developed to recover all Pareto optimal paths [15, 34, 37, 25, 26, 31, 23, 24]. However, all of these algorithms were designed for general graphs, and their efficiency advantages are less obvious for highlyrefined geometrically embedded graphs, where Paretooptimal paths are particularly plentiful. (This is precisely the case for meshes or randomsampling based graphs intended to approximate the motion planning in continuous domains.) In this paper we describe a simple method for bicriteria path planning on such graphs, with an efficient approach for approximating the entire PF. The key idea is based on keeping track of the “budget” remaining to satisfy the constraints based on the secondary criterion. This approach was previously used in continuous optimal control by Kumar and Vladimirsky [21] to approximate the discontinuous value function on the augmented state space. More recently, this technique was extended to hybrid systems modeling constraints on resetrenewable resources [7]. In the discrete setting, the same approach was employed as one of the modules in hierarchical multiobjective planners [11]; our current paper extends that earlier conference publication.
The key components of our method are discussed in §2, followed by the implementation notes in §3. In §4 we provide the results of numerical tests on Probabilistic RoadMap graphs (PRM) in twodimensional domains with complex geometry. Our algorithms were also implemented on a realistic heterogeneous robotic system and tested in field experiments described in §5. Finally, open problems and directions for future work are covered in §6.
2. The augmented state space approach
Consider a directed graph on the set of nodes and edges between nodes. We will choose to be a special “source” node (the robot’s current position) and our goal will be to find optimal paths starting from . For convenience, we will also use the notation for the set of all nodes from which there exists a direct transition to . We will assume that the cost of each such transition is positive. We begin with a quick review of the standard methods for the classical shortest path problems.
2.1. The single criterion case.
Using to denote the cumulative cost (i.e., the sum of transition costs ’s) along a path , the usual goal is to minimize over , the set of all paths from to . The standard dynamic programming approach is to introduce the value function , describing the total cost along any such optimal path. Bellman’s optimality principle yields the usual coupled system of equations:
(1) 
with . Throughout the paper we will take the minimum over an empty set to be . The vector can be formally viewed as a fixed point of an operator defined componentwise by (1). A naive approach to solving this system is to use value iteration: starting with an overestimate initial guess , we could repeatedly apply , obtaining the correct solution vector in at most iterations. This results in computational cost as long as the indegrees of all nodes are bounded: for some constant . The process can be further accelerated using the GaussSeidel relaxation, but then the number of iterations will strongly depend on the ordering of the nodes within each iteration. For acyclic graphs, a standard topological ordering of nodes can be used to essentially decouple the system of equations. In this case, the GaussSeidel relaxation converges after a single iteration, yielding the computational complexity. For a general directed graph, such a causal ordering of nodes is a priori unknown, but can be recovered at runtime by exploiting the monotonicity of values along every optimal path. This is the essential idea behind Dijkstra’s classical algorithm [9], which relies on heapsort data structures and solves this system in operations. Alternatively, a class of labelcorrecting algorithms (e.g., [4, 6, 5]) attempt to approximate the same causal ordering, while avoiding the use of heapsort data structures. These algorithms still have the worstcase complexity, but in practice are known to be at least as efficient as Dijkstra’s on many types of graphs [4].
When we are interested in optimal directions from to a specific node only, Dijkstra’s method can be terminated as soon as is computed. (Note: Dijkstra’s algorithm typically expands computations outward from .) An A* method [14] can be viewed as a speedup technique, which further restricts the computational domain (to a neighborhood of optimal path) using heuristic underestimates of the costtogo function. More recent extensions include “Anytime A*” (to ensure early availability of good suboptimal paths) [13, 22, 38] and a number of algorithms for dynamic environments (where some of the values might change before we reach ) [32, 33, 19].
2.2. Bicriteria optimal path planning: different value functions and their DP equations.
We will now assume that an alternative criterion for evaluating path quality is defined by specifying “secondary costs” for all the transitions in this graph. Similarly, we define to be the cumulative secondary cost (based on ’s) along . In this setting, it is useful to consider several different value functions. The secondary (unconstrained) value function is It satisfies a similar system of equations: and
A natural question to ask is how much cost must be incurred to traverse a path selected to optimize a different criterion. We define as minimized over the set of all primaryoptimal paths Thus, , and if we define , then
Similarly, we define as minimized over the set of all secondaryoptimal paths Thus, , and if we define , then
All ’s can be efficiently obtained by the standard Dijkstra’s method with ’s computed in the process as well. The same is true for ’s and ’s.
Returning to our main goal, we now define the primaryconstrainedbysecondary value function as (the accumulated primary cost) minimized over (the paths with the cumulative secondary cost not exceeding ). We will say that is the remaining budget to satisfy the secondary cost constraint. This definition and the positivity of secondary costs yield several useful properties of :

is a monotone nonincreasing function of .



We will use the notation
(2) 
to define the set of feasible transitions on level :
The dynamic programming equations for are then for all and
(3) 
For notational simplicity, it will be sometimes useful to have defined for nonpositive budgets; in such cases we will assume whenever and . The Pareto Front for each node is the set of pairs , where a new reduction in the primary cost is achieved:
(4) 
For each fixed , the value function is piecewiseconstant on intervals and the entries in provide the boundaries/values for these intervals.
Given the above properties of , it is only necessary to solve the system (3) for , where is the maximal budget level. E.g., for many applications it might be known a priori that a secondary cumulative cost above some is unacceptable. On the other hand, if the goal is to recover the entire for a specific , we can choose , or even to accomplish this for all .
2.3. The basic upwardsweep algorithm.
For simplicity, we will first assume that all secondary costs are positive and quantized:
This also implies that is only defined for A simple implementation (Algorithm 1, described below) takes advantage of this structure by storing as an array of values for each node and . For we will need only one value for all . For each remaining node with budget we can compute the primaryconstrainedbysecondary value function using formula (3).
The explicit causality present in the system is taken advantage of by the algorithm, leading to at most function calls to solve equation (3). This results in an appealing complexity, linear in the number of nodes and the number of discrete budget levels .
2.4. Quantizing secondary costs and approximating
In the general case, the secondary costs need not be quantized, and even if they are, the resulting number of budget levels might be prohibitively high for online pathplanning. But a slight generalization of Algorithm 1 is still applicable to produce a conservative approximation of and . This approach relies on “quantization by overestimation” of secondary edge weights with a chosen :
(5) 
Similarly, we can define to be the cumulative quantized secondary cost along .
The definition of in (2) is then naturally modified to
(6) 
with the set of feasible transitions on level still defined as
Modulo these changes, the new value function is also defined by (3) for all .
It can be similarly computed by Algorithm 1 if
the condition on line 7 is replaced by
if then ...
The above modifications ensure that for every path and for all , Moreover, if is finite, there always exists some “optimal path” such that and . Of course, there is no guarantee that such a path is truly Paretooptimal with respect to the real (nonquantized) values, but and the obtained converges to the nonquantized Pareto Front as . In fact, it is not hard to obtain the bound on by using the upper bound on the number of transitions in Paretooptimal paths. Let and . If is the number of transitions on some Paretooptimal path from to , then
Since Algorithm 1 overestimates the secondary cost of each transition by at most , we know that
In the following sections we show that much more accurate “budget slackness” estimates can be also produced at runtime.
3. Implementation notes
A discrete multiobjective optimization problem is defined by the choice of domain , the graph encoding allowable (collisionfree) transitions in , and the primary and secondary transition cost functions and defined on the edges of that graph. Our application area is path planning for a ground robot traveling toward a goal waypoint in the presence of enemy threat(s). The Pareto Front consists of pairs of threatexposure/distance values (each pair corresponding to a different path), where any decrease in threat exposure results in an increase in distance traveled and viceversa.
While recovering the Pareto Front is an important task, deciding how to use it in a realistic pathplanning application is equally important. If is based on the threat exposure and the maximal allowable exposure is specified in advance, the entire is not needed. Instead, the algorithm can be used in a fully automatic regime, selecting the path corresponding to . Alternatively, choosing we can recover the entire Pareto Front, and a human expert can then analyze it to select the best tradeoff between and . This is the setting used in our field experiments described in Section 5.
3.1. Discretization parameter .
After the designation of , , and , the only remaining parameter needed for Algorithm 1 is . For a fixed graph , the choice of strongly influences the performance:

The selection of provides direct control over the runtime of the algorithm on . Due to the fact that our robot’s samplingbased planning process produces graphs of a predictable size, a fixed number of secondary budget levels ensures a predictable amount of computation time. Since the complexity is linear in , the user can use the largest affordable to define
(7) where is based on the nonquantized secondary weights .

The size of also controls the coarseness of the approximate Pareto Front. After all, , and the approximation is guaranteed to be very coarse if the number of Paretooptimal paths in is significantly higher.

Even if is large enough to ensure that the number of Pareto optimal paths is captured correctly, there may be still an error in approximating due to quantization of secondary costs. For each pair there is a corresponding path , whose budget slackness can be defined as
(8) which can be considered a postprocessing technique. Alternatively, if the last transition in this path is from to , this can be used to recursively define the slackness measurement
and compute it simultaneously with as an error estimate for . This approach was introduced in [28].
We point out two possible algorithmic improvements not used in our current implementation:

As described, the memory footprint of our algorithm is proportional to since all values are stored on every budget level. In principle, we need to store only finite values larger than ; for each node , their number is . This would entail obvious modifications of the main loop of Algorithm 1 since we would only need to update the “still constrained” nodes:
This set can be stored as a linked list and efficiently updated as increases, especially if all nodes are presorted based on ’s.

If the computational time available only permits for a coarse budget quantization (i.e. is large), an initial application of Algorithm 1 may result in failure to recover a suitable number of paths from a set of evenlyspaced budget levels. Large differences in secondary cost often exist between paths that lie in different homotopy classes, and individual homotopy classes may contain a multitude of paths with small differences in secondary cost. To remedy this, might be refined adaptively, for a specific range of budget values. The slackness measurements can be employed to determine when such a refinement is necessary.
3.2. Nonmonotone convergence: a case study.
In realworld path planning examples, the true secondary transition costs are typically not quantized, and equation (5) introduces a small amount of error dependent on . These errors will vanish as , but somewhat counterintuitively the convergence is generally not monotone.
This issue can be illustrated even in single criterion pathplanning problems. Consider a simple graph in Fig. (a)a, where there are two paths to travel from to . If the edgeweights for this graph were quantized with formula (5), an upper bound for a given path would be:
Figs. (b)b & (c)c each show the quantized cost along each path along with this upper bound. There are several interesting features that this example illustrates. First, the convergence of the quantized edge weights is nonmonotone, as expected. Further, as , the minimumcost path corresponding to the quantized edgeweights switches between the two feasible paths in the graph. Fig. 5 shows the graph with quantized weights for specific values of , where the path shown in bold is the optimal quantized path. Finally, as decreases, a threshold () is passed where the quantized cost along the top (truly optimal) path is always less than the quantized cost along the bottom (suboptimal) path.
Returning to the bicriteria planning, the monotone decrease of errors due to quantization can be ensured if every path satisfying for a fixed value of will still satisfy the same inequality as decreases. If we define a sequence , the simplest sufficient condition is to use . This is the approach employed in all experiments of Section 4. For the rest of the paper we suppress the hats on the ’s for the sake of notational convenience; the slackness measurements corresponding to each value of are shown for diagnostic purposes in all experiments of Section 4.
4. Benchmarking on synthetic data
We aim to gain further understanding of the proposed algorithm in applications of robotic path planning by focusing most of the attention on a simple scenario. Here, in contrast to Section 5, our goal is to study the convergence properties of the algorithm by refining two accuracy parameters: the number of PRM nodes in the graph and the number of budget levels (determined by ). Computations are done offline on a laptop rather than a robotic system, allowing for more computational time than in a real scenario. We assume that the occupancy map and threats are known a priori so that we expend no effort in learning this information. All tests are conducted on a laptop with an Intel i74712HQ CPU (2.3GHz quadcore) with 16GB of memory.
4.1. Case Study Setup
The results presented in this paper require a graph that accurately represents the motion of a robot in an environment with obstacles, where nodes correspond to collisionfree configurations and edges correspond to feasible transitions between these configurations. We are particularly interested in graphs that contain a large number of transitions between any two nodes. There are a number of algorithms capable of generating such graphs, such as the Rapidlyexploring Random Graphs (RRG) [17] or Probabilistic Roadmap (PRM) [18] algorithms. In this work we chose to use the PRM algorithm as implemented in Open Motion Planning Library (OMPL) [36] as a baseline algorithm to generate the roadmap/graph. In essence, the PRM algorithm generates a connected graph representing the robot’s state (e.g. position, orientation, etc…) in the environment. The graph is generated by randomly sampling points in the statespace of the robot (called the configuration space in robotics) and trying to connect existing vertices of the graph within the sampled configuration.
In case studies examined, we construct a roadmap as a directed graph where is the set of nodes and is the set of transitions (edges). The nodes of represent the positions of the robot in 2D, i.e. , and an edge , represents that there is a collisionfree line segment from node to node in the PRM graph. Each edge of is assigned a primary and secondary transition cost by two cost functions , respectively.
For the configuration space we chose to be sampled by the OMPL planning library [36] with the default uniform (unbiased) sampling, and two slight modifications: we increased the number of points along each edge that are collisionchecked for obstacles and removed selftransitions between nodes. For realworld examples the roadmap is typically generated by running the PRM algorithm for a fixed amount of time. However, for testing purposes we specify the number of nodes desired in .
Two test cases are presented:

Section 4.7 presents a more complicated scenario in a synthetic environment to illustrate the much broader scope of the algorithm.
Both tests assume that the physical dimensions of are (so each pixel of the occupancy map is ). Each edge is generated by connecting vertices in the graph that are ‘close together’ via straightline segments and collisionchecking the transition (edge) with a sphere of radius 5m, which represents the physical size of the robot.
The first occupancy map considered is shown in Fig. (a)a; the color black represents occupied areas (walls) whereas white represents the unoccupied areas (configuration space). The concentric circles represent the contours of a threat level function (to be defined). A PRMgenerated roadmap with nodes and edges over this occupancy map is shown in Fig. (b)b.
4.2. Cost Functions
We consider two cost functions and for the multiobjective planning problem. The cost function represents the Euclidean distance between two adjacent nodes in , i.e. where .
The cost function denotes the threat exposure along an edge in the graph. To define we assume that there are threats in the environment, which are indicated by their position , severity , minimum radius , and visibility radius for each . For each threat, these parameters define a threat exposure function
(9) 
for . Fig. (a)a shows the contours of for a single threat with , m, and . For simplicity we will always assume that for all in all examples.
The cumulative threat exposure function is then defined as the integral of the instantaneous threat exposure along the transition (edge), summed over all threats
(10) 
where is parameterized above by assuming the robot moves at a constant speed. Note that in (10), we penalize both the proximity to each threat and the duration of exposure. An example threat placement in the environment is shown in Fig. (a)a along with the associated contours of the threat exposure function.
4.3. Pareto Front
We use the proposed multiobjective planning algorithm to identify paths that lie on the Pareto front of the primary and secondary cost. Our approach enables the reuse of a roadmap for searches under different objectives and constraints.
We rely on the two cost functions and described in Section 4.2. In every Pareto Front shown we will reserve the horizontal axis for accumulated values of and the vertical axis for accumulated values of . We first point out a Pareto Front in Fig. (a)a that is generated for the environmental setup shown in Fig. 8. The Pareto Front is colorcoded (dark blue to magenta) indicating the cumulative threat exposure along the path (lowtohigh), where each colorcoded Paretooptimal path within the configuration space is shown in Fig. (b)b. As explained in Section 3 there is additional error due to the quantization of the secondary costs. As discussed, one method for measuring this error is to calculate the true secondary cost along each path, which is shown by the gray markers in every Pareto Front plot.
4.4. Comparison with scalarization
The scalarization algorithm operates by minimizing a weighted sum of the objectives to recover Paretooptimal solutions. In the context of bicriteria pathplanning on graphs, this means that we select some and define a new (single) cost for each edge Dijkstra’s algorithm is then applied to find an optimal path with respect to the new edge weights ’s. Any resulting optimal path is Paretooptimal with respect to our two original criteria. (Indeed, if some other path were strictly better based on both ’s and ’s, it would also have a lower cumulative cost with respect to ’s.)
The procedure is repeated for different values of to obtain more points on . The computational complexity of this algorithm is , where is the number of sampled values.
Even though scalarization recovers paths which are rigorously Paretooptimal and does not rely on any discretization of secondary costs, it has several significant drawbacks. First, the set of values to try is a priori unknown. The most straightforward implementation based on a uniform grid on turns out to be highly inefficient since the distribution of corresponding points on the Pareto front is very nonuniform. This is easy to see geometrically, since can be interpreted as the slope of the Pareto Front wherever it is smooth; see Figure 17(A). Indeed, when is not smooth, we typically have infinitely many ’s corresponding to the same point on ; see Figure 17(B). To improve the efficiency of scalarization, we have implemented a basic adaptive refinement in to obtain the desired resolution of . The second drawback is even more significant: when the Pareto front is not convex, a single value might define several optimal paths, corresponding to different points on ; see Figure 17(C). In such cases, the Pareto Front typically is not convex, with nonconvex portions remaining completely invisible when we use the scalarization. Indeed, this approach can be viewed as approximating as an envelope of “support hyperplanes” and thus only recovers its convex hull [8].
We tested our implementation of this adaptive scalarization algorithm on the example already presented in Figures 8 and 13. Not surprisingly, our budgetaugmented algorithm produces significantly more (approximate) Paretooptimal solutions than the scalarization; see Figures (c)c and (d)d. Regardless, of the refinement threshold, scalarization completely misses all nonconvex portions of . In this scenario adapative scalarization produced 35 unique solutions in approximately 1.75 seconds. For comparison, our budgetaugmented approach produces 160 (approximate) solutions in just 0.2 seconds.
4.5. Effect of discretization
As discussed in Section 2, the parameter determines the ‘discretization’ of the secondary budget allowance. For all results we use as defined in Equation (7) in Section 3.
In Fig. 26 we generate a highdensity node graph (with edges) and observe the effect of decreasing (increasing ) on the results. As decreases, it is clear that the produced Pareto Front approaches the true secondary cost curve in gray. The geometric sequence of values used to generate the Pareto Fronts in Figs. (a)a–(d)d is important as discussed in Section 3.2 – it is what guarantees the monotone convergence. We note that even for nongeometric sequences of that nonmonotone convergence is difficult to visually observe.
4.6. Swapping primary/secondary costs
Throughout this paper we have assumed that the roles of primary and secondary edge weights are fixed, focused on realistic planning scenarios minimizing the distance traveled subject to constraints on exposure to enemy threat(s). However, an equivalent solution may also be obtained by switching the costs, with a better approximation of sometimes obtained without increasing the number of budget levels [28].
Our experimental evidence indicates that the algorithm does a better job of recovering the portion of the Pareto Front where the slope of the front has smaller magnitude. In other words, small changes in secondary budget result in small changes in accumulated primary cost, e.g. see Fig. 26. Fig. 35 shows the result of setting the primary cost to threat exposure () and the secondary cost to be distance (). This algorithmic feature can be used to recover a more uniformly dense Pareto Front without significantly increasing the computational cost: consider collating the right portion of Fig. (b)b and the left part of Fig. (b)b.
4.7. Visibility from enemy observer
Previous results focused on an example with a single enemy observer with a simple model for threat exposure. In this final subsection we illustrate the algorithm on a much broader example including multiple enemies with limited visibility behind obstacles. We assume that the size of this domain is , and suppose there are two enemy observers with parameters:
Enemy 1  m  

Enemy 2  m 
For this problem we take the visibility of enemies into account in the construction of a new threat function . We calculate just as in (10), but modify the individual threat equations to account for limited visibility. We define modified threat level functions
for some small (we found to work well).
In Fig. (a)a we show the environment where, just as before, the black rectangles represent obstacles that define the configuration space. The contours of the summed threat exposure functions, , are plotted. The locations of the two enemies are visually apparent, and the visibility of each enemy can also be easily seen. In particular, the white regions in the image show the points where visibility is obstructed for both enemies, i.e. for all . Fig. (b)b shows the roadmap generated for a node graph with edges in this environment.
Figs. (c)c & (d)d show the results with the settings and . Fig. (c)c shows the results of the algorithm using the same roadmap (Fig. (b)b) with . The corresponding Pareto Front in Fig. (d)d exhibits strong nonconvexity on this particular domain and generated graph due to the large number of obstacles.
Our final test is to use this visibility example to study the convergence of the Pareto Front as the number of nodes in the graph increases. We fix budgetlevels in order to accurately capture the Pareto Front corresponding to the generated graph of nodes. We plot in Fig. 41 as varies. There is some Pareto Front PF corresponding to a ‘continuous’ version of this multiobjective problem, and we see that as increases converges to PF.
5. Experimental data
The multiobjective path planning algorithm described in this paper was implemented and demonstrated on a Husky ground vehicle produced by Clearpath Robotics [1]. A picture of the vehicle used in the experiments is shown in Figure (a)a. The vehicle was equipped with a GPS receiver unit, a WiFi interface, a SICK LMS111 Outdoor 2D LIDAR [2], an IMU, wheel encoders and a miniITX single board computing system with a 2.4GHz Intel i5520M processor and 8GB of RAM.
Although the GPS unit is used for localization, the associated positional uncertainty typically cannot meet requirements needed for path planning. Moreover, a map of the environment is required to generate a roadmap to support path planning as discussed in the previous section. To this end, we use a Simultaneous Localization and Mapping (SLAM) algorithm for both localization and mapping, and in particular, the Hector SLAM open source implementation [20]. Hector SLAM is based on scan matching of range data which is suitable for a 2D LIDAR such as the SICK LMS111. The output of Hector SLAM is an occupancy grid representing the environment and a pose estimate of the vehicle in the map. Details of the algorithm and implementation can be found in reference [20] and the accuracy of localization is addressed in [29]. In order to allow stable and reliable state estimation, we also implemented an extended Kalman filter (EKF). Position estimates were computed by fusing inertial information and SLAM pose estimates through the EKF.
A set of 24 “Jersey barriers” placed on flat ground were used to create a complex mazelike environment with multiple interconnecting corridors. Figure (b)b shows the barriers and their positions on the ground. The map generated by Hector SLAM for the obstacle course is shown in Figure (c)c. The map of the environment is known to the vehicle before the start of the multiobjective planning mission. We utilized the Open Motion Planning Library (OMPL) [35] which contains implementations of numerous samplingbased motion planning algorithms. We have prior experience using this library for samplingbased planner navigation of an obstaclerich environment [16]. The generality of OMPL facilitated the development and implementation of our fieldcapable bicriteria path planning algorithm: efficient lowlevel data structures and subroutines are leveraged by OMPL (such as the Boost Graph Library (BGL) [30]) and several predefined robot types (and associated configuration spaces) are available. For the purposes of Husky path planning we apply the PRM algorithm within an state space.
For the experiments provided in this section, we set the number of budget levels to . The roadmap “growtime” is set to be about seconds, and planning boundary to be a box of size , resulting a roadmap with about vertices and edges (e.g., see Fig. (b)b). The time to generate the roadmap graph and assign edgeweights is about second, and time to compute the solution and generate the Pareto Front is about second. Thus, a fresh planning instance from graph generation to computing a solution takes about seconds. We also implemented a number of features in the planner intended to add robustness and address contingencies that may arise during the mission, such as when threat values in the environment change. In this case, the execution framework will attempt to replan with updated threat values. Since the graph need not be regenerated, replanning is typically much faster (about second).
In the subsequent mission, we assume the Husky is tasked with navigating from its current location to a designated goal waypoint, while avoiding threat exposure to a set of fixed and given threats, i.e., we aim to minimize distance (primary cost) subject to a maximum allowable level of threat exposure (secondary cost). We use the cost functions (distance) and (threat exposure) as given in Section 4.2. For threat exposure, we use a single threat with , and . Once the path is computed, the path is sent down to the lower level path smoothing module. The path smoother is part of the hierarchical planning and execution framework as discussed in [11] and is implemented using model predictive control with realistic vehicle dynamics.
Since threat exposures are difficult to quantify absolutely but easy to compare relatively, we designed a Graphical User Interface (GUI) to pick a suitable path from the Pareto front reconstructed by the proposed algorithm, i.e., the user chooses the shortest path subject to an acceptable amount risk from among the paths in the Pareto front. The process to execute the planning mission is illustrated in Fig. 50 can be described as follows:

Given an occupancy map of the environment generated by the Hector SLAM algorithm and the known threat location, the user sets the desired goal location for the mission (see Fig. (a)a). The user has some predefined level of tolerance/budget for the amount of threat exposure allowed. Say, for example, a threat tolerance of is permitted in the example shown in Figure 50 (for reference, the shortest distance path has a threat exposure of ).

An initial PRM is generated with edgeweights computed by primary and secondary cost functions. Algorithm 1 is executed and the results are used to compute the Pareto Front (see Fig. (b)b). Next, the path corresponding to each point on the PF is recovered and shown in the GUI, where color indicates the amount of secondary cost (see Fig. (c)c).

From the paths available in Fig. (c)c and the amount of threat exposure tolerated, an expert analyzes the results. Originally the budget was defined to be , which corresponds to a path with a length of m (primary cost). The expert realizes if the exposure allowance is slightly increased to , a path with length m is available. These two points correspond to the large vertical drop in the Pareto Front in Fig. 50.

From the results and analysis, the user then chooses a path by clicking on a dot in the lower right plot. The picked path is highlighted in bold (see Fig. (c)c). The user then clicks a button in the GUI to verify the path as desired (see Fig. (d)d), and finally the selected path is sent to lower level modules for path smoothing and vehicle control.
Fig. 55 shows four representative snapshots from a real mission in progress.
6. Conclusions
We have introduced an efficient and simple algorithm for bicriteria path planning. Our method has been extensively tested both on synthetic data and in the field, as a component of a real robotic system. Unlike prior methods based on scalarization, our approach recovers the entire Pareto Front regardless of its convexity. As an additional bonus, choosing a larger value of boosts the efficiency of the method, resulting in an approximation of PF, whose accuracy can be further assessed in real time.
The computational cost of our methods is corresponding to the number of nodes in the budgetaugmented graph. Of course, the total number of nodes with distinct Pareto optimal cost tuples can be much smaller than . In such cases, the prior labelsetting algorithms for multiobjective planning will likely be advantageous since their asymptotic cost is typically . However, in this paper we are primarily interested in graphs used to approximate the path planning in continuous domains with edge costs correlated with the geometric distances. As we showed in Section 4, for such graphs the number of points on PF is typically quite large (particularly as PRM graphs are further refined), with new Pareto optimal paths becoming feasible on most of the budget levels. Coupled with domain restriction techniques and the simplicity of implementation, this makes our approach much more attractive.
Several extensions would obviously greatly expand the applicability of our method. We hope to extend it to a higher number of simultaneous criteria and introduce A*, D*, and “anytime planning” versions. In addition, it would be very useful to develop a priori bounds for the errors introduced in the PF as a function of . Another direction is automating the choice of primary/secondary cost to improve the method’s efficiency.
Acknowledgements. The authors would like to acknowledge and thank the whole Autonomy team at United Technologies Research Center. In particular, the authors would like to thank Suresh Kannan, Alberto Speranzon, and Andrzej Banaszuk for productive discussions of the algorithm and implementations of the simulation software. In addition, the authors would like to thank Amit Surana, Zohaib Mian, and Jerry Ding for the support from the Autonomy Initiative and help during field experiments. The authors are also grateful to Blane Rhoads for his suggestions on testing the slackness of the Pareto Front approximation.
References
 [1] Clearpath robotics  husky ground vehicle.
 [2] Sick 2d lidar.
 [3] E. Altman, Constrained markov decision processes, Stochastic Modeling Series, Taylor & Francis, 1999.
 [4] Dimitri P Bertsekas, A simple and fast label correcting algorithm for shortest paths, Networks 23 (1993), no. 8, 703–709.
 [5] Dimitri P. Bertsekas, Dynamic programming and optimal control, vol. 1, Athena Scientific Belmont, MA, 1995.
 [6] Dimitri P Bertsekas, Francesca Guerriero, and Roberto Musmanno, Parallel asynchronous labelcorrecting methods for shortest paths, Journal of Optimization Theory and Applications 88 (1996), no. 2, 297–320.
 [7] W. Chen, Z. Clawson, S. Kirov, R. Takei, and A. Vladimirsky, Optimal control with budget constraints and resets, SIAM Journal on Control and Optimization 53 (2015), 712–744.
 [8] I. Das and J.E. Dennis, A closer look at drawbacks of minimizing weighted sums of objectives for pareto set generation in multicriteria optimization problems, Struct. Optim. 14 (1997), 63–69.
 [9] Edsger W Dijkstra, A note on two problems in connexion with graphs, Numerische mathematik 1 (1959), no. 1, 269–271.
 [10] Xu Chu Ding, Alessandro Pinto, and Amit Surana, Strategic planning under uncertainties via constrained markov decision processes., ICRA, IEEE, 2013, pp. 4568–4575.
 [11] Xuchu Ding, Brendan Englot, Allan Pinto, Alberto Speranzon, and Amit Surana, Hierarchical multiobjective planning: From mission specifications to contingency management, Robotics and Automation (ICRA), 2014 IEEE International Conference on, IEEE, 2014, pp. 3735–3742.
 [12] S. Feyzabadi and S. Carpin, Hcmdp: A hierarchical solution to constrained markov decision processes, May 2015, pp. 3971–3978.
 [13] Eric A Hansen, Shlomo Zilberstein, and Victor A Danilchenko, Anytime heuristic search: First results, Univ. Massachusetts, Amherst, MA, Tech. Rep 50 (1997).
 [14] Peter E Hart, Nils J Nilsson, and Bertram Raphael, A formal basis for the heuristic determination of minimum cost paths, Systems Science and Cybernetics, IEEE Transactions on 4 (1968), no. 2, 100–107.
 [15] J. M. Jaffe, Algorithms for finding paths with multiple constraints, Networks 14 (1984), 95–116.
 [16] S. K. Kannan, W. M. Sisson, D. A. Ginsberg, J. C. Derenick, X. C. Ding, T. A. Frewen, and H. Sane, Close proximity obstacle avoidance using samplingbased planners, 2013 AHS Specialists’ Meeting on Unmanned Rotorcraft and NetworkCentric Operations, 2013.
 [17] Sertac Karaman and Emilio Frazzoli, Samplingbased algorithms for optimal motion planning, The International Journal of Robotics Research 30 (2011), no. 7, 846–894.
 [18] Lydia E. Kavraki, Petr Svestka, JC Latombe, and Mark H. Overmars, Probabilistic roadmaps for path planning in highdimensional configuration spaces, IEEE Transactions on Robotics and Automation 12 (1996), no. 4, 566–580.
 [19] Sven Koenig and Maxim Likhachev, Fast replanning for navigation in unknown terrain, Robotics, IEEE Transactions on 21 (2005), no. 3, 354–363.
 [20] Stefan Kohlbrecher, Johannes Meyer, Thorsten Graber, Karen Petersen, Uwe Klingauf, and Oskar von Stryk, Hector open source modules for autonomous mapping and navigation with rescue robots, RoboCup 2013: Robot World Cup XVII, Springer, 2014, pp. 624–631.
 [21] A. Kumar and A. Vladimirsky, An efficient method for multiobjective optimal control and optimal control subject to integral constraints, Journal of Computational Mathematics 28 (2010), 517–551.
 [22] Maxim Likhachev, Geoffrey J Gordon, and Sebastian Thrun, Ara*: Anytime a* with provable bounds on suboptimality, Advances in Neural Information Processing Systems, 2003.
 [23] E. Machuca, L. Mandow, and J.L. Pe’rez de la Cruz, An evaluation of heuristic functions for bicriterion shortest path problems, New Trends in Artificial Intelligence. Proceedings of the XIV Portuguese Conference on Artificial Intelligence (EPIA 2009)., 2009.
 [24] Enrique Machuca, Lorenzo Mandow, Jose L. Pérez de la Cruz, and Amparo RuizSepulveda, An empirical comparison of some multiobjective graph search algorithms, Proceedings of the 33rd annual German conference on Advances in artificial intelligence (Berlin, Heidelberg), KI’10, SpringerVerlag, 2010, pp. 238–245.
 [25] L. Mandow and J. L. Pérez De la Cruz, A new approach to multiobjective a* search, Proceedings of the 19th international joint conference on Artificial intelligence (San Francisco, CA, USA), Morgan Kaufmann Publishers Inc., 2005, pp. 218–223.
 [26] Lawrence Mandow and José. Luis Pérez De La Cruz, Multiobjective a* search with consistent heuristics, J. ACM 57 (2008), 27:1–27:25.
 [27] Masahiro Ono, Marco Pavone, Yoshiaki Kuwata, and J. Balaram, Chanceconstrained dynamic programming with application to riskaware robotic space exploration, Autonomous Robots 39 (2015), no. 4, 555–571.
 [28] Blane Rhoads, Suresh Kannan, Thomas A. Frewen, and Andrzej Banaszuk, Optimal control of autonomous helicopters in obstaclerich environments, United Technologies Research Center, East Hartford, CT (2012).
 [29] João Machado Santos, David Portugal, and Rui P. Rocha, An evaluation of 2d slam techniques available in robot operating system., SSRR, IEEE, 2013, pp. 1–6.
 [30] J.G. Siek, L.Q. Lee, and A. Lumsdaine, Boost graph library: User guide and reference manual, the, Pearson Education, 2001.
 [31] A. J. V. Skriver and K. A. Andersen, A label correcting approach for solving bicriterion shortestpath problems, Computers & Operations Research 27 (2000), no. 6, 507 – 524.
 [32] Anthony Stentz, Optimal and efficient path planning for partiallyknown environments, Robotics and Automation, 1994. Proceedings., 1994 IEEE International Conference on, IEEE, 1994, pp. 3310–3317.
 [33] Anthony Stentz, The focussed d* algorithm for realtime replanning, IJCAI, vol. 95, 1995, pp. 1652–1659.
 [34] Bradley S. Stewart and Chelsea C. White, III, Multiobjective a*, J. ACM 38 (1991), 775–814.
 [35] Ioan A. Şucan, Mark Moll, and Lydia E. Kavraki, The Open Motion Planning Library, IEEE Robotics & Automation Magazine 19 (2012), no. 4, 72–82, http://ompl.kavrakilab.org.
 [36] Ioan Alexandru Sucan, Mark Moll, and Lydia E. Kavraki, The open motion planning library, IEEE Robotics & Automation Magazine 19 (2012), no. 4, 72–82.
 [37] Chi Tung Tung and Kim Lin Chew, A multicriteria paretooptimal path algorithm, European Journal of Operational Research 62 (1992), no. 2, 203 – 209.
 [38] Jur Van Den Berg, Rajat Shah, Arthur Huang, and Ken Goldberg, Ana*: Anytime nonparametric a*, Proceedings of Twentyfifth AAAI Conference on Artificial Intelligence (AAAI11), 2011.