Path Planning in Unknown EnvironmentsUsing Optimal Transport Theory

# Path Planning in Unknown Environments Using Optimal Transport Theory

Haoyan Zhai    Magnus Egerstedt    Haomin Zhou
###### Abstract

This paper introduces a graph-based, potential-guided method for path planning problems in unknown environments, where obstacles are unknown until the robots are in close proximity to the obstacle locations. Inspired by optimal transport theory, the proposed method generates a graph connecting the initial and target configurations, and then finds a path over the graph using the available environmental information. The graph and path are updated iteratively when newly encountered obstacle information becomes available. The resulting method is a deterministic procedure proven to be complete, i.e., it is guaranteed to find a feasible path, when one exists, in a finite number of iterations. The method is scalable to high-dimensional problems. In addition, our method does not search the entire domain for the path, instead, the algorithm only explores a sub-region that can be described by the evolution of the Fokker-Planck equation. We demonstrate the performance of our algorithm via several numerical examples with different environments and dimensions, including high-dimensional cases.

Keywords— path planning, unknown environment, optimal transport, Fokker-Planck equation.

## 1 Introduction

This paper considers path planning for a robot, or possibly a group of robots, in an unknown environment. In other words, a set of robots in given initial configurations are tasked with finding a feasible path to the target configuration, while avoiding collisions with obstacles. We consider scenarios where the number of robots is fixed and where obstacles are detected when they are within detection range to one of the robots in the group. We assume that the system employs a broadcast strategy in the sense that the obstacle information, once available, is shared among the group immediately. Compared to the path planning problem for known environments, there are several significant challenges when the problem is posed in unknown environments. First of all, re-planning while moving becomes necessary when a planned path is blocked by newly detected obstacles. This raises livelock concerns, i.e., the robots may end up moving in loops, and never reach the target even when there exist feasible paths. Secondly, the configuration space may be quite high-dimensional. Especially when there are a large number of robots present. As a result, grid-like discretizations often lead to intractable computations. In this case, working with graphs is a viable option to reduce the computation burden. However, the cost can still be high if the graph has to span everywhere in the high-dimensional space. Thirdly, there may be narrow pathways between obstacles, which poses significant hurdles to identify them in the search process. Finally, for problems with unknown environments, optimality is only meaningful in the currently known environments. Hence, one may have to accept locally optimal, or even simply feasible, solutions in some cases.

There exists an extensive literature on path planning. For example, the well-known Probabilistic Road Map (PRM) method generates a random graph that does not intersect with obstacles and then finds a path over the graph to connect the initial and target configurations [1, 2, 3, 4]. PRM guarantees a connection between the initial and target configurations when the graph is dense enough in the configuration space. Many additional PRM have been reported in the past decades, see [5, 6, 7, 8] for details.

The Rapidly-exploring Random Tree (RRT) is another influential method [9]. It creates a tree structure rooted at the initial configuration, making sure that all the vertices are connected to the initial one. At each step, the algorithm picks a configuration in the space randomly, and checks whether it can be added to the tree following certain criteria. This is continued until the target or a configuration close enough to the target is included in the tree. This algorithm has been adopted to path finding in unknown environments [10, 11] and has recent improvements such as RRT are reported in [12, 13].

The Artificial Potential Field (APF) assigns the robot a positive charge while the target configuration a negative one [14] . This drives the robot towards the target. To avoid collisions with the obstacles, APF sets the obstacles with positive charges that repel the robot. Since it is a local gradient method, APF is amenable to high dimensional problems. However, a potential limitation of the original APF is the creation of unnecessary local minima due to the presence of obstacles, which may fail the algorithm. In recent years, there are reported improvements of APF, see [15, 16, 17, 18] and the references therein.

The family of Bug Algorithms, starting from the original Bug0, Bug1 and Bug2 [19, 20] to the later developments, such as TangentBug [21], DistBug [22] and many other variants, adopt two basic modes as their design principle: motion-to-goal mode and boundary-following mode. They are powerful tools, with theoretical convergence guarantees, especially suitable for path planning in unknown environments in dimensional working space. Some recent survey and performance comparison studies can be found in [23, 24].

In addition, widely known graph based methods, like [25], [26], Focused [27] and lite [28] can be used for path planning in both known and unknown environments [29]. When applied to the problems in unknown environments, they often require to cover the entire region by discrete lattice grids, on which the algorithms are performed to find paths. In literature, there are other types of algorithms such as genetic algorithm [30, 31], evolutionary programming [32], fuzzy logic [33, 34], neural network [35], network simplex method [36], method of evolving junctions [37], fast marching tree [38], and a few hybrid approaches that combine different methods [39, 40], and also many more swarm strategies for multi-agent systems in recent years [41, 42, 43].

In this paper, we present a potential guided, graph-based pathfinding method inspired by the evolution of Fokker-Planck Equation (FPE) in optimal transport theory [44]. Optimal transport theory is a branch of mathematics studying how to transport one probability distribution to another with the optimal cost. There are different ways to formulate the theory such as using linear programming [45], or partial differential equations (PDEs) [46, 47] etc. Among various formulations, the optimal control approach reveals that the FPE is the gradient flow of a free energy, consisting of potential energy and entropy, in the probability space equipped with the so-called Wasserstein distance [48]. Using the optimal transport theory, one can show that FPE can escape the traps of any local minima in a potential field and reach the Gibbs distribution which concentrates on the global minimizer. Incorporating this property and advantages of several existing algorithms, we design a novel method for path planning in unknown environments. Our goal is providing an alternative algorithm that can work efficiently, especially for problems with high dimensional configuration spaces, such as multi-agent systems. When designing our algorithm, we introduce a potential field, defined by the distance to the target configurations in this paper. The unique global minimum of the potential field is at the target configurations. Unlike APF, the obstacles do not contribute to the potential field, instead, they define the infeasible regions. We generate a graph that has a tree structure originated from the initial configuration, growing in a deterministic manner guided by the flow direction of FPE towards the target configuration. Our algorithm has the following features:

1. The algorithm is a graph based deterministic procedure with guaranteed convergence properties, meaning that the algorithm terminates in a finite number of steps, and returns a feasible path if there exists one. Thus, the algorithm is complete. We would highlight that the convergence of our algorithm is deterministic, in contrast to the asymptotic convergence results shared by many methods using randomness.

2. If the algorithm does not find a feasible path, one concludes that from initial to target configurations, there does not exist a feasible trajectory such that one can find a tubular region, centered at the trajectory with a small radius, not intersecting with obstacles. The lower bound of the radius for the tubular region is proportional to the step size used in graph generation.

3. The path found by the algorithm is locally optimal in the known environment up to the current location of the robots.

4. The graph generated by the algorithm has a tree structure growing linearly with respect to the dimension of the configuration space. Together with the dimension reduction techniques proposed to rapidly escape the local traps, the algorithm can efficiently handle high dimensional problems.

5. The algorithm only explores a limited region defined by the solution of a FPE, even when the obstacles are not known a priori.

It should be noted that optimal transport theory has been considered in several recent studies for path planning. For example, swarming robots are modeled by a distribution, and their optimal transport map is calculated by linear programming in [49]. Another paper presents a partial differential equation (PDE) based swarming model to the deployment of a large scale of robots using Kantorovick-Rubinstein relaxation of optimal transport theory [50]. Our method is different in that we directly use the evolution of FPE to guide the path construction.

In the next section, we present the details of the algorithm with the finite step stopping property. In Section 3, we show some numerical examples to illustrate the performance in both low and high dimensional configuration spaces. Section 4 gives strategies for dimension reduction near local minima to further lower the computational cost. In Section 5 the relationship between the algorithm and optimal transport theory is discussed. The convergence proof is given in Section 6. We end the paper with a brief conclusion in the last section.

## 2 Algorithm

Let the configuration space be a bounded connected domain in . We assume that the robots can alter configurations freely in as long as the change does not violate the required constraints. There are two types of constraints we consider in this paper. One is the constraints known in advance, for example, two robots can’t be too close or too far away from each other in the multi-agent system. We denote those constraints by

 ϕ=(ϕ1,ϕ2⋯,ϕk1):Ω⟶Rk1,

and a configuration does not satisfy the constraints when for some . The other type of constraints is given by unknown environments, such as unknown obstacles. We represent them by

 ψ=(ψ1,ψ2⋯,ψk1):Ω⟶Rk2,

and for some means the constraints being violated. We emphasise that can only be detected if robots are close enough to the obstacles. This implies that the knowledge of must be updated dynamically while the robots are in motion. For the simplicity in discussion, we assume that both and are continuous.

To illustrate the setups, we give a single robot example in Figure 1. The configuration space is a square, all the gray bars are the obstacles that the robot cannot collide with. The light gray bars in the figure are obstacles undetected. Like in the second picture in Figure 1, if the robot moves horizontally but not too far away from its initial configuration, there is no detected obstacle. As the robot moves, more and more obstacles are recognized. Our goal is finding a path from the initial configuration (red diamond in Figure 1) to the target configuration (red circle in Figure 1). More precisely, we want to find a feasible path

 γ(t):[0,T]→Ω,

satisfying and for all , such as the red path in Figure 1, while is updated with newly detected obstacles as changes.

To describe the dynamical change of the unknown constraints while moving along a path, we mark a configuration as part of the detected obstacles if () and is within distance to the current configuration of the robot. To be precise, we define

 ˜ψ(x,t,γ)={ψ(x)if ∥x−γ(τ)∥≤R for some τ≤t0otherwise (1)

as the detected part of the environment along the path.

For the convenience of discussion, we assume there exists at least a feasible path connecting the initial and target configurations, and this feasible path is contained in a tubular obstacle-free region with radius as shown by the shadow part in the first picture in Figure 1. This assumption is a technique requirement that is used for the proof of the convergence and can be rewritten as the following equation

 supγ∈Γinft∈[0,T]supr≥0{r:B(γ(t),r)∩O=∅}=L>0, (2)

where is an open set, and is also open. We denote as the boundary of , and as the boundaries of separating the constrained regions from the feasible regions.

Let us define the set of all possible paths from to in the full time interval as

 Γ={γ:γ(0)=xs,γ(t)=xt,∀t≥T0, for some T0≤T,γ∩O=∅} (3)

Then our dynamical path planning algorithm in the unknown environment is given in Algorithm 1:

In the remaining part of this section, we discuss, assisted with examples, the three major steps (Step 4, 5, and 6 in Algoirhtm 1) in details.

### 2.1 Graph generating

The first step is to generate a graph , where is the vertex set and is the edge set, connecting the current configuration ( for the first graph generation) and the target with currently known environment. The vertices are configurations in while the edge linking is the straight line segment between and . Meanwhile, we would like to create the graph satisfying two properties: 1) the graph does not violate any known constraints; 2) the graph cannot contain too many vertices due to the computation complexity concern in the high dimensional cases. To achieve these goals, we introduce a convex potential function , admitting a unique global minimizer , to help us choose the vertices. We select an -dimensional orthonormal basis (here is the dimension of ) to determine the directions which are used to add new vertices to . For simplicity, we take , the distance to the target, as the potential, and the standard coordinate axes as the orthonormal basis in this paper.

At the first generating step, we let and . In each step afterward, a vertex with the lowest potential is chosen. We pick new points along the orthonormal basis originated at , with distance to , and use them as the candidates to expand (first figure in Figure 2). Before adding those points into the vertex set, we first delete all candidates that violate the currently known constraints ( or for some and , is the previous trajectory of the robots). For example, the robot shown in the left plot in Figure 2 stops at the red diamond position and generates four points around it. Among them, the point in the obstacle is removed (right picture in Figure 2). Next, we delete vertices whose edges violate the constraints as shown in Figure 3. In this case, there exists a point such that is not in the feasible region. In addition, to avoid repeating vertices, we remove those already included in from the candidate list, as shown in Figure 4. After these deleting steps, we add all remaining candidates, and their associated edges, to and respectively. This process is repeated until the target is within a small neighborhood of a vertex in . For example, the final graph after several iterations is plotted in the right figure of Figure 4.

To summarize, given the previous trajectory of robots ( as default), we let the current constraints be

 Oc={x∈Ω:˜ψi(x,T0,γ)<0% for some i∈{1,⋯,k2} or ϕi(x)<0 for some i∈{1,⋯,k1}}.

The graph generating procedure can be written in the following algorithm (Algorithm 2), and we define is the ancestor of if when is picked to generate nodes as the lowest potential node, is added to the vertex set as newly generated node.

###### Remark 1.

The choice of the generating radius can be arbitrary, although and must satisfy an inequality to have the convergence guarantee theoretically (as is shown in Section 2.4). Larger leads to fewer vertices in while smaller giving a finer search in . For simplicity, we treat those obstacles with distance less than to be a single obstacle by ignoring the gaps among them in our theoretical analysis. In practice, as shown in our experiments, the graph generation can still create nodes passing through the gap between obstacles with distance less than or even .

### 2.2 Path finding

After generating the graph , the next step is to find a feasible path moving from the current configuration to the target using only vertices and edges on the graph. Our goal is to minimize the total travel distance. The graph generated by Algorithm 2 has the following property:

###### Proposition 1.

There exists a unique path from the current configuration to the target over the generated graph . If the path is denoted by with

 xc=x1→x2→⋯→xq=xf,

is the ancestor of .

###### Proof.

The existence of a path from the current configuration to the target is provided by the construction of the graph. We note that if two nodes share an edge, one of them is the ancestor of the other, which is also determined by the graph construction algorithm. Clearly, we have that is the ancestor of . By induction, if we assume that is the ancestor of , we claim that is the ancestor of . Otherwise, it implies that must be the ancestor of , which means that has two ancestors. This is a contradiction with the graph generation strategy: we delete all candidate nodes that are already generated in previous steps.

For the uniqueness of the path, we first notice that the algorithm stops once an edge is linked to the target , from which we conclude that has unique ancestor. By Algorithm 2, every node except has a unique ancestor. If we assume that there are two paths and , denoted by and respectively, we must have and . By the uniqueness of the ancestor for each node, we must have . By induction, we have and . Thus, and the uniqueness is proven. ∎

By our graph generation algorithm, if there is an edge between two nodes, one of them must be the unique ancestor of the other. This suggests a simple strategy to identify the path: from the target configuration, we simply back trace the ancestor of each node in the path until reaching starting configuration .

Other algorithms can be applied to find the path as well. For example, we can define the distance of the edge linking vertices as

 kij=len((i,j))=∥vi−vj∥.

Then the well-known Dijkstra method, or its improvements, can be used to obtain the path with computational complexity where is the number of vertices and is the number of edges [51].

Another way is to assign each edge distance which is equivalently to introduce the modified adjacency matrix on the graph , where

 kij={1if (i,j)∈E∞otherwise.

Then the Breadth First Search (BFS) can be used to find the path with the complexity [52, 53], which is faster than Dijkstra. Other graph-based path planning algorithms, such as , or lite, can be used too.

It is worth mentioning that if we assume the path has nodes, the suggested back-tracing approach is of complexity . While the generated graph has at least nodes. Obviously the complexities of BFS and Dijkstra methods are higher than our back-tracing strategy.

### 2.3 Environment updating

While robots move along a path in the configuration space, the knowledge of constraints is updated at the same time by (1). We let

 γi:[0,T0]→Ω

be the current path, and if a point on the path intersects the boundary of the constrained region, the motion stops at a point before arriving the intersection.

To be more precise, let us denote the environment update at each time step as

 Otc=Oc∪{x∈Ω:˜ψj(x,t,γi)<0 for some j∈{1,⋯,k2}}.

If the path is found activating constraints while moving at time , i.e. , we define

 Tb=inf{t:γi(t)∈OTsc}

as the first intersection time. Then must be on the boundary of , i.e. . When this happens, we can always pick a stopping time such that the distance from to the nearest obstacle is smaller than the detection radius . Afterwards, we update , assign the initial configuration as and go back to the graph generating step. Each time a new path is produced when the current path is blocked. We collect all paths produced in Algorithm 1 as , and their stopping time set as . From our choices of stopping time, we can require that there exists a positive constant satisfying , and for all , has non-empty intersection with for every . Such selected stopping time set satisfies the following property

 supiinfϵ{ϵ:B(γi(Ti),ϵ)∩OTic≠∅}=q

in which the detectable region at configuration , using (1), is defined as a closed set by

 ¯B(x,R)={y∈Ω:d(x,y)≤R}

We emphasis that can be selected uniformly. For example, we can simply let robots stop at a position that has a distance of to the obstacles each time when the path is blocked. In this setup, . In general, we can select different stop positions. The finite-step convergence property, presented in the next Section, is guaranteed as long as (4) is satisfied.

### 2.4 Convergence and complexity

The proposed algorithms terminate in finite steps with guaranteed convergence, which is stated in the following main theorems.

###### Theorem 2.1.

Assuming that (2) is true and

 l<2L√n,

where is the dimension of , the graph generation algorithm (Algorithm 2) stops in finite steps. That is, the loop in the algorithm terminates in finite iterations, the generated graph is connected and has a finite number of vertices . Furthermore, if .

###### Theorem 2.2.

Let be the paths produced by Algorithm 1 with being the stopping time set. If the assumptions in Theorem 2.1 and (4) hold, then .

Theorem 2.1 shows that, given the currently known environment, the graph generating procedure stops in finite steps. Theorem 2.2 tells that our algorithm breaks the loop in Algorithm 1 in finite steps. The two theorems together ensure that Algorithm 1 is convergent in finite steps and guarantees a feasible path with the condition (2). Therefore, the algorithm is complete. We leave the proofs of both theorems in Section 6.

Furthermore, if the configuration space is of dimension , there are at most new points generated at each step in Algorithm 2, hence the growth rate for the size of the graph is at each iteration. The complexity of the Updating Environment step relies on the techniques used to detect the environment, so we do not consider it here. Overall, the proposed algorithms are scalable to high dimensional problems, because the growth of the graph is controlled linearly with respect to the dimension and it stops in finite steps. This feature is illustrated by our numerical examples presented in the next section.

## 3 Numerical Examples

We set the working space to be in all examples and denote the graph generating radius as . In this section, we show various low-dimensional experiments to give a basic impression on how the algorithm works, followed by several high dimensional cases with different environments. In all examples, the start configurations are always marked as red diamonds while the targets are the red circles. The potential function is taken as the Euclidean distance from any point to the target , i.e. where .

### 3.1 Low dimensional cases

The first example is one robot moving in an unknown environment (Figure 5). The configurations are the physical locations of the robot, so this is a two-dimensional problem. We take in our graph generating algorithm. Initially, the robot is at the top right corner. It only has the knowledge of a few nearby obstacles at the beginning, while other obstacles are not known. Hence, the graph expands towards the target greedily until reaching the destination as shown in Figure 5(a). The first path is found by BFS on this graph, shown in Figure 6(a). However, while moving, the robot detects that the path is blocked. It stops before reaching the obstacle boundary and starts a new round of graph generating, path finding and environment updating steps. During the process, the robot generates several graphs (Figure 5(b,c,d)) and updates the environment while moving along the corresponding paths as shown in Figure 6(b,c,d), all of which fail to reach the destination. In the end, it generates a graph (Figure 5(e)) and finds a path (Figure 6(e)) to the target. The complete path from initial to target is provided in Figure 6(f).

The set-ups for the next example are all the same as the previous one except the initial configuration. The generated graphs are depicted in Figure 7(a-e) in time order and the corresponding paths are in Figure 8(a-e) while the complete path is shown in Figure 8(f). Despite of the difference in the initial configurations, the algorithm gives similar paths (Figure 6(f) and 8(f)). In the next experiment, we keep the settings used in the second example, but enlarge the generating radius from to . By doing so, the robot can no longer move into the central box from the top left corner as it does in the first two examples (Figure 6(e) and 8(e)). Instead, it moves down and finds a different way to the destination. This path reaches higher potential area than the previous paths. The graphs for this example are depicted in Figure 9 and paths are in Figure 10 respectively.

To conclude the lower dimensional cases, we display the graphs and paths produced by Algorithm 1 for a different environment in Figure 11 and 12. Similar behaviors can be observed in those pictures.

### 3.2 High dimensional cases

In the next few examples, we calculate the paths for several multiple-agent systems. In addition to the constraints imposed by the obstacles, we also enforce that the robots cannot be too close or too far away from each other. In our examples, we set that any two robots must keep their distance between and when moving in the unknown environment. Besides, the link between each pair of robots cannot be blocked by obstacles. All examples are accompanied by youtube videos, with the web links given in the footnotes. In Figure 13, a 2-robot ( dimensional) system is used. From the pictures, we observe that the robots move up until trapped, because they always choose the fastest potential-decaying direction in the known environment. Then they retreat back and eventually find the correct way111video at https://youtu.be/6wKe7wnlG58. The next example is a 3-robot system ( dimensional problem) shown in Figure 14. The environment allows a direct path from the initial to the target. The algorithm immediately finds this direct path and avoids taking other sideways222video at https://youtu.be/q84VhKfYUyo. Finally, a 5-robot system is shown in Figure 15 to demonstrate that the algorithm is capable of solving a -dimensional problem with complicated environment333video at https://youtu.be/H5lfzAYbfRA, in which the robots need to twist so that they can successfully pass through the gaps between obstacles. Another example of a 10-robot system (with dimensional configuration space) moving in an unknown environment can also be found online444video at https://youtu.be/gVinTsto7pE. We would like to note that it takes about 1 minute to finish the entire computation for this 10-robot system by using Matlab on a regular laptop (a Macbook Pro with 2.9 GHz Intel Core i5 CPU) with no particular effort being made to optimize the implementation of the algorithm.

In addition to the displayed paths, we illustrate the performance of the algorithms by using several other measurements. Table 1 shows the collective information about the number of vertices in graphs generated during the procedure. In this table, “Figure” column indicates the corresponding figures of the examples, “num of robots” represents the number of robots, and “” is the generating radius in each experiment. To show the efficiency of our algorithms, we use the average number of nodes in the graphs, represented in “avg”, and the maximum number of vertices amongst all graphs which is listed in the column “max”. We can see that as the dimension of the problem (indicated in the “dim” column) increases, the size of the graphs increases, but not as fast as the exponential growth with respect to the dimensionality.

Furthermore, we observe that the algorithm generates the particular graph with the maximum number of vertices when the robots are trapped in local minimizer (shown in “trapped” column). In the dimensional example, the robots do not encounter any local minimizer, which results in much fewer vertices. In fact, the sizes of graphs are smaller than those in the four-dimensional case. We also observe that the number of graphs generated by the algorithm (“num of G” column) highly relies on the environments and the choices of the orthonormal bases. Thus it is not used as a criterion to judge the efficiency of the algorithm. Overall, our algorithm is relatively efficient especially when dealing with high dimensional problems. The most costly part is to escape the local traps, and we propose a couple of strategies to improve the performance in the next section.

## 4 Escaping Local Traps Rapidly

From the experiments conducted, we notice that the number of generated vertices increases when the robots are trapped in local minimizers, and the number of nodes at each local trap is proportional to the volume of the trap. This is not a surprise because the nearly exhausted search is used to escape local traps. In order to reduce cost, we present two different strategies. Before doing so, we need to identify local minimizers and define their trap regions. We say that a node point is a local minimizer if no lower-potential points around can be generated by Algorithm 2. Since a local trap can only be created by constraints because of the convexity of the potential function, we define the trap region as a set enclosed by the boundary of local constraints and the level curve (hyper-surface in high dimensional problems) of potential function in the following way:

 L∗(x)=supc{L(c):L(a) is closed for all a∈[c0,c]}, (5)

where and is the closed set containing with

 ∂L(c)⊂(∂Oc∪{x:p(x)=c})

When a local trap is identified, our goal is to find points located on the intersection of obstacle boundary and the level curve (surface) given by (5) as quickly as possible, and then continue to generate vertices outside of the trap region. Here we introduce two different dimension reduction methods to achieve this goal.

Keep the robot near obstacles: We know that some of the constraints in must be nearly activated around the local minimizer . For the ease of presentation, we denote those nearly activated constraints as for some integer where is a small positive number and is some or . For example, it can be chosen as . We modify the algorithm so that it only generates points satisfying the inequalities, that is, only add points such that to until there is a vertex with

 E={x∈B(y,√2l):∃z∈{c∈(x±N)∖V:p(c)

where is the substituted set of orthonormal basis for in the subspace, is the current vertex set, and is the constraint mentioned above. After this point, we go back to Algorithm 2. With the same assumptions as stated in Theorems 2.1 and 2.2, we can show that this method find the path in finite steps, and the proof of the convergence follows the same arguments as provided in Section 6. Since is continuous locally, the modified search is conducted in a low-dimensional subspace if is chosen appropriately.

Fix the shape formed by robots: A different way to get out of the local traps is to introduce a set of new constraints on the robots so that they restrict the graph generation in a low dimensional subspace . For example, one may fix the pairwise distance between robots, so indicates that the distance between a certain pair of robots is a given value. In 2-D or 3-D workspace, those restrictions often lead to a fixed shape formed by the robots. Each reduces the search dimension by one because the new vertices added to must satisfy . Similar to the previous strategy, we stop this procedure when a vertex is generated, which indicates the robots moved out of the known local trap. On the other hand, it is possible that after adding new constraints, there is no feasible way to move out. In this case, no new vertex can be generated in , then we remove one of the added constraints, and continue with the graph generating algorithm in a subspace which is one dimension higher than the previous subspace. The procedure is repeated if necessary. For this method, if we further assume that there is a feasible tube in the low dimensional subspace defined by all constraints, including the added ones , we can use the same proof to show its convergence in a finite number of steps. In this paper, we implement this dimension reduction strategy in our high dimensional examples.

In the two and five robots cases demonstrated in Section 3, we fix the distance between each pair of robots when a local minimizer is encountered. To compare the results, we carry out several new experiments, in which all set-ups including initial and target configurations, the obstacles and all parameters are the same. The final path and how the robots move can be found in videos555video for two robots with the improved algorithm at https://youtu.be/od5fmuo8cR8 and video for five at https://youtu.be/vVHThxmtmf8. The information on the generated graphs is displayed in Table 2, where we can see that the number of vertices decreases significantly. In the 5 robots case, the largest graph is no longer produced at local traps. Instead, the first generated graph contains more nodes because of the long distance from initial to target configurations. In our examples, we observe that nodes needed around the local minimizers are reduced from to , where is the edge length assuming the local trap is a square and is the dimension of .

We also observe a common feature in all examples: the environment is not entirely explored, and the generated graphs are greedily expanding towards the target configuration. This special feature is not by accident. In fact, it is determined by the Fokker-Planck equation in optimal transport theory. We give a thorough discussion on their connections in the next section.

## 5 Relation to FPE and Optimal Transport

The design of the graph generating algorithm, Algorithm 2, is inspired by the evolution of FPE, which determines a region where the search is conducted. In this section, we describe in detail on how the region evolves following the solution of FPE,

 {∂ρ∂t(x,t)=∇⋅(ρ(x,t)∇p(x))+βΔρ(x,t)ρ(x,0)=ρ0(x), (6)

where is a given distribution and is the potential function. Based on (6), the region is constructed by an intermittent diffusion process, meaning we take to be , so that the density is transported greedily along the negative gradient direction, while we adjust to trigger a diffusion process when trapped in a local minimizer. For simplicity, we call the gradient part and the diffusion part. However, since our graph generating algorithm only choose new points along the given orthonormal directions , we must replace in (6) by its projection onto :

 u(x)=∑y∈H(x)Py∇p(x),

where is the projection operator to , and is defined as,

 H(x)=argminy∈A⟨∇p(x),y⟩∥∇p(x)∥

with . The resulting equation is

 {∂ρ∂t(x,t)=∇⋅(ρ(x,t)u(x))+βΔρ(x,t)ρ(x,0)=ρ0(x). (7)

We note that both (6) and (7) can be rewritten as

 ∂ρ∂t(x,t)=∇⋅{ρ(x,t)[v(x)+∇(βlogρ(x,t))]},

where for (6) and for (7). This expression can be approximated by the following upwind discretization of (6) on a lattice grid (here we assume that is one of the grid points), with mesh size and orientation aligned with the orthonormal basis used in Algorithm 2 [54],

 ∂ρj∂t=⎛⎝∑k∈Nb(j)(Fk(ρ,β)−Fj(ρ,β))+ρkdjk−∑k∈Nb(j)(Fj(ρ,β)−Fk(ρ,β))+ρjdjk⎞⎠1(Δx)2, (8)

where , , is the set of all adjacent nodes (neighbors) of node on the grid , , in which is the free energy

 F(ρ,β)=N∑j=1(p(xj)ρj+βρjlogρj), (9)

and for (6). A similar discretization can be derived for (7). The value of in the discretization of , which, if assume without loss of generality, can be defined as

 djk={1⟨j−k,∇p(j)⟩=min{i∈Nb(j):p(i)

where represent the coordinates of the corresponding nodes and is the gradient vector at configuration in . If the projection is not involved as is in the diffusion part, we simply let for all .

In the rest of this section, we show how to build using (8) with . The strategy is that we alternate the procedures between the gradient () and diffusion () to grow the region. When a new part of region is formed each time, we simply union it with the existing one. We want to mention that at any point we change the procedures ( from to a nonzero value, or vice versa), we reinitialize the density before evolving (7). We terminate the procedure, if the target configuration is included in .

### 5.1 Gradient part of Rf

For the gradient case, the points on the grid expand along the projection of the negative gradient of the potential function onto . We evolve (8) with and the initial condition

 ρ(x,0)=δxi={1x=xi0x≠xi,

where is the starting point of the current gradient procedure. Until reaching the steady state, the solution on the grid can be calculated. The steady state solution satisfies for the following property,

###### Proposition 2.

, where is a subset of local minimizers of on the given grid.

Then we select points such that

 R1(xi)={x∈GL:∂∂tρ(x,t)>0 for some t>0}, (11)

Once is determined, we merge it to the set constructed in the previous steps (we use empty at the first step), i.e. . If , we continue to amend the set with the diffusion procedure described in Section 5.2.

### 5.2 Diffusion part of Rf

We assume that the previously constructed set is . In the diffusion part, since , is involved in the calculation. To avoid blowing up in the computation, we initialize the density for (8) as below:

 ρ(x,0)=δRp=⎧⎪⎨⎪⎩1|Rp|(1−ϵ)x∈Rp1|GL∖Rp|ϵx∈GL∖Rp,

where can be an arbitrarily small positive real number. With this initialization, we can calculate in (8) until reaching the stationary solution . Now following , we choose points on the grid as:

 R2 =W⋃s=0Rs2, Rs2 =⎧⎨⎩x=argmaxy∈GL∖⋃s−1j=0Rj2ρ(y,∞):Rs−12⋂Nb(x)≠∅⎫⎬⎭, R02 =Rp,

where is defined by

 W=argmins{s:z∈Rs2,∃y∈Nb(z)∖(s⋃τ=1Rτ2) with p(y)

We union into by defining . In the newly selected , we pick

 xi=argminy{p(y):y∈Nb(x)∖R for some x∈R}. (12)

This is the new starting point for the next gradient procedure, and we return to the gradient part as described in Section 5.1.

By alternating the procedures to obtain and until is included, we define the final region

 Rf=⋃x∈RBox(x,l)

where is the closed box centered at with edge length . With the constructed , we have the following theorem.

###### Theorem 5.1.

Assuming that the robots only stops on node points with assumptions in Theorem 2.2 and , the complete path generated by the algorithm satisfies .

The proofs of Theorem 5.1 and Proposition 2 are given in Section 6.

We show two examples in Figure 16 with initial configurations indicated by red diamonds and target red circles. The gray region is calculated by the gradient and diffusion procedures described in this section. The computation is done on a grid with mesh size . As we can see clearly in Figure 16, starting from the right middle part of , the graph first expands along the x-axis, which is the projected negative gradient direction on the lattice grid, until it hits an obstacle. Then the procedure is switched to the diffusion case, and produces a region in front of the obstacle following the Gibbs’ distribution until finding a way out. After that, the procedure changes back to the gradient case and moves to the target greedily. This time, the projected negative gradient direction coincides with the actual one. Again, the diffusion case kicks in when a local minimizer is encountered. The procedure repeats until the target is reached. Figure 16 shows another example with more complicated set ups.

We would like to mention that we use the forward Euler method to discretize (8) in time,

 ρ+j=ρj+⎛⎝∑k∈Nb(j)(Fk(ρ,β)−Fj(ρ,β))+ρkdjk−∑k∈Nb(j)(Fj(ρ,β)−Fk(ρ,β))+ρjdjk⎞⎠Δt(Δx)2. (13)

To make the scheme convergent, we need the following proposition, which can be regarded as the Courant-Friedrichs-Lewy (CFL) conditions for numerical PDE schemes.

###### Proposition 3.

Given the lattice grid with grid size , (13) is stable if

 Δt
###### Proof.

To make the scheme stable, we need

 ⎧⎪⎨⎪⎩ρj+∑k∈Nb(j)(Fk−Fj)+ρkdjkΔt(Δx)2<1ρj−∑k∈Nb(j)(Fj−Fk)+ρjdjkΔt(Δx)2>0 ⟹ ⎧⎪ ⎪⎨⎪ ⎪⎩Δt<1∑k∈Nb(j)(Fj−Fk)+djk(Δx)2Δt<1−ρj∑k∈Nb(j)(Fk−Fj)+ρkdjk(Δx)2

for all . This leads to (14). In addition, following the proof of Theorem 3 in [55], we can obtain for a fixed grid. Therefore and are bounded from above for all edges in . This implies that the right hand side of (14) is bounded from below by a positive real number, so can stay strictly positive. ∎

###### Remark 2.

For each given , we can get a region . If we let go to , which means that the lattice approaches to the continuous space, we can define . The graph produced by our algorithm must satisfy . In fact, is the smallest bounded region in which the search is conducted.

## 6 Convergence Analysis

### 6.1 Convergence of the algorithm

In this section, we give detailed proofs for the Theorems 2.1 and 2.2. To do this, we need to prove several lemmas first. We begin with showing that the set of all feasible path is compact in a finite time interval (Lemma 6.1) and there exists a feasible path in within a tubular region that is clear of obstacles (Lemma 6.2).

###### Lemma 6.1.

If there exists a feasible path, is non-empty and compact with respect to the norm given by

 dΓ(γ1,γ2)=maxt∈[0,T]∥γ1(t)−γ2(t)∥,

where and are two paths in .

###### Proof.

Let us denote the feasible path as

 γ:[0,T0]→Ω.

If we define for all , we have . To prove is compact, we assume there is a sequence of paths such that

 limi→∞dΓ(γi,γ)=0.

Since is compact and is open, it implies that is compact. Therefore for any , as , we have . This includes

 γ(0)=limi→∞γi(0)=xs, γ(T)=limi→∞γi(T)=xt, γ(t)⊂(Ω∖O),

which gives , and is a close set. In addition, since is compact, is bounded, we conclude that is compact. ∎

###### Lemma 6.2.

Assume that (2) is true, there must exist a feasible path satisfying

 (∪t∈[0,T]B(γ(t),L))∩O=∅,

where is the constrained set.

###### Proof.

Because of (2), there exists a sequence of paths satisfying

 limn→∞inft∈[0,T]supr∈[0,∞){r:B(γn(t),r)∩O=∅}=L.

From Lemma 6.1, we know that is compact, therefore . For an arbitrary , denote

 L(t)=supr∈[0,∞){r:B(γ0(t),r)∩O=∅}

As the whole space is compact, the limit . Since , one has for arbitrary time . That is, fix the curve defined as above, for all , and thus

 (∪t∈[0,1]B(γ0(t),L))∩O=∅,

which proves the lemma. ∎

In the next few lemmas, we prove several results that ensure the generating graph algorithm (Algorithm 2) creating new points in the feasible region when the radius is small enough compared to , and the process does not stop until reaching a neighborhood of the target configuration.

###### Lemma 6.3.

Given a point on an -dimensional Euclidean space and . Let , and be a set containing orthonormal vectors, then such that if

 0
###### Proof.

Without loss of generality, we can assume . Denote , we can rewrite . Since , then , so that . Consider the point , then

 ∥zk−x∥2 =∑i≠ky2i+(yk−sign(yk)l)2 =L2+l2−2sign(yk)ykl =L2+l2−|yk|l ≤L2+l2−2L√nl.

One has if . So . ∎

###### Lemma 6.4.

Given a continuous path and a closed set with , and and , then there exists such that and is closed.

###### Proof.

We use the signed distance function with if and when . It is clear that is continuous with respect to . Hence,

 g=f∘γ:[0,T]→Ω

is also continuous with and since and . As a result, there is at least one point so that , hence . In fact, all points satisfying are on .

Assume is open, must be open, because is continuous. This implies that is the disjoint union of some open intervals. Take one of the open interval, say , we have and . Due to the continuity of , we have , which means , and this is a contradiction. Therefore, must be closed. ∎

###### Lemma 6.5.

Assume that (2) holds. Graph is generated by Algorithm 2 with . If , then the graph generating step of Algorithm 2 does not stop, and there exists at least one point in the feasible region that can be added to by the algorithm.

###### Proof.

Let us assume that the graph generating algorithm terminates after finite steps, returning a connected graph containing . We denote

 C=∪v∈V¯B(v,L),

which is a closed set with . First we want to prove by contradiction. Let us assume . Take the path in Lemma 6.2, it is true that while