MultiObjective Policy Generation
for
MultiRobot Systems
Using
Riemannian Motion Policies
Abstract
In many applications, multirobot systems are required to achieve multiple objectives. For these multiobjective tasks, it is oftentimes hard to design a single control policy that fulfills all the objectives simultaneously. In this paper, we focus on multiobjective tasks that can be decomposed into a set of simple subtasks. Controllers for these subtasks are individuallydesigned and then combined into a control policy for the entire team. One significant feature of our work is that the subtask controllers are designed along with their underlying manifolds. When a controller is combined with other controllers, their associated manifolds are also taken into account. This formulation yields a policy generation framework for multirobot systems that can combine controllers for a variety of objectives while implicitly handling the interaction among robots and subtasks. To describe controllers on manifolds, we adopt Riemannian Motion Policies (RMPs), and propose a collection of RMPs for common multirobot subtasks. Centralized and decentralized algorithms are designed to combine these RMPs into a final control policy. Theoretical analysis shows that the system under the control policy is stable. Moreover, we prove that many existing multirobot controllers can be closely approximated by the framework. The proposed algorithms are validated through both simulated tasks and robotic implementations.
1 Introduction
Multirobot control policies are often designed through performing gradient descent on a potential function that encodes a single teamlevel objective, e.g. forming a certain shape, covering an area of interest, or meeting at a common location [1, 2, 3]. However, many problems involve a diverse set of objectives that the robotic team needs to fulfill simultaneously. For example, collision avoidance and connectivity maintenance are often required in addition to any primary tasks [4]. One possible solution is to encode the multiobjective problem as a single motion planning problem with various constraints [5, 6, 7, 8]. However, as more objectives and robots are considered, it can be difficult to directly search for a solution that can achieve all of the objectives simultaneously.
An alternative strategy is to design a controller for each individual objective and then combine these controllers into a single control policy. Different schemes for combining controllers have been investigated in the multirobot systems literature. For example, one standard treatment for interrobot collision avoidance is to let the collision avoidance controller take over the operation if there is a risk of collision [9]. A fundamental challenge for such construction is that unexpected interaction between individual controllers can yield the overall system unstable [10, 11]. Nullspacebased behavioral control [12, 13] forces low priority controllers to not interfere with high priority controllers. However, when there are a large number of objectives, the system may not have the sufficient degrees of freedom to consider all the objectives simultaneously. Another example is the potential field method, which formulates the overall controller as a weighted sum of controllers for each objective [9, 14]. While the system is guaranteed to be stable when the weights are constant, careful tuning of these constant weights are required to produce desirable behaviors.
Methods based on Control Lyapunov Functions (CLFs) and Control Barrier Functions (CBFs) [4, 10, 11] seek to optimize primary tasklevel objectives while formulating secondary objectives, such as collision avoidance and connectivity maintenance, as CLF or CBF constraints and solve via quadratic programming (QP). While this provides a computational framework for general multiobjective multirobot tasks, solving the QP often requires centralized computation and can be computationally demanding if the number of robots or constraints is large. Although the decentralized safety barrier certificate [11] is a notable exception, it only considers interrobot collision avoidance and it has not been demonstrated how the same decentralized construction can be applicable to other objectives.
In this paper, we return to the idea of combining controllers and rethink how an objective and its corresponding controller are defined: instead of defining objectives directly on the configuration space, we define them on nonEuclidean manifolds, which can be lowerdimensional than the configuration space. When combining individuallydesigned controllers, we consider the outputs of the controllers and their underlying manifolds. In particular, we adopt Riemannian Motion Policies (RMPs) [15], a class of manifoldoriented control policies that has been successfully applied to robot manipulators, and RMPflow [16], the computational framework for combining RMPs. This framework, where each controller is associated with a matrixvalue and statedependent weight, can be considered as an extension to the potential field method. This extension leads to new geometric insight on designing controllers and more freedom to combine them. While the RMPflow algorithm is centralized, we provide a decentralized version and establish the stability analysis for the decentralized framework.
There are several major advantages to defining objectives and controllers on manifolds for multirobot systems. First, this formulation provides a general formula for the construction of controllers: the key step is to design the manifold for each substask, as controllers/desired behaviors can be viewed as a natural outcome of their associated manifolds. For example, obstacle avoidance behavior is closely related to the geodesic flow in a manifold where the obstacles manifest as holes in the space. Second, since we design controllers in the manifolds most relevant to their objectives, these manifolds are usually of lower dimension than the configuration space. When properly combined with other controllers, this can provide additional degrees of freedom that help controllers avoid unnecessary conflicts. This is particularly important for multirobot systems where a large number of controllers interact with one another in a complicated way. Third, it is shown in [16] that Riemannian metrics on manifolds naturally provide a notion of importance that enables the stable combination of controllers. Finally, RMPflow is coordinatefree [16], which allows the proposed framework to be directly generalized to heterogeneous multirobot teams.
We present four contributions in this paper. First, we present a centralized solution to combine controllers for solving multirobot tasks based on RMPflow. Second, we design a collection of RMPs for simple and common multirobot subtasks that can be combined to achieve more complicated tasks. Third, we draw a connection between some of the proposed RMPs and a large group of existing multirobot distributed controllers. Finally, we introduce a decentralized extension to RMPflow, with its application to multirobot systems, and establish the stability analysis for this decentralized framework.
2 Riemannian Motion Policies (RMPs)
We briefly review Riemannian Motion Policies (RMPs) [15], a mathematical representation of policies on manifolds, and RMPflow [16], a recursive algorithm to combine RMPs. In this section, we start with RMPs and RMPflow for single robots, for which these concepts are initially defined [15, 16]. We will later on consider them in the context of multirobot systems in subsequent sections.
Consider a robot (or a group of robots in later sections) with its configuration space being a smooth dimensional manifold. For the sake of simplicity, we assume that admits a global^{1}^{1}1In the case when does not admit a global coordinate, a similar construction can be done locally on a subset of the configuration space . generalized coordinate . As is the case in [16], we assume that the system can be feedback linearized in such a way that it is controlled directly through the generalized acceleration, . We call a policy or a controller, and the state.
RMPflow [16] assumes that the task is composed of a set of subtasks, for example, avoiding collision with an obstacle, reaching a goal, tracking a trajectory, etc. In this case, the task space, denoted , becomes a collection of multiple subtask spaces, each corresponding to a subtask. We assume that the task space is related to the configuration space through a smooth task map . The goal of RMPflow [16] is to generate policy in the configuration space so that the trajectory exhibits desired behaviors on the task space .
2.1 Riemannian Motion Policies
Riemannian Motion Policies (RMPs) [15] represent policies on manifolds. Consider an dimensional manifold with generalized coordinate . An RMP on can be represented by two forms, its canonical form and its natural form. The canonical form of an RMP is a pair , where is the desired acceleration, i.e. control input, and is the inertial matrix which defines the importance of the RMP when combined with other RMPs. Given its canonical form, the natural form of an RMP is the pair , where is the desired force. The natural forms of RMPs are introduced mainly for computational convenience.
RMPs on a manifold can be naturally (but not necessarily) generated from a class of systems called Geometric Dynamical Systems (GDSs) [16]. GDSs are a generalization of the widely studied classical Simple Mechanical Systems (SMSs) [17]. In GDSs, the kinetic energy metric, , is a function of both the configuration and velocity, i.e. . This allows kinetic energy to be dependent on the direction of motion, which can be useful in applications such as obstacle avoidance [15, 16]. The dynamics of GDSs are in the form of
(1) 
where we call the damping matrix and the potential function. The curvature terms and are induced by metric ,
(2) 
with , denoting the th column of , denoting the th component of , and denoting matrix composition through horizontal concatenation of vectors. Given a GDS (1), there is an RMP naturally associated with it given by and . Therefore, the velocity dependent metric provides velocity dependent importance weight when combined with other RMPs.
2.2 RMPflow
RMPflow [16] is an algorithm to generate control policies on the configuration space given the RMPs for all subtasks, for example, collision avoidance with a particular obstacle, reaching a goal, etc. Given the state information of the robot in the configuration space and a set of individuallydesigned controllers (RMPs) for the subtasks, RMPflow produces the control input on the configuration space through combining these controllers.
RMPflow introduces: i) a data structure, the RMPtree, to describe the structure of the task map , and ii) a set of operators, the RMPalgebra, to propagate information across the RMPtree. An RMPtree is a directed tree. Each node in the RMPtree is associated with a state defined over a manifold together with an RMP . Each edge in the RMPtree is augmented with a smooth map from the parent node to the child node, denoted as . An example RMPtree is shown in Fig. 1. The root node of the RMPtree is associated with the state of the robot and its control policy on the configuration space . Each leaf node corresponds to a subtask with its control policy given by an RMP , where is a subtask space.
The RMPalgebra consists of three operators: pushforward, pullback and resolve. To illustrate how they operate, consider a node with child nodes, denoted as . Let be the edges from to the child nodes (Fig. 1). Suppose that is associated with the manifold , while each child node is associated with the manifold . The RMPalgebra works as follows:

The pushforward operator forward propagates the state from the parent node to its child nodes . Given the state associated with , the pushforward operator at computes its associated state as , where is the Jacobian matrix of the map .

The pullback operator combines the RMPs from the child nodes to obtain the RMP associated with the parent node . Given the RMPs from the child nodes, , the RMP associated with node , , is computed by the pullback operator as,

The resolve operator maps a naturalformed RMP to its canonical form. Given the naturalformed RMP , the operator produces with , where denotes MoorePenrose inverse.
With the RMPtree specified, RMPflow can perform control policy generation through the following process. First, RMPflow performs a forward pass: it recursively calls pushforward from the root node to the leaf nodes to update the state information associated with each node in the RMPtree. Second, every leaf node evaluates its corresponding naturalformed RMP , possibly given by a GDS. Next, RMPflow performs a backward pass: it recursively calls pullback from the leaf nodes to the root node to back propagate the RMPs in the natural form. After that, RMPflow calls resolve at the root node to transform the RMP into its canonical form . Finally, the robot executes the control policy by setting .
2.3 Stability Properties of RMPflow
To establish the stability results of RMPflow, we assume that every leaf node is associated with a GDS. Before stating the stability theorem, we need to define the metric, damping matrix, and potential function for a node in the RMPtree. {definition} If a node is a leaf, its metric, damping matrix and potential function are defined as in its associated GDS (1). Otherwise, let and denote the set of all child nodes of and associated edges, respectively. Suppose that , and are the metric, damping matrix, and potential function for the child node . Then, the metric , damping matrix and potential function for the node are defined as,
(3) 
where denotes function composition.
3 Centralized Control Policy Generation
We begin by formulating a control policy generation algorithm for multirobot systems directly based on RMPflow. This algorithm is centralized because it requires a centralized processor to collect the states of all robots and solve for the control input for all robots jointly given all the subtasks. In Section 4, we introduce a decentralized algorithm and analyze its stability properties.
Consider a potentially heterogeneous^{2}^{2}2Please see Appendix D for a discussion about heterogeneous teams. team of robots indexed by . Let be the configuration space of robot with being a generalized coordinate on . The configuration space is then the product manifold . As in Section 2, we assume that each robot is feedback linearized and we model the control policy for each robot as a secondorder differential equation . An obvious example is a team of mobile robots with double integrator dynamics on . Note, however, that the approaches proposed in this paper is not restricted to mobile robots in Euclidean spaces.
Let denote the index set of all subtasks. For each subtask , a controller is individually designed to generate RMPs on the subtask manifold . Here we assume that the subtasks are preallocated in the sense that each subtask is defined for a specified subset of robots . Examples of subtasks include collision avoidance between a pair of robots (a binary subtask), trajectory following for a robot (a unitary subtask), etc.
The above formulation gives us an alternative view of multirobot systems with emphasis on their multitask nature. Rather than encoding the teamlevel task as a global potential function (as is commonly done in the multirobot literature), we decompose the task as local subtasks defined for subsets of robots, and design policies for individual subtasks. The main advantage is that as the task becomes more complex, it becomes increasingly difficult to design a single potential function that renders the desired global behavior. However, it is often natural to decompose global tasks into local subtasks, even for complex tasks, since multirobot tasks can often come from local specifications [3, 4]. Therefore, this formulation provides a straightforward generalization to multiobjective tasks. Moreover, this subtask formulation allows us to borrow existing controllers designed for singlerobot tasks, such as collision avoidance, goal reaching, etc.
Recall from Section 2 that RMPflow operates on an RMPtree, a tree structure describing the task space. The main objective of this section is thus to construct an RMPtree for general multirobot problems. Note that given a set of subtasks, the construction of the RMPtree is not unique. One way to construct an RMPtree is to use nonleaf nodes to represent subset of the team:

The root node corresponds to the joint configuration space and its corresponding control policy.

Any leaf node is augmented with a userspecified policy represented as an RMP on the subtask manifold .

Every nonleaf node is associated with a product space of the configuration spaces for a subset of the team.

The parent of any leaf RMP is associated with the joint configuration space , where are the robots that subtask is defined on.

Consider two nonleaf nodes and such that is a decedent of in the RMPtree. Let and be the subset of robots corresponds to node and , respectively. Then .
Fig. 2 shows an example RMPtree for a team of three robots. The robots are tasked with forming a certain shape and reaching a goal while avoiding interrobot collisions. The root of the RMPtree is associated with the configuration space for the team, which is the product of the configuration spaces for all three robots. On the second level, the nodes represent subsets of robots which, in this case, are pairs of robots. Several leaf nodes, such as the ones corresponding to collision avoidance and distance preservation, are children of these nodes as they are defined on pairs of robots. One level deeper is the node corresponding to the configuration space for robot . The goal attractor leaf node is a child of it since the goal reaching subtask is assigned only to robot .
Note that branching in the RMPtree does not necessarily define a partition over robots. Let and be children of the same node and let and be the subset of robots for node and , respectively. Then it is not necessary that . For example, in Fig. 2, the three nodes on the second level are defined for subsets , , and , respectively. The intersection of any two of them is not empty. In fact, if a branching is indeed a partition, then the problem can be split into independent subproblems. For multirobot systems, this means that the team consists of independent subteams with completely independent tasks. This rarely occurs in practice.
According to Theorem 2.3, if all the leaf node controllers are designed through GDSs, it is guaranteed that the controller generated by RMPflow drives the system to a forward invariant set if and is nonsingular. In other words, this guarantees that the resulting system is stable, which is important: unstable behaviors such as highfrequency oscillation are avoided and, more importantly, stability provides formal guarantees on the performance of certain types of subtasks such as collision avoidance, which is discussed in Section 3.1.
To elucidate the process of designing RMPs and to connect to relevant multirobot tasks, we provide examples of RMPs for multirobot systems that can produce complex behaviors when combined. In the following examples, we use to denote the coordinate of robot in . An additional map can be composed with the given task maps if robots possess different kinematic structures.
3.1 Pairwise Collision Avoidance
To ensure safety operation of the robotic team, interrobot collisions should be avoided. We formulate collision avoidance as ensuring a minimum safety distance for every pair of robots. To generate collisionfree motions, for any two robots , we construct a collision avoidance leaf node for the pair. The subtask space is the 1d distance space, i.e. . Here, we use (italic) to denote that it is a scalar on the 1d space.
To ensure a safety distance between the pair, we use a construction similar to the collision avoidance RMP for static obstacles in [16]. The metric for the pairwise collision avoidance RMP is defined as , where , with a small positive scalar . The metric retains a large value when the robots are close to each other ( is small), and when the robots are moving fast towards each other ( and is large). Conversely, the metric decreases rapidly as increases. Recall that the metric is closely related to the inertial matrix, which determines the importance of the RMP when combined with other policies. This means that the collision avoidance RMP dominates when robots are close to each other or moving fast towards each other, while it has almost no effect when the robots are far from each other.
We next design the GDS that generates the collision avoidance RMP. The potential function is defined as and the damping matrix is defined as , where are positive scalars. As the robots approach the safety distance, the potential function approaches infinity. Due to the stability guarantee of RMPflow, this barriertype potential will always ensure that the distance between robots is greater than . This means that the resulting control policy from RMPflow is always collisionfree.
3.2 Pairwise Distance Preservation
Another common task for multirobot systems is to form a specified shape or formation. This can be accomplished by maintaining the interrobot distances between certain pairs of robots. Therefore, formation control can be induced by a set of leaf nodes that maintain distances. Such an RMP can be defined on the 1d distance space, , where is the desired distance between robot and robot . For the GDS, we use a constant metric . The potential function is defined as and the damping is , with . We will refer to this RMP as Distance Preservation RMPa in later sections.
Note that the above RMP is not equivalent to the potentialbased formation controller in, e.g. [2, 3]. However, there does exist an RMP that has very similar behavior. It is defined on the product space, . The metric for the RMP is also constant, , where and denotes the identity matrix. The potential function is defined as , where is differentiable and achieves its minimum at . Common choices include and [2]. The damping matrix is defined as , with . This RMP will be referred to as Distance Preservation RMPb in later sections.
When there are only distance preserving RMPs in the RMPtree, the resulting individuallevel dynamics are given by
(4) 
where represents the set of edges in the formation graph, and is the degree of robot . This is closely related to the gradient descent update rule over the potential function with an additional damping term, and normalized by the degree of the robot. We will later prove in Section 3.3 that the degreenormalized potentialbased controller and the original potentialbased controller have similar behaviors in the sense that the resulting systems converge to the same invariant set.
The main difference between the two distance preserving RMPs is the space on which they are defined. The first RMP is defined on a 1d distance space while the second RMP is defined on a higher dimensional space. Therefore, the first RMP is more permissive in the sense that it only specifies desired behaviors in a one dimensional submanifold of the configuration space. This is illustrated through a simulated formation preservation task in Section 5.
3.3 Potentialbased Controllers from RMPs
Designing controllers based on the gradient descent rule of a potential function is very common in the multirobot systems literature, e.g. [2, 3, 18]. Usually, the overall potential function is the sum of a set of symmetric, pairwise potential functions between robot and robot that are adjacent in an underlying graph structure. When the robots follow doubleintegrator dynamics, a damping term is typically introduced to guarantee convergence to an invariant set. Let be the ensemblelevel state of the team. The controller is given by, , where is a positive scalar. We define a degreenormalized potentialbased controller as, , where is a diagonal matrix with and is the degree of robot in the graph. {theorem}[] Both the degreenormalized controller and the original potentialbased controller converge to the invariant set .
Proof.
For the original controller, consider the Lyapunov function candidate . Then . By LaSalle’s invariance principle [19], the system converges to the set . For the degreenormalized controller, consider the Lyapunov function candidate . Then . The system also converges to the same set by LaSalle’s invariance principle [19]. ∎
Therefore, similar to potentialbased formation control, one can directly implement the degreenormalized version of these potentialbased controllers by RMPs defined on the product space, . The potential function for the RMP is defined as . Constant metric and damping can be used, e.g. , and , where and are positive scalars. Moreover, similar to formation control, one can also define RMPs on the distance space with potential function . Since RMPs are defined on a lowerdimensional manifold, this approach may provide additional degrees of freedom when these RMPs are combined with other policies.
4 Decentralized Control Policy Generation
Although the centralized RMPflow algorithm can be used to generate control policies for multirobot systems, it can be demanding in both communication and computation. Therefore, we develop a decentralized approximation of RMPflow that only relies on local communication and computation.
Before discussing the algorithm, a few definitions and assumptions are needed. Given the set of all subtasks , we say two robots and are neighbors if and only if there exists a subtask such that both robots are involved in. We then say that the algorithm is decentralized if only the state information of the robot’s direct neighbors is required to solve for its control input. Note that here we implicitly assume that the robots are equipped with the sensing modality or communication modality to access the state of the neighbors. We also assume that the map and the Jacobian matrix for a subtask are known to the robot if the robot is involved in the subtask. For example, for the formation control task, the robot should know how to calculate distance between two robots given their states, and also know the partial derivatives of the distance function.
The major difference between the decentralized algorithm and the centralized RMPflow algorithm is that, in the decentralized algorithm, there is no longer a centralized root node that can generate control policies for all robots. Instead, each robot should have its own RMPtree that generates policies based on the information available locally. Therefore, the decentralized algorithm actually operates on a forest with RMPtrees, called the RMPforest. An example RMPforest is shown in Fig. 3. There are three robots performing the same formation preservation task as in Fig. 2. Hence, there are three RMPtrees in the RMPforest. For each RMPtree, there are leaf nodes for every subtask relevant to the robot associated with the RMPtree. As a result, there are multiple copies of certain subtasks in the RMPforest, for example, the collision avoidance node for robot and appears twice: once in the RMPtree of robot , and once in the RMPtree of robot . However, these copies do not share information.
We call the decentralized approximation partial RMPflow. In partial RMPflow, every subtask is viewed as a timevarying unitary task. Therefore, following the RMPtree construction in the previous section, it is natural to consider onelevel RMPtrees, where the leaf nodes are direct children of the root nodes.
Notationwise, let be the set of subtasks that robot participates in. Since there are multiple copies of the same subtasks in the RMPforest, we use to denote the node corresponds to the copy of subtask in the tree of robot while let denote the edge from the root of tree to the leaf node . We let denote the smooth map from the joint configuration space to the subtask space (which is the same across trees in the RMPforest) and let be the Jacobian matrix of with respect to only, i.e. .
To compute the control input for robot , an algorithm similar to RMPflow is applied in RMPtree :

pushforward: Let be the coordinates of the leaf nodes of RMPtree . Given the state of the root , its state is computed as, , , where . It is worth noting that , since the other robots are considered static when computing .

Evaluate: Let , , and be the userdesigned metric, damping matrix, and potential function for subtask . For notational simplicity, we denote , , and . At leaf node , the RMP is given by the following system (similar to GDS),
(5) Note that, to provide stability, the RMP is no longer generated by a GDS. In particular, the curvature term compensates for the motion of other robots.

pullback: Given the RMPs from the leaf nodes of tree , , the pullback operator calculates the RMP for the root node of tree ,
(6) 
resolve: The control input is given by .
Note that when all the metrics are constant diagonal matrices and all the Jacobian matrices are identity matrices, the decentralized partial RMPflow framework has exactly the same behavior as RMPflow. This, in particular, holds for the degreenormalized potentialbased controllers discussed in Section 3.3. Therefore, the decentralized partial RMPflow framework can also reconstruct a large number of multirobot controllers up to degree normalization.
Partial RMPflow has a stability result similar to RMPflow, which is stated in the following theorem. {theorem}[] Let , , and be the metric, damping matrix, and potential of the tree ’s root node. If and is nonsingular for all , the system converges to a forward invariant set
Proof.
See Appendix A. ∎
5 Experimental Results
We evaluate the multirobot RMP framework through both simulation and robotic implementation. The detailed choice parameters in the experiments and additional simulation results can be found in Appendix B and Appendix C.
5.1 Simulation Results
Formation preservation tasks [20], where robots must maintain a certain formation while the leader is driven by some external force, are considered harder than formation control tasks since one needs to carefully balance the external force and the formation controller. However, since translations and rotations can still preserve shape, the team should have the capability of maintaining the formation regardless of the motion of the leader.
We consider a formation preservation task in simulation where a team of five robots are tasked with forming a regular pentagon while the leader has the additional task of reaching a goal. The two distance preservation RMPs introduced in Section 3.2 are compared. Distance preservation RMPs are defined for all edges in the formation graph. To move the formation, an additional goal attractor RMP is defined for the leader robot, where the construction of the goal attractor RMP can be found in Appendix B.1 (referred to as Goal Attractor RMPa) or [16]. We use a damper RMP defined by a GDS on the configuration space of every single robot with only damping so that the robots can reach a full stop at the goal. Fig. (a)a shows the resulting behavior for the distance preserving RMPa. The robots manage to preserve shape while the leader robot is reaching the goal since the subtasks are defined on lowerdimensional manifolds. By contrast, the behavior for distance preservation RMPb (which is equivalent to the degreenormalized potentialbased controller) is shown in Fig. (b)b. This distance preservation RMP fails to maintain the formation when the leader robot is attracted to the goal.
5.2 Robotic Implementations
We present several experiments (video: https://youtu.be/VZHr5SN9wXk) conducted on the Robotarium [21], a remotely accessible swarm robotics platform. Since the centralized RMPflow framework and the decentralized partial RMPflow frameworks have their own features, we design a separate experiment for each framework to show their full capability.
5.2.1 Centralized RMPflow Framework
The main advantage of the centralized RMPflow framework is that the subtask spaces are jointly considered and hence the behavior of each controller is combined consistently. To fully exploit this feature, we consider formation preservation with two subteams of robots. The two subteams are tasked with maintaining their formation while moving back and forth between two goal points and . The five robots in the first subteam are assigned a regular pentagon formation and the four robots in the second subteam must form a square. At the beginning of the task, goal is assigned to the first subteam and goal to the second subteam. The robots negotiate their path so that their trajectories are collision free.
A combination of distance preservation RMPs, collision avoidance RMPs, goal attractor RMPs, and damper RMPs are used to achieve this behavior. The construction of the RMPtree is similar to Fig. 2. A distance preservation RMPa is assigned to every pair of robots that corresponds to an edge in the formation graph, while collision avoidance RMPs are defined for every pair of robots. For each subteam, we define a goal attractor RMP for the leader, where the construction of the goal attractor RMP is explained in Appendix B.1. We also use a damper RMP defined by a GDS on the configuration space of every single robot so that the robots can reach a full stop at the goal. Fig. 5 shows several snapshots from the experiment. We see that the robots are able to maintain their corresponding formations while avoiding collision. The two subteams of robots rotates around each other to avoid potential collision, which shows that the full degrees of freedom of the task is exploited.
5.2.2 Decentralized Partial RMPflow Framework
For the decentralized partial RMPflow framework, we consider a team of eight robots. The robots are divided into two subteams. The task of the first subteam is to achieve cyclic pursuit behavior for a circle of radius m centered at the origin. The other subteam is designed to go through the circle surveilled by the other subteam. To achieve the cyclic pursuit behavior, each robot in the first subteam follows a point moving along the circle through a goal attractor RMP (defined in Appendix B.1). The RMPforest for the second subteam follows a similar structure as Fig. 3. For each single robot, there are collision avoidance RMPs for all other robots. Snapshots from the experiment are shown in Fig. 6. The robots from the second subteam manage to pass through the circle under the decentralized framework.
6 Conclusions
In this paper, we consider multiobjective tasks for multirobot systems. We argue that it is advantageous to define controllers for single subtasks on their corresponding manifolds. We propose centralized and decentralized algorithms to generate control policies for multirobot systems by combining control policies defined for individual subtasks. The multirobot system is proved to be stable under the generated control policies. We show that many existing potentialbased multirobot controllers can also be approximated by the proposed algorithms. Several subtask policies are proposed for multirobot systems. The proposed algorithms are tested through simulation and deployment on real robots.

This work was supported in part by the grant ARL DCIST CRA W911NF1720181.
References
 [1] Francesco Bullo, Jorge Cortes, and Sonia Martinez. Distributed control of robotic networks: a mathematical approach to motion coordination algorithms, volume 27. Princeton University Press, 2009.
 [2] Mehran Mesbahi and Magnus Egerstedt. Graph theoretic methods in multiagent networks, volume 33. Princeton University Press, 2010.
 [3] Jorge Cortés and Magnus Egerstedt. Coordinated control of multirobot systems: A survey. SICE Journal of Control, Measurement, and System Integration, 10(6):495–503, 2017.
 [4] Li Wang, Aaron D Ames, and Magnus Egerstedt. Multiobjective compositions for collisionfree connectivity maintenance in teams of mobile robots. In IEEE 55th Conference on Decision and Control, pages 2659–2664. IEEE, 2016.
 [5] Vishnu R Desaraju and Jonathan P How. Decentralized path planning for multiagent teams with complex constraints. Autonomous Robots, 32(4):385–403, 2012.
 [6] Glenn Wagner and Howie Choset. Subdimensional expansion for multirobot path planning. Artificial Intelligence, 219:1–24, 2015.
 [7] Wenhao Luo, Nilanjan Chakraborty, and Katia Sycara. Distributed dynamic priority assignment and motion planning for multiple mobile robots with kinodynamic constraints. In American Control Conference, pages 148–154. IEEE, 2016.
 [8] Siddharth Swaminathan, Mike Phillips, and Maxim Likhachev. Planning for multiagent teams with leader switching. In IEEE International Conference on Robotics and Automation, pages 5403–5410. IEEE, 2015.
 [9] Ronald C Arkin, Ronald C Arkin, et al. Behaviorbased robotics. MIT press, 1998.
 [10] Aaron D Ames, Jessy W Grizzle, and Paulo Tabuada. Control barrier function based quadratic programs with application to adaptive cruise control. In IEEE 53rd Annual Conference on Decision and Control, pages 6271–6278. IEEE, 2014.
 [11] Li Wang, Aaron D Ames, and Magnus Egerstedt. Safety barrier certificates for collisionsfree multirobot systems. IEEE Transactions on Robotics, 33(3):661–674, 2017.
 [12] Bradley E Bishop. On the use of redundant manipulator techniques for control of platoons of cooperating robotic vehicles. IEEE Transactions on Systems, Man, and CyberneticsPart A: Systems and Humans, 33(5):608–615, 2003.
 [13] Gianluca Antonelli, Filippo Arrichiello, and Stefano Chiaverini. Flocking for multirobot systems via the nullspacebased behavioral control. Swarm Intelligence, 4(1):37, 2010.
 [14] O. Khatib. Realtime obstacle avoidance for manipulators and mobile robots. In IEEE International Conference on Robotics and Automation, volume 2, pages 500–505, Mar 1985.
 [15] Nathan D Ratliff, Jan Issac, Daniel Kappler, Stan Birchfield, and Dieter Fox. Riemannian motion policies. arXiv preprint arXiv:1801.02854, 2018.
 [16] ChingAn Cheng, Mustafa Mukadam, Jan Issac, Stan Birchfield, Dieter Fox, Byron Boots, and Nathan Ratliff. RMPflow: A computational graph for automatic motion policy generation. In The 13th International Workshop on the Algorithmic Foundations of Robotics, 2018.
 [17] Francesco Bullo and Andrew D Lewis. Geometric control of mechanical systems: modeling, analysis, and design for simple mechanical control systems, volume 49. Springer Science & Business Media, 2004.
 [18] Jorge Cortes, Sonia Martinez, Timur Karatas, and Francesco Bullo. Coverage control for mobile sensing networks. IEEE Transactions on robotics and Automation, 20(2):243–255, 2004.
 [19] Hassan K Khalil. Noninear systems. PrenticeHall, New Jersey, 2(5):5–1, 1996.
 [20] Brian DO Anderson, Changbin Yu, Baris Fidan, and Julien M Hendrickx. Rigid graph control architectures for autonomous formations. IEEE Control Systems Magazine, 28(6):48–63, 2008.
 [21] Daniel Pickem, Paul Glotfelter, Li Wang, Mark Mote, Aaron Ames, Eric Feron, and Magnus Egerstedt. The Robotarium: A remotely accessible swarm robotics research testbed. In IEEE International Conference on Robotics and Automation, pages 1699–1706. IEEE, 2017.
Appendices
Appendix A Proof of Theorem 3
See 3
Proof.
Let be the total potential function for all subtasks, i.e., . Consider the Lyapunov function candidate , where Then, following a derivation similar to [16], we have,
(7) 
By definition of the pullback operator, we have ,
(8) 
Also by definition of pullback, , hence,
(9)  
where we denote and the last equation follows from the fact that for all .
Therefore, for the Lyapunov function candidate , we have,
(10) 
where the last equation follows from . Then by LaSalle’s invariance principle [19], the system converges to a forward invariant set . ∎
Appendix B Details of the Experiments
In this appendix, we introduce the construction of unitary goal attractor RMP, which is used in many of the experiments, and provide the choice of parameters for the simulation and experiments.
b.1 Unitary Goal Attractor RMP
In multirobot scenarios, instead of planning paths for every robot, it is common to plan a path or assign a goal to one robot, called the leader. The other robots may simply follow the leader or maintain a given formation depending on other subtasks assigned to the team. In this case, a goal attractor RMP may be assigned to the leader. A number of controllers for multirobot systems are also based on going to a goal position, such as the cyclic pursuit behavior [3] and Voronoibased coverage controls [3, 18].
There are several design options for goal attractor RMPs. We will discuss two examples. The first goal attractor RMP is introduced in [16]. The attractor RMP for robot is defined on the subtask space , where is the desired configuration for the robot. The metric is designed as . The weight function is defined as , with and for some . The weights and control the importance of the RMP when the robots are far from the goal and close to the goal, respectively. As the robot approaches the goal, the weight will smoothly increase from to . The parameter determines the characteristic length of the metric. The main intuition for the metric is that when the robot is far from the goal, the attractor should be permissive enough for other subtasks such as collision avoidance, distance preservation, etc. However, when the robot is close to the goal, the attractor should have high importance so that the robot can reach the goal. The potential function is designed such that,
(11) 
where , and as . The parameter determines the characteristic length of the potential. The potential function defined in (11) provides a softnormalization for so that the transition near the origin is smooth. The damping matrix is , where is a positive scalar. We will refer to this goal attractor RMP as Goal Attractor RMPa in subsequent sections. Although more complicated, it produces better results when combined with other RMPs, especially collision avoidance RMPs (see Appendix C).
Another possible goal attractor RMP is based on a PD controller. This RMP is also defined on the subtask space . The metric is a constant times identity matrix, with some . The potential function is defined as and the damping is , with . This RMP is equivalent to a PD controller with and . This goal attractor will be referred to as Goal Attractor RMPb in subsequent sections.
b.2 Choice of Parameters
b.2.1 Simulated Formation Preservation Task
For each robot, we define a goal attractor RMPa, with parameters , , , , , . We use a damper RMP with , . For the distance preserving RMPa we set parameters and . For distance preservation RMPb (which is shown to be equivalent to the potentialbased controller in the previous simulation), we choose parameters and .
b.2.2 Centralized RMPflow Framework
We use distance preservation RMPa’s with , and . The safety distance between robots is . The parameters for the collision avoidance RMPs are set as , , and . For goal attractors, we use goal attractor RMPa’s with , and . The damping RMPs have parameters , .
b.2.3 Decentralized Partial RMPflow Framework
For the cyclic pursuit tasks, the robots are attracted by points moving along the circle with angular velocity . The parameters for the associated goal attractor RMPa’s are , , , , , . For robots from subteam 2, the distance preservation RMPa’s have parameters , . The goal attractor are goal attractor RMPa’s with parameters , , , , , . The parameters for the collision avoidance RMPs are , , and , with safety distance .
Appendix C Additional Simulation Results
c.0.1 RMPs & Potentialbased Controllers
As is discussed in Section 3, many potentialbased multirobot controllers can be reconstructed by the RMP framework up to degree normalization. In this example, we consider a formation control task with five robots. The robots are tasked with forming a regular pentagon with circumcircle radius . The robots are initialized with a regular pentagon formation, but with a larger circumcircle radius of .
We consider a degreenormalized potential field controller from [2],
(12) 
where is the desired distance between robot and robot , is the set of edges in the formation graph, and is the degree of robot in the formation graph. For the RMP implementation, we use the controller given by (4). The potentialbased controller (12) is equivalent to the controller generated by the distance preservation RMP given by (4) when choosing . In simulation, we choose for both the RMP controller and the potentialbased controller. The trajectories of the robots under the two controllers are displayed in Fig. (a)a and Fig. (b)b, respectively. The results are identical implying the controllers have exactly the same behavior.
c.0.2 Goal Attractor RMPs & Collision Avoidance RMPs
An advantage of the multirobot RMP framework is that it can leverage existing singlerobot controllers, which may have desirable properties, especially when combined with other controllers. In this simulation task, the performance of the two goal attractor RMPs are compared when combined with pairwise collision avoidance RMPs. In the simulation, three robots are tasked with reaching a goal on the other side of the field while avoiding collisions with each of the others. The parameters for the collision avoidance RMPs are set as , and . Fig. (a)a and Fig. (b)b show the behavior of the resulting controllers with the two choices goal attractor RMPs discussed in Section B.1, respectively. For goal attractor a, we use parameters , and . we use , , and for goal attractor RMPb. We notice that goal attractor RMPa generates smoother trajectories compared to goal attractor RMPb.
c.0.3 The Centralized & Decentralized RMP Algorithms
The centralized and the decentralized RMP algorithms are also compared through the same simulation of three robots reaching goals. Goal attractor RMPa’s with the same parameters were used. For the collision avoidance RMP, we set , and . The trajectories of the robots under the decentralized algorithm are illustrated in Fig. (c)c. Compared to trajectories generated from centralized RMPs, the robots oscillate slightly when approaching other robots, and made aggressive turns to avoid collisions.
Appendix D On Heterogeneous Robotic Teams
A significant feature of RMPs is that they are intrinsically coordinatefree [16]. Consider two robots and with configuration space and , respectively. Assume that there exists a smooth map from to . Then the RMPtree designed for one robot can be directly transferred to robot by connecting the tree to the root node of robot through the map . Therefore, RMPs provides a level of abstraction for heterogeneous robotic teams so that the user only needs to design desired behaviors for a homogeneous team with simple dynamics models, for example, double integrator dynamics, and seamlessly transfer it to the heterogeneous team. This insight could bridge the gap between theoretical results, which are usually derived for homogeneous robotic teams with simple dynamics models, and real robotics applications.