Selective model-predictive control for flocking systems
Selective model-predictive control for flocking systems.
In this paper the optimal control of alignment models composed by a large number of agents is investigated in presence of a selective action of a controller, acting in order to enhance consensus. Two types of selective controls have been presented: an homogeneous control filtered by a selective function and a distributed control active only on a selective set. As a first step toward a reduction of computational cost, we introduce a model predictive control (MPC) approximation by deriving a numerical scheme with a feedback selective constrained dynamics. Next, in order to cope with the numerical solution of a large number of interacting agents, we derive the mean-field limit of the feedback selective constrained dynamics, which eventually will be solved numerically by means of a stochastic algorithm, able to simulate efficiently the selective constrained dynamics. Finally, several numerical simulations are reported to show the efficiency of the proposed techniques.
In this paper we focus on multi-agent systems subject to a velocity alignment dynamics [, , , , , ], and influenced by an external control. The control term can be used to study the influence of an external agent on the system dynamics, or to enforce emergence of non spontaneous desired asymptotic states. This type of problems has risen large interest in several communities, see for example [, , , , , , ].
We will consider models of Cucker-Smale-type, [, , ] consisting of a second order system where an average process among binary interactions rules the alignment of the velocities, and where such interactions are weighted by a function of the relative distance among two agents. Thus we consider a system of agents with position and velocity , and initial datum , which evolves according to as follows
where , is a general communication function weighting the alignment towards the mean velocity. The following theorem holds for systems of type (1.1),
(Unconditional flocking). Let us consider the system (1.1), where the communication function is assumed to be decreasing and such that
then the maximal diameter of the positions remains uniformly bounded by , namely we have for every , and the velocities converge exponentially fast towards the flocking state , such that
The unconditional flocking can be retrieved also in the case conditions - do not hold, but particular properties on the initial state of the system should be satisfied, see []. In a general setting these conditions are not usually fulfilled, therefore a natural way to reach the flocking condition in system of type (1.1), requires the intervention on the system, by means of a designed control or an external force. This idea has been studied in several research fields, and from different perspectives, in particular for Cucker-Smale-type models,[, , , , , ]. Hence a natural framework for such problem consists in finding a control strategy in the space of the admissible controls , solution of the following minimization problem,
where is a desired velocity, a positive convex function, and subject to the dynamics
The type of control problem underlined by (1.2)-(1.3) is equivalent to assume the presence of a policy maker able to exert a control action on every single agent. From a modeling view point this choice can be justified in different ways, for example by considering a limited amount of resources, thus a governor cannot act simultaneously on every agent but it has to select a portion of the population in order to make its action relevant; alternatively a control active only on few agents, can be promoted taking into account a -minimization in (1.2) , [, ]. On the other hand, if we do not consider enough restriction on the action of our policy maker, we can not exclude that the solution of the optimal control problem (1.2)-(1.3) requires the ability of controlling point-wisely every agent in a large set.
In this paper will address the goal of controlling a large agents’ ensemble when the action of the controllers is performed in selective way. We will give two different interpretation to the concept of selective control.
First we will consider selectivity as an intrinsic property of the system of filtering external actions, thus the control influences the flocking dynamics through a selective function , which depends on the state of the system. In particular we will assume the control to be equal for all agents, i.e. , and introduce a function modeling the propensity to accept the information coming from .
Second, we will define selectivity according to the membership to a particular set, and we will consider the control to be active only on the agents belonging to a given set . We will assume that such set can be defined according to some properties of the system and of the domain.
From the mathematical viewpoint to each model we will associate different control problems, in the first case we will assume a setting in terms of optimal control problems, [, , ], in the second case we will introduce a type of differential game, making the implicit assumption that agents inside the selective set wish to optimize their own functional. [, ], for connection among this two different approach we refer to [].
Moreover, since we are interested also in the numerical investigation of the models, we will introduce a numerical strategy to reduce the computational cost. Indeed the numerical solution of control problems for systems of type (1.2)-(1.3), requires usually a tremendous computational efforts, due to the nonlinearities of the model and the large number of agents, []. Thus a first step towards the cost reduction is obtained numerically via model-predictive control (MPC), [], when dealing with such complex system, where instead of solving the above control problem over the whole time horizon, the system is approximated by an iterative solution over a sequence of finite time steps, [, ].
A further approximation of the microscopic dynamics consists in the derivation of the so-called mean-field limit of the particle system, which describes the behavior of the system for large number of agents, [, , ]. Such statistical description of the evolution of the microscopic system, has been recently coupled with the optimal control problem, in order to furnish a novel description of problems in terms of mean-field optimal control problems (1.2)-(1.3), [, ].
In this direction our paper presents a simple selective approach to obtain greedy solutions to the mean-field optimal control problem, embedding into the mean-field dynamics the instantaneous controls obtained from the MPC procedure.
The paper is organized as follows, in Section 2 we described our modeling setting, deriving a approximated solution through an model-predictive control strategy (MPC); in Section 3 we will derive formally a mean-field description of the constrained problem. Finally in Section 4 we present several numerical results showing the efficiency of the proposed technique.
2 Selective control of flocking models
We propose two different models and control setting to promote the alignment of a flocking system: a space homogeneous control filtered by a selective function measuring the influence on the dynamics, an heterogeneous control which activates on some specific agents once they belong to the selective set, .
Filtred control with selective function. We consider a Cucker-Smale-type model where a system of agents described by their position and velocity, with initial datum , evolves accordingly to
where , is a general communication function weighting the alignment towards the mean velocity, depending on the relative distance of the agents.
The control term is include as an external intervation, whose action is multiplied by function a real-valued function, which tunes the influence of the control on the single agent according to its position and velocity. We refer to as selective function and the term as filtered control. Thus we define as the solution of the following optimal control problem
constrained to the dynamics of (2.1), where is a desired velocity, and a regularization parameter the space of admissible controls. For simplicity in the formulation (2.2) we consider a least–square type cost functional, but other choices can be considered.
Observe that a-priori could be un-known by the controller, which would implies a different modeling setting accounting possible uncertainties. Instead we will assume that the controller has a perfect knowledge of the selective function , which might be eventually his belief on the effectiveness of the control action.
Pointwise control with selective set. In a different setting we consider the Cucker-Smale-type model (1.3) where each agent is controlled directly, thus we have a system of agents with initial datum evolving according to
where is the strategy of every agents. We assume that the strategy of the agent is active only if for every , and where is defined for every as follows
where is the selective set defined on the phase space at time . Therefore only the dynamics of a portion of the total agents can be influenced, in order to steer whole system towards the target velocity .
Differently from the previous model we set up the problem in a differential game setting. Thus the strategy of each agent is defined through the following set of minimization problems,
for , where each agent wants to minimizes its own functional , as soon as at time he belongs to . Therefore each agent strategy at time is the solution of an equilibrium problem solved among the agents in the set .
We observe that this problem can be formulated also in terms of an optimal control problem, where the aim is to find a vector solution to the following minimization problem,
2.1 Model predictive control
We introduce now a numerical technique based on model predictive control (MPC), also called receding horizon strategy in the engineering literature, in order to reduce the computational cost of the optimal control problem, [, ]. This procedure is in general only suboptimal with respect to the global optimal solution of problems (2.1)–(2.2), and (2.3)–(2.6), nonetheless we will show that also in the simplest setting the solution of the MPC furnishes an instantaneous feedback control, which is a consistent discretization of a first order approximation of the optimal control dynamics.
Let us consider the time sequence , a discretization of the time interval , where , for all and . Then we assume the control to be constant on every interval , and defined as a piecewise function, as follows
where is the characteristic function of the interval . In general model predictive control strategies solve a finite horizon open-loop optimal control problem predicting the dynamic behavior over a predict horizon , with initial state sampled at time (initially ), and computing the control on a control horizon .
Since our goal is to derive instantaneous control strategies, in what follows we will consider a reduced setting , and taking in to account a first order discretization of the optimal control problem (2.1)-(2.2).
Instantaneous filtered control
Let us introduce a full discretization of the system (2.1) through a forward Euler scheme, and we solve the minimization problem (2.2) via MPC strategy on every time frame . Thus the reduced optimal control problem reads
for all , and . The MPC aims at determining the value of the control by solving for the known state a (reduced) optimization problem on in order to obtain the new state . This procedure is reiterated until is reached. In this way it is possible to reduce the complexity of the initial problem (2.1)-(2.2), to an optimization problem in a single variable . Therefore we introduce the compact notation , and , where for every , is the associated lagrangian multiplier of , and we define the discrete Lagrangian , such that
Computing the gradient of (2.10) with respect to each component of and for every , we obtain the following first order optimality conditions
where the action of the control is substituted by an implicit term representing the relaxation towards the desired velocity . Note that in this implicit formulation the action of the control is lost for , since it is expressed in terms of . Thus, in order to rewrite the system as a consistent time discretization of the original control problem is necessary to assume the following scaling on the regularization parameter, , and we revert to the system into an explicit form, thus we obtain
where we have omitted terms. We leave the details of the derivation of the forward system in Appendix A.
Hence system (2.14) represents a consistent discretization of the following dynamical system,
where the control term is expressed by a steering term acting as an average weighted by the selective fuction .
Let us assume , and defining the mean velocity of the system, then we have
which admits the explicit solution, . Therefore, for , . Thus, in this case, the feedback control is able to control only the mean of the system but not to assure the global flocking state, note.
Instantaneous pointwise control
Similarly to previous section we introduce a full discretization of the system through a forward Euler scheme of the optimal control problem (2.3)-(2.5) on every time frame . Then the reduced optimal control problem reads
where the solution is easily retrieved by differentiation with respect to , for every . Thus we have
In order to rewrite the system as a consistent time discretization of the original control problem we scale the regularization parameter, , and plugging the control (2.18) into the discretized dynamics, we obtain
where we have omitted terms and is the characteristic function defined on the selective set . Hence system (2.19) is a consistent discretization of
Note that at variance with respect to the previous case, the control is acting pointwisely on every single agent as a steering term towards the desired state. In the case of , and , for any it can be easily shown that the velocities converge to the desired flocking state , for and for any .
Let us remark that performing the MPC strategy on single time interval for the optimal control problem (2.6), gives us the following discrete functional,
Writing the discrete Lagrangian and computing its variations with respect to each components of and , gives us to the following system
Thus, by reverting to the explicit version of (2.22) we obtain the same feedback control system (2.19) with instantaneous control (2.18). Therefore we have that the suboptimal controls recovered via model predictive control on the single horizon, respectively for (2.6) and (2.5) are equivalent, [, ].
3 Mean-field limit for the controlled flocking dynamics
where we introduced the empirical probability measures
representing the particle density at time with position and velocity . Moreover we defined the general operator as follows
with , and where the operator indicates in general the control term. Thus for different the types of instantaneous controls we derived in the previous sections, we have respectively
where denotes the the integral in over the full . Collecting all the terms and integrating by parts in we recover the following weak formulation
Rewriting the main expression we have
and thus the strong form reads
Hence assuming that for the limit exists, where is a probability density on , we obtain the following integro-differential PDE equation of the Vlasov-type,
as the mean-field limit of system (3.1).
In what follows we show some classical results of the rigorous derivation of the mean-field limit, restricting ourself to the control expressed by in equation (3.4) with selective function . Eventually we discuss the case of a general .
Stability result in presence of selective function. Let us consider defined as in (3.4), for this case we give sufficient conditions in order to prove the mean-field limit (3.6), (i.e. see hypothesis of Theorem 4.11 in []). To this end let us first introduce the following definition
(Wasserstein 1-distance). Let , be two Borel probability measures. Then the Wasserstein distance of order between and is defined as
where the infimum is computed over the set of transference plans between and , i.e. among the probability measures in the product space with marginals and .
We further define the subset of probability measures of compact support on , with finite first moment, and we define the non-complete metric space endowed with the Wasserstein 1-distance. Moreover we introduce the set of functions , which are locally Lipschitz with respect to , uniformly in time. Therefore let us consider the operator , such that
then we state the following
Let , and be locally Lipschitz and bounded, and , such that , for every and for a given radius . Then for any ball , there exists a constant such that
where denotes the Lipschitz constant in the ball .
Let us first define estimate
where for the sake of brevity we omit the dependency on , and where , where is such that , then we have (3.8a). Let us now introduce the optimal transference plan between and , in the sense of Definition 3.1, and having defined , which is again locally Lipschitz thanks to the boundedness of , then for any we have
Thus, taking the absolute value we have
which implies (3.8b). ∎
In this case the results of Lemma 3.1 are sufficient to satisfy the hypothesis of Theorem 4.11 in [], in this way existence, uniqueness and stability of measure solutions for model (3.6) are assured. The remarkable consequence of this theorem is the stability of the solutions in the Wasserstein 1-distance, which gives us a rigorous derivation of the kinetic equation (3.6) as the limit of the a large number of agents of the system of ODEs (3.1) in the Doubrushin’s sense, for further details see [].
In the case of control with selective set , the previous results are not valid anymore, because it carries the discontinuous function , which is not locally Lipschitz, therefore we need more refined estimates to prove a stability result. From the modeling view point a possible strategy consists in considering a mollified version of the in order to gain enough regularity, [, , ]. More refined result for the mean-field limit have been shown in the case of discontinuous kernels and they might be extended to this case, [].
4 Numerical simulations
One of the main difficulty in the numerical solutions of kinetic models of type (3.6), arises in the approximation of the interaction operators, and , which requires usually a huge computational efforts. In order to reduce the computational complexity we use a fast numerical algorithm based on the approximation of the interaction operator through a Boltzmann-like equation, we leave further details of this procedure in Appendix B and we refer to [].
We perform the simulations for , defining an initial data normally distributed in space, with center in zero and unitary variance, and in velocity, uniformly distributed on a circumference of radius . Our goal is to enforce alignment with respect to the desired velocity . The evolution of the kinetic equation (3.6) is evaluated up to final time , with for the time discretization, considering sampled particles and scaling parameter .
Hence we consider the mean-field model (3.6), with the standard communication function,
with , with this choice of , then the hypothesis of Theorem 1.1 are not satisfied, and consequently the unconditional flocking is not guaranteed a-priori.
We report in Figure 1 the initial data and the final state reached at time , depicting the spatial density and showing at each point the value of the flux, represented by the following vector quantity . Note that in the right-hand side figure the flocking state is not reached, and the density is spreading around the domain following the initial radial symmetric distribution of the velocity field.
Starting from this initial example we want to stabilize the evolution testing the performances of the different control policies in presence of a selective function and in the case of a selective set.
4.1 Localized stabilization
We compare the two control approaches in the case of a selective control only capable of acting on a confined ball of the space domain. Hence, we define
Test 1a: filtered control We study the evolution of the system in presence of a selective control, where the selective function is , and the filtered control defined by
Moreover in order to compare the behavior of the action of the selective control we define respectively the running cost, and the total cost as
The numerical results in Figure 2 shows the higher influence of the control for increasing value of , i.e. for longer influence of the control on the density, and decreasing value of the penalization parameter .
In Figure 3 we additionally explore the range of parameters , with respect to the following quantities: measuring the alignment at final time for , and the following cost .
Test 1b: pointwise control. We now study the action of a control in presence of the selective set , therefore the control is active only on the density which is included inside the ball of radius , and is defined as follows