Performance Evaluation of a Multi-Agent Risk-Sensitive Tracking System

# Performance Evaluation of a Multi-Agent Risk-Sensitive Tracking System

## Abstract

In this paper, we consider a simple linear exponential quadratic Gaussian (LEQG) tracking problem for a multi-agent system. We study the dynamical behaviors of the group as we vary the risk-sensitivity parameter, comparing in particular the risk averse case to the LQG case. Then we consider the evolution of the performance per agent as the number of agents in the system increases. We provide some analytical as well as simulation results. In general, more agents are beneficial only if noisy agent dynamics and/or imperfect measurements are considered. The critical value of the risk sensitivity parameter above which the cost becomes infinite increases with the number of agents. In other words, for a fixed positive value of this parameter, there is a minimum number of agents above which the cost remains finite.

## 1 Introduction

With the rapidly growing interest in sensor networks and distributed control systems, a large body of litterature in recent years has focused on new and old engineering problems that these networks pose. Examples include consensus problems in various forms (see [1] and the references therein), control design with pre-specified information passing structure and control under communication constraints (e.g. [2, 3]), designing communication schemes in ad-hoc networks [4], etc.

In some cases, such as in signal processing and control for distributed parameter systems (e.g. [5, 6]), the distributed architecture of the system is imposed by the task in mind. But in a large number of situations, especially involving mobile robotic networks, the designer has a significant freedom in choosing the degree of distributivity of the system, the number and type of agents to use, etc. Yet relatively little work has focused on understanding quantitatively the benefits of multi-agent systems over centralized systems for various tasks, or the impact of the number of agents on the performance of the system, when this choice is available.

For a given task at hand, it is important to study these problems, because they influence most of the system design. For example, scalability is a relative notion. If we find that no significant improvement, or worse a decrease in performance is to be expected if we employ thousands of agents instead of tens or hundreds, it is probably better to design algorithms with good performance that are not necessarily the most decentralized. Moreover, it is clear than more agents is not always beneficial. For example, the capacity of a wireless sensor network is expected to decrease as the number of nodes increases [7]. A similar issue arises in mobile robotic networks where conflict free vehicle routing is necessary and congestion increases with the size of the network [8].

In some recent work researchers have obtained asymptotic performance scalings for certain multi-agent systems. Let us mention [9, 10] on various dynamic routing problems. One can also find in the computer science litterature a large number of papers focusing on the minimum number of agents necessary to perform certain tasks: for instance, the minimum number of pursuers to catch an evader [11], or the minimum number of guards for an art gallery [12].

In this work, we study a simple multi-agent tracking problem using linear quadratic control tools, which are well suited for performance evaluation. We focus on the evolution of the performance per agent, as the number of agents increases. Numerous papers describe multi-agent architectures for tracking and estimation, the simultaneous localization and mapping problem, etc., focusing on the various subroutine designs described earlier (e.g. [13]). Because of the link between risk-sensitive control and differential games, our application is also related to some recent work on multi-agent pursuit-evasion games [14]. The model we consider is simpler, but our focus is on obtaining insight regarding the performance asymptotics of the system, which these references do not usually discuss.

The rest of the paper is organized as follows. Section 2 describes our model and section 3 the general form of its solution. We find in particular that a standard LQG formulation might not provide a satisfying cooperative control law for the group of agents, as the individual controllers decouple, essentially because of the certainty equivalence principle. We can obtain coupled control laws however if we consider risk-sensitive agents. Intuitively, we expect the set of risk-sensitive parameters where tracking is possible, or the robustness to model disturbances of our system, to increase with the number of agents. This is related to the earlier observation that a minimum number of agents can be necessary in certain pursuit-evasion games. Section 4 provides some analysis of this aspect. Increasing the number of agents becomes critical in our scenario only in the case of noisy dynamics and, of course, noisy measurements since more agents obtain a better state estimator. We give elements of analytical performance analysis as well as simulation results in that section as well.

## 2 Basic Model

We will consider the following tracking problem. There is an evader moving randomly in , subject to the linear dynamics

 ˙xe=Axe+Gwe, (1)

where is a standard -dimensional white Gaussian noise, and we let .

We have identical pursuers, which are also described by linear systems, with the same matrix for simplicity:

 ˙xp,i=Axp,i+Bup,i+√ϵFwp,i,i=1,…,n,

where are standard -dimensional white Gaussian noises, independent between different agents and from . is a parameter that will tend to in parts of the analysis later on. Let . The initial positions of the agents are also independent d-dimensional Gaussian random vectors.

Each pursuer incurs a running cost

 12(xp,i−xe)′Q(xp,i−xe) (2)

which is quadratic in the difference between its state and the state of the evader. There is also a running control cost for each agent, where is a positive definite matrix. Alternatively, the mobile agents are tracking the center of an ellispoid moving randomly in according to (1), whose shape is known and given by a positive semi-definite matrix . A motivation for this model is cooperative soaring and tracking ascending currents for a team of Unmanned Aerial Vehicles.

Each agent has access to a relative measurement:

 yp,i=C(xp,i−xe)+Hvp,i,i=1,…,n,

with a standard d-dimensional white noise. We let and assume to be positive definite. The measurement noise processes of the different agents are independent, and also independant of the various noises in the dynamics. For example, the agents could obtain noisy measurements of the gradient of the quadratic cost function (2), in which case . We will also consider the perfect measurement case, where , subject to no measurement noise.

We define , and the aggregate vectors , , , , . We will also use the Kronecker product of matrices, which we recall. If and are matrices, then we have by definition:

 A⊗B=⎡⎢ ⎢⎣a11B…a1nB⋮⋱⋮am1B…amnB⎤⎥ ⎥⎦,

and we will use the property:

 (A⊗B)(C⊗D)=AC⊗BD.

The eigenvalues of are and the corresponding eigenvectors , where , , , are the eigenvalues and eigenvectors of and respectively. Hence if and are positive definite matrices, so is . Denote by the identity matrix, by the column vector of ones of size , and by the matrix of ones. Let

 An=In⊗A,Bn=In⊗B,Fn=In⊗F,Zn=In⊗Z, Cn=In⊗C,Hn=In⊗H,Vn=In⊗V,

be the block diagonal matrices describing the dynamics of the group of pursuers. Finally, let and the corresponding spectral density matrix be

 Wn=En⊗W=GnG′n=⎡⎢ ⎢⎣W⋯W⋮⋮W⋯W⎤⎥ ⎥⎦.

Then the evolution of the relative states is described by:

 ˙x=Anx+Bnu+√ϵFnwp−Gnwe,x(0)=x0,

with observations

 y=Cnx+Hnv.

We will assume that is a Gaussian random vector with mean and covariance matrix .

### 2.1 Basic Example: Simple Integrators

In an example used repetedly in the following, the pursuers are simply integrators and the evader follows a standard brownian motion:

 {˙xe=we˙xp,i=ui+√ϵwp,i,i=1,…,n,

i.e., we have the dynamics

 ˙x=u+√ϵwp−⎡⎢ ⎢⎣we⋮we⎤⎥ ⎥⎦.

We will also take , i.e.,

 y=c+v.

## 3 Risk-Sensitive Tracking

Let and . We formulate the tracking problem as an infinite horizon linear exponential quadratic Gaussian (LEQG) problem:

 minimize J =limT→∞2θTlnE{expθ2∫T01n(n∑i=1(xp,i−xe)′Q(xp,i−xe) +n∑i=1u′p,iRup,i)dt} =limT→∞2θTlnE{expθ2∫T01n(x′Qnx+u′Rnu)dt}.

The factor is due to the fact that we want to obtain a measure of performance per agent. If we consider risk-averse agents, if the agents are risk-seeking and in the limit of , the agents are risk-neutral and we recover the standard LQG formulation. There is a very large litterature on the LEQG problem, as well as on more general risk-sensitive control problems, that we cannot review here. Suffices to say that the LEQG problem was introduced in the engineering litterature and solved in the full information case by Jacobson (who did not use the logarithmic transformation) [15]. Links with differential games are emphasized already in that paper, and the relationship to control is considered in [16, 17, 18]. The risk-sensitive state estimator is not the Kalman filter, and robustness properties of this filter have been explored for example in [19]. An important motivation for the early work of Jacobson in the full information case was to obtain a controller which is not independent of the noise in the dynamics, as arises in the LQG case because of the certainty equivalence principle. This motivation translates in our case into obtaining controllers for the pursuers which are not independent of the known characteristics of the random motion of the evader, certainly a desirable feature.

The LEQG output feedback solution over an infinite horizon is described for example in [18, 20]. We start by considering the full state feedback problem or perfect measurement case, that is, . Introduce the quantity

 Sn(θ)=nBnR−1nB′n−θ(Wn+ϵZn)

and consider the generalized algebraic Riccati equation (GARE)

 A′nX+XAn−XSn(θ)X+1nQn=0. (3)

Now define the quantity

 θ∗(n)=sup{θ∈R: the GARE (% ???) admits a positive definite solution Xn,θ}. (4)

Assume that is controllable and is observable. Then we have that is positive, and for all , the LEQG problem with perfect state measurements admits an optimal state-feedback solution

 u=−(nR−1nB′nXn,θ)x, (5)

with the optimal cost being

 J∗(θ,n)=Tr((Wn+ϵZn)Xn,θ). (6)

Furthermore the feedback matrix is Hurwitz. In the following we will omit to write the dependence with respect to of the solution of the GARE, and write just .

Now consider the output feedback solution for the problem with imperfect measurements described earlier. Recall that we assumed and let . Introduce the quantity

 Tn(θ)=C′nV−1nCn−θQnn,

and consider the two GAREs (3) and

 YA′n+AnY−YTn(θ)Y+(Wn+ϵZn)=0. (7)

Define the quantity

 θ∗I(n)=sup{θ∈R: the GAREs (% ???) and (???) admit minimal positive definite solutions Xn and Yn,% respectively, and the matrix I−θYnXn has only positive % eigenvalues}.

For , introduce the filter

 d^xdt=(An+θYnQnn)^x+Bnu+YnC′nV−1n(y−Cn^x), ^x0=¯x0.

Let . Then one can compute that that is generated by the following differential equation:

 d~xdt=(An−Sn(θ)Xn)~x+(I−θYnXn)−1B~u +(I−θYnXn)−1YnC′nV−1n(y−Cn~x),

where .

Now suppose that the pairs and are controllable. For the second condition, in practice we will assume to be controllable. Also, assume that the pairs and are observable, and , then for all , the optimal controller is given by

 u∗=−nR−1nB′nXn(I−θYnXn)−1^x=−nR−1nB′nXn~x.

The optimal cost is

 J∗I(θ,n)=Tr(YnQnn+YnC′nV−1nCnYnXn(I−θYnXn)−1). (8)

### 3.1 Application to the basic example

It is interesting to visualize the trajectories of the agents in a simple case of the basic example. We consider a situation with perfect information, and . The agents are tracking an evader according to the model of random motion described earlier, but in this simulation the evader in fact remains immobile at the origin. The trajectories are shown on Fig. 1, for risk-averse, risk-neutral and risk-seeking agents, and show different qualitative convergence behaviors.

The risk-neutral trajectories are essentially trivial in the sense that the controllers are totally decoupled and the control law for one agent depends only on its separation from the target. In the risk-seeking and risk-averse case however, the controllers become coupled, and each agent needs information about the position of all the other pursuers as weel as the target. In the risk averse case, two set of agents starting at different distances from the target try to track it from both sides. Intuitively, the overshoot is due to a pessimistic behavior which leads them to give more importance to the case where the target moves away from them.

## 4 Performance Analysis

Equations (6) and (8) give the cost per agent when the group of pursuers consist of agents. In this section we study how this individual cost evolves as the number of agents increases.

### 4.1 Perfect State Measurements

#### LQG solution

The LQG solution is obtained for (risk neutral). In this case we can rewrite the control GARE (3) as:

 A′nX+XAn−XnBnR−1nB′nX+1nQn=0. (9)

Due to the block diagonal form of the matrices, the equation decouples by agents. With our controllability and observability assumptions, (9) has a unique positive definite solution ([21], corollary 13.8). In particular, if is the positive definite solution for one agent, i.e.,

 A′X1+X1A−XBR−1B′X+Q=0

then by unicity the solution for agents is verified to be

 Xn=1nIn⊗X1.

The controller for the group, is decoupled as well, that is, each agent considers only the state difference between himself and the target, but not the states of the other agents.

The cost per agent in the perfect information case is then

 J∗(0,n) =Tr((Wn+ϵZn)Xn)=1nTr(En⊗WX1+In⊗ϵZX1) =Tr((W+ϵZ)X1)=J∗(0,1),

that is, the cost per agent is independent of the number of agents.

#### Risk-Sensitive Solution with no dynamics noise

When , some coupling between the controllers of the agents is introduced through the matrix . This is a desirable feature as intuitively we would like the agents to take advantage of the fact that they can cooperate. We will take here, which still leads to a nonsingular problem in the perfect information case.

Consider first the case of one agent. We can always write the solution as a pertubation of the LQG solution, i.e., as

 X1=~X1+^X1,

where is the solution to (3) for and is the solution to (9) for .

###### Proposition 4.1.

For agents and , the LEQG problem with perfect state measurements admits an optimal state-feedback solution solution (5) with the solution to (3) given by

 Xn=1nIn⊗~X1+1n2En⊗^X1. (10)
###### Proof.

By definition we have that and verify:

 A′~X1+~X1A−~X1BR−1B~X1+Q=0 (11) A′(~X1+^X1)+(~X1+^X1)A −(~X1+^X1)(BR−1B′−θW)(~X1+^X1)+Q=0. (12)

We try this candidate solution in the GARE (3) for . Then by a straightforward calculation, using the identity , we obtain:

 1nIn⊗(A′~X1+~X1A−~X1BR−1B~X1+Q)+ 1n2En⊗[A′(~X1+^X1)+(~X1+^X1)A −(~X1+^X1)(BR−1B′−θW)(~X1+^X1) −(A′~X1+~X1A−~X1BR−1B)] =0.

Indeed, in the first term we have (11) and in the second term we have the difference between (12) and (11).

Moreover, this solution is positive definite and stabilizing. This follows from the fact that the eigenvalues of

 In⊗M+Enn⊗N,

for any diagonalizable matrices and , are the eigenvalues of with multiplicities as well as those of . A set of corresponding eigenvectors are , where the are vectors spanning the kernel of (which is symmetric) and the are eigenvectors of , and , for eigenvector of . It is then easy to see that the eigenvalues of are thoses of and , and those of are the eigenvalues of and . But and are stabilizing and positive definite. ∎

###### Remark.

In particular, we see that the critical value is independent of in the perfect information case when .

We can then compute the cost of this solution to the risk-sensitive tracking problem, for . We get

 J∗(θ,n) =Tr(WnXn)=Tr(1nEn⊗W~X1+1n2E2n⊗W^X1) =Tr(W(~X1+^X1))=J∗(θ,1).

Hence, as in the LQG case, the cost per agent in the case is independent of the number of agents.

#### Risk-Sensitive Solution with noisy dynamics

The risk-sensitive solution in the perfect measurement case seems to be most relevant when noise is present in the dynamics of the agents. In this case, the cost-per-agent is not independent of the number of agent any more, and moreover, the critical value of the parameter increases with . Another way to say this is that for a fixed value of risk aversion , there is a minimum number of agents that are necessary to obtain a finite cost.

We only provide a numerical illustration of this fact. Fig. 2 shows the cost per agent in the basic example with . We set . In this case, the cost per agent is finite only if agents or more are used. The lower bound is the cost per agent for .

### 4.2 Imperfect State Measurements

In the case of imperfect measurements, there is one obvious advantage of using more agents. As the number of agents taking measurements with independent noise sources increases, a better state estimate can be constructed. In this section, we consider the imperfect measurement tracking problem, with the simplifying assumption . Note that in this case, the matrix is not stabilizable when , resulting in an only marginally stable filter. Hence we have to keep a small noise term in the calculation, but in the analysis we will consider the dominant part as , for which a clear answer is available.

#### LQG solution

For (risk neutral), the filter algebraic Riccati equation (7) for becomes:

 −YC′nV−1nCnY+(Wn+ϵZn)=0. (13)

Let be the positive definite solution (unique in the LQG case) of the Riccati equation (13) for one agent, i.e.,

 −Y1C′V−1CY1+(W+ϵZ)=0.

Then we have

 (En√n⊗Y1)(In⊗CV−1C)(En√n⊗Y1)=(E2nn)⊗(W+ϵZ) =En⊗W+ϵEn⊗Z,

and so as , the solution to the agent problem, for which the right hand side in the previous equation is , approaches .

The total cost is

 J∗I(0,n)=Tr(YnQnn+(Wn+ϵZn)Xn).

We have already seen that if , the second term of this expression becomes independent of . Hence, as , the cost approaches

 J∗I(0,n) ≈1nTr{(En√n⊗Y1)(In⊗Q)}+Tr(WX1) ≈1√nTr{Y1Q}+Tr(WX1).

In conclusion, as the number of agents increases, the tracking performance per agent converges to the control performance for one agent at rate , due to a better estimation performance only. This is intuitively expected from our understanding of the asymptotic normality of maximum likelihood estimators. However, if we consider the more general diffusion process with and , it is not clear if this asymptotic rate of convergence still holds.

#### Risk-Sensitive Solution

Again let us consider the case and . In the limit , the solution to the control GARE is given by (10) and this equation does not influence the critical value of the parameter . The filter GARE is:

 Y(In⊗(CV−1C−θnQ))Y=En⊗W+ϵIn⊗Z. (14)

Hence as in the LQG problem, we see that as , the solution to this equation approaches

 Yn=En√n⊗~Y1,n, (15)

but an essential difference is that now, is the solution to the equation:

 Y(CV−1C−θnQ)Y=W+ϵZ, (16)

which is not the single agent equation, but depends on through the parameter . Fig. 3 provides some experimental results on the performance of the multi-agent system in various cases considered above.

If the constraint that must have only positive eigenvalues were not present, from (16) would increase linearly with . Now it is easy to check from (10) and (15) that as , approaches

 YnXn≈Enn3/2⊗~Y1,nX1,

has eigenvalues and , so the condition as becomes

 ρ(θ~Y1,n,θX1,θ)<√n,

where denotes the spectral radius. We have indicated the dependence of and on in this condition. Fig. 4 shows the evolution of the critical value with the number of agents, computed for the basic example. We obtained similar experimental results for . However if it seems that we can obtain different behaviors.

Note that in the various cases considered above, when an analytical solution could be obtained it always implied that for solving the n agent problem, we only need to solve a Riccati equation of the same size as for one agent. Hence significant computational reduction can be achieved by taking advantage of the symmetries present when considering homogeneous agents.

## 5 Conclusion

In this work we have considered a basic tracking task to be performed cooperatively by a team of mobile sensors. We have given some elements of analysis concerning the influence of the size of the group on the individual performance. Intuitively, there seems to be much gain to be obtained by multi-agent systems in terms of robustness and in a risk-sensitive context, and we believe that more work in this direction is needed. In any case, it is clear that there is a need for a better understanding of the beneficial role of distributed architectures, when this aspect is under the control of the system designer.

### References

1. V. D. Blondel, J. M. Hendrickx, A. Olshevsky, and J. N. Tsitsiklis, “Convergence in multiagent coordination, consensus, and flocking,” in Proceedings of the Joint 44th IEEE Conference on Decision and Control and European Control Conference, Seville, Spain, December 2005.
2. S. Tatikonda, “Control under communication constraints,” Ph.D. dissertation, Massachusetts Institute of Technology, 2000.
3. C. Langbort, R. Chandra, and R. D’Andrea, “Distributed control design for systems interconnected over an arbitrary graph,” IEEE Transactions on Automatic Control, vol. 49, no. 9, pp. 1502– 1519, 2004.
4. I. Akyildiz, S. Weilian, Y. Sankarasubramaniam, and E. Cayirci, “A survey on sensor networks,” IEEE Communications Magazine, vol. 40, no. 8, pp. 102–114, 2002.
5. A. Willsky, “Multiresolution markov models for signal and image processing,” Proceedings of the IEEE, vol. 90, no. 8, pp. 1396–1458, 2002.
6. B. Bamieh, F. Paganini, and M. Dahleh, “Distributed control of spatially invariant systems,” IEEE Trans. Automatic Control, vol. 47, pp. 1091–1107, 2002.
7. P. Gupta and P. Kumar, “The capacity of wireless networks,” IEEE Transactions on Information Theory, vol. 46, no. 2, pp. 388–404, March 2000.
8. M. Savchenko and E. Frazzoli, “On the time complexity of conflict-free vehicle routing,” in Proceedings of the American Control Conference, 2005.
9. J. Enright, E. Frazzoli, K. Savla, and F. Bullo, “On multiple UAV routing with stochastic targets: Performance bounds and algorithms,” in AIAA Guidance, Navigation, and Control Conference and Exhibit, San Francisco, CA, August 2005.
10. H. Waisanen, D. Shah, and M. Dahleh, “Optimal delay in networks with controlled mobility,” in 17th International Symposium on Mathematical Theory of Networks and Systems, 2006.
11. V. Isler, S. Kannan, and S. Khanna, “Randomized pursuit-evasion with local visibility,” SIAM Journal on Discrete Mathematics, vol. 1, no. 20, pp. 26–41, 2006.
12. T. Shermer, “Recent results in art galleries,” Proceedings of the IEEE, vol. 80, no. 9, pp. 1384–1399, September 1992.
13. S. Martínez and F. Bullo, “Optimal sensor placement and motion coordination for target tracking,” Automatica, vol. 42, no. 4, pp. 661–668, 2006.
14. J. Hespanha, M. Prandini, and S. Sastry, “Probabilistic pursuit-evasion games: A one-step nash approach,” in Proceedings of the IEEE Conference on Decision and Control, 2000.
15. D. Jacobson, “Optimal stochastic linear systems with exponential performance criteria and their relation to deterministic differential games,” IEEE Transactions on Automatic Control, vol. 18, no. 2, pp. 124–131, 1973.
16. K. Glover and J. Doyle, “State-space formulae for all stabilizing controllers that satisfy an norm bound and relations to risk sensitivity,” Systems and Control Letters, vol. 11, no. 3, pp. 167 – 172, 1988.
17. T. Başar and P. Bernhard, -Optimal Control and Related Minimax Design Problems: A Dynamic Game Approach, 2nd ed.   Birkhäuser, 1995.
18. P. Whittle, Risk-Sensitive Optimal Control.   Wiley, 1990.
19. R. Boel, M. James, and I. Petersen, “Robustness and risk-sensitive filtering,” IEEE Transactions on Automatic Control, vol. 47, pp. 451–461, 2002.
20. Z. Pan and T. Başar, “Model simplification and optimal control of stochastic singularly perturbed systems under exponentiated quadratic cost,” SIAM Journal on Control and Optimization, vol. 34, no. 5, pp. 1734–1766, 1996.
21. K. Zhou, J. Doyle, and K. Glover, Robust and Optimal Control.   Prentice Hall, 1996.
Comments 0
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minumum 40 characters

Loading ...
72434

You are asking your first question!
How to quickly get a good answer:
• Keep your question short and to the point
• Check for grammar or spelling errors.
• Phrase it like a question
Test
Test description