Tight estimates for convergence of some non-stationary consensus algorithms

Tight estimates for convergence of some non-stationary consensus algorithms

Abstract

The present paper is devoted to estimating the speed of convergence towards consensus for a general class of discrete-time multi-agent systems. In the systems considered here, both the topology of the interconnection graph and the weight of the arcs are allowed to vary as a function of time. Under the hypothesis that some spanning tree structure is preserved along time, and that some nonzero minimal weight of the information transfer along this tree is guaranteed, an estimate of the contraction rate is given. The latter is expressed explicitly as the spectral radius of some matrix depending upon the tree depth and the lower bounds on the weights.

Keywords: multiagent systems; distributed consensus; convergence rate; linear time-varying systems; uncertain systems; stochastic matrices; Perron-Frobenius theory; mixing rates.

1 Introduction

Appeared in the areas of communication networks, control theory and parallel computation, the analytical study of ways for reaching consensus in a population of agents is a problem of broad interest in many fields of science and technology; see [2] for references. Of particular interest is the question of estimating how quickly consensus is reached on the basis of few qualitative (mainly topological) information as well as basic quantitative information on the network (mainly the strength of reciprocal influences).

Originally, this problem was considered in the context of stationary networks. For Markov chains that are homogeneous (that is stationary in the vocabulary of dynamical systems), it amounts to quantify the speed at which steady-state probability distribution is achieved, and is therefore directly related to finding an a priori estimate to the second largest eigenvalue of a stochastic matrix. Classical works on this subject are due to Cheeger and Diaconis, [6, 8], see also [11] for improved bounds, as well as [19, 14] and [18] for a survey. The latter concern reversible Markov chains, for example when the transition matrix is symmetric, see e.g. [9] for the non-reversible case.

Among the classical contributions which instead deal with time-varying interactions we refer to the work of Cohn, [7], where asymptotic convergence is proved, but neglecting the issue of relating topology and guaranteed convergence rates. Tsitsiklis et al. also provided important qualitative contributions to this subject [20, 21, 3], as well as Moreau [15]. See also [1] for further nonlinear results. In particular, the role of connectivity of the communication graph in the convergence of consensus and spanning trees has been recognised and finely analysed [15, 5, 16].

More recently, important contributions in characterizing convergence to consensus in a time-varying set-up were proven by several authors, see for instance [3, 15]. See also [5, 4] for more specific cases.

In a previous paper [2], several criteria were provided to estimate quantitatively the contraction rate of a set of agents towards consensus, in a discrete time framework. The attempt there consisted in following the spread of the information over the agent population, along one or more spanning-trees. Ensuring a lower bound to the matrix entries of the agents already attained by the information flow along the spanning-tree, rather than the nonzero contributions as classically, permitted to obtain tighter estimates with weaker assumptions. Distinguishing between different sub-populations, of agents already touched by spanning-tree and agents not yet attained, and using lower bounds on the influence of the former ones, one is able to establish rather precise convergence estimates.

As a matter of fact, rapid consensus can be obtained in two quite different ways — either by dense and isotropic communications (based, say, on a complete graph), or by very unsymmetric and sparse relations (with a star-shaped graph with a leading root). In the first case many spanning trees cover the graph, while in the second configuration a unique one does the job.

The present article is a continuation of [2]. Emphasis is put on propagation of a unique spanning tree and on the resulting consequences in terms of convergence speed. It is demonstrated that in the particular case where such a spanning tree structure is guaranteed to exist at any time, ensuring minimal weight to the transmission of information along the tree (from the root to the leafs) indeed enforces some minimal convergence rate, whose expression is particularly simple. A worst-case estimate is provided, expressed as the spectral radius of certain matrix whose size equals the depth of the tree and whose coefficients depend in a simple way of the assumed minimal weights. This results in a sensible improvement over existing evaluations.

The paper is organized as follows. Section 2 contains the problem formulation and a presentation of the main result, together with the minimal amount of technical tools to allow for its comprehension. A comparison system is introduced afterwards in Section 3, whose study is central to establish the convergence estimate. The original method for analysis of this system is used in Section 4 to get convergence rate estimate (therein is stated the main result of the paper, Theorem 3), and some properties of the latter are studied. This result is commented in Section 5, before some concluding remarks. The proofs are sent back to Appendix.

Notations

The -th vector of the canonical basis in the space () is denoted ; the vector with all components equal to 1 in is written . When the context is clear, we omit the exponent and just write , resp.  to facilitate reading. We also use brackets to select components of vectors. All these notation are standard, and for a vector , the -th component is written alternatively , , or .

The systems considered here will be composed of agents: accordingly, we let .

As usual, identity and zero square matrices of dimension are denoted and respectively. We denote the matrix with ones on the sub-diagonal and zeros otherwise: . Here, and later in the text, denotes the Kronecker symbol, equal to 1 (resp. 0) when the condition written in the subscript is fulfilled (resp. is not). For self-containedness, recall that a real square matrix is said stochastic (row-stochastic) if it is nonnegative with each row sum equal to 1.

The spectral radius of a square matrix is denoted . Last, we use the notion of nonnegative matrices, meaning real matrices which are componentwise nonnegative. Accordingly, the order relations and envisioned for matrices are meant componentwise.

2 Problem formulation and presentation of the main result

Our aim is to estimate the speed of convergence towards consensus for the following class of time-varying linear systems:

(1)

where is a sequence of stochastic matrices (in particular, ; this is exactly the dual of what happens in the case of non-homogeneous Markov chains, where the probability distribution, written as a row vector , verifies rather a relation like ).

Let us first introduce some technical vocabulary to present in simplest terms the main result of the paper, afterwards enunciated in Section 4. The definition of the quantity we intend to estimate is as follows.

Definition 1 (Contraction rate).

We call contraction rate of system (1) the number defined as:

where the supremum is taken on those for which the denominator is nonzero.

The contraction rate is thus related to the speed of convergence to zero of the agent set diameter. In what follows, the latter plays the role of a Lyapunov function to study convergence to agreement. For stationary systems, as is well known, the number is indeed the second largest eigenvalue of the matrix . More in general, it corresponds to the second largest Lyapunov exponent of the considered sequence of matrices .

Definition 2 (Communication graph).

We call communication graph of system (1) at time the directed graph defined by the ordered pairs such that .

In the present context, we use indifferently the terms “node” or “agent”.

We now introduce assumptions on the existence of a constant hierarchical structure embedded in the communication graph, and on minimal weights attached to the corresponding links.

Assumption 1.

For a given positive integer , called the depth of the communication graph, assume the existence of nested sets such that

  • is a singleton (whose element is called the root);

  • ;

  • .

Assume in addition, for given nonnegative real numbers , that, for all and all

(2a)
(2b)
(2c)

As an example, the sets may be induced by some fixed spanning tree embedded in the communication graph: the existence of a distinguished agent, the root, is presupposed and, although the matrices and the underlying communication graphs are allowed some variations, information progress from this root along a (time-varying) tree to attain all the agents. The number bounds from above the minimal time for the information to attain the most distant agents from the root. Likely, we call the depth of agent . The set indeed consists of all the agents whose depth is guaranteed by Assumption 1 to be at most equal to . An example of (fixed) communication graph and the associated nested sets is shown in Figure 1.

Figure 1: The nested sets and the spanning tree

In addition to the spanning tree structure, Assumption 1 imposes some minimal weights to the information transmitted downstream along this structure (this is the role played by ), and also to the information used between agents located at same depth. Concerning the latter, expressed by condition (2b), remark that it is fulfilled by self-loops, that is when

(because by definition, for ); but it is indeed weaker: it allows just as well communications between agents whose depths are equal. The constraint on the self-loops of the root agent, measured by , is different than for the other agents (); this is done on purpose, and permits to treat simultaneously the case of leaderless coordination and ‘pure’ coordination with a leader (case corresponding to ).

Last, notice that, the matrices being stochastic, one should have:

for Assumption 1 to be fulfilled.

We are now in position to present the contents of Theorem 3. The latter states that, under the conditions exposed above, the rate of convergence of system (1) is at most equal to the spectral radius of the matrix defined by

A major characteristic of this estimate is that it is independent of the number of agents: it only depends upon and the depth .

We introduce, in the rest of the present Section, a general example where Assumption 1 is naturally fulfilled.

Definition 3 (-tree matrix).

For every nonnegative numbers , we call -tree matrix any matrix defined by the recursion formula

where, for all , the vector is a vector of the canonical basis in .

Notice that the agents have implicitly been numbered by the tree matrix representation: the pertinent information propagates from smaller to higher indexes. A central case where Assumption 1 holds is given by the following result.

Proposition 1.

Let be stochastic matrices. Assume the existence of a sequence of -tree matrices such that for all

Then, after some finite time, system (1) fulfills Assumption 1 with

and recursively defined as

We emphasize the fact that the lower bound may vary upon time.

By construction, in Proposition 1 verifies: , and does not depend upon the ordering of the matrices . Moreover, it may be proved directly that in the particular case of constant , is the depth of the associated graph (defined in Definition 2); generally speaking, however, the depth of a tree matrix sequence is at least equal to the of the depths of the individual matrices . To prove both properties, it is sufficient to remark that, in the case of constant , the previous formula indeed computes the depth of the associated graph. Figure 2 presents the case of two matrices for which the supremum of depths is equal to 2, that is strictly less than the depth of the sequence of matrices obtained by alternatively taking each of them, which is here equal to 3 (and also strictly less, in this case, than ). One can take the numbers defined in Assumption 1 equal to the corresponding numbers given below, a quite natural choice which yields in the present case:

Proof of Proposition 1.

Let the family of sets be defined as in the statement. Clearly is a singleton, and also for all as desired. Moreover, . In particular then, for all . Since for all and all , it is straightforward to verify that

as desired, for Assumption to hold. Finally, let , viz. . This yields for some for infinitely many times. Actually, more is true due to the special structure of tree matrices, namely for infinitely many s. We claim that for all larger than some finite time there exists such that . Indeed, let denote the father of the -th node in the tree matrix , that is the unique index such that ; clearly, for all . Indeed, for all sufficiently large s, , (otherwise infinitely many times and, therefore, we would have , which is a contradiction). Let . For all subsequent times, we have:

This concludes the proof of the Proposition. ∎

Figure 2: Trees of depth inducing a nested structure of depth .

3 A comparison system for the diameters evolution

We now build an auxiliary time-varying system, with a simpler structure than (1), and with the property that the asymptotic contraction rate of the original system can be bounded from above by carrying out suitable computations on this newly introduced system. Our main result for the present section is a statement relating convergence of (1) towards consensus of a comparison system introduced below.

Theorem 2.

Assume system (1) fulfills Assumption 1, for given nonnegative numbers (such that ). Let be defined by

Then, satisfies the following inequality:

(3)

where

(4)
(5)

Recall that inequality (3) is meant componentwise. A complete proof of Theorem 2 is provided in Section A.1.

Remark 1.

Two special cases of interest as far as application of Theorem 2 are obtained for the following values of parameters:

  1. : viz. communication graph admits a leader; under such premises, expressions for and simplify as follows:

  2. , viz. root agent is not different from any other member of the group in terms of self-confidence on his own position in the formation of consensus:

4 Convergence rate estimate and properties

Based on Theorem 2, we now provide Theorem 3, which states properly the property announced in the beginning of the paper.

Theorem 3.

Consider the linear time-varying dynamical system (1), with stochastic. Assume Assumption 1 is fulfilled. Then, the contraction rate towards consensus can be bounded according to the following formula:

(6)

with given in (4) and (5).

Proof of Theorem 3 is given in Section A.2. Recall that stochasticity of implies that the nonnegative scalar verify: , .

Theorem 3 provides a tight estimate for the contraction rate of (1) on the basis of the parameters , and , and of the depth of the sequence of tree matrices. We emphasize the fact that the result holds for time-varying systems. Indeed, Theorem 3 is an inherently robust result, as Assumption 1 allows for much uncertainty in the definition of system (1). This robustness is meant with respect to variations of the communication graph (provided these variations don’t violate the set conditions of Assumption 1), and with respect to variations of the coefficients of the matrix (provided they respect the quantitative constraints in Assumption 1).

A central fact is that the value in (6) does not depend upon the number of agents involved in the network: rather the depth of the graph is involved, which is quite natural.

Some properties of the estimate are now given. They are indeed useful to have a grasp on the asymptotic behaviour of the contraction estimate, as well as on their monotonicity properties; the latter are in agreement with the increase of decrease of information available by varying the parameters and .

Theorem 4.

Let . Then for any , has the following properties.

  • is the largest real root of the polynomial equation

    (7)
  • For any , .

  • For any , .

  • if and only if .

  • when , and more precisely

Theorem 4 is demonstrated in Section A.3.

The following result, demonstrated in Section A.4, studies the variation of as a function of . When considering as a function of these quantities, we write , meaning for defined as in (5).

Theorem 5.

For any ,

  • the function is nonincreasing on the set ;

  • the function is nonincreasing on the set ;

  • if , then when .

Moreover, for any ,

  • if and only if or .

  • if and only if .

Notice that the estimates given in the last two points of Theorem 5 are tight: they are reached for the following stationary systems:

Case :
Case :
Case :

5 Discussion and interpretation of the results

It is interesting to compare our results with the classical estimate which is obtained by assuming a lower-bound on the diagonal entries as well as on the non-zero entries of . In our set-up this is obtained by letting and . In order to have an idea on the quality of the two estimates, we plot the ratio of the spectral gaps,

for in Fig. 3. As it is possible to see, the new estimates are consistently tighter than the classic ones; in the best case, viz. for , the ratio of spectral gaps approaches . So, the quality of the estimates actually improves with respect to the classic bound, as the horizon increases.

Figure 3: Ratios between spectral gaps

When additional information is available, for instance when the coefficient as given in (2) are known, then contraction rate estimates become much tighter with respect to their classical counterparts which are not able to discriminate between inner loops of the root node and inner-loops of individual agents, as well as strength of inter-agent communication links. In order to carry out a comparison, notice that under the assumption of a prescribed tree matrix bounding from below , we may assume for the classical estimate the following value of which indeed is always smaller than . Hence, the corresponding spectral gaps satisfy:

so that, we may compare the classical estimate with the new one by considering the following ratios:

We plotted the function at the right-hand side of the previous inequality in a scale as a function of and . In general the ratio depends critically on the tree depth , hence we only plot it for relatively small tree depths. In particular the results shown in Fig. 4 were obtained.

(a)

(b)

(c)

Figure 4: Ratios of spectral gap: (a) , (b) , (c) . The vertical axis is graduated in a scale.

Notice that the relative quality of the estimates again increases with , and already for a significant portion of parameters space lies in the area in which estimates differ by a factor. The dependence of upon and is shown in Fig. 5 for . This also clearly shows the different monotonicity properties highlighted in the previous Section.

Figure 5: The function for (from bottom to top)

.

6 Conclusion

We provide a novel and tight estimate of the contraction rate of infinite products of stochastic matrices, under the assumption of prescribed lower bounds on the influence between different sets of agents which naturally arise by following the information spread along the interaction graph. This improves previously known bounds and, when additional information is assumed, exploits the additional structure for tightening of several orders of magnitude the previously available estimates. The other crucial factor in determining the overall convergence rate is the time needed to the information to propagate from some root node (which may or may not play the role of a leader) to the other nodes. The bound can be computed as the Perron-Frobenius eigenvalue (the spectral radius) of a positive -dimensional matrix, whose entries depend in a relatively simple way on the parameters characterizing the hypothetic lower bounds available. Some monotonicity and asymptotic properties of the bound are also proved.

Appendix A Proofs

a.1 Proof of Theorem 2

For each given solution of (1) we define two vectors of size as follows:

(8)

in such a way that

Notice that defined in Theorem 2 equals , and that, for any , .

Our first aim is to compute an upper-bound of on the basis of . Let us consider the following estimates. First, for the unique index ,

(9)

using the fact that and . Also, for (that is ):

(10)

where the last inequality follows considering that and , , by Assumption 1.

We now proceed to compute suitable estimates for the vector defined in (8). In particular, for any , we may derive, exploiting (9) and (10):

One has . Hence, the previous inequality implies:

and thus

that is:

(11)

with the nomenclature adopted in (5).

A symmetric argument can be carried out for the minima defined in (8). In analogy to the formulas (9), (10) obtained in the previous paragraphs, we get:

and, for ,

Similarly to (11), we get

Putting now together (11) and (A.1) leads to:

(12)

On the other hand, one also have

Gathering this inequalities yields (3) and proves Theorem 2.

Remark 2.

Notice that alternatively to (11), the following estimate is also valid:

This yields the result of Theorem 2 with , instead of (5), but the sequel demonstrates that the corresponding estimates are less precise (see Remark 3 below).

a.2 Proof of Theorem 3

All the factors in (3) being nonnegative, the order relation is compatible with multiplication. One then obtains, for all ,