Network Inference from Consensus Dynamics
We consider the problem of identifying the topology of a weighted, undirected network from observing snapshots of multiple independent consensus dynamics. Specifically, we observe the opinion profiles of a group of agents for a set of independent topics and our goal is to recover the precise relationships between the agents, as specified by the unknown network . In order to overcome the under-determinacy of the problem at hand, we leverage concepts from spectral graph theory and convex optimization to unveil the underlying network structure. More precisely, we formulate the network inference problem as a convex optimization that seeks to endow the network with certain desired properties – such as sparsity – while being consistent with the spectral information extracted from the observed opinions. This is complemented with theoretical results proving consistency as the number of topics grows large. We further illustrate our method by numerical experiments, which showcase the effectiveness of the technique in recovering synthetic and real-world networks.
The study of networks and multi-agent systems has attracted enormous interest over the last years. Network-based problem formulations abound in diverse application domains, ranging from socio-economical to biological settings, and from physical to engineering systems [1, 2, 3]. Often, these systems display a rich dynamical behavior that emerges from an interplay of the non-trivial connection topology of the network. In this context, consensus has been one of the most popular and well-studied dynamics on networks [4, 5, 6]. This is due to both its analytic tractability as well as its simplicity in approximating several fundamental behaviors. For instance, in socio-economic domains consensus provides a model for opinion formation in a society of individuals. For engineering systems, it has been considered as a basic building block for an efficient distributed computation of global functions in networks of sensors, robots, or other agents.
However, especially in the biological and social domains, the true couplings between the agents are usually unknown, and have to be inferred from data [7, 8, 9]. Network inference, though often not discussed explicitly, is thus a fundamental constituent of network analysis. Different notions of network inference have been considered in the literature, with examples ranging from the estimation of ‘functional’ couplings based on pairwise statistical association measures (correlation, mutual information), to causal inference . In this paper, we are interested in what is often called structural or topological inference [8, 9]: given a system of dynamical units, we want to infer their direct ‘physical’ interactions, e.g., for a distributed system on a network, we want to infer the exact weighted adjacency relationships between the nodes.
Optimization-based strategies for such inference tasks have been proposed in the literature[11, 12]. In relation to the inference of networks from consensus-like dynamics,  and  consider inferring the network based on observing the (cross-)power spectral density of a consensus process driven by noise, and a node knockout strategy. Related approaches have also been pursued in [15, 16]. Further methods combining notions from spectral identification with optimization techniques are considered in [17, 18, 19]. Moreover,  proposes an interesting approach for the identification of nonlinear systems based on the observation of a few nodes only, however the recovery is limited to spectral properties of the underlying network structure, rather than the full topology. In our previous work [21, 22], we studied how independent steady-state observations of the same linear network process can be used to extract information about the eigenvectors of the unknown underlying network.
Departing from the existing literature, in this paper we make very few assumptions on the unknown structure and the samples obtained from observing the consensus process. In particular, we neither impose a specific sampling scheme in which samples are drawn at known time instances, nor do we assume that we have access to a complete time-series of observations. Instead, we are given a statistical characterization of the initial inputs, and observe a set of independent snapshots of the responses to such inputs after some unknown time intervals. Surprisingly, even under these mild assumptions it is possible to (approximately) infer the network structure, as we demonstrate in the sequel.
After a brief review of preliminary concepts in Section II, the recovery problem is described rigorously in Section III. Section IV discusses how several spectral properties of the unknown network can be inferred from the observation of a set of output signals (Sections IV-A and IV-B) and then proposes a convex optimization formulation for the recovery problem that takes into account this spectral information (Section IV-C). We illustrate the effectiveness of this approach through a series of numerical experiments on synthetic as well as real-world networks in Section V. A brief outlook in Section VI identifying extensions and future lines of work wraps-up the paper.
Notation: The entries of matrix and (column) vector are denoted by and , respectively; to avoid confusion, the alternative notation and will be used occasionally, when dealing with indexed families of matrices (vectors). The notation , , and stands for transpose, expected value, and probability, respectively; , , and refer to the all-zero vector, the all-one vector, and the identity matrix, where the sizes are clear from context. For a vector , is a diagonal matrix whose th diagonal entry is .
Networks. A weighted and undirected network consists of a node set of known cardinality , an edge set of unordered pairs of elements in , and non-negative edge weights such that for all . The neighborhood of is defined as the set of nodes connected to . The edge weights can be conveniently collected as entries of the symmetric adjacency matrix . Defining the diagonal degree matrix , the (combinatorial) Laplacian matrix associated with network is given by and can be shown to be positive semi-definite . Since is a symmetric and real matrix, it is diagonalized by a unitary matrix , i.e. , where the diagonal matrix contains the eigenvalues of . Throughout the paper, we are going to assume that all eigenvalues of , , are unique (non-degenerate). This assumption is not fundamental from a technical viewpoint, but simplifies the presentation of our results. In particular, this implies that is connected. The Laplacian matrix can be used to model local dynamics on the associated network , as discussed next.
Discrete-time consensus dynamics. Define the state vector where the value corresponds to the opinion of agent in the network. We consider a discrete-time linear consensus dynamics [6, 5, 4] in which evolves locally, i.e., the value of at a given instant depends exclusively on the previous values of at node and its neighborhood :
where indicates discrete time instants. According to (1), agent updates its state as a linear combination of its own state in the previous time step and the weighted discrepancy with its neighbors in the previous time step. In this context, indicates the relative weight that node gives to this discrepancy in the respective update (the ‘learning rate’ of the nodes at time ). From the definition of it readily follows that (1) can be expressed in vector form as , or more compactly:
We refer to the above dynamics as discrete-time consensus because, under mild conditions on , the state of all agents coincides asymptotically, i.e. for , where is a constant.
Throughout the paper, we will use to refer to the initial signal, i.e. , and we denote by the observation of the dynamics at a specific time of interest, i.e., . It follows then that is related to via
Moreover, given two sub-exponential random variables and with corresponding parameters and , the sum is also sub-exponential with parameters .
Iii Problem Formulation
To motivate our problem setup, let us consider the context of social networks. Assume that we observe, at a specific instant in time, the opinion profile of all agents in a network regarding independent topics, each of which evolved according to a consensus dynamics like the one in (3). The discussion about each of the topics, which we index via , may have started at a different point in time – corresponding to unknown durations for each topic. Furthermore, the interactions between the agents may have been heterogeneous across time and topics – associated with unknown learning rates . Our goal is now to recover the underlying social network from the observation of opinion profiles at a given time instant.
Formally, consider different consensus dynamics evolving on a single network encoded by the Laplacian . Each dynamics corresponds to a distinct input and diffusion rates as in (3). Our goal is to recover from a single snapshot of the state vectors of the consensus dynamics. More precisely, we have access to one observation for each dynamics, where is given by
Notice that, while is the unknown of interest, we do not assume that we know , i.e. how long each consensus dynamics has been running, nor do we assume the knowledge of the diffusion rates . We assume merely that we have a statistical characterization of the unknown inputs , which we model here as independent realizations of a zero-mean multivariate normal random variable , of unknown power .
The above problem is markedly under-determined, thus requiring us to consider a novel recovery scheme for that combines spectral information with a regularized convex optimization approach (see Section IV). To ensure that there is some information about contained in the snapshots , we assume that the two following mild technical conditions hold. First, for all , meaning that the dynamics has not reached consensus yet, in which case all information about the network structure would be completely lost. Second, we assume that the deterministic rates are small enough to ensure asymptotic convergence of (6). Specifically, for all .
Iv Topology Inference From Consensus
The goal is to design a recovery scheme that is able to find , despite the scarce information about at our disposal. Our discussion is divided into three parts. In Sections IV-A and IV-B, we highlight how information about the eigenvectors and eigenvalues of can be inferred from the observation of , and prove some asymptotic consistency results for the limit of large sample size . The guiding idea is the following: since the inputs in (6) are realizations of white Gaussian noise, the color of the outputs must contain information about the unknown .
In Section IV-C, we use these insights to design an inference method for , based on a regularized convex optimization problem. More precisely, we propose to recover the unknown by solving an optimization problem that searches for a valid Laplacian consistent with the spectral information extracted from the observations , while promoting a desirable sparse structure.
Iv-a Inferring the eigenvectors
Define the linear operators so that for all . Leveraging the eigendecomposition we can write
where we have implicitly defined the diagonal matrices for all . Notice that the diagonal entries of are bounded between . Consider the sample covariance matrix
and notice that for the vectors and are independent, but not identically distributed. More precisely,
where we used that is symmetric and deterministic, and that is white. Note that, in general, will not converge to the covariance of any specific for increasing . Hence the procedure based on spectral templates developed in [21, 22] is not applicable here. However, as we demonstrate in the sequel, can still be used to recover the eigenbasis of the unknown Laplacian .
Let and be independent zero-mean scalar Gaussian random variables. Then, the random variable is sub-exponential with parameters .
Proof (sketch): A direct computation of the moment-generating function of yields
It is then not hard to verify that
for all such that , and by combining (10) and (11) the statement of the lemma follows. In order to show inequality (11), one can begin by showing that is valid for , then apply the square root to the inequality and finally substitute . \QED
We will now show that in (8) and are simultaneously diagonalizable, provided that is sufficiently large. For notational purposes, we define the matrix .
For , the eigenbasis diagonalizes , i.e., for all it holds that
where we have defined . Since is a multivariate Gaussian random variable, it follows that . For each entry in (13) we thus have
where and are independent. Lemma 1 now implies that is a sub-exponential random variable with parameters . Consequently, is equal to the average of independent sub-exponential random variables, each of them having zero mean. Define then the coefficients
The sub-exponential tail bound in (5) yields
for . Recall that every is upper bounded by , hence . Consequently, for small enough :
which proves the proposition. \QED
Proposition 1 guarantees that the eigenbasis can be recovered by performing an eigendecomposition of for large enough . While in most practical instances of network inference will not be unbounded, Proposition 1 nevertheless can be used as the basis for an inference algorithm even if only a finite number of observations are available; see Section IV-C.
We remark that the validity of (12) does not imply that exists. Indeed, our weak assumptions on and – which translate into mild conditions on – could result in an that does not converge for increasing . However, even if does not converge to a specific matrix, we may interpret (12) at stating that converges to the set of matrices diagonalized by .
Iv-B Inferring the eigenvalues
As will be shown in Proposition 2, also provides insights about the eigenvalues of the unknown Laplacian . In proving this, we make use of the following lemma, whose proof is omitted for being analogous to that of Lemma 1.
Let , then the random variable is sub-exponential with parameters and .
For every there exists a large enough number of observations such that, for all ,
with probability at least for every .
Proof : Beginning from (13), it follows that
where . Invoking Lemma 2, we have that is a sub-exponential random variable with parameters . Thus, is equal to the average of independent sub-exponential random variables, each of which has mean (Lemma 2). Define the global parameters and for each as
and the expected value
Then, we can again leverage the sub-exponential tail bounds in (5) to obtain
for . The last inequality follows from the fact that is upper bounded by for all , which results in the bound . A direct application of the union bound on (22) yields
for , from which it immediately follows that
where we fixed a desired probability level at . Our goal now is to choose small enough to ensure that (18) is satisfied and then solve for the corresponding number of observations in (24) using such an . To do this, first recall that the eigenvalues of the Laplacian in satisfy the ordering . Hence, we know that the diagonal entries of will be inversely sorted [cf. (7)], i.e., . We further assume that when for some , where does not depend on . It then follows from (21) that for . Consider a deviation from the mean where is large enough to ensure that . By specializing (24) to and solving for as a function of , we have that for all such that
every random variable is at most at a distance from its mean with probability at least . Since by definition for , this means that the variables are sorted in the same order as their means with high probability, i.e., for with probability at least , as we wanted to show. \QED
In Section IV-A we discussed that need not converge for large . Nevertheless, (18) is stating that, even in the diverging case, the diagonal elements of follow a specific order with high probability. This observation, in combination with Proposition 1, is leveraged in Section IV-C to develop a network topology inference algorithm for finite .
Iv-C Recovering the optimal Laplacian matrix
As discussed earlier, selecting a Laplacian that is consistent with the observations is in general an under-determined problem. Even when fixing the eigenbasis and the ordering of the eigenvalues, there is freedom in choosing the exact eigenvalues as long as the order is preserved. Consequently, we seek to recover an optimal among all those consistent with the observed data. Our notion of optimality is based on sparsity, but other features might be selected as well.
Denoting by the eigendecomposition of where the eigenvalues are sorted in increasing order, our inferred Laplacian can be found as the solution of the following convex optimization problem.
Since the elementwise norm is simply a convex relaxation of the (pseudo-)norm, the objective (26a) promotes sparsity in , i.e., the optimal choice for . Alternatively, the norm can be replaced by its iterative reweighted counterpart , which has shown to perform better in practice. The constraints in (26b) force the output to be a valid Laplacian, namely, must have non-positive off-diagonal elements and each row must sum up to zero. Notice that these two requirements enforce diagonal dominance of which, in turn, ensures positive semi-definiteness. The constraints in (26c) impose that must be close to being diagonalized by . It was shown in Section IV-A that coincides with for arbitrarily large number of observations . However, for all practical implementations, is finite and thus, we do not require to be diagonalized by directly. Rather, we require our output to be close (as measured by the Frobenius norm) to another matrix (the optimal ) that is diagonalized by . Note that in practise the matrix variable does not need to be constructed, but can be directedly substituted into the norm. We remark further that, we could replace the Froebenius norm here by the maximum, thereby reducing the problem to a linear program. Lastly, (26d) incorporates the fact that the eigenvalues of and the true Laplacian are inversely ordered by forcing this inverse order to . Here is a positive integer that we can choose. Notice that when we impose a strict order on the eigenvalues whereas for we impose a laxer order for the cases in which is not large enough. Finally, the constant can be chosen freely, since it will only vary the scale of the recovered Laplacian . Notice that this scale ambiguity is unsurmountable given that in (6) a common factor across all (unknown) can be absorbed into the unknown .
In a nutshell, given a series of observations following model (6), we propose to recover (a scaled version of) by first constructing as in (8) to extract its eigenbasis , and then solving problem (26). In the next section we assess the practical performance of this approach.
V Numerical Experiments
We present two test cases, where our goal is to recover synthetic and real-world networks, respectively.
Erdős-Rényi networks. To validate our method, we test its performance when the eigenbasis is perfectly known. Intuitively, this situation will arise when an infinite number of observations is available (cf. Proposition 1). Clearly, a high recovery rate under this setting is a necessary condition for acceptable recovery in the finite regime. Hence, we consider Erdős-Rényi (ER) random graphs  of varying sizes and edge probabilities . For each combination of and we generate networks, compute their associated Laplacian , and then try to recover it by solving (26) for , , and . Recall that the choice of only affects the scale of the recovered Laplacian and, thus, is inconsequential to the recovery performance. For every network generated, we consider the recovery to be successful if the error is less than . Fig. 0(a) portrays the recovery rates (averaged across the realizations) as a function of and .
We first observe that the recovery rates are high. The overall recovery rate is and, if we increase the threshold for success recovery to , this rate becomes . Secondly, we note that as increases, recovery becomes almost certain. The reason for this is that, after assuming perfect knowledge of , having two sparse Laplacians that share identical sets of eigenvectors becomes less probable for larger . Finally, we see a decay in performance for increasing , which can be attributed to the fact that we are imposing sparsity on the recovered Laplacian even for relatively large values of .
Real-world social networks. Consider four social networks defined on a common set of nodes, which represent students from the University of Ljubljana111Access to the data and additional details are available at http://vladowiki.fmf.uni-lj.si/doku.php?id=pajek:data:pajek:students. Edges in each of the networks represent different types of interactions among the students, and were constructed by asking each student to select a group of preferred college mates for different situations, e.g., to discuss a personal issue or to invite to a birthday party. The considered networks are unweighted and symmetric, and the edge between nodes and exists if either student picked in the questionnaire or vice versa.
For each of the four networks, our goal is to recover the true Laplacian from the observation of synthetic consensus dynamics by solving (26), where we vary from to ; see Fig. 0(b). We consider the same metric for recovery error as in the previous experiment, averaged over realizations of a synthetic dynamics. This synthetic dynamics was generated by drawing the input from a standard multivariate Gaussian distribution, selecting uniformly at random in and each rate uniformly at random in . In solving (26) we set equal to the smallest possible value (found via iterative search) that guarantees feasibility of (26) and , although recovery was robust to the specific value chosen for .
As displayed in Fig. 0(b), we observe a monotonous decrease of the recovery error with increasing for all networks. This is not surprising since we know that for larger , the sample eigenbasis becomes closer to the real eigenbasis and thereby facilitates recovery. In Fig. 0(c) we show specific instances of corresponding to Network 1 for different number of observations . To facilitate interpretation, the matrices shown in the figure are split along the diagonal: the upper triangular portion of each matrix (excluding the diagonal) corresponds to the absolute values of the entries of the recovered , whereas the lower triangular portion and the diagonal correspond to the difference with the true Laplacian, i.e. entries of . As expected, the discrepancy with the real Laplacian (lower triangle) decreases with increasing . Moreover, note that we here consider a weighted network recovery. If we are only interested in recovering the support of the graph, then we can do better even with only samples by post-processing our results. More precisely, if we only keep the (true number of edges in ) strongest links in , of them coincide with the edges present in . This overlap increases to and for and , respectively. Notice that even for this last case the error in Fig. 0(b) is not , due to small differences in the actual weights of the recovered edges.
We have proposed a novel technique for the identification of a network based on observing snapshots of a number of independent consensus processes of unknown duration. To achieve this, we formulated a convex optimization problem that outputs a sparse, valid Laplacian which is provably consistent with the spectral information obtained from the consensus observations.
Our results pave the way for several interesting avenues of future work including: (i) investigation of the trade-off between specific network topologies and the required sample size to achieve a desired level of recovery performance; (ii) consideration of a richer class of dynamical models, including non-deterministic processes such as switched systems; and (iii) extensions of the proposed algorithm that incorporate generative random network models as priors in the network inference problem.
-  S. H. Strogatz, “Exploring complex networks,” Nature, vol. 410, no. 6825, pp. 268–276, Mar. 2001.
-  S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, and D.-U. Hwang, “Complex networks: Structure and dynamics,” Physics Reports, vol. 424, no. 4-5, pp. 175–308, 2006.
-  M. E. J. Newman, Networks: An Introduction. Oxford University Press, USA, Mar. 2010.
-  A. Jadbabaie, J. Lin, and A. S. Morse, “Coordination of groups of mobile autonomous agents using nearest neighbor rules,” IEEE Transactions on Automatic Control, vol. 48, no. 6, pp. 988–1001, 2003.
-  R. Olfati-Saber, J. Fax, and R. Murray, “Consensus and Cooperation in Networked Multi-Agent Systems,” Proceedings of the IEEE, vol. 95, no. 1, pp. 215–233, Jan. 2007.
-  W. Ren and R. W. Beard, “Consensus seeking in multiagent systems under dynamically changing interaction topologies,” IEEE Transactions on Automatic Control, vol. 50, no. 5, pp. 655–661, 2005.
-  P. D’Haeseleer, S. Liang, and R. Somogyi, “Genetic network inference: From co-expression clustering to reverse engineering,” Bioinformatics, vol. 16, no. 8, pp. 707–726, 2000.
-  M. Timme and J. Casadiego, “Revealing networks from dynamics: An introduction,” Journal of Physics A: Mathematical and Theoretical, vol. 47, no. 34, p. 343001, 2014.
-  I. Brugere, B. Gallagher, and T. Y. Berger-Wolf, “Network structure inference, a survey: Motivations, methods, and applications,” arXiv preprint arXiv:1610.00782, 2016.
-  P. Spirtes and K. Zhang, “Causal discovery and inference: Concepts and recent methodological advances,” in Applied informatics, vol. 3, no. 1. Springer Berlin Heidelberg, 2016, p. 3.
-  A. Julius, M. Zavlanos, S. Boyd, and G. J. Pappas, “Genetic network identification using convex programming,” IET Systems Biology, vol. 3, no. 3, pp. 155–166, 2009.
-  E. J. Candes, M. B. Wakin, and S. P. Boyd, “Enhancing sparsity by reweighted minimization,” Journal of Fourier analysis and applications, vol. 14, no. 5-6, pp. 877–905, 2008.
-  S. Shahrampour and V. M. Preciado, “Reconstruction of directed networks from consensus dynamics,” in American Control Conference (ACC), 2013. IEEE, 2013, pp. 1685–1690.
-  ——, “Topology identification of directed dynamical networks via power spectral analysis,” IEEE Transactions on Automatic Control, vol. 60, no. 8, pp. 2260–2265, 2015.
-  D. Materassi and M. V. Salapaka, “On the problem of reconstructing an unknown topology via locality properties of the Wiener filter,” IEEE Transactions on Automatic Control, vol. 57, no. 7, pp. 1765–1777, 2012.
-  M. Nabi-Abdolyousefi and M. Mesbahi, “Sieve method for consensus-type network tomography,” IET Control Theory & Applications, vol. 6, no. 12, pp. 1926–1932, 2012.
-  D. Hayden, Y. Yuan, and J. Goncalves, “Network identifiability from intrinsic noise,” IEEE Transactions on Automatic Control, vol. PP, no. 99, pp. 1–1, 2016.
-  D. Hayden, Y. H. Chang, J. Goncalves, and C. J. Tomlin, “Sparse network identifiability via compressed sensing,” Automatica, vol. 68, pp. 9 – 17, 2016.
-  Y. Yuan, G.-B. Stan, S. Warnick, and J. Goncalves, “Robust dynamical network structure reconstruction,” Automatica, vol. 47, no. 6, pp. 1230–1235, 2011.
-  A. Mauroy and J. Hendrickx, “Spectral identification of networks using sparse measurements,” SIAM Journal on Applied Dynamical Systems, vol. 16, no. 1, pp. 479–513, 2017.
-  S. Segarra, A. G. Marques, G. Mateos, and A. Ribeiro, “Network topology inference from spectral templates,” IEEE Transactions on Signal and Information Processing over Networks, vol. 3, no. 3, pp. 467–483, Sept 2017.
-  ——, “Network topology identification from spectral templates,” in 2016 IEEE Statistical Signal Processing Workshop (SSP), 2016, pp. 1–5.
-  R. Merris, “Laplacian matrices of graphs: A survey,” Linear Algebra and its Applications, vol. 197, pp. 143–176, 1994.
-  P. Rigollet, “High dimensional statistics,” Lecture Notes, Cambridge, MA, USA: MIT OpenCourseWare, 2015.
-  S. Boucheron, G. Lugosi, and P. Massart, Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford university press, 2013.
-  B. Bollobás, Random Graphs. Cambridge University Press, 2001.