Consensus ranking for multi-objective interventions in multiplex networks
High-centrality nodes have disproportionate influence on the behavior of a network; therefore controlling such nodes can efficiently steer the system to a desired state. Existing multiplex centrality measures typically rank nodes assuming the layers are qualitatively similar. Many real systems, however, are comprised of networks heterogeneous in nature, for example, social networks may have both agnostic and affiliative layers. Here, we use rank aggregation methods to identify intervention targets in multiplex networks when the structure, the dynamics, and our intervention goals are qualitatively different for each layer. Our approach is to rank the nodes separately in each layer considering their different function and desired outcome, and then we use Borda count or Kemeny aggregation to identify a consensus ranking – top nodes in the consensus ranking are expected to effectively balance the competing goals simultaneously among all layers. To demonstrate the effectiveness of consensus ranking, we apply our method to a degree-based node removal procedure such that we aim to destroy the largest component in some layers, while maintaining large-scale connectivity in others. For any multi-objective intervention, optimal targets only exist in the Pareto-sense; we, therefore, use a weighted generalization of consensus ranking to investigate the trade-off between the competing objectives. We use a collection of model and real networks to systematically investigate how this trade-off is affected by multiplex network structure. We use the copula representation of the multiplex centrality distributions to generate model multiplex networks with given rank correlations. This allows use to separately manipulate the marginal centrality distribution of each layer and the interdependence between the layers and to independently investigate the role of the two using both analytical and numerical methods.
In complex networks, a small subset of nodes often has disproportionate influence on the behavior of the system Freeman (1978); Page et al. (1999); Albert et al. (2000); Newman (2010), and controlling such nodes allows us to steer the network to desired states Motter and Lai (2002); Liu et al. (2011); Fiedler et al. (2013). For example, vaccinating a small fraction of carefully selected nodes can suppress large-scale disease outbreaks Pastor-Satorras and Vespignani (2002), or in a social network information is often disseminated by a small set of influencers Kitsak et al. (2010). Network centrality measures – such as degree, PageRank, or betweenness centrality – rank nodes based on their importance with respect to some process of interest. Therefore, high-centrality nodes provide effective targets to influence the behavior of a system: a node with high eigenvector centrality is an ideal target for vaccination Masuda (2009), or a node in a social network with high closeness centrality is an effective influencer Freeman (1978). Nodes, however, typically participate in multiple networks simultaneously forming a multilayer or multiplex network Kivelä et al. (2014); Boccaletti et al. (2014); Bianconi (2018). For example, a person typically participates in a number of friendship and professional social networks, and these layers are often qualitatively different in nature, including affiliative and competitive interactions Harary et al. (1953); Wasserman and Faust (1994). Therefore, it is desirable to identify targets in a multiplex network that maximize our influence in certain layers, yet minimize any unwanted impact on others. More generally, we consider the scenario where the structure, the dynamics, and our intervention goals are qualitatively different for each layer of a multiplex network, and we aim to find nodes that effectively balance the trade-offs between the multiple objectives.
Recently, a number of centrality measures were introduced to rank nodes in multilayer or multiplex networks Halu et al. (2013); Estrada and Gómez-Gardenes (2014); Solé-Ribalta et al. (2016); Rahmede et al. (2017). These centralities typically assume qualitatively similar layers with similar dynamics; and therefore they may provide useful intervention targets if the structure, the dynamics, and our intervention goals are all similar for each layer. For example, the family of “versatility centralities” extend common single-layer centralities to multilayer networks; as such, PageRank versatility is useful to identify influential scientists in a multiplex co-authorship network; or airports with high betweenness versatility are prone to congestion in the multiplex air traffic network De Domenico et al. (2015). Such multiplex centralities, however, do not address the more complex, and therefore less studied, scenario where the dynamics on the layers or our objectives are qualitatively different. For example, social and political multiplex networks are often comprised of cooperative (friendship, collaboration, alliance) and competitive (hostility, fighting, distrust) layers Harary et al. (1953); Wasserman and Faust (1994); Maoz (2010); possible interventions might aim to reduce hostility while maintaining cooperation. In fact, this is a practical issue that caretakers of captive primate groups face: sometimes unusually aggressive animals are removed with the goal of reducing overall levels of conflict, and it is desirable that this removal does not negatively affect the cohesion of the group Judge et al. (1994); McCowan et al. (2018); Pósfai and D’Souza (2018). For another example, consider the recent work modeling brain activity as a two-layer multiplex network with synchronization and transport dynamics. A possible strategy to affect the behavior of the system is to target nodes that are influential in both layers Nicosia et al. (2017). Or another recent work studied a two-layer SIS model where one layer represents the spread of a disease and the other the spread of awareness of that disease Granell et al. (2013). Although both layers are characterized by SIS dynamics, a possible intervention would have different goals for the two layers: blocking the propagation of the disease and promoting awareness; and the best spreaders and the best blockers are known to have different properties Radicchi and Castellano (2017). In these example multiplex systems, the function of each layer and/or our corresponding intervention goals are qualitatively different and ranking nodes in such networks has so far remained an uninvestigated problem.
In this paper, we explore the use of rank aggregation methods to identify target nodes for multi-objective interventions in multiplex networks. Instead of identifying influential nodes based on a single integrated multiplex centrality measure, we rank the nodes in each layer separately based on a centrality that is relevant to the function of the particular layer and our objectives. We then use rank aggregation methods to find a consensus ranking; high-ranking nodes are expected to balance all the objectives simultaneously. An expanding number of rank aggregation methods exist, and although these were originally studied by economists in the context of social choice theory Arrow (2012), many also found applications in other disciplines, from computer science to biology Dwork et al. (2001); Lin (2010). In this paper, we focus on two widely-used methods, Borda count and Kemeny aggregation Kemeny and Snell (1962); de Borda (1781). In Sec. II, we introduce consensus ranking in multiplex networks, and we demonstrate several of its properties using model and real networks. As an example application, in Sec. III, we study degree-based node removal of multiplex networks, such that we aim to destroy the largest connected component of some layers, while maintaining connectivity in the rest. In case of multiple objectives, optimality only exists in the Pareto-sense; therefore we rely on a generalized version of the Borda count algorithm that allows us to assign varying weight to the different objectives. Using the weighted Borda count, we systematically study the trade-off between the competing goals, using analytically solvable model networks and a set of real networks.
Ii Consensus ranking
A multiplex network consists of layers, where each layer is a network () with the same set of nodes . These layers represent distinct types of interactions or relationships between the nodes, and there is no restriction on the structure of the individual layers, e.g., some layers might be directed, while others undirected, or they can be weighted or unweighted. To quantify the importance of a node in an individual layer , we calculate the node’s centrality as if layer is in isolation. Nodes with high centrality are important for the functioning of that given layer, and targeting such nodes is an effective strategy to influence or monitor the behavior of that particular layer in isolation. The type of centrality measure that works best depends on the function of the layer, for example, betweenness centrality is used for layers representing a transportation network, or eigenvector centrality is useful for layers governed by diffusion-like dynamics. However, nodes in a multiplex network do not perform a single task, but they simultaneously participate in all layers, prompting the question: How do we identify nodes that are important in all or most layers? This is particularly challenging if layers perform qualitatively different functions or our criteria for ranking nodes is different for each layer. To overcome these difficulties, instead of directly combining the centralities of each layer, we first determine the node ranks in each layer and then we use rank aggregation methods to identify a consensus ranking. An expanding number of rank aggregation methods exist Arrow (2012); Dwork et al. (2001); Lin (2010), here we focus on two widely-used and intuitive methods: the Borda count (BC) and the Kemeny aggregation (KA) de Borda (1781); Kemeny and Snell (1962).
We define the rank of node in layer as , where
is the number of nodes that have smaller centrality than node . We denote the list of node ranks in layer as . If there are no ties in the rankings (i.e., if ) provides an ordinal ranking, the top rank being . If ties exist, the ranks of tied nodes are set to be equal and assigned the worst value, e.g., if nodes and have equal centralities and are top ranked, we assign to both nodes.
The BC algorithm is perhaps the most straight-forward rank aggregation method de Borda (1781), it works by assigning a score to each node equal to the average number of nodes that outranks in the layers of the multiplex:
where is defined in Eq. (1). The BC ranking, , is simply a ranking of the nodes based on the scores . The algorithm requires sorting the nodes; therefore its computational complexity is .
The KA algorithm belongs to a larger class of rank aggregation methods that aim to identify a consensus ranking such that is at the minimum average distance from the rankings Irurozki (2014); Ceberio et al. (2015). Formally, it is defined as
where is some measure of distance between two rankings and , such as Hamming distance, Cayley distance, etc. The KA algorithm uses the Kemeny distance Kemeny and Snell (1962), which naturally takes into account ties:
and is defined similarly. Kemeny distance is normalized such that if and are the same, ; and if and rank nodes in opposite order, . If there are no ties in the rankings, the Kemeny distance is equivalent to Kendall’s distance 111Note that if ties are present Kemeny distance is not a distance in the mathematical sense, as it does not satisfy the triangle inequality..
Unfortunately, identifying the Kemeny consensus is an NP-complete problem for Bartholdi et al. (1989); therefore approximate methods have to be used. Over the years, a high number of such algorithms were proposed: a recent survey compared 104 algorithms and combinations of algorithms Ali and Meilă (2012). In fact, the previously introduced BC algorithm can be considered as a simple heuristic approximation of the KA problem. Here, we implemented a local search algorithm, which was identified as providing optimal trade-off between accuracy and run-time Ali and Meilă (2012). The algorithm starts from and outputs a new consensus ranking by finding a local minimum of Eq. (3) using a restricted set of transformations. The computational complexity of our implementation is ; we provide the details of the algorithm in Appendix A.
ii.1 Model networks
We now compare the two algorithms by applying them to a multiplex network model with tunable pairwise Kemeny distance between its layers. To generate these multiplex networks, we first independently create layers, and we determine the rank of each node in each layer with respect to some centrality. We then generate a node label sequence to ensure that the distance between each pair of layers is . Here, we use the Erdős-Rényi model (ER) or the scale-free static model (SF) to generate each layer, and we use degree centrality to rank the nodes. We describe the procedure in detail in Appendix B.
By choosing parameter of the model network, we set the strength of consensus: if , the layers are independent and there is no intrinsic consensus; if , the ranking in each layer is the same and consensus is perfect. We compare the two algorithms by calculating the average distance of the layers from consensus:
where is the consensus ranking found by the or algorithm. Note that corresponds to the cost function of the KA algorithm provided by Eq. (3), and we chose for comparison because BC is sometimes considered as an approximation of KA (Ali and Meilă, 2012). The results, however, have to be interpreted with care, finding a ranking that corresponds to lower doesn’t necessarily mean that it is better, because ultimately BC and KA define the consensus in different ways. Similarly to the fact that no single best centrality measure exists to rank the nodes, the preferred rank aggregation algorithm also depends on our particular purposes.
Figure 1 shows as a function of the distance between layers . Since the KA algorithm works by improving the solution of the BC algorithm, we always find that (the equality holds for , where both algorithms find the same consensus, corresponding to the global minimum of ). However, this improvement is marginal in terms of . If strong consensus exists () both algorithms closely approximate this consensus. Even if the layers are independent (), we are able to find a consensus with , especially for multiplexes with only a few layers. For the latter case, the consensus does not capture an intrinsic property of the multiplex network, rather it identifies nodes that by chance have high rank in many layers; and therefore potentially provide low-cost targets for influencing multiple layers simultaneously.
Note that Fig. 1 shows results for ER layers, we found that using the SF layers produces almost indistinguishable results. In fact, the performance of the algorithms only depends on the rankings , and not on the details of each layer . The only property of that is specific to ER or SF layers is the number of ties in the ranking, and our results indicate that the performance of the BC and KA algorithms in terms of are insensitive to this.
ii.2 Real networks
The BC and KA algorithms allow us to identify and analyze consensus ranking in real systems represented by multiplex networks. Here, we investigate three examples:
Airline network: a multiplex network with 5 layers representing the United States air traffic network in 2013. The nodes are airports and links indicate direct flights between them, and the layers correspond to the 5 largest carriers 222The airline network represents the five largest US carriers in terms of number of unique flights. It is constructed based on the publicly available database of the USA Department of Transportation (https://www.transtats.bts.gov/Fields.asp?Table_ID=259), all domestic flights labeled as “scheduled passenger service” are included..
Primate social network: a multiplex network with 4 layers containing interaction data from one week of observations of a captive rhesus macaque troupe. The layers represent different interactions between animal pairs, including conflict, signaling of subordination, grooming, and huddling McCowan et al. (2011); Beisner et al. (2015).
Human social network: a multiplex network with 5 layers, where nodes are members of the Department of Computer Science at the Aarhus University and the links indicate various social relationships: Facebook friendship, spending leisure time together, working together, co-authorship, and regularly having lunch together Magnani et al. (2013).
Table 1 summarizes some basic properties of these networks. The airline network has a clear heavy-tailed degree distribution with hubs that have significantly more connections than average nodes. It is not possible to be as definite about the social networks due to their small size; there are, however, nodes that are connected to a significant fraction of the network, for example, in the fifth layer of the human social network the largest hub is connected to almost half of the other nodes. The primate social network is composed of competitive interactions (conflict and subordination) and cooperative interactions (grooming and huddling) – the existance of antagonistic and affiliative layers is a general property of social and political networks Harary et al. (1953); Wasserman and Faust (1994); Maoz (2010). For the primate network, we find that the adversarial layers are more heterogeneous than the affiliative layers. Interestingly, similar pattern was observed for human social networks obtained from an online game that allowed competition and alliances between players Szell et al. (2010).
When ranking nodes in a network, we chose a node centrality depending on the function of the system and our goals. Here, for illustration purposes, we calculate the node rankings for each network based on degree centrality. Figure 2(a-c) shows the pairwise Kemeny distances between layers for each network. Typically, we find , indicating positively correlated rankings; the correlation is the strongest in the airline network, while rankings in the social networks are less aligned with each other. We observe an interesting pattern in the primate social networks: the distance is low between affiliative layers (grooming and huddling), and also low between the interactions involving social hierarchy (conflict and signaling); but the rankings corresponding to competitive layers are independent from rankings of affiliative layers.
We also identify the consensus ranking for each network using the BC and the KA algorithms. Figure 2d shows the average distance of the layers from consensus, , for the original networks and their randomized counterparts, where we shuffled the node labels in each layer, eliminating any correlation between the rankings. For all multiplexes, we find stronger consensus in the real instances than in their randomized versions: the largest difference was observed in the airline network, while smallest in the primate network. As expected, KA always finds a consensus with a lower than BC; the difference, however, is marginal with the exception of the airline network.
Overall, we found for both model and real multiplexes that the BC and KA consensus rankings are similar in terms of , while the BC algorithm is faster and extremely simple to implement. Both algorithms effectively identify the consensus if the layers are strongly correlated, and even if the layers of a multiplex are independent, we can identify a consensus ranking that provides low-cost targets for simultaneous intervention on multiple layers. So far, we only compared the consensus rankings in terms of , in the next section we demonstrate how consensus ranking can identify targets for a simple degree-based node removal model with multiple objectives.
Iii Multi-objective degree-targeted attack
In this section, we explore an example of using consensus ranking to identify effective targets for multi-objective interventions in multiplex networks. A classic result of network science is that complex networks with heavy-tailed degree distributions are robust against random node removal; while targeted removal of high-degree hubs rapidly breaks down their large-scale connectivity Albert et al. (2000). Complex networks, however, rarely exist in isolation, nodes typically participate in multiple networks simultaneously. Therefore, removing a node from a multiplex network to reduce connectivity in one layer, might also remove connections from other layers that otherwise we would like to preserve. More specifically, here, we consider the problem of removing nodes from a multiplex network with layers such that for a set of layers we aim to reduce the size of the largest component, while we want to keep the rest of the layers intact. We rank the nodes in each layer based on degree centrality, but to reflect our different objectives, we reverse the rankings for the layers that we keep intact:
where and are the ranks of node in layers that we want to destroy and keep, respectively; and is the degree of node in layer . Given a multiplex network, we calculate these ranks for each layer, and we use the BC and KA algorithms to identify consensus rankings. We then iteratively remove the nodes based on this consensus ranking starting with the highest ranked nodes.
Figure 3 shows the relative size of the largest connected component as a function of the fraction of nodes removed for model multiplex networks with ER or SF layers (for details about the model networks see Appendix B). For the ER example, we find that both the BC and KA consensus-based removal reduces faster for the layers we aim to destroy, and slower for layers we aim to preserve than random node removal, but BC preforms better at reducing , while KA is better at preserving connectivity. Similarly for the SF example, BC destroys the targeted layers faster than KA, but doesn’t keep the rest of the layers intact more than random removal would. In either case, we are not as effective as if the layers would be in isolation.
Comparing the BC and KA rankings in Fig. 3, we find that neither method is objectively better than the other, instead they provide a different trade-off between the two objectives: BC preforms better at destroying layers, while KA does better at keeping layers intact. In fact, this is a pattern that we widely observed varying the parameters of the multiplex model networks. Therefore, from hereon in this section, we will focus on the BC ranking and we will study the trade-off by introducing a weighted version of BC. Further reasons for focusing on the BC algorithm are that it is much faster than the KA and the simplicity of the BC algorithm allows us to analytically solve for the model networks.
The original BC algorithm provides one possible trade-off between the competing objectives of destroying certain layers, while keeping others intact. To explore other possible trade-offs, we introduce a weighted Borda count (wBC) algorithm that allows us to assign varying preference to the different objectives. The weighted Borda score of node is defined as
where , if we destroy layer (); , if we keep layer intact (); and . The choice corresponds to the unweighted Borda score (with a multiplier of 1/2); if , we only care about the layers we aim to destroy; and if , we only care about the layers aim to keep intact. In the following, we first derive an analytical solution for , and then we systematically investigate how multiplex network structure affects the trade-off between the different objectives using model and real networks.
iii.1 Analytical solution
To analytically solve the size of the giant connected component (GCC) for consensus-based removal, we first calculate the size of the GCC for a general degree-based removal strategy (DBS) on a single layer network, and then we map the consensus-based process to a degree-based one for each layer. By DBS we mean a node removal process where the probability of removing a node only depends on its degree, i.e., the probability of removing node is . Let be the probability that a randomly selected link leads to the GCC. Assuming local tree-like structure and uncorrelated networks, we calculate using the self-consistent equation
where is the degree distribution and is the average degree of the network. The first term on the right hand side is the probability that following a random link leads to a node that is not removed and not part of the GCC, and the second term is the probability that the node is removed. Once is determined, we obtain the relative size of the GCC using the equation
Several node removal processes can be described as a DBS. For example, the choice leads to simple random node removal. Generally the best DBS to destroy a single-layer network removes nodes starting with the highest degree. Formally we express this as
meaning that we remove all nodes with degree higher than and a certain fraction of nodes with degree , where and is the CDF of the degree distribution. Also, the most effective DBS to keep a single-layer network intact is to remove nodes with the lowest degree first:
To map the consensus-based node removal to a DBS, we first calculate the Borda score of a node with multiplex degree as
where is the CDF of the degree distribution of layer . Next, we calculate the Borda score distribution, i.e., the probability that a randomly selected node of the multiplex has Borda score :
where is the multiplex degree distribution. When removing fraction of nodes based on consensus ranking, the probability of removing a node with Borda score is
where and is the CDF corresponding to provided by Eq. (14). For each layer , the consensus-based removal can be formulated as an effective DBS with
where is the joint multiplex degree distribution, and the summations are over the degrees in all layers, except .
Substituting Eq. (16) into Eqs. (9) and (10) provides , the relative size of the GCC of layer for consensus-based node removal. We numerically evaluate these equations for a class of multiplex model networks where the layers are either ER or SF networks, for SF networks we use the degree distribution provided in Eq. (23). Figure 4 compares the analytical solution of to simulations, showing excellent agreement. We find that there is a non-trivial connection between and as the node removal process switches back-and-forth between destroying and preserving layers. Note that the methods that we used to numerical evaluate the necessary equations become intractable for increasing number of layers; therefore for the numerical solutions we restrict ourselves to ER and SF layers.
iii.2 Model networks.
The parameter of the wBC algorithm allows us to assign different level of importance to the two competing objectives: favors breaking down layers, while focuses on keeping layers intact. To systematically investigate this trade-off, we introduce the coefficients
where is the size of the GCC in layer after removing fraction of its nodes using the wBC algorithm; () is the size of the GCC after removing fraction of its highest (lowest) degree nodes, i.e., it is the best we could do if layer was in isolation. The coefficients and measure how well the objectives are achieved relative to the case when we remove nodes from each layer independently. Analytical solution of Eq. (17) is obtained by first calulcating , , and using Eqs. (11), (12), and (16).
Figure 5 shows the trade-off curve between and for model multiplex networks with layers, where we aim to destroy half of the layers, while keeping the other half intact. If there would be no trade-off, the curve would be a single point at ; if all layers would be identical, the trade-off curve would be the diagonal line .
Figure 5a shows that increasing the number of layers affects the trade-off in two distinct ways: (i) As increases the number of conflicting objectives also increase, and the trade-off becomes more severe approaching the diagonal line. (ii) For , we aim to destroy multiple layers simultaneously, which cannot be done as efficiently as if they where in isolation; therefore even if , remains less than 1. Increasing has a similar, but weaker effect on , since destroying a network is more difficult than keeping it intact. Changing the pairwise Kemeny distance between the layers (Fig. 5b), we find that strong consensus () leads to strong trade-off between and , but allows efficient destruction () or preservation () of layers. Figure 5c shows the effect of degree heterogeneity, we find that destroying SF layers and preserving ER layers entails the least amount of trade-off, while destroying SF and preserving ER is the most difficult. Finally, Fig. 5d shows that the trade-off is the most pronounced at the initial stages of node removal (small ), and the competing objectives are less restrictive in case of large-scale removals (large ).
iii.3 Real networks.
In the previous section, we used model multiplex networks to understand how basic network properties affect the - trade-off. Real networks, however, have more complex structure. To investigate their effect, we use the three example datasets introduced in Sec. II.2: the human social network, the primate social network, and the airline network. We assign half of the layers in each multiplex to be destroyed and the rest to be kept intact, and we calculate the - trade-off after removing a fraction of the nodes ( for the airline network and the human social network, and for the primate social network). We then compare the trade-off curves to the following randomized null models:
Full randomization (FR): We randomly rewire all links, in effect replacing each layer with an ER network with the same number of nodes and links.
Node label randomization (NLR): We shuffle the node labels in each layer, removing any correlation or consensus between layers, but leaving the structure of each layer unchanged.
Degree preserved randomization (DPR): We rewire links such that the multiplex degree of each node is unchanged. This randomization, therefore, preserves the pairwise Kemeny distances between the layers, and removes all structure within the layers beyond their degree sequence.
Node label and degree preserved randomization (NLR+DPR): Combining NLR and DPR removes correlations both within layers and between layers, and only preserves the degree distributions of the individual layers.
Figure 6 provides the trade-off curves for the three example networks, each network showing a distinct behavior. For the airline network, we aim to destroy two layers and keep the other three intact. In Sec. II.2, we showed that the airline network has heterogeneous degree distributions and strong consensus between its layers. For model networks, we found that strong consensus leads to significant - trade-off, and indeed, we find that for the airline network the trade-off curve is close to the diagonal. We find that the DPR trade-off curve is almost indistinguishable from the original, while the trade-off for FR, NLR, and NLR+DPR is significantly weaker, meaning that both the inter-layer correlations and the degree distributions are necessary to explain the strong trade-off. We also find that for the original and NLR curve, but not for FR, this indicates that the simultaneous destruction of the two layers is aided by the presence of hubs and is uneffected by the inter-layer correlations.
In the case of the primate social network, we aim to destroy the two layers that are related to competition (conflict and signaling) and to preserve the affliative layers (grooming and huddling). In fact, it is common practice for the caretakers of captive primate groups to remove unusually aggressive individuals with to goal of reducing overall levels of conflict; and it is desirable that this removal doesn’t negatively affect the cohesion of the group Judge et al. (1994); McCowan et al. (2018); Pósfai and D’Souza (2018). Figure 6b shows the - curves for the primate network, and we find a very weak trade-off. This is largely explained by the inter-layer correlations, which in contrast to the airline network, reduce the trade-off. To understand this recall Fig. 2b, where we showed that Kemeny distance is small between the conflict-signaling and the grooming-huddling layer pairs, while there is no or even negative correlation between the competitive and affiliative layers. This particular structure allows us to simultaneously disrupt the competitive layers without affecting the affiliative layers. Furthermore, the competitive layers have more heterogeneous degree distributions than the affiliative layers (Table 1), a property also seen in human social networks Szell et al. (2010). Using model networks, we showed that destroying heterogenous layers and keeping homogeneous layers intact reduces the trade-off (Fig. 5c). Indeed, comparing the FR and DPR+NLR curves shows that the degree distribution of the layers contributes to the weak trade-off, albeit less than the inter-layer correlations.
Finally, for the human social network, our goal is to reduce connectivity in two layers and keep the other three intact. For both the airline network and the primate social network, we found that the inter-layer correlations significantly affect the - trade-off; interestingly, in case of the human social network inter-layer correlations have little effect. The original trade-off curve is best approximated by the NLR, which removes correlations between layers, but preserves all structure within each layer; and all other randomizations show stronger trade-off than the original. Therefore, we conclude that the internal structure of the layers beyond the degree distribution, such as community structure, is what reduces trade-off.
In this paper, we explored the use of consensus rankings to identify effective targets for multi-objective interventions in multiplex networks. Our strategy is to calculate a centrality for each layer that is relevant to its specific function and our specific objectives, rank the nodes based on these centralities, and then using rank aggregation methods identify a consensus ranking. As an example process, we studied the degree-based node removal process, where we aimed to destroy the largest connected component in some layers, while keeping the other layers intact. We demonstrated that removing the nodes in order of consensus ranking effectively balances these competing objectives.
The advantage of our method is that it is agnostic to the specific properties of the layers and our goals, making it a widely applicable tool. However, the price of this flexibility is that methods designed for a specific system likely outperform our general approach – future work should explore this possible trade-off. In our work, we extensively investigated how inter-layer structural correlations affect the trade-off between different objectives. We have not, however, explored the scenario when the dynamics of the layers are also directly coupled. Centrality measures that take such coupling into account were only developed for multilayer networks where all layers follow qualitatively similar dynamics De Domenico et al. (2015); Solé-Ribalta et al. (2016). It would be interesting to extend rank aggregation techniques to directly consider coupled dynamics.
Acknowledgements.We thank Haochen Wu for the airline network dataset. We gratefully acknowledge support from the US Army Research Office MURI Award No. W911NF-13-1-0340, DARPA Award No. W911NF-17-1-0077, and the National Institutes of Health Award No. R24-OD011136.
Appendix A Local search algorithm
Many methods exist to approximate the Kemeny consensus, a recent survey compared the performance of 104 algorithms and combinations of algorithms Ali and Meilă (2012). It found that so-called local search methods provide an optimal trade-off between accuracy and run-time, meaning that algorithms with significantly longer runtime only marginally decreased the cost function provided by Eq. (3). Local search algorithms start from an initial ranking which can be either random or an approximation of the Kemeny consensus provided by another algorithm. Then this ranking is improved on by a series of local transformations that decrease the cost function. Here, we implement a version of local search based on the simple insert sort algorithm.
As initial ranking we use , the output of the Borda count algorithm. Let node be the top ranked node in , node the second, and so on; if there are ties in , we randomly break them. We then construct a new ranking by starting from an empty ranking and iteratively adding nodes. First, we add node . We then add such that it has new rank (i) above , (ii) tied with , or (iii) ranked below . The new rank of is chosen to minimize the cost function. We repeat this step until all nodes are assigned a new rank.
Appendix B Centrality correlated model networks
We introduce a method to generate multiplex networks with nodes, layers and tunable Kemeny distance between each pair of layers . We start by generating independent layers using any single-layer network model of choice. We then calculate the rank of each node in each layer based on some centrality . Note that we are not restricted to use the same single-layer network model or the same centrality for all layers. In the following, we describe a procedure to re-label the nodes in each layer to specify the inter-layer dependency between the ranks of nodes.
Let be the CDF of the multiplex centrality distribution, i.e., is the probability that a randomly selected node has for every ; and let be the CDF of the marginal centrality distribution of layer . Specifying the multiplex centrality distribution would allow us to control the dependency between layers; we, however, cannot arbitrarily choose , since the marginals are determined by the properties of the individual layers. Yet, according to Sklar’s theorem, we can always write the multiplex centrality distribution as
where is an -variate copula Sklar (1959). A copula is the CDF of a random vector with uniform margins Joe (1997); Nelsen (2006). The advantage of this representation is that it separates the marginal distributions of each layer, specified by , and the interdependency structure between layers, specified by . If the marginals and are continuous, the Kemeny distance between the two layers is the same as the Kendall’s distance and it is completely defined by the copula
To re-label the nodes such that the multiplex centrality distribution follows Eq. (18), we draw a random vector from the -variate copula for each node , and from this we obtain a vector of ranks where
We re-label each node such that , the rank of in layer , becomes equal to . For example, if , we re-label nodes such that the highest ranked node in layer 1 and third highest ranked node in layer 2 are labeled .
Throughout this paper, we use the Gaussian copula
where is the CDF of the L-variate standard normal distribution with correlation matrix and is the inverse CDF of the standard normal distribution. Substituting the Gaussian copula into Eq. (19), we find the relationship
where is an element of the correlation matrix . Therefore, choosing the correlation matrix allows us to set the Kemeny distance between layers.
Finally, note that the above procedure does not take into account ties in the ranks, i.e., we assume that for any . Therefore, if there are any ties in the centralities (e.g., if degree centrality is used), we randomly break them. Furthermore, Eq. (19) assumes that the marginal centrality distributions are continuous, if this is not the case, the Kemeny distance between layers is no longer exactly provided exactly by Eq. (19) and (22); through simulations, however, we found that is still well approximated by them.
b.1 Static scale-free model
To generate the scale-free layers, we use the static model Goh et al. (2001). Starting from unconnected nodes, we assign a weight to each node , where . We then randomly select two nodes and with probability proportional to and , respectively, and if there is no link between nodes and , we connect them. We repeat this step until links are added. The resulting network has average degree and its degree distribution can be written as sum of Poisson distributions
where is the expected degree of node . For large , the degree distribution is approximated as
where is the gamma function and is the upper incomplete gamma function. We refer to this network as scale-free, because the tail of the distribution decays as a power-law, i.e., , where .
- Freeman (1978) Linton C Freeman, “Centrality in social networks conceptual clarification,” Social networks 1, 215–239 (1978).
- Page et al. (1999) Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd, The PageRank citation ranking: Bringing order to the web., Tech. Rep. (Stanford InfoLab, 1999).
- Albert et al. (2000) Réka Albert, Hawoong Jeong, and Albert-László Barabási, “Error and attack tolerance of complex networks,” Nature 406, 378 (2000).
- Newman (2010) Mark E. J. Newman, Networks: an introduction (Oxford University Press, 2010).
- Motter and Lai (2002) Adilson E Motter and Ying-Cheng Lai, “Cascade-based attacks on complex networks,” Physical Review E 66, 065102 (2002).
- Liu et al. (2011) Yang-Yu Liu, Jean-Jacques Slotine, and Albert-László Barabási, “Controllability of complex networks,” Nature 473, 167 (2011).
- Fiedler et al. (2013) Bernold Fiedler, Atsushi Mochizuki, Gen Kurosawa, and Daisuke Saito, “Dynamics and control at feedback vertex sets. i: Informative and determining nodes in regulatory networks,” Journal of Dynamics and Differential Equations 25, 563–604 (2013).
- Pastor-Satorras and Vespignani (2002) Romualdo Pastor-Satorras and Alessandro Vespignani, “Immunization of complex networks,” Physical Review E 65, 036104 (2002).
- Kitsak et al. (2010) Maksim Kitsak, Lazaros K Gallos, Shlomo Havlin, Fredrik Liljeros, Lev Muchnik, H Eugene Stanley, and Hernán A Makse, “Identification of influential spreaders in complex networks,” Nature Physics 6, 888 (2010).
- Masuda (2009) Naoki Masuda, “Immunization of networks with community structure,” New Journal of Physics 11, 123018 (2009).
- Kivelä et al. (2014) Mikko Kivelä, Alex Arenas, Marc Barthelemy, James P Gleeson, Yamir Moreno, and Mason A Porter, “Multilayer networks,” Journal of Complex Networks 2, 203–271 (2014).
- Boccaletti et al. (2014) Stefano Boccaletti, Ginestra Bianconi, Regino Criado, Charo I Del Genio, Jesús Gómez-Gardenes, Miguel Romance, Irene Sendina-Nadal, Zhen Wang, and Massimiliano Zanin, “The structure and dynamics of multilayer networks,” Physics Reports 544, 1–122 (2014).
- Bianconi (2018) Ginestra Bianconi, Multilayer Networks: Structure and Function (Oxford University Press, 2018).
- Harary et al. (1953) Frank Harary et al., “On the notion of balance of a signed graph.” The Michigan Mathematical Journal 2, 143–146 (1953).
- Wasserman and Faust (1994) Stanley Wasserman and Katherine Faust, Social network analysis: Methods and applications, Vol. 8 (Cambridge university press, 1994).
- Halu et al. (2013) Arda Halu, Raúl J Mondragón, Pietro Panzarasa, and Ginestra Bianconi, “Multiplex pagerank,” PLOS ONE 8, e78293 (2013).
- Estrada and Gómez-Gardenes (2014) Ernesto Estrada and Jesús Gómez-Gardenes, “Communicability reveals a transition to coordinated behavior in multiplex networks,” Physical Review E 89, 042819 (2014).
- Solé-Ribalta et al. (2016) Albert Solé-Ribalta, Manlio De Domenico, Sergio Gómez, and Alex Arenas, “Random walk centrality in interconnected multilayer networks,” Physica D: Nonlinear Phenomena 323, 73–79 (2016).
- Rahmede et al. (2017) Christoph Rahmede, Jacopo Iacovacci, Alex Arenas, and Ginestra Bianconi, “Centralities of nodes and influences of layers in large multiplex networks,” Journal of Complex Networks (2017).
- De Domenico et al. (2015) Manlio De Domenico, Albert Solé-Ribalta, Elisa Omodei, Sergio Gómez, and Alex Arenas, ‘‘Ranking in interconnected multilayer networks reveals versatile nodes,” Nature Communications 6, 6868 (2015).
- Maoz (2010) Zeev Maoz, Networks of nations: The evolution, structure, and impact of international networks, 1816–2001, Vol. 32 (Cambridge University Press, 2010).
- Judge et al. (1994) Peter G Judge, Frans BM De Waal, Katherine S Paul, and Thomas P Gordon, “Removal of a trauma-inflicting alpha matriline from a group of rhesus macaques to control severe wounding,” Laboratory Animal Science 44, 344–350 (1994).
- McCowan et al. (2018) Brenda McCowan, Brianne Beisner, and Darcy Hannibal, “Social management of laboratory rhesus macaques housed in large groups using a network approach: A review,” Behavioural Processes 156, 77–82 (2018).
- Pósfai and D’Souza (2018) Márton Pósfai and Raissa M D’Souza, “Talent and experience shape competitive social hierarchies,” Physical Review E 98, 020302(R) (2018).
- Nicosia et al. (2017) Vincenzo Nicosia, Per Sebastian Skardal, Alex Arenas, and Vito Latora, “Collective phenomena emerging from the interactions between dynamical processes in multiplex networks,” Physical Review Letters 118, 138302 (2017).
- Granell et al. (2013) Clara Granell, Sergio Gómez, and Alex Arenas, “Dynamical interplay between awareness and epidemic spreading in multiplex networks,” Physical Review Letters 111, 128701 (2013).
- Radicchi and Castellano (2017) Filippo Radicchi and Claudio Castellano, “Fundamental difference between superblockers and superspreaders in networks,” Physical Review E 95, 012318 (2017).
- Arrow (2012) Kenneth J Arrow, Social choice and individual values, Vol. 12 (Yale University Press, 2012).
- Dwork et al. (2001) Cynthia Dwork, Ravi Kumar, Moni Naor, and Dandapani Sivakumar, “Rank aggregation methods for the web,” in Proceedings of the 10th International Conference on World Wide Web (ACM, 2001) pp. 613–622.
- Lin (2010) Shili Lin, “Rank aggregation methods,” Wiley Interdisciplinary Reviews: Computational Statistics 2, 555–570 (2010).
- Kemeny and Snell (1962) J.L. Kemeny and J.G. Snell, Mathematical Models in the Social Sciences (Blaisdell, New York, 1962).
- de Borda (1781) Jean C de Borda, “Mémoire sur les élections au scrutin,” (1781).
- Irurozki (2014) Ekhine Irurozki, Sampling and learning distance-based probability models for permutation spaces, Ph.D. thesis, Ph. D. thesis, University of the Basque Country, Donostia-San Sebastián (2014).
- Ceberio et al. (2015) Josu Ceberio, Ekhine Irurozki, Alexander Mendiburu, and Jose A Lozano, “A review of distances for the mallows and generalized mallows estimation of distribution algorithms,” Computational Optimization and Applications 62, 545–564 (2015).
- (35) Note that if ties are present Kemeny distance is not a distance in the mathematical sense, as it does not satisfy the triangle inequality.
- Bartholdi et al. (1989) John J Bartholdi, Craig A Tovey, and Michael A Trick, “The computational difficulty of manipulating an election,” Social Choice and Welfare 6, 227–241 (1989).
- Ali and Meilă (2012) Alnur Ali and Marina Meilă, “Experiments with kemeny ranking: What works when?” Mathematical Social Sciences 64, 28–40 (2012).
- (38) The airline network represents the five largest US carriers in terms of number of unique flights. It is constructed based on the publicly available database of the USA Department of Transportation (https://www.transtats.bts.gov/Fields.asp?Table_ID=259), all domestic flights labeled as “scheduled passenger service” are included.
- McCowan et al. (2011) Brenda McCowan, Brianne A Beisner, John P Capitanio, Megan E Jackson, Ashley N Cameron, Shannon Seil, Edward R Atwill, and Hsieh Fushing, “Network stability is a balancing act of personality, power, and conflict dynamics in rhesus macaque societies,” PLOS ONE 6, e22350 (2011).
- Beisner et al. (2015) Brianne A Beisner, Jian Jin, Hsieh Fushing, and Brenda Mccowan, “Detection of social group instability among captive rhesus macaques using joint network modeling,” Current Zoology 61, 70–84 (2015).
- Magnani et al. (2013) Matteo Magnani, Barbora Micenkova, and Luca Rossi, “Combinatorial analysis of multiple networks,” arXiv preprint arXiv:1303.4986 (2013).
- Szell et al. (2010) Michael Szell, Renaud Lambiotte, and Stefan Thurner, “Multirelational organization of large-scale social networks in an online world,” Proceedings of the National Academy of Sciences 107, 13636–13641 (2010).
- Sklar (1959) M Sklar, “Fonctions de repartition an dimensions et leurs marges,” Publ. Inst. Statist. Univ. Paris 8, 229–231 (1959).
- Joe (1997) Harry Joe, Multivariate models and multivariate dependence concepts (Chapman and Hall/CRC, 1997).
- Nelsen (2006) Roger B Nelsen, An introduction to copulas (Springer Science & Business Media, 2006).
- Goh et al. (2001) K-I Goh, Byungnam Kahng, and Doochul Kim, “Universal behavior of load distribution in scale-free networks,” Physical Review Letters 87, 278701 (2001).