Detecting global bridges in networks
Abstract
The identification of nodes occupying important
positions in a network structure is crucial for the understanding
of the associated realworld system. Usually, betweenness
centrality is used to evaluate a node capacity to connect
different graph regions. However, we argue here that this measure
is not adapted for that task, as it gives equal weight to “local” centers
(i.e. nodes of high degree central to a single region) and
to “global” bridges, which connect different communities. This
distinction is important as the roles of such nodes are different
in terms of the local and global organisation of the network
structure. In this paper we propose a decomposition of betweenness
centrality into two terms, one highlighting the local contributions and the other the global ones. We call the latter
bridgeness centrality and show that it is capable to specifically spot out global bridges. In addition, we
introduce an effective algorithmic implementation of this measure
and demonstrate its capability to identify global bridges
in air transportation and scientific collaboration networks.
Centrality Measures, Betweenness Centrality, Bridgeness Centrality
JXX, JYY
xxxxxx
Pablo Jensen et al.
1 Introduction
Although the history of graphs as scientific objects begins with
Euler’s ? famous walk across Königsberg bridges,
the notion of ’bridge’ has rarely been tackled by network
theorists
Let’s discuss now the second point, the “theory” needed to measure the
bridging force of different edges or nodes
In all these contexts, it is the very same question that we wish to ask: do nodes or edges reinforce the density of a cluster of nodes (bounding) or do they connect two separated clusters (bridging)? Formulated in this way, the bridging/bounding question seems easy to answer. After having identified the clusters of a network, one should simply observe if a node connects nodes of the same cluster (bounding) or of different clusters (bridging). However, the intracluster/intercluster approach is both too dependent on the method used to detect communities and flawed by its inherent circular logic: it uses clustering to define bridging and bounding ties when it is precisely the balance of bridges and bounds that determines clusters. Remark that, far from being a mathematical subtlety, this question is a key problem in social theory. Defining internal (gemeinschaft) and external (gesellschaft) relations by presupposing the existence and the composition of social groups is absurd as groups are themselves defined by social relations.
In this paper, we introduce a measure of bridgeness of nodes that is
independent on the community structure and thus escapes this vicious
circle, contrary to other proposals
?; ?. Moreover, since the computation of
bridgeness is straightforwardly related to that of the usual
betweenness, Brandes’ algorithm ? can be used to
compute it efficiently
Measuring bridgeness
Identifying important nodes in a network structure is crucial for the understanding of the associated realworld system ?; ?; ?, for a review see ?. The most common measure of centrality of a node for network connections on a global scale is betweenness centrality (), which “measures the extent to which a vertex lies on paths between other vertices” ?; ?. We show in the following that, when trying to identify specifically global bridges, has some limitations as it assigns the same importance to paths between the immediate neighbours of a node as to paths between further nodes in the network. In other words BC is built to capture the overall centrality of a node, and is not specific enough to distinguish between two types of centralities: local (center of a community) and global (bridge between communities). Instead, our measure of bridging is more specific, as it gives a higher score to global bridges. The fact that may attribute a higher score to local centers than to global bridges is easy to see in a simple network (Figure 1). The logics is that a “star” node with degree , i.e. a node without links between all its first neighbors (clustering coefficient 0) receives automatically a = arising from paths of length 2 connecting the node’s first neighbors and crossing the central node. More generally, if there exist nodes with high degree but connected only locally (to nodes of the same community), their betweenness may be of the order of that measured for more globally connected nodes. Consistent with this observation, it is wellknown that for many networks, is highly correlated with degree ?; ?; ?. A recent scientometrics study tried to use betweenness centrality as “an indicator of the interdisciplinarity of journals” but noted that this idea only worked “in local citation environments and after normalization because otherwise the influence of degree centrality dominated the betweenness centrality measure ?.
To avoid this problem and specifically spot out global centers, we decompose into a local and a global term, the latter being called ’bridgeness’ centrality. Since we want to distinguish global bridges from local ones, the simplest approach is to discard shortest paths, which either start or end at a node’s first neighbors from the summation to compute (Eq. 1). This completely removes the paths that connect two non connected neighbors for ’star nodes’ (see Figure 1) and greatly diminishes the effect of high degrees, while keeping those paths that connect more distant regions of the network.
More formally in a graph , where assigns the set of nodes and the set of links the definition of the betweenness centrality for a node stands as:
(1) 
where
(2) 
Here the summation runs over any distinct node pairs and ; represents the number of shortest paths between and ; while is the number of such shortest paths running through . Decomposing into two parts (right hand side) the first term defines actually the global term, bridgeness centrality, where we consider shortest paths between nodes not in the neighbourhood of (), while the second local term considers the shortest paths starting or ending in the neighbourhood of . This definition also demonstrates that the bridgeness centrality value of a node is always smaller or equal to the corresponding value and they only differ by the local contribution of the first neighbours. Fig. 1 illustrates the ability of bridgeness to specifically highlight nodes that connect different regions of a graph. Here the (Fig. 1a) and bridgeness centrality values (Fig. 1b) calculated for nodes of the same network demonstrate that bridgeness centrality gives the highest score to the node which is central globally (green), while does not distinguish among local or global centers, and actually assigns the highest score to nodes with high degrees (red).
In the following, to further explore the differences between these measures we define an independent reference measure of bridgeness using a known partitioning of the network. This measure provides us an independent ranking of the bridging power of nodes, that we correlate with the corresponding rankings using the and bridgeness values. In addition we demonstrate via three example networks that bridgeness centrality is always more specific than to identify global bridges.
Computing global bridges from a community structure
To identify the global bridges independently from their score in or bridgeness, we use a simple indicator inspired by the wellknown RaoStirling index ?; ?; ?; ?, as this indicator is known to quantify the ability of nodes to connect different communities. Moreover, it includes the notion of “distance”, which is important for distinguishing local and global connections. However, we note that this index needs as input a prior categorization of the nodes into distinct communities. Our global indicator in Eq.3 for node is defined as:
(3) 
where the sum runs over communities (different from the community of node , taken as ), being if there is a link between node and community and otherwise. Finally, corresponds to the ’distance’ between communities and , as measured by the inverse of the number of links between them: the more links connect two communities, the closer they are. Nodes that are only linked to nodes of their own community have , while nodes that connect two (or more) communities have a strictly positive indicator. Those nodes that bridge distant communities, for example those that are the only link between two communities, have high values.
As a next step we use this reference measure (i.e. the global indicator) to rank nodes and compare it to the rankings obtained by the two tentative characteristics of bridging ( and bridgeness) in three large networks.
Synthetic network: unbiased LFR
We start with a synthetic network obtained by a method similar to that of Lancichinetti et al ?. This method leads to the socalled ’LFR’ networks with a clear community structure, which allows to easily identify bridges between communities. We have only modified the algorithm to obtain bridges without the degree bias which arises from the original method. Indeed, LFR first creates unconnected communities and then chooses randomly internal links that are reconnected outside the community. This leads to bridges, i.e. nodes connected to multiple communities, which have a degree distribution biased towards high degrees. In our method, we avoid this bias by randomly choosing nodes, and then one of their internal links, which we reconnect outside its community as in LFR. As reference, we use the global indicator defined above. As explained, this indicator depends on the community structure, which is not too problematic here since, by construction, communities are clearly defined in this synthetic network.
Fig. 3a shows that bridgeness provides a ranking that is closer to that of the global indicator than . Indeed, we observe that the ratio for bridgeness is higher than for . This means that ordering nodes by their decreasing bridgeness leads to a better ranking of the ’global’ scores  as measured by G  than the corresponding ordering by their decreasing values. As shown in the simpler example of a 1000node network (demonstrated in Fig. 2), fails because it ranks too high some nodes that have no external connection but have a high degree. A detailed analysis of the nodes of a cluster is given in Supplementaty Informations.
In addition we directly measured , the average relative contribution of the local term in for nodes of the same degree (see Fig. 3b). We observe a negative correlation, which means that the local term is dominating for low degree nodes, while high degree nodes have higher bridgeness value as they have a higher chance to connect to different communities.
Real network 1: airport’s network
Proving the adequacy of bridgeness to spot out global bridges on real networks is more difficult, because generally communities are not unambiguously defined, therefore neither are global bridges. Then, it is difficult to show conclusively that bridgeness is able to specifically spot these nodes. To answer this challenge, our strategy is the following:
(i) We use flight itinerary data providing origin destination pairs between commercial airports in the world (International Air Transport Association). The network collects 47,161 transportation connections between 7,733 airports. Each airport is assigned to its country.
(ii) We consider each country to be a distinct ’community’ and compute a global indicator based on this partitioning, as it allows for an objective (and arguably relevant) partition, independent from any community detection methods. Then we show that bridgeness offers a better ranking than BC to identify airports that act as global bridges, i.e. that connect countries internationally.
As an example, in Fig. 4 we show the two largest airports of Argentina, Ezeiza (EZE) and Aeroparque (AEP). Both have a similar degree (54 and 45 respectively), but while the first connects Argentina to the rest of the world, Aeroparque mostly handles domestic flights, thus functioning as a local center. This is confirmed by the respective G values: 2327.2 (EZE) and 9.0 (AEP). However, just like in our simple example in Fig. 1, gives the same score to both, while bridgeness clearly distinguishes between the local domestic center and the global international bridge by attributing to the global bridge a score 250 times higher (see Fig. 4). This can partly be explained by the fact that AEP is a ’star’ node (low clustering coefficient: 0.072), connected to 12 very small airports, for which it is the only link to the whole network. All the paths starting from those small airports are cancelled in the computation of the bridgeness (they belong to the ’local’ term in Eq.1), while counts them equally as any other path.
More generally, Figure 3 shows that, as for the Airport network, bridgeness provides again a ranking that is closer to that of the global indicator. Indeed, ordering nodes by their decreasing bridgeness leads to a ranking that is closer to the ranking obtained by the global score than the ranking by decreasing . In addition we found again negative correlations between the average relative local term and node degrees (see Fig. 3c), assigning similar roles for low and high degree nodes as in case of the synthetic network.
Real network 2: scientometric network of ENS Lyon
The second example of a real network is a scientometric graph of a scientific institution ?, the “Ecole normale supérieure de Lyon” (ENS, see Figure 5). This networks adds authors to the usual cocitation network, as we want to understand which authors connect different subfields and act as global, interdisciplinary bridges. To identify the different communities, we rely on modularity optimization ?, which leads to a relevant community partition because scientific networks are highly structured by disciplinary boundaries. This is confirmed by the high value of modularity generated by this partition (0.89). In Figure 5, the authors of different communities are shown with different colors, and their size corresponds to their betweenness (left) or bridgeness (right) centrality, which clearly leads to highlight different authors as the main global bridges, which connect different subfields. We compute the Stirling indicator (Eq.1) based on the modularity structure to identify the global bridges. As for the previous networks, Fig. 3 shows that bridgeness ranks the nodes in a closer way than to the ranking provided by the global measure based on community partition. On the other hand the corresponding function (see Fig. 3d) suggests a slightly different picture in this case. Here nodes with large but moderate degrees (smaller than ) have high local terms suggesting that they act as local centres, while nodes with higher degrees have somewhat smaller local terms assigning their role to act as global bridges.
Discussion
In this paper we introduced a measure to identify nodes acting as global bridges in complex network structures. Our proposed methodology is based on the decomposition of into a local and global term, where the local term considers shortest paths that start or end at one of the node’s neighbors, while the global term, what we call bridgeness, is more specific to identify nodes which are globally central. We have shown, on both synthetic and real networks, that the proposed bridgeness measure improves the capacity to specifically find out global bridges as it is able to distinguish them from local centers. One crucial advantage of our measure of bridgeness over former propositions is that it is independent of the definition of communities.
However, the advantage in using bridgeness depends the precise topology of the network, and mainly on the degree distribution of bridges as compared to that of all the nodes in the network. When bridges are highdegree nodes, and bridgeness give an equally good approximation, since highdegree bias do not play an important role in this case. Instead, when some bridges have low degrees, while some highdegree nodes act like local centers of their own community, bridgeness is more effective to identify bridges as gives equally high rank to nodes with high degree, even if they are not connected to nodes outside of their community. We demonstrated that bridgeness is systematically more specific to spot out global bridges in all the networks we have studied here. Although the improvement was small on average, typically to , even a small amelioration of a widely used measure is in itself an interesting result.
We should also note that, except on simple graphs, comparing these two measures is difficult since there is no clear way to identify, independently, the ’real’ global bridges. We have used community structure when communities seem clearcut, but then we fall into the circularity problems stressed in the introduction. Using metadata on the nodes (i.e. countries for the airports) may solve this problem but raises others, as metadata do not necessarily correspond to structures obtained from the topology of the network, as shown recently on a variety of networks ?. Another possible extension would be to identify overlapping communities to identify independently global bridges, as nodes involved in multiple communities, and correlate them with the actual measure, which provides a direction for future studies. However, in any case identifying global bridges remains a difficult problem as it is tightly linked to another difficult problem, that of community detection. Decomposing into a local and a global term helps to improve the solution, but many questions remain still open for further inquiry.
Supplementary Informations
S1. Modified Brandes algorithm
Bridgeness algorithm, inspired by Brandes’ “faster algorithm” ?
SP[s,t] precompute all shortest distances matrix/dictionary
CB[v] 0, v V ;
for s V do
S empty stack;
P[w] empty list, w V ;
[t] 0, t V ; [s] 1;
d[t] 1, t V ; d[s] 0;
Q empty queue;
enqueue s Q;
while Q not empty do
dequeue v Q;
push v S;
foreach neighbor w of v do
// w found for the first time?
if d[w] < 0 then
enqueue w Q;
d[w] d[v] + 1;
end
// shortest path to w via v?
if d[w] = d[v] + 1 then
[w] [w] + [v];
append v P[w];
end
end
end
[v] 0, v V ;
// S returns vertices in order of nonincreasing distance from s
while S not empty do
pop w S;
for v P[w] do [v] [v] + [v]/[w] ⋅ (1 + [w]);
if SP[w,s]>1 then CB[w] CB[w] + [w];
end
end
S2. Case study on a synthetic network community
The specificity of bridgeness and the influence of the degree, which prevents BC from identifying correctly the most important bridges, can be exemplified by examining the scores of nodes in cluster 5 of the synthetic network. This cluster is linked to cluster 13 by 5 connections (through nodes 248, 861, 471, 576 and 758) and to cluster 1 by a single connection (through node 232). BC gives roughly the same score to nodes 232 and 248, while bridgeness attributes a score almost 4 times higher to node 232, correctly pointing out the importance of this single bridge between clusters 5 and 1. This is because BC is confused by the high degree of node 248 (41) as compared to node 232 low degree (20). Therefore, by counting all the shortest paths, BC attributes too high a bridging score to node 248. Second problem with BC, it gives a high score to nodes that are not connected to other communities, merely because they are local centers, i.e. they have a high degree. For example, node 515 obtains a higher BC score than node 758 (Table S1), even if node 515 has no connection to other communities (but degree 49), contrary to node 758 (connected to cluster 5, but degree 23). Bridgeness never ranks higher local centers than global bridges: here, it correctly assigns a 5 times higher score to node 758 than to node 515.
Id  Stirling  Modularity Class  Betweenness  Bridgeness  Degree 

542  0.0222  5  9173.71  2644.62  44 
422  0.0278  5  7714.27  3855.62  35 
232  0.0950  5  7551.22  5846.86  20 
804  0.0285  5  6995.63  2824.64  34 
248  0.0082  5  6588.65  1624.30  48 
734  0.0907  5  6410.31  4373.72  21 
273  0.0322  5  5698.28  2631.59  30 
75  0.0868  5  5349.47  3558.31  22 
962  0.0399  5  4989.66  2951.45  24 
292  0.0399  5  4377.77  1939.06  24 
481  0.0256  5  4305.68  1796.92  25 
781  0.0475  5  4257.93  2200.21  20 
304  0.0434  5  4221.64  2467.65  22 
625  0.0202  5  3964.21  1314.62  32 
861  0.0108  5  3295.01  714.44  36 
132  0.0200  5  2985.45  1157.49  24 
471  0.0154  5  2865.07  1296.38  25 
79  0.0302  5  2256.02  1004.28  21 
205  0.0208  5  1921.65  788.51  23 
515  0.0000  5  1884.07  86.45  49 
758  0.0166  5  1791.80  435.66  23 
608  0.0200  5  1777.54  522.75  24 

Footnotes
 We refer to the common use of the word ’bridge’, and not to the technical meaning in graph theory as ’an edge whose deletion increases its number of connected components’
 In this paper, we will focus on defining the bridgeness of nodes, but our definition can straightforwardly be extended to edges, just as the betweenness of edges is derived from that of nodes.
 We have written a plugin for Gephi ? that computes this measure on large graphs. See Supplementary Informations for a pseudoalgorithm for both node and edge bridgeness.
References
 A LaTeX macro package for message sequence charts—maintenance document—describing \mscpack version \mscversion. Note: Included in MSC macro package distribution
 A LaTeX macro package for message sequence charts—reference manual—describing \mscpack version \mscversion. Note: Included in MSC macro package distribution
 A LaTeX macro package for message sequence charts—user manual—describing \mscpack version \mscversion. Note: Included in MSC macro package distribution
 The LaTeX Graphics Companion. AddisonWesley.
 ITUTS Recommendation Z.120: Message Sequence Chart (MSC). Geneva.
 LaTeX—a document preparation system—user’s guide and reference manual. 2nd edition, AdsisonWesley. Note: Updated for LaTeX2e
 Tutorial on message sequence charts (MSC’96). In FORTE,