Concurrency and reachability in tree-like temporal networks
Network properties govern the rate and extent of various spreading processes, from simple contagions to complex cascades. Recently, the analysis of spreading processes has been extended from static networks to temporal networks, where nodes and links appear and disappear. We focus on the effects of “ÂÂaccessibility”, whether there is a temporally consistent path from one node to another, and “reachability”, the density of the corresponding “ÂÂaccessibility graph”ÂÂ representation of the temporal network. The level of reachability thus inherently limits the possible extent of any spreading process on the temporal network. We study reachability in terms of the overall levels of temporal concurrency between edges and the structural cohesion of the network agglomerating over all edges. We use simulation results and develop heterogeneous mean field model predictions for random networks to better quantify how the properties of the underlying temporal network regulate reachability.
Social networks are woven together by temporal contacts organized according to various structural details which together form the substrate of infectious dynamics, determining the impacts of spreading diseases and viral information flows. Compared to the extensive literature modeling spreading dynamics on static networks, we lack a thorough understanding of the particular effects of the temporal properties of network contacts. Meanwhile, recent studies have found diverse temporal contact features — such as distributions of inter-contact times, temporal correlation in inter-contact times, and birth and death of nodes and links — may have very different impacts on the dynamics of spreading processes Vazquez et al. (2007); Karsai et al. (2011); Min et al. (2011); Masuda et al. (2013); Hiraoka and Jo (2018); Holme and Liljeros (2014); Delvenne et al. (2015); Colman and Charlton (2016).
Concurrency, broadly defined as ‘relationships that overlap in time’ Moody and Benton (2016), is one of the key elements affecting the extent and speed of disease spreading. Concurrency is a longstanding concept in epidemiology and has been considered in diverse contexts, including for understanding the epidemic potential of HIV/AIDS Morris and Kretzschmar (1995); Watts and May (1992); Gurski and Hoffman (2016); Moody and Benton (2016). Some studies have applied the concept as a property proportional to an average contact rate or an average degree in unit time Watts and May (1992); Gurski and Hoffman (2016) or as a link density of a reachable network converted from a pair-to-pair contact patterns Morris and Kretzschmar (1995). In another recent study, concurrency was considered as the number of links of an individual in unit time within a generative temporal activity model Onaga et al. (2017). Whatever the particular definition of concurrency, the general idea in application is that increased concurrency increases the density of the effective network structure over which an infection is transmitted, resulting in a larger number of alternative paths between nodes, thus increasing the potential for greater spread through the population. As such, the general conclusion that higher concurrency increases the potential for epidemic spread seems to be trivial. However, the detailed mechanism of this increase is important to understand and to quantify to assess the impact of concurrency in a particular temporal network setting.
Motivated by previous work Moody and Benton (2016), we consider the reachability of the temporal contact network over which transmission can occur. Reachability is the density of the accessibility graph that includes an edge from node to node if and only if there is a temporally consistent path originating at that can reach in the underlying temporal contact network. That is, reachability quantifies the maximum possible impact of the infectious spreading by quantifying the fraction of node pairs that can be accessible via temporally consistent paths (see, e.g., Grindrod et al. (2011); Lentz et al. (2013); Holme (2015); Moody and Benton (2016)). Reachability is a useful metric not only because it measures the maximal substrate of infectious spreading, but also because it indicates how much temporal continuity can be ignored when one uses an aggregated static network to analyze infection dynamics Lentz et al. (2013).
To separate out the influences of the temporal and structural details, we focus as in Moody and Benton (2016) on two critical properties of the temporal network: temporal concurrency and structural cohesion. Temporal concurrency is defined here as the fraction of pairs of links that overlap in time. Meanwhile, structural cohesion measures the effective connectedness in the underlying topology of the time-aggregated network, ignoring the temporal details of the individual edges. A good measure of structural cohesion should embody the notion that highly cohesive networks should be difficult to separate (i.e., by node or edge removal) into separate components. As such, we employ the definition of structural cohesion as the average number of node-independent paths between two nodes Moody and White (2003), as applied to the network that includes all edges over the total time period studied. We emphasize that structural cohesion is more than a simple function of edge density; rather, it is influenced by the organizational patterns of the connections. In particular, one can observe different amounts of structural cohesion even while keeping the total link density constant.
By deliberately separating the structural cohesion measurement from that for temporal concurrency, we explore the role of each and the interplay between them in affecting reachability. Pairing numerical calculations with an approximate model we develop here, we examine the roles of these temporal network properties, observing in particular how structural cohesion directly affects the desciption of the use of detours to find temporally-consistent paths between node pairs. Our approximate model focuses on networks that are tree-like in the sense of having low structural cohesion, in an effort to develop and assess the accuracy of model approximations for the level of reachability in random temporal networks. (We refer the interested reader to Melnik et al. Melnik et al. (2011) for further discussion of what it might mean for a network to be “tree-like” in this sense.)
We start with detailed definitions of temporal concurrency and structural cohesion in Sec. II.1, continue to describe the methods for constructing our synthetic and sampled empirical networks in Sec. II.2 and Sec. II.3, respectively, and provide specific quantitative details for numerically computing reachability in Sec. II.4. We then develop our model approximation for reachability in Sec. III. In Sec. IV, we compare numerical measurements and the approximation for reachability on synthetic trees, Erdős-Rényi networks, and configuration model realizations with exponential degree distributions, before continuing on to the empirical examples studied previously in Moody and Benton (2016). We conclude with a discussion in Sec. V about the effect of temporal concurrency on the reachability and limits of the presently-developed approximation, along with possible future directions for improvement.
ii.1 Structural cohesion and temporal concurrency
The ease with which disease spreads on a network is typically increased in the presence of multiple diverse alternative paths between nodes. In a temporal network with many links overlapping in time, the concurrency increases the number of such paths that are temporally consistent, possibly accelerating the spread and increasing the total outbreak size even without increasing the number of contacts in the network. To study the role of the structural and temporal connectedness, we separately consider the impacts of the structural cohesion and temporal concurrency, following the approach in Moody and Benton (2016) (summarized above and presented in detail below).
We emphasize that throughout this study we distinctively refer to three related network representations describing the pattern of interaction: (1) the full temporal contact network (Fig. 1(a)), which we assume is undirected; (2) the static aggregated network that includes all links that ever appear (Fig. 1(b)); and (3) the directed accessibility graph (Fig. 1(c)) that describes the presence of temporally-consistent paths between ordered pairs of nodes.
To measure the structural cohesion, , we consider only the static aggregated network representation including all edges that are ever present in the specified temporal network. Within the aggregated network, we seek the number of node-independent paths, , available between nodes and . We employ the shortest path approximation of White and E. J. Newman (2001) to numerically calculate and then average over all pairs of nodes:
where is the size of the network.
To measure temporal concurrency, , we consider here the single-interval case where the link between nodes and (if present) has a single, specified starting time and persists for duration . For simplicity, we will assume that start times and durations are each independent and identically distributed (IID) across the edges that ever appear in the aggregated network during the selected total time period, . That is, in particular, the timings of the edges emanating from a given node are necessarily independent of each other. As such, the temporal concurrency of edges associated with a given node, measuring the probability that there are such edges overlapping in time, becomes by this IID assumption equivalent to the probability that any randomly-selected pair of links overlaps in time. We can thus select two randomly selected links with start times and durations denoted by and . Without loss of generality, let . The probability that these two edges overlap in time is then simply the probability that the duration of the first edge is larger than the difference between starting times, . If we let be the probability distribution function of durations and be the probability distribution function of start times, the concurrency under these simplifying assumptions becomes
ii.2 Simulated timings and temporal concurrency
For the simulated temporal network data studied here, we further simplify the above expressions for temporal concurrency by assuming specific probability distribution functions for the start times and durations of edges, inherently nondimensionalizing the time scale of the two distributions so as to work easily within a one-parameter model for modifying temporal concurrency. In so doing, we emphasize that the temporal concurrency and the subsequent calculation of reachability depends only on the orderings of start and end times, not the total amounts of time involved in those overlaps. We take a uniform distribution for edge start times, , , and draw durations from an exponential distribution .
We emphasize that changing the decay rate of the exponential in is unnecessary, since doing so is nondimensionally equivalent to a change in for calculating concurrency and reachability. That is, the range has inherently become a nondimensional ratio of the underlying time scales of the distributions of start times to that of edge durations.
The cumulative distribution function of edge durations larger than some specified time, which we notate by , then simplifies to and the concurrency of the temporal network under these timings can be rewritten as
We note that for . Meanwhile, taking the series expansion of the exponential, we obtain that the temporal concurrency for approaches the value like . For comparison, (3) gives .
Importantly, our structural cohesion definition depends only on the topology of the aggregated time-independent network, ignoring start and end times and including all edges that ever exist during the time period. In contrast, the temporal concurrency depends only on the distributions of start times and edge durations, independent of the network topology.
ii.3 Construction of synthetic temporal networks
To explore the effect of temporal concurrency and its interplay with structural cohesion, we will examine the reachability with a model approximation based on the assumption that the networks are locally tree-like. We thus start by confirming the analysis on balanced and unbalanced tree networks, which have only one node independent path for each node pair. In the balanced tree networks, each node has successors except the leaves that are at distance from the root.
We generate unbalanced tree networks by rewiring the balanced trees, ensuring that they maintain the same numbers of nodes and edges. In our rewiring, we choose a random edge from the set of edges . Removing this edge separates the network into two components: one includes and the other includes . We then choose a random node from the component containing node , and connect to . In so doing, we ensure at each step that we maintain a tree structure without cycles. We continue this process times, where the rewiring fraction used in our work here is ; that is, we rewired 10% of the links.
We generated connected components from Erdős-Rényi (ER) networks using the gnp_random_graph function in the NetworkX python package NetworkX developer team (2014), Generating 100 ER networks initially from nodes and connection probability yielded largest connected components of size and average degree . For , with the same initial size, the largest connected components have an average size of and .
We further compare these results with randomly generated graphs with exponential degree distributions, as described in the Appendix. In particular, we observe that the average structural cohesion for an exponential degree distribution graph is typically smaller than for an ER network with the same mean degree. We then subsequently rewire the ER and exponential degree networks to a desired, matched structural cohesion, to clarify the comparison being considered (see the Appendix).
To connect our results to the previous work of Moody and Benton (2016), we used the same sampled collaboration networks studied there, which were extracted by four-step random walks from collaboration networks Moody (2004). In particular, we consider the same four examples that were highlighted in Fig. 3 and Appendix 2 of Moody and Benton (2016), having similar sizes to one another but different structural cohesion. These selected networks capture low average numbers of partners and skewed degree distributions, both of which are typical in sexual contact patterns, making them useful for testing the impact of temporal concurrency in the context of a spreading infection Moody and Benton (2016).
For each of our four different classes of aggregated network structure (trees, ER, exponential degree distributions, and empirical examples), we randomly generate the temporal information for each edge (i.e., start times and durations). We note that we treat all of our synthetic networks as single-interval temporal networks, where each edge is present for the entirety of the duration after its start time, as drawn from the selected distributions. Given the start time, , and duration, , of an edge, its end time, is of course . As described above, we consider start times drawn from a uniform distribution , with durations following an exponential distribution . As such, is effectively a dimensionless time, which we vary in the range .
ii.4 Numerical reachability by using accessibility matrix
Given the specific temporal contact information of every link in the temporal contact network, we directly evaluate the average reachability as the density of the accessibility graph. Direct contacts like in Fig. 1(a) immediately carry over into the accessibility graph, along with additional ordered pairs like and in Fig. 1(c). For example, an infection starting from D at can reach C either by directly infecting B, or by infecting A who then infects B, and then by B infecting C during their (later in time) contact. However, an infection seeded at C cannot reach A or D because of the absence of temporally consistent paths, since the link does not appear until after all of the other edges have ended.
The role of concurrency as a potential enhancer of reachability is immediately apparent in this small toy example if we vary the start time of the edge: if that start time were before , then the ordered pair would also be in the accessibility graph, so an infection seeded at can spread further than with the timings indicated in the figure. Similarly, if that start time were before , the ordered pair would also be accessible.
We describe the unweighted accessibility graph through its adjacency matrix with elements when there is a temporally consistent path from node to node , otherwise , as shown in Fig. 1(e). To quantify an average accessibility across the whole network, we calculate the density of the accessibility graph (that is, the density of the off-diagonal elements of the accessibility matrix)
and we call this quantity the reachability of the temporal network.
To numerically evaluate the reachability from temporal network data, we represent the essential temporal information into layers of contacts corresponding to edge end times that have been sorted in ascending order, as depicted in Fig. 2. The process of generating the temporal layers is as follows:
Sort edges in by end time in ascending order, where is the end time of edge , is the sorted index of edges, and is the total number of edges, . For example, is the edge with the earliest end time, whereas is the last edge to end. (Breaking ties between identical end times is unimportant for calculating reachability, except insofar as it can be used to speed up the calculation by indexing the smaller number of distinct end times, under an appropriate change of notation.)
Construct the th temporal layer matrix by including edge and all other edges with that are present just before the end time . That is, includes and all satisfying both and .
By repeating step 2, the full set of temporal layer matrices may be prepared.
Multiply the matrix exponentials of each temporal matrix to obtain .
Binarize : For all values, set .
Evaluate the average reachability by Eq. 4.
For example, in Fig. 1, the earliest ending edge ends at time . The edge is the only other edge present in the temporal layer in Fig. 2, satisfying the step 2 conditions above. One can similarly determine the adjacency matrices corresponding to the end time of each edge, and multiply the matrix exponentials to evaluate the accessibility matrix as described in step 4. Once we binarize the accessibility matrix , the reachability is obtained by averaging the off-diagonal elements of . In the example in Figs. 1 and 2, the reachability is .
The matrix exponentials in step 4 above provide a simply-expressed formula indicating the connected components within each individual temporal layer. As such, multiplying the matrix exponentials for any set of consecutive temporal layers yields (after binarizing) the reachable network associated with that combination of layers. But in practice for larger temporal networks, it is significantly more efficient computationally to instead directly calculate the connected components of and replace the matrix exponential with the binary indicator matrix whose elements specify whether the corresponding pair of nodes are together in the same component at that time. Similarly, to save memory overhead, steps 3 and 4 can be trivially combined to consider only one temporal layer at a time. For even larger networks whose adjacency matrices must be represented as sparse matrices to fit in memory, the corresponding accessibility graph could instead be constructed one row at a time, updating the running average of the density to calculate the overall reachability.
Iii Approximate model for reachability
We seek to approximate the reachability in terms of some minimal temporal and structural information necessary to accurately describe the essential relationships. Using simulated timings on random graphs, we immediately observe that the overall density and level of cohesion are insufficient for describing the needed structural effects. Specifically, we consider simulated timings on random networks with exponential degree distributions and ER graphs that have been rewired to target specific cohesion values as described in the Appendix. The results in Figure A.9 show reachability versus concurrency for rewired ER and exponential degree distribution graphs both with and , demonstrating clear differences. As such, we desire to more accurately approximate the reachability versus concurrency relationship using more structural information. Motivated by Fig. A.9 and modeling successes for other network problems (see, e.g., Melnik et al. (2011); Gleeson et al. (2012); Gleeson (2013) and citations therein), we consider approximations developed in terms of the underlying degree distribution.
We develop a heterogeneous mean-field model for reachability specified in terms of the degree distribution , temporal concurrency, and structural cohesion. In so doing, we implicitly assume that the underlying aggregated network is sufficiently locally tree-like (see, e.g., Melnik et al. (2011)). Assuming that the essential network structure is (at least largely) dictated by the degree distribution , we proceed to develop models for the effects of the temporal concurrency of edges.
As above, let be the probability of edge starting times, and be the probability of an edge duration larger than . We seek the probability that a given chain of edges is temporally consistent given that the first edge has start time . By definition, , since any edge considered in isolation is temporally consistent with itself. Assuming all edge start times and durations are independent and identically distributed, the recursive equation for is developed by considering whether the start time of the next edge in the chain is before or after :
The first integral accounts for the possibility that the next edge in the chain has a start time before the first edge, distributed according to , but duration long enough to be concurrent with the first edge, with probability . Importantly, we then assess that this next edge is the first edge in a chain of edges that are temporally consistent with probability , as opposed to , to account for the requirement that all of the edges on the remaining chain still need to be concurrent or appear after the original edge start time . In contrast, the second integral directly measures the contribution from the next edge starting at time , after the start time of the first edge, along with the probability that an edge starting at time is part of a chain of edges that are temporally consistent. The second integral here ends at time since by our definition for .
Given , we can then determine the probability that a randomly selected chain of edges of path length is temporally consistent, by averaging over the distribution for the start time of the first edge:
where we have here explicitly noted the remaining dependence on for temporal consistency along path length . We note in particular that the initialization yields for all ; that is, a path of length is necessarily temporally consistent.
iii.1 Consideration of node-independent paths
Motivated by Moody and Benton Moody and Benton (2016) — specifically, inspired by their observations about the role of structural cohesion as measured by the average number of node-independent shortest paths — the development of our model approximation proceeds by restricting attention to the node-independent shortest paths between a node pair . This treatment, taking the assumption of locally tree-like structure to an extreme, allows us to treat the probabilities along each node-independent path independently. But in doing so, we recognize that we will undoubtedly fail to take into account all potential detours around temporally-inconsistent parts of these paths. Nevertheless, as we will see below, this approach appears to be relatively accurate for small enough structural cohesion and particularly so at low levels of concurrency, presumably because the probabilities drop off quickly with increasing under such conditions.
As above, we continue to denote the number of node-independent shortest paths between nodes and by . We index those paths by and identify the length of path by . We then seek the probability that path is temporally consistent, which we write as . By the definition of reachability, an ordered node pair is accessible if there is at least one temporally-consistent path from the one node to the other; so we only need to exclude the case that there are no temporally-consistent paths between the nodes. Continuing to assume that we can reasonably consider only the node-independent paths, the probability that at least one of these node-independent shortest paths under consideration between and is temporally consistent follows simply by independence:
That is, given the computation White and E. J. Newman (2001) that separately identifies the number and length of node-independent shortest paths for each node pair in the aggregated network, equation (7) gives us the probability of accessibility between the pair, as restricted along these node-independent paths. In other words, the corresponding element of the accessibility matrix becomes 1 with probability . The expected density of the accessibility graph (the off-diagonal parts of the accessibility matrix) under our approximation thus becomes
By our construction, , though this does not require corresponding similarity in the elements of . We also note that we would ideally consider edge-independent paths, which are by definition at least as numerous as the node-independent ones. But given the observed relationship between structural cohesion and the degrees of node pairs in our random graph results in Fig. A.8 in the Appendix, we expect that the typical numbers of edge-independent shortest paths should not on average be much greater than the node-independent ones in these random cases.
iii.2 Modeling in terms of the distribution of path lengths
The above calculation of requires detailed knowledge of the number and lengths of the node-independent paths between each pair. That is, for all intents and purposes we need the entire structure of the aggregated network upon which to calculate these quantities. However in many network survey conditions, the available information is much more tightly constrained. It can be particularly beneficial under such settings to model outcomes at the level of heterogeneous mean-field theories that use only the degree distribution of the network (see, e.g., Pastor-Satorras and Vespignani (2001); Gleeson (2013)). Since such models are typically derived from “locally tree-like” assumptions (see, e.g., Melnik et al. (2011)), we find it reasonable to consider how we might similarly extend our tree-like assumptions above.
Given a distribution of path lengths, , to be considered as independent candidate paths between a randomly selected pair of nodes, the joint probability that a selected path has length and is temporally consistent is given by . Summing over possible path lengths, we compute the probability that a path from this set is temporally consistent:
where is the largest path length in the distribution . Then the probability that at least one of independent paths between nodes and is temporally consistent simplifies to
Notably, using the probability is this way decouples the considered probabilities along each path from all other possible properties of importance of nodes and (e.g., their degrees). And, again, in making this calculation we have made the (rather strong) assumption that we considered only independent paths.
To model , we note that while an exact analytical measure of the structural cohesion appears to be prohibitively difficult, a trivial application of Menger’s theorem Menger (1927) requires the maximum number of node-independent paths between nodes and to be bounded by the minimum degree of the pair, , where and indicate the degrees of the two nodes. We observe that this upper bound yields a good approximation for the average cohesion in our random graphs, as observed in Fig. A.8. That said, we note by way of contrast that the four empirical networks from Moody and Benton (2016) that we study have much lower cohesions (1.61, 1.34, 1.07 and 1.06) than bounded by this relationship to node degrees (3.18, 3.39, 2.10 and 2.01, respectively). Moreover, in a true tree we require for all node pairs by definition.
By assuming and substituting the approximation into Eq. 10, depends only on the node degrees and , along with the path length distribution under consideration. The resulting approximation of reachability, denoted to distinguish it from the calculation of the previous subsection, then becomes
where is the degree distribution and we have not bothered to correct the contribution in corresponding to pairing a node with itself.
We again emphasize that we have assumed the structural and temporal details of our temporal networks are independent of one another. Therefore, for instance, there are no correlations between node degrees and all of the temporal details absorbed into the terms. The only remaining structural contributions in the approximation are from (1) the empirically observed degree distribution, (2) the selected model for as discussed above, and (3) the selected distribution of path lengths to obtain in Eq. 9.
In theory, one could continue by way of approximating in terms of the degree distributions Katzav et al. (2015); Nitzan et al. (2016). In particular, we note that in going in this direction one is more likely to be able to employ some model for the distribution of geodesic shortest path lengths, as opposed to that for node-independent shortest paths. Similarly, if the path length distribution is to be sampled by some manner, it may be more likely to get a reasonable sample of the geodesic paths versus the node-independent ones. To explore the effect of potentially using the geodesic shortest path length distribution instead of the node-independent path length distribution, we below consider both possibilities by direct use of the empirically observed path length distributions in each network, using to represent the approximation obtained using the shortest path length distribution and for the model using the node-independent path length distribution.
We numerically examine the relationship between temporal concurrency and reachability on different families of networks, comparing with our model approximations. The reachability approximation from Sec. III.1 uses the specific structural information of numbers and lengths of node-independent paths between each node pair. In contrast, the and approximations from Sec. III.2 employ path length distributions over the whole network, using the distributions of shortest paths and of node-independent paths, respectively. We confirm the results for trees (balanced and unbalanced). We then test the calculation on Erdős-Rényi networks at two different densities and on four empirical networks highlighted in Moody and Benton (2016).
iv.1 Reachability on synthetic tree networks
We numerically evaluate the reachabilty and our approximation, varying the temporal concurrency on balanced and randomly unbalanced tree networks with two different sizes (specified by the number of offspring, , and the depth, , of the balanced tree): (i) and , with nodes; and (ii) and , with nodes. The average degrees of these two types of trees are . We numerically evaluate the reachability by the method in Sec. II.4 and compare it with the (Eq. 8) and (Eq. 11) approximations. Since, by definition, a tree provides only a single node-independent path between a node pair, we accordingly set in Eqs. 7 and 10 in calculating and , respectively. Similarly, because the node-independent and geodesic shortest path distributions are thus identical, we note that on a tree.
In Fig. 3, the approximations accurately describe the typical increase in reachability with increasing temporal concurrency for both the balanced and unbalanced trees. We specifically note that the concurrency values plotted here are the expected value given a specified time interval . The results on the unbalanced trees include different network realizations as obtained by the rewiring described in Sec. II.3. We observe a very slight gap between the approximations and for the unbalanced trees, compared to that of the balanced trees, though the predictions are still well within the standard deviation of the data. We hypothesize that the greater heterogeneity in the path length distribution in the unbalanced tree network may be a possible cause of this difference. The result confirms that the approximations accurately estimate the reachability for tree networks using only the path length and degree distributions and , which is of course expected in this setting.
iv.2 Reachability on Erdős-Rényi networks
We next consider the reachability on sparse Erdős-Rényi (ER) networks with and , going in with the assumption that these random graphs will typically have sufficiently locally tree-like structure Newman (2010).
As shown in Fig. 4(a), the approximation that includes the specific path length information between node pairs in the network largely underestimates the reachability. If anything, we should not be surprised that underestimates the true value of like this, since the calculation leading to only considers reachability along node-independent paths. As such, the increased error made by in increasing from to is expected, though the size of the resulting error emphasizes the apparent importance of available detours around these paths even for these small mean degrees. We note in particular that the approximation is quite good at very low concurrency, , where the node-independent shortest paths presumably have greater dominance because longer paths along detours become even more unlikely to maintain temporal consistency. However, the limiting behavior of the approximation in the approach as is clearly incorrect. We hypothesize that the behavior in this limit is possibly controlled by temporal inconsistency of key edge-to-edge transitions important along many paths, which is not an effect considered in the approximation.
We also include the approximations and in Fig. 4. We note that is very similar to here, indicating only modest change in the jump in the approximation obtained using full path length information for each node pair () versus a single path length distribution across all node-independent shortest paths (). The additional gap between and is due to replacing the path length distribution empirically obtained over all node-independent shortest paths with the geodesic shortest path distribution, yielding shorter paths which are slightly more likely to be temporally consistent. Thus, slightly overestimates the reachability at very low temporal concurrency.
To further understand the limitations of our approximations, we explored the reachability frequency of node pairs according to their degrees in ER networks with . We directly measure how many node pairs with given degrees are reachable out of the total number of reachable node pairs:
where and represent the sets of nodes having degree and , respectively. We measured across ER networks with for low [ in Fig. 5(a)] and high temporal concurrency [ in Fig. 5(b)]. Perhaps remarkably, we observe an only very small shift in between these two panels in the Figure, but the shift in the distribution that is apparent indicates that a larger fraction of the reachable pairs for high concurrency involve the degree-one nodes. That this should be the case makes intuitive sense in that the reachability of the degree-one nodes should be more suppressed at low concurrency than that for higher-degree nodes.
Noting that most node pairs are reachable at , we make this observation more explicit by also computing the relative reachability frequency between degree pairs, defined as
where is the reachable matrix of the corresponding static network obtained by aggregating the temporal contacts. Because we only consider largest connected components in our numerical experiments, and the sum in the denominator merely counts the number of such pairs given the selected degrees. Fig. 5(c) and (d) shows the relative reachability frequency for low () and high () temporal concurrency, respectively. In particular, we confirm in panel (d) that almost all pairs are reachable, with for all degree values. In contrast, for , the low-low-degree node pairs are much less likely to be reachable, as seen in Fig. 5(c)). Meanwhile, even at low concurrency, we see that high-high-degree node pairs are already quite likely to be reachable, with nearly of node pairs being reachable in this setting. We note in looking at Fig. 5(c)) that there are very few degree- nodes in these networks, so the apparent dropoff in for these cases is due to averaging over a small number of such cases.
The increasing errors in our model predictions at higher concurrency are directly because of the increasing importance of the neglected detours around the node-independent shortest paths. Recalling the explicit role of the number of such paths between nodes and in our approximations, we ask whether the relationship between reachability and concurrency observed numerically might be captured by assuming some other effective values for . In Fig. 6, we continue to consider reachability on the ER networks. Focusing for this figure only on approximations built from degree distributions and distributions of geodesic shortest paths, we reproduce here our regular approximation using from Fig. 4(b). This approximation overestimates the reachability at low concurrency because the distribution of geodesic shortest paths are shorter on average than the full set of node-independent shortest paths (the latter used in our approximations). As seen in Fig. 6, this overestimate at low concurrency can also be at least partially corrected for by decreasing the effective cohesion used in the approximation formulae to . (For comparison, the average structural cohesion of the underlying ER networks is .) Of perhaps greater interest, we see in Fig. 6 that the underestimated reachability at large appears to be corrected for at this level of modeling by choosing an effective cohesion value of , yielding a good approximation over the range . We believe that identifying such effective cohesion values as modeled from other network features (as opposed to curve fitting here) may be an interesting direction for future work, as a means of extending the range of validity of our tree-based approximations.
iv.3 Reachability on empirical networks
We examined reachability versus concurrency on four sampled empirical networks that were highlighted in the previous work of Moody and Benton (2016). Example networks (i) and (ii) have low structural cohesion — and , respectively — while example networks (iii) and (iv) have relatively higher structural cohesion — and , respectively.
When the network structure is tree-like in the sense of cohesion being near , all three of our model approximations plotted in Fig. 7 appear to be in relatively good agreement with the numerically calculated reachability. In accord with our other results above, we see that our approximation reasonably captures the low-concurrency limiting behavior in Fig. 7(a,b), and while it necessarily underestimates the level of reachability throughout, the deviation from the true reachability curves at low structural cohesion (panels a and b of the Figure) are not as large as at higher cohesion (panels c and d). Moreover, we see that much of this underestimate is effectively corrected in this case by the other modeling steps introduced by the and approximations, again particularly so at lower values of cohesion.
We also note here that the approximation overestimates the reachability in the low concurrency regime in panels a and b, unlike the above-observed behavior for ER graphs. This occurs because the way we constructed the empirical distribution of the node-independent shortest paths for this calculation here counted multiple short paths between nearby nodes. This counting yields on average shorter paths that then overestimate the reachability at small concurrency.
We investigated the overall level of reachability in temporal networks, considering the effects of temporal concurrency and its interplay with network structure, including structural cohesion. We developed a sequence of approximations for reachability based on strong (and potentially inaccurate) assumptions of locally tree-like networks. We then compared our approximations to numerical results for simulated edge timings on a variety of types of networks. In networks that are tree-like in the sense of low structural cohesion, our approximation agrees well with the numerically computed reachability, particularly so for small concurrency. At larger structural cohesion and/or larger concurrency, the importance of having many possible non-independent paths is not captured by our existing approximations.
We further explored the effects in our different model approximations using different levels of detailed network information. Specifically, our model uses the observed numbers and lengths of node-independent shortest paths between each pair of nodes. In contrast, our models employ only the degree distribution with a path length distribution, and we considered differences using distributions of geodesic shortest paths versus node-independent shortest paths.
Whereas our present approximations are more accurate at small temporal concurrency, productive future work might focus on the limiting behavior as . Specifically, our approximation correctly captures at , but the manner of approach as is noticeably incorrect compared to the simulated temporal network measurements, unless we artificially select an increased cohesion value as in Fig. 6. Given the relatively simple shape of the reachability versus concurrency curves, it is perhaps possible that a theory that is only correct in capturing the limiting behavior of reachability might be matched or otherwise combined with our present model to better approximate reachability over the whole interval. Future studies might also explore the possible role of heterogeneity in actor-level concurrency across the network.
We believe the present study, focused on the role of temporal concurrency and structural cohesion in determining reachability, further emphasizes the need to better understand the interplay between the temporal and topological aspects in networks. With a more complete, integrated picture of this interplay, it may be possible in the future to identify different immunization strategies for outbreaks on empirical temporal networks in terms of their estimated structural and temporal properties. For example, such models could then be used to help predict possible benefits obtainable from targeting hub nodes in the underlying contact network versus individual-level or population-level interventions to decrease concurrency.
Research reported in this publication was supported by the Eunice Kennedy Shriver National Institute of Child Health & Human Development of the National Institutes of Health under Award Number R01HD075712 and the James S. McDonnell Foundation 21st Century Science Initiative - Complex Systems Scholar Award grant #220020315. The content is solely the responsibility of the authors and does not necessarily represent the official views of any sponsors.
Appendix A Construction of the exponential degree distribution networks
To complement our tests on ER networks, we additionally consider networks with exponential degree distributions that have been rewired to match the structural cohesion of ER networks having the same mean degree. We construct a network with an exponential degree distribution using the configuration_model function in the NetworkX package in python, which follows steps described in Newman (2003). A degree sequence for nodes is generated by independent draws from the given distribution , where is the desired mean degree. We used the largest connected component from the generated network. We removed self-loop and multi-edges and only accepted the resulting network if the mean degree was within of the desired . We note in particular that this procedure does not properly sample the space of simple configuration model graphs without self-loops and multi-edges Fosdick et al. (2018). But for our present purposes of using these networks as random examples, we do not rely on obtaining a proper sampling of the space. We have not shown figures here exploring our approximations for these exponential degree distribution networks, since they are qualitatively similar to that discussed for ER networks in the main text, in particular having better accuracy at small .
We note that the exponential degree distribution networks as generated to this point of the procedure have natural levels of structural cohesion that are different from ER networks with the same mean degree, as shown in Fig. A.8. Because of the important role of structural cohesion in the present work, we seek to remove this difference between the exponential degree and ER networks. In Fig. A.8(a), we see that the observed structural cohesion in these random graphs is very close to their upper bounds given by averaging over , except at small mean degrees . In Fig. A.8(b), we see that there is very little finite-size effect in the observed structural cohesion values on these graphs. (As an aside, we note that the empirical degree distributions in the largest connected component are generally slightly right-shifted from the imposed degree distribution before restriction to the largest connected component. This shift thereby increases the upper bound for structural cohesion obtained by averaging over .)
To tune networks to a desired structural cohesion — specifically, to make networks with ER and exponential degree distributions but with the same structural cohesion — we rewire the links as follows (see, e.g., Maslov and Sneppen (2002); Jo and Eom (2014)). We randomly choose two links and . If cutting these links does not break the network up into multiple components, we cut these links and then replace them with either and or and . In so doing, we reject new candidate edges that generate multi-edges or self-loops and then select the pair of edges that make the new structural cohesion closest to the desired value. If neither rewiring option successfully moves the cohesion closer to the target value, the original cut edges are restored. By this method, the degree distribution remains constant while the degree-degree correlation and the structural cohesion change. We repeat this rewiring process until either the target value of structural cohesion is obtained (to within a tolerance here of ) or, if the target is not achieved within rewires then the process is restarted with a new random graph generated from the distribution.
Figure A.9 demonstrates the reachability of rewired ER and exponential degree distribution networks with the same average degree () and structural cohesion (). Even though the mean degree and structural cohesions of these random graphs are the same, the relationship between reachability and concurrency are noticeably different in the figure. This observation further motivates the development of our approximations in the main text in terms of degree distributions and path lengths.
- A. Vazquez, B. Rácz, A. Lukács, and A.-L. Barabási, Phys. Rev. Lett. 98, 158702 (2007).
- M. Karsai, M. Kivelä, R. K. Pan, K. Kaski, J. Kertész, A.-L. Barabási, and J. Saramäki, Phys. Rev. E 83, 025102 (2011).
- B. Min, K.-I. Goh, and A. Vazquez, Phys. Rev. E 83, 036102 (2011).
- N. Masuda, K. Klemm, and V. M. Eguíluz, Phys. Rev. Lett. 111, 188701 (2013).
- T. Hiraoka and H.-H. Jo, Scientific Reports 8 (2018).
- P. Holme and F. Liljeros, Scientific Reports 4 (2014).
- J.-C. Delvenne, R. Lambiotte, and L. E. C. Rocha, Nature Communications 6, 7366 (2015).
- E. Colman and N. Charlton, Physical Review E 94 (2016).
- J. Moody and R. A. Benton, Annals of Epidemiology 26, 241 (2016).
- M. Morris and M. Kretzschmar, Social Networks 17, 299 (1995).
- C. H. Watts and R. M. May, Mathematical Biosciences 108, 89 (1992).
- K. Gurski and K. Hoffman, Mathematical Biosciences 282, 91 (2016).
- T. Onaga, J. P. Gleeson, and N. Masuda, Phys. Rev. Lett. 119, 108301 (2017).
- P. Grindrod, M. C. Parsons, D. J. Higham, and E. Estrada, Physical Review E 83, 046120 (2011).
- H. H. K. Lentz, T. Selhorst, and I. M. Sokolov, Phys. Rev. Lett. 110, 118701 (2013).
- P. Holme, The European Physical Journal B 88, 234 (2015).
- J. Moody and D. R. White, American Sociological Review 68, 103 (2003).
- S. Melnik, A. Hackett, M. A. Porter, P. J. Mucha, and J. P. Gleeson, Phys. Rev. E 83, 036112 (2011).
- D. White and M. E. J. Newman, SSRN Electronic Journal (2001).
- NetworkX developer team, “Networkx,” (2014).
- J. Moody, American Sociological Review 69, 213 (2004).
- J. P. Gleeson, S. Melnik, J. A. Ward, M. A. Porter, and P. J. Mucha, Physical Review E 85, 026106 (2012).
- J. P. Gleeson, Phys. Rev. X 3, 021004 (2013).
- R. Pastor-Satorras and A. Vespignani, Phys. Rev. Lett. 86, 3200 (2001).
- K. Menger, Fundamenta Mathematicae 10, 96 (1927).
- E. Katzav, M. Nitzan, D. ben Avraham, P. L. Krapivsky, R. KÃ¼hn, N. Ross, and O. Biham, EPL (Europhysics Letters) 111, 26006 (2015).
- M. Nitzan, E. Katzav, R. Kühn, and O. Biham, Phys. Rev. E 93, 062309 (2016).
- M. Newman, Networks: An Introduction (Oxford University Press, Inc., New York, NY, USA, 2010).
- M. Newman, Computer Physics Communications 147, 40 (2003).
- B. Fosdick, D. Larremore, J. Nishimura, and J. Ugander, SIAM Review 60, 315 (2018).
- S. Maslov and K. Sneppen, Science 296, 910 (2002).
- H.-H. Jo and Y.-H. Eom, Physical Review E 90, 022809 (2014).