Separating temporal and topological effects in walkbased network centrality
Abstract
The recently introduced concept of dynamic communicability is a valuable tool for ranking the importance of nodes in a temporal network. Two metrics, broadcast score and receive score, were introduced to measure the centrality of a node with respect to a model of contagion based on timerespecting walks. This article examines the temporal and structural factors influencing these metrics by considering a versatile stochastic temporal network model. We analytically derive formulae to accurately predict the expectation of the broadcast and receive scores when one or more columns in a temporal edgelist are shuffled. These methods are then applied to two publicly available datasets and we quantify how much the centrality of each individual depends on structural or temporal influences. From our analysis we highlight two practical contributions: a way to control for temporal variation when computing dynamic communicability, and the conclusion that the broadcast and receive scores can, under a range of circumstances, be replaced by the row and column sums of the matrix exponential of a weighted adjacency matrix given by the data.
1 Introduction
Epidemics, viral marketing, cultural diffusion, the distribution of food in ant colonies, and the flow of information within the human brain, are amongst a growing number of applications of network theory which currently reside at the forefront of modern science [1, 2, 3, 4, 5]. Advances in technology continue to promote the accumulation of data, providing an optimistic light in the quest to understand these hugely complex systems. The task then, for researchers across a range of disciplines, is to find optimal ways to measure, model, analyze, and present the vast information at their disposal.
Network theory has proved to be an invaluable resource to exploit data on a large scale. Its great utility comes partly from the its ability to translate problems into a language independent of the particular subject of study. Hence, a “node” can represent entities as diverse as a human, a protein or a word [6, 7, 8]. “Edges” can represent any sort of interaction between the nodes, and concepts such as percolation, diffusion, paths and walks can all serve as models for various processes observed in the real world.
It is remarkable whenever the methods developed for the analysis of one subject matter are applied to seemingly unrelated problems. This occurs frequently when networks are involved. For example, the preferential attachment model can explain the distribution of citations in scientific literature as well as the distribution of popularity in a social network [9, 10], the PageRank algorithm was developed to rank websites but can also measure the risk of cancer in humans [11]. These universalities motivate us to search for ways to measure networks and classify them by their properties; if we have a good description of the network, then we have potentially described a part of the “real world” which we would like to understand, moreover, we also have the entirety of past research and all the accompanying tools developed to help attack the problem.
1.1 Motivation for “dynamic communicability”
Transmissible disease is possibly the best example to demonstrate the versatility of network analysis. Ultimately the theoretical considerations of network epidemiology involve nodes, edges and some knowledge of the disease itself such as the transmission probability, recovery rate and so on. Transmission could occur from one person to another, from one location to another (e.g. connected by air travel), or between species, but in each case the models employed remain well within the confines of the network framework [12, 13, 14]. This also extends to computer viruses [15], Twitter hashtags and internet memes [16, 17], and possibly even cultural transmission on an archeological timescale [18]. Clearly there is much to be gained from having a grounded understanding of how things spread through a network regardless of what that particular network represents.
The work we present here concerns a scenario where we are given a database containing a set of distinct individuals, a set of pairwise interactions, and the exact time at which each interaction happened (see Fig.a). Additionally it is assumed that some transmissible agent was, or potentially could have been, spreading through the network. A practical question which often arises is: “which node is potentially the most significant when it comes to the spread of a transmissible agent?”.
To find the most influential spreader, given data of past interactions, there are several options to consider: the simplest method would be to find the individual with the highest node degree (this could be defined as either number of interactions that person had, or the number of people with whom they interacted). Alternatively we could use global network properties such as the betweenness centrality or closeness centrality of a node, both of which are defined on temporal networks [19]. The most extensive approach currently being used is to build a computational model of the process, adding as many factors into the model as one sees fit; where uncertainty is present, random variables can be used; and the centrality of an individual can be computed by running the model repeatedly and counting the proportion of simulations in which they are infected [12, 20].
Dynamic communicability, which was introduced in [21] and is described in detail here in Section 2, offers a balance between the approach of modeling an epidemiclike process on a network, and simply measuring the size and shape of a network. Here we determine the influence of a node by counting the number of timerespecting walks that began at the node in question. In essence, we are using a model which assumes that a transmissible agent moves from one node to another at the exact time that an interaction takes place (which is known from the data) and with a given transmission probability. The fact that it is a walk (as opposed to a path) means that the agent can revisit previously infected nodes. Assuming this, and supposing that the pathogen is administered at node , the broadcast score of tells us how large the expected outbreak will be. Supposing the pathogen is administered to a random unknown node, the receive score of tells us how likely that pathogen is to reach .
1.2 Separating dependencies
In this paper we interrogate the two dynamic communicability metrics: broadcast score and its opposite, receive score. Through theoretical approaches we will examine how these centrality measures respond to different temporal network structures. Further, we derive methods to deconstruct the dynamic communicability measures into “time dependent” and “structure dependent” components. The formulae we derive achieve the same result as “shuffling” (randomly permuting) either the structural or temporal columns of the temporal edgelist respectively. This is an increasingly common technique used to determine the importance of various relationships within a database [22, 23, 24, 20]. Here we employ this technique to unpick, from the information available, the factors most relevant to determining the outcome of a contagionlike process.
The following section explains in detail the dynamic communicability metrics. In Section 3 we describe a stochastic model which can be tuned to reproduce various properties of the data. The main results from the model are a set of “shortcut formulae” for decomposing the dynamic communicability metrics into time dependent and structure dependent elements in an efficient way. We demonstrate these results on two publicly available data sets, which are described in Section 4, and the results are presented in Section 5. Section 6 summarizes the findings from this work which we consider most significant.


2 Definitions of “Broadcast score” and “Receive score”
Communicability, as introduced in [25], is a measure of centrality based on the concept of “walking” on a network. A walk is any sequence of nodes in which one entry may only follow another if there is an edge in the network which connects them (if the network is directed then consecutive entries must follow the direction of the edge). The extension to temporal networks, in which edges exist only at specified temporal instances, was introduced in [21] and further developed in [26] and [27]. When dealing with temporal edges, we consider node sequences in which consecutive nodes are connected by an edge and, additionally, the time of that edge is later than (or at the same time as) its predecessor. These are referred to as “timerespecting” walks.
Based on this premise it is possible to quantify the relationship between any two nodes: the “dynamic communicability” from node to node , denoted , is a measure of the relative likelihood that a random walker injected into the network at will eventually pass through . If we let be the number of timerespecting walks of length that begin at and end at , then
(1) 
The value here is analogous to the probability of transmission (across an edge) in an epidemic spreading process. When chosen to be sufficiently small, it ensures that long walks are discounted heavily while short walks contribute more to the dynamic communicability metric. Several alternative approaches have been proposed as ways to downweight walks based on their length. The original communicability defined on a static network discounts walks of length by dividing their number by , whereas when the temporal equivalent was introduced the exponential discounting shown in Eq.(1) was used. The measure introduced in [26] combines both. The measure introduced in [27] extends Eq.(1) by additionally discounting walks according to their duration in time.
For the first centrality measure, known as the “Broadcast score” of a node , we compute the sum of all the discounted walks that begin at (). Similarly, to compute the second centrality measure, known as the “Receive score” of a node , we sum all of the discounted walks that end at ().
3 The model
We use a simple yet versatile stochastic model to generate temporal networks. The parameters of the model can be manipulated to create synthetic data with properties similar to a wide range of temporal networks including those observed in many real interactive systems. Let there be nodes. The model proceeds over a series of discreet timesteps by the following rule:
At time , with probability , a directed edge exists from node to node .
The adjacency matrix at time , , will have a in location with probability and be otherwise (it might often be the case that will be a matrix of zeros). The dynamic communicability matrix, as introduced in [21], over the sample (starting at and ending at ) is given in general by
(2) 
where denotes the identity matrix. But, as suggested in [28], we do not want to count paths that take multiple moves in a single timestep, so we will instead look at the variant definition
(3) 
Eqns. (2) and (3) are equivalent when for all (as this is the only way can be true) i.e. provided that no walks of length 2 ever exist within a single time slice. Under these conditions our analysis also applies to the version of dynamic communicability defined in [26], where , since when .
The timedependent matrix
(4) 
to a large extent, describes the entire structure of the network and its evolution over time. Our approach to exploring dynamic communicability of networks generated by this model involves considering the various forms that can take; then examining the expectation of as we iteratively increase the number of terms on the right hand side of Eq.(3). The following analysis requires that the values contained in are small enough that the probability of generating a matrix containing a path of length or more is negligible. The model is therefore more applicable to temporally highly resolved datasets as opposed to those in which a relatively small number of temporal instances are recorded.
3.1 Receive score
If we think about constructing iteratively i.e. starting at time with , then multiplying on the right by , then again by the next term, then the next etc., then
(5) 
where indexes the number of times the iteration has been performed. After iterations we have the desired . The effect of one iteration can be seen on a example:
(6) 
In general, provided for all , if the th entry of is then the th column is multiplied by and added to the th column. Since the receive score after iterations is equal to the (row) vector of column sums of ,
(7) 
we can describe its evolution as increases as follows: at each iteration choose and with probability and update by setting
(8) 
In matrix notation this is
(9) 
3.2 Expectation of receive score
The receive score is dependent on . To examine this dependence, we focus on the expectation of , denoted , which is computed by taking the mean over many networks generated by the described model for some given . For analytical considerations we assume that all of the are well approximated by their mean. A similar approach is found in [29]. The growth of is then described by
(10) 
The right hand side here equation sums over all possible changes that can happen to and their associated probabilities. This is equivalent to replacing in Eq.(9) with the expectation of , which happens to be . We have
(11) 
For large timescales, we can say that , giving
(12) 
An almost identical derivation can be performed to find a similar expression for . In this case, instead of starting the iterative process at and multiplying on the right, as in Eq.(5), we start at time with and iterate by multiplying on the left, i.e . Following similar steps we arrive at
(13) 
where is a column vector of the expectation of the broadcast scores. Our theoretical results stem from these two equations, solutions can be found for various forms of , here we mention a few simple cases.
3.3 Timeindependent matrix
Equivalence to shuffling the time column
Consider a temporal edgelist where the “time” column has been shuffled as shown in part (iii) of Fig.(a). The overall number of interaction events between each pair of nodes is unchanged, however each of these events now occurs at some random point in time. The timeseries of interaction events from node to node can be modeled by a Bernoulli process, i.e. at each discreet timestep there is a fixed probability that an edge from to will exist. If we have a sufficiently large amount of data then the matrix of these timeindependent probabilities, which happens to be , can be approximated easily as we show in this section. The above result can then be used to predict the dynamic communicability metrics of the timeshuffled edgelist.
We can infer from the data by constructing a weighted adjacency matrix where is the total number of times each edge appears in the temporal edgelist. To infer a timeindependent probability that an edge exists at time (for any ) we normalize by the number of time steps in the sample:
(16) 
Since lies outside the time for which data is sampled, is a zero matrix, giving where is a row vector of length and all entries are . Substituting the matrix associated with Eq.(16) into Eq.(14), and into the equivalent result for , we arrive at the concise formulae for computing the expectation of the broadcast and receive scores of a timeshuffled network,
(17) 
and
(18) 
respectively. A very fast opensource algorithm for solving the matrix exponential for large matrices has recently been developed [30]. Applying this method gives a prediction for the outcome of averaging a large number of shuffled temporal edgelists where the “time” column has been shuffled. The comparison between the prediction and the actual shuffled data is shown in Fig.(2).
Heterogenous “send” and “receive” model
Consider a temporal edge list for which all three columns have been shuffled as in part (iv) of Fig.(a). While much of the relational information will be lost, the number of times each node is found in the “source” column will be unchanged and therefore the outgoing degree of each node is retained, similarly the incoming degree is unchanged by the shuffling of the “Target” column. This process bears much resemblance to the configuration model of [31] in which each node has a given degree but the pairwise connections are randomized. Related models, which replace the exact degree sequence with a sequence of fitness variables (giving the propensity of each node to attract edges), have been studied [32]; this happens to be a case where Eqs.(17) and (18) can be solved analytically.
Let be the probability that node has an outgoing edge in any given timestep (we have chosen the letter as this represents the ‘sending’ of information), and let be the probability that has an incoming edge in any given timestep ( to represent the ‘receiving’ of information). With the vector notation, and , we have
(19) 
We add the condition that then the expected number of edges per timestep is (meaning that when comparing to data we can treat as the total number of interactions). Under these conditions the solution to Eq.(12) (see Appendix A.1) is
(20) 
Two main conclusions come from this result: firstly, the receive score of a node is proportional to its propensity to attract incoming edges (for broadcast score it is the outgoing edges, see Appendix A.1). Second, as the sample size increases the score increases exponentially.
3.4 Timedependent matrix
A general solution to Eq.(12) for any does not exist, we instead incorporate a limited amount of temporal information by expanding the “send” and “receive” model of the previous section. Suppose we have the model from Section 3.3.2 with the modification that the “receive” vector is now a function of time, say , then Eq.(12) reduces to
(21) 
(see Appendix A.2). Eq.(21) allows us to examine special cases where the order in which messages are sent affects the receive score of each node.
Simple timedependent example
Before we derive a result applicable to realworld data, we introduce a simple example to provide some intuition for the timedependence. We consider the case where each node is active only once during the duration of the sample. Suppose node receives edges at time . We can write the corresponding vector using the Dirac :
(22) 
The justification for this choice of is that the expected number of messages received by over some interval will be if the time interval includes . For convenience we suppose, without loss of generality, that for all . In Appendix A.3 Eq.(20) is solved with this form of to get
(23) 
This result shows that nodes which interact later in the sample will have, on average, exponentially higher receive scores. In a similar way, it can be shown that a node which acts earlier in the sample has an exponentially higher broadcast score.
Incorporating empirical data
Suppose that for each node we know the time of every received edge but do not know where the edge originated from (this corresponds to the source shuffled network). We can achieve this by choosing
(24) 
where is the set of edges for which is the target and is the time at which edge was present. More important, however, is the function which we define as the number of messages that have been received by between and , and can be expressed as
(25) 
To achieve the correct normalization (for the expectation of the total number of edges to agree with the data) we choose to be the probability that any given edge is sent from , this is inferred using
(26) 
The solution to Eq.(12), which we derive in Appendix A.4, is
(27) 
This formula predicts the average of the receive score over many networks generated by shuffling the Source column in the original data. The analytical prediction and average shuffling results are shown in Fig.(2). In our data analysis we also use an equivalent formula to predict the outcome of shuffling the target column and calculating the broadcast score. The derivation is similar to that of Eq.(27). We get
(28) 
where is the number of messages that have been sent by between and , is the timeindependent probability that receives a message in any given timestep, and is the set of edges for which is the source.
4 Data
4.1 Enron
We downloaded the entire Enron email corpus that was made publicly available during an investigation by the Federal Energy Regulatory Commission into the events leading to its bankruptcy [33]. The data contains the mailing history of 150 Enron employees between 1999 and 2003. A folder exists for each of the named employees, each of which contain a number subfolders, and each subfolder contains a number of text files; the text files contain the emails themselves and some metadata. The naming of the folders is not consistent across employees; most sent emails belong to a folder labelled “sent”, “sent email”, or something similar but there are also many exceptions. A consistent format was found across all the text files with the timestamp located on the first line, the “From” field appearing on the second, and the “To” field starting on the third line and often extending over several lines where emails have been sent to multiple recipients.
We crawled every text file within subfolders named “sent”, “sent_items” and “_sent_mail’, reading the specific lines which correspond to the “From” field, the “To” field and the timestamp. Within the “From” and “To” lines we found all substrings which resemble a distinct email address i.e. bound on either side by blank spaces and contain the “@” symbol. From these data we constructed a temporal edge list of the form shown in Fig.(a) where the node IDs are email addresses. Multiple edges were created for emails with multiple recipients. In several cases the email addresses found in the “From” field, across the emails of an individual employee, would not always be identical. Usually this was because of the use of email aliases although on a small number of occasions this was clearly not the case. At our own discretion, we replaced the node ID of all aliases relating to an employee with a single node ID.
Many of the emails were sent to addresses outside of the corporation, these were removed from our data. We also found that some employees in the dataset had very little or no activity; we therefore reduced the sample to only those who have both sent, and received, at least one email to other users within the sample. After trimming, the network has nodes and a total of temporal edges.
We also incorporated information regarding the roles of each employee according to enron.org [34]. The following abbreviations have been used for the legend in Fig.(b): EMPemployee, TRAtrader, LAWlawyer, MANmanager, DIRdirector, VPvice president, MDmanaging director, PREpresident, CEOchief executive, ???unknown.
The sample of emails we have chosen to use is by no means complete, however, it is our belief that the methods used to sample this data avoid introducing any biases which would compromise the results we present.
4.2 Sociopatterns hospital ward
We downloaded the Hospital ward dynamic contact network from the Sociopatterns website (refer to [35] for details). The data was collected using proximity sensors attached to each participant. In the original data, every instance (instances are recorded every 20 seconds) in which two participants are “interacting” (i.e. within a given proximity of each other) is presented in a temporal edge list of the form shown in Fig.(a). Consequently, interactions which occur for a prolonged duration appear in the data multiple times so we performed the following reduction: where the same pair of participants were found to be interacting on multiple consecutive timesteps, all but one of the corresponding rows in the edge list were removed, leaving only the first of such instances. For each remaining row we create two edges in the processed temporal edgelist, one in each direction between the pair of participants interacting, both edges have the same timestamp. Our analysis therefore considers transmission to occur at the first moment an interaction begins and does not depend on its duration. After processing, the network has nodes and a total of temporal edges.
4.3 Algorithms
Much of the related literature formulates the problem of computing a dynamic communicability matrix using a series of linear algebra operations [21]. This approach utilizes the adjacency matrix for the network at each time step (see Fig.(a)) and assumes that within each timeslice the hypothetical random walker can traverse edges instantaneously, i.e. without requiring that time move forward for them to perform the movement. Consequently, if there is any cycle within a single timeslice (including for example an edge from to and another from to ) then there will be paths of infinite length, meaning that must be restricted to a particular range of values to guarantee convergence [36].
In this work we remove the assumption that a walk can traverse more than one edge per time slice (as suggested in [28]). Moreover, we suggest the following recursive approach to computing the dynamic communicability metrics which avoids the need to perform any matrix operations.
Suppose we have a network with each temporal edge denoted by a triple where is the source node, is the target node and is the time. Rewriting Eq.(9) with this notation we have
(29) 
with . Then the receive score for node computed between time and is given by
(30) 
Similarly, for the broadcast score we have
(31) 
with . Then the broadcast score for node computed between time and is given by
(32) 
If we were to first compute the vector for all the nodes , then commit these values to memory, then compute for all , and continue in this fashion, then the addition operations we perform are precisely the same as those performed in the established matrix multiplication method [28]. The advantage of this implementation, however, is that the score for a single node can be computed lazily, that is, without wasting unnecessary time. (It is important, when using this method, to use memoization to avoid repeating a large number of calls to the functions and ). Computationally we can be certain that these algorithms are at least as fast as the current alternatives.
5 Results
5.1 Modeling
In Section 3 we derived formulae which predict the outcome of calculating the broadcast score for a large number of shuffled temporal edgelists. The amount of error in these predictions is illustrated in Fig.(2) where we see that Eq.(18) gives accurate results regarding temporal edge lists with the timecolumn shuffled. The corresponding result, Eq.(27), appears to be less reliable however, owing to the computational cost of calculating the receive score multiple times, we chose only to test a very small sample. This contradicts the assumptions of the analytical model; particularly the assumption made in Section 3.2 that the score in an individual generation of the probabilistic model is well approximated by its mean, at time , over many generations. It is likely that in a small dataset that there is a high variance in the distribution of receive scores and we expect the prediction to improve as the number of interactions increases. The creation of these “shortcut” formulae allowed us to perform data analysis on two large scale temporal edgelists which would have otherwise taken an inconvenient amount of computation.
5.2 Data analysis
Using the method described in Section 2 we calculated the broadcast score for the Enron email corpus and the receive score for the Sociopatterns hospital ward experiment. We have chosen values of that produce visually interesting figures; when too small the calculation of broadcast and receive scores are dominated by the contribution from walks of length and therefore become equivalent to the outdegree and indegree respectively. Conversely, when is too large, long walks dominate the scores and the edges with early timestamps determine the outcome. To anyone considering using these methods we recommend that a range of values be tested, each one potentially exposing different information about the behavior of the system. For an indepth analysis of and its interpretation see [?].
The results are presented first in Fig.(3). In Fig.(4) we compare the result of each individual with their overall activity. We note two observations from Fig.(4): one Enron employee (a director) stands out as having an unusually high broadcast score when compared to a low amount of overall activity (broadcast rank , degree rank ), and that patients in the hospital ward tend to have large receive scores considering their overall activity. The results are also presented in Appendix B.
Fig.(b) shows the expected results of performing various shufflings, we can think of the axis in these plots as a measure of how much the score of each individual depends on temporal properties, and the axis for structural properties. We see that the outlier from the Enron dataset is, remarkably, unremarkable regarding both of these measures and neither alone can explain their high broadcast score (timeshuffled rank and targetshuffled rank , both lower than the actual broadcast rank of ). However, the fact that both shuffled ranks are higher than the degree rank suggests that the individual in question is sending emails very economically, i.e. sending to the most efficient recipients, and choosing the optimal moments in time to send. From this example it appears that the contributions from both factors add to the overall broadcast score.
An alternative interpretation is that the individual in question was feeding information into the network which was consequently being disseminated in a way that inflates their broadcast score (although similar results are not found for the CEOs who we would expect to be influential in the same way). The individual in question was a lobbyist for the corporation, after a very brief investigation we did not determine a particular reason why they should be significantly influential.
From Fig.(b) it is apparent that shuffling the time column can cause large changes to the receive rank of a participant whereas the sourceshuffling appears to be less effective. This is because the temporal activity of the participants deviates significantly from a Bernoulli process (that is assumed in the timeindependent model). More specifically, nodes exist which are inactive towards the beginning of the sampling period but have a lot of activity at later timesteps. The receive score of these nodes is amplified by the exponential increase over time that is indicated by the very simple example in Section 3.4.1. Those which are active early on in the sampling period but have little or no activity at later times will have lower receive scores. When such effects dominate the outcome the effect of timeshuffling is significant.
While we do not discuss here the broadcast scores for the Hospital data, or the receive scores for the Enron email data, the results can be seen in Fig.(6), Fig.(7) and Fig.(b).
6 Discussion
As datadriven industries increasingly find value in targeting the most central, most influential, individuals, it is important to scrutinize the methods and tools that network science is promoting. The idea that there is one magic formula which can produce a meaningful result regardless of the system in question is firstly, wrong, and secondly, a counterproductive way of thinking. Here we have scrutinized the dynamic communicability metrics and found that temporal variation can have a stronger effect in some systems, like the hospital ward, than in others, like Enron. We have found efficient shortcut formulae to quantify the temporal component by randomizing the structural factors and likewise quantify the structural component by randomizing the temporal factors. Those who have data and wish to analyze dynamic communicability should use these methods to add more dimensions, and more depth, to their analysis.
When we look at the simple example of Fig.(a), we can compute the broadcast scores and find that node is ranked number one. We can then ask why is the most influential broadcaster and find that it is not because it was the most active ( was in fact the most active), but because of a complex interplay of temporal and structural factors; was the first to communicate, and importantly, one of those early edges was received by who was subsequently the most active node. Looking at large datasets it is tedious to try to deconstruct every sequence of contacts that caused each individual to achieve its score. Instead, we have introduced meaningful statistics, i.e. the results of shuffling, that provide insight into the interplay of temporal and structural factors.
Several specific phenomena are commonly found in social systems whose effects are nullified by the shuffling process. The structure (or topology) of complex networks has been extensively studied and many elements have repeatedly been found across different systems [38]. Degree heterogeneity is one such topological feature which is not nullified by shuffling. On the other hand, features like community structure, assortativity and clustering are likely to play a significant role in determining the communicability in most applications of this work [39, 40, 41]. Similarly, the nonshuffled data is likely to exhibit certain temporal features. Recent studies of communication data, similar to the Enron dataset, show that activity generally occurs in bursts [42]. Others focus on the effect of circadian cycles which are likely to occur in the hospital ward data [43].
Clustering and burstiness both increase the number of walks which revisit nodes. For most contagion processes these walks would not be permissible since, for example, many diseases can only be contracted once, similarly a piece of information can only be attained once (this is possibly the reason why bursty networks have been shown to slow the spread of information compared to their temporally shuffled equivalent [24]). This remains a fundamental problem of the dynamic communicability metric which should not be overlooked.
Another issue that ought to be considered when using the dynamic communicability metrics is the effect of a bounded sampling window. Take for example the simple example of Fig.(a). Here has the highest broadcast score because it is the first node to create outgoing edges. Had we observed the system just one timestep earlier we might have found one or more edges from to , thus making the highest ranked broadcaster above . This is a general issue; our analytical results tell us that the earlier interactions contribute exponentially more than those which occur later; therefore the first node involved in the first recorded interaction will, by chance, receive an unduly high broadcast score. In the case of the receive score, interactions that occur late in the sample inflate the score of the involved nodes. The advancement of dynamic communicability presented in [28], that assumes infectiousness decays in the time between interactions, may mitigate these problems to some extent. We conclude this paper by suggesting two possible alternative solutions:
6.1 Control for temporal variation
Eq.(27) gives the expectation of the receive score based on temporal variation. It can therefore be considered as a control to compare to the actual score. Further, we suggest that a normalized version of the receive score would be a more appropriate measure to compare individuals in the same network. The normalized version is the ratio of the actual score, computed using Eq.(30), and its expectation, computed using Eq.(27).
6.2 Remove temporal variation
Alternatively, we ignore temporal variation altogether; in many circumstances this is sensible since the temporal variation over the duration of the sample is not usually expected to be the same in the future (unless perhaps it is driven by a cyclic process). Without knowledge of when each future interaction will occur, the Bernoulli process used in the timeindependent model is a suitable choice. In such a case, the past data provides an estimate of how active each node will be, but the timing of their interactions remains random. The matrix exponential in Eqs.(17) and (18), can be computed very efficiently to give these approximations to the receive score and broadcast score. Incidentally, the matrix exponential has previously been proposed as a centrality measure [25, 44].
Acknowledgements
We thank Georgios Giasemidis for helpful discussions at early stages of the project. We are grateful to Shweta Bansal for helpful comments regarding the structure and presentation of the manuscript and to Isabel Chen for feedback in the late stages. E.R.C was funded in part by RCUK Digital Economy programme via EPSRC Grant EP/G065802/1 ‘The Horizon Hub’ and in part by NSF Grant No. as part of the joint NSFNIHUSDA Ecology and Evolution of Infectious Diseases program.
Appendix A Modeling
a.1 Heterogeneous “send” and “receive” model
The Model:
In any given timestep, the probability that has an out going edge is , the probability that it has an incoming edge is .
Making no further assumptions about who communicates with whom, letting and both be column vectors we have the general stochastic model with
(33) 
There are at least two ways to find the expectation of broadcast and receive scores for this model. It is possible to write down an expression for the which can then be substituted into Eq.(15). An alternative method is to solve Eq.(12) directly. First we express Eq.(12) in terms of our new variables:
(34) 
Multiplying both sides on the right by gives
(35) 
which is a differential equation describing the timeevolution of , a scalar variable. This has the solution
(36) 
Substituting the result back into Eq.(34) we get
(37) 
Which has the solution
(38) 
In a similar way one can show that the expectation of the broadcast score is
(39) 
a.2 Timedependent matrix
The Model:
At time , the probability that has an out going edge is , the probability that it has an incoming edge is
Eq.(12) now becomes
(40) 
Multiplying both sides on the right by we get
(41) 
This equation now only includes scalar functions of so we can solve to get
(42) 
Substituting this back into Eq.(40) we have
(43) 
Since .
a.3 Simple timedependent example
The model:
At time person is on the receiving end of edges. As before, the number of outgoing edges is determined by a timeindependent probability .
Clearly, after iterations the process will end so we use and as the initial and final conditions respectively. To find the broadcast score of a node we solve Eq.(43) with
(44) 
where is a scalar and is the Dirac delta. The justification for this version of is that the expected number of messages sent by over some timeinterval will be if the time interval includes . Without loss of generality we can say meaning that node sends first, then node and so on. First we focus on expressing in a simpler form. Since
(45) 
(This result derives from the fact that the integral of the Dirac delta between and is the Heaviside step function .) we have
(46) 
Substituting this into Eq.(43) then integrating over the whole sample gives
(47) 
The integral is solved by the translation property of the Dirac delta and we have
(48) 
a.4 Incorporating empirical data
The model:
Let be the set of edges for which is the target node, and be the time at which edge was present. As before, is the timeindependent probability for to be the source of an edge.
We achieve this by choosing
(49) 
We can choose the set and the corresponding in a way that recreates exactly what is observed in the target and time columns of an empirical temporal edgelist. We introduce , the number of messages sent by between time and time , this is expressed
(50) 
giving
(51) 
and therefore Eq.(43) can be expressed
(52) 
Integrating over the entire duration of the sample gives
(53) 
Finally, using the translation property of the Dirac delta function we have
(54) 
Appendix B Rankings
b.1 Sociopatterns hospital ward receiverank
Rank  None  Timeshuffled  Sourceshuffled  Time and Source 
1  1115 (NUR)  1115 (NUR)  1115 (NUR)  1115 (NUR) 
2  1210 (NUR)  1210 (NUR)  1210 (NUR)  1210 (NUR) 
3  1190 (NUR)  1207 (NUR)  1295 (NUR)  1295 (NUR) 
4  1295 (NUR)  1295 (NUR)  1157 (MED)  1207 (NUR) 
5  1109 (NUR)  1109 (NUR)  1190 (NUR)  1157 (MED) 
6  1629 (NUR)  1164 (NUR)  1629 (NUR)  1164 (NUR) 
7  1149 (NUR)  1193 (NUR)  1149 (NUR)  1193 (NUR) 
8  1157 (MED)  1157 (MED)  1109 (NUR)  1144 (MED) 
9  1205 (NUR)  1658 (ADM)  1205 (NUR)  1109 (NUR) 
10  1658 (ADM)  1190 (NUR)  1098 (ADM)  1149 (NUR) 
11  1193 (NUR)  1098 (ADM)  1144 (MED)  1221 (MED) 
12  1196 (NUR)  1144 (MED)  1193 (NUR)  1098 (ADM) 
13  1098 (ADM)  1114 (NUR)  1196 (NUR)  1159 (MED) 
14  1144 (MED)  1149 (NUR)  1181 (NUR)  1196 (NUR) 
15  1181 (NUR)  1181 (NUR)  1221 (MED)  1181 (NUR) 
16  1625 (NUR)  1221 (MED)  1658 (ADM)  1190 (NUR) 
17  1164 (NUR)  1159 (MED)  1164 (NUR)  1260 (MED) 
18  1221 (MED)  1625 (NUR)  1130 (MED)  1658 (ADM) 
19  1130 (MED)  1365 (PAT)  1625 (NUR)  1205 (NUR) 
20  1365 (PAT)  1196 (NUR)  1260 (MED)  1114 (NUR) 
21  1383 (PAT)  1205 (NUR)  1159 (MED)  1191 (MED) 
22  1114 (NUR)  1245 (NUR)  1114 (NUR)  1625 (NUR) 
23  1260 (MED)  1260 (MED)  1365 (PAT)  1148 (MED) 
24  1547 (PAT)  1191 (MED)  1207 (NUR)  1365 (PAT) 
25  1159 (MED)  1378 (PAT)  1148 (MED)  1245 (NUR) 
26  1702 (PAT)  1629 (NUR)  1660 (MED)  1130 (MED) 
27  1207 (NUR)  1148 (MED)  1383 (PAT)  1202 (NUR) 
28  1378 (PAT)  1179 (ADM)  1671 (ADM)  1179 (ADM) 
29  1660 (MED)  1130 (MED)  1378 (PAT)  1629 (NUR) 
30  1671 (ADM)  1383 (PAT)  1202 (NUR)  1378 (PAT) 
31  1148 (MED)  1352 (PAT)  1352 (PAT)  1352 (PAT) 
32  1401 (PAT)  1202 (NUR)  1702 (PAT)  1383 (PAT) 
33  1352 (PAT)  1391 (PAT)  1401 (PAT)  1391 (PAT) 
34  1307 (PAT)  1702 (PAT)  1142 (NUR)  1105 (NUR) 
35  1362 (PAT)  1362 (PAT)  1547 (PAT)  1108 (NUR) 
36  1391 (PAT)  1307 (PAT)  1391 (PAT)  1362 (PAT) 
37  1232 (ADM)  1374 (PAT)  1485 (NUR)  1142 (NUR) 
38  1469 (PAT)  1393 (PAT)  1307 (PAT)  1660 (MED) 
39  1202 (NUR)  1105 (NUR)  1469 (PAT)  1485 (NUR) 
40  1142 (NUR)  1401 (PAT)  1232 (ADM)  1307 (PAT) 
41  1245 (NUR)  1363 (PAT)  1362 (PAT)  1702 (PAT) 
42  1179 (ADM)  1660 (MED)  1245 (NUR)  1401 (PAT) 
43  1108 (NUR)  1395 (PAT)  1179 (ADM)  1168 (MED) 
44  1701 (PAT)  1142 (NUR)  1108 (NUR)  1100 (NUR) 
45  1460 (PAT)  1168 (MED)  1460 (PAT)  1393 (PAT) 
46  1168 (MED)  1108 (NUR)  1261 (NUR)  1374 (PAT) 
47  1784 (PAT)  1547 (PAT)  1613 (NUR)  1613 (NUR) 
48  1261 (NUR)  1320 (PAT)  1701 (PAT)  1363 (PAT) 
49  1152 (MED)  1100 (NUR)  1168 (MED)  1395 (PAT) 
50  1209 (ADM)  1671 (ADM)  1191 (MED)  1246 (NUR) 
51  1485 (NUR)  1327 (PAT)  1769 (PAT)  1261 (NUR) 
52  1191 (MED)  1701 (PAT)  1784 (PAT)  1671 (ADM) 
53  1769 (PAT)  1232 (ADM)  1152 (MED)  1327 (PAT) 
54  1416 (PAT)  1469 (PAT)  1209 (ADM)  1701 (PAT) 
55  1100 (NUR)  1385 (PAT)  1416 (PAT)  1547 (PAT) 
56  1374 (PAT)  1209 (ADM)  1100 (NUR)  1385 (PAT) 
57  1105 (NUR)  1399 (PAT)  1385 (PAT)  1232 (ADM) 
58  1385 (PAT)  1460 (PAT)  1105 (NUR)  1460 (PAT) 
59  1395 (PAT)  1152 (MED)  1363 (PAT)  1469 (PAT) 
60  1393 (PAT)  1116 (NUR)  1374 (PAT)  1209 (ADM) 
61  1363 (PAT)  1261 (NUR)  1395 (PAT)  1152 (MED) 
62  1613 (NUR)  1769 (PAT)  1393 (PAT)  1320 (PAT) 
63  1535 (ADM)  1377 (PAT)  1327 (PAT)  1238 (NUR) 
64  1327 (PAT)  1485 (NUR)  1320 (PAT)  1769 (PAT) 
65  1320 (PAT)  1323 (PAT)  1373 (PAT)  1116 (NUR) 
66  1373 (PAT)  1416 (PAT)  1535 (ADM)  1416 (PAT) 
67  1525 (ADM)  1613 (NUR)  1525 (ADM)  1377 (PAT) 
68  1246 (NUR)  1305 (PAT)  1246 (NUR)  1399 (PAT) 
69  1238 (NUR)  1246 (NUR)  1238 (NUR)  1305 (PAT) 
70  1116 (NUR)  1784 (PAT)  1377 (PAT)  1784 (PAT) 
71  1399 (PAT)  1373 (PAT)  1116 (NUR)  1323 (PAT) 
72  1377 (PAT)  1238 (NUR)  1399 (PAT)  1373 (PAT) 
73  1305 (PAT)  1535 (ADM)  1305 (PAT)  1535 (ADM) 
74  1323 (PAT)  1332 (PAT)  1323 (PAT)  1332 (PAT) 
75  1332 (PAT)  1525 (ADM)  1332 (PAT)  1525 (ADM) 
b.2 Enron email broadcast rank
Rank  None  Timeshuffled  Targetshuffled  Time and Target 
1  tana.jones (???)  tana.jones (???)  tana.jones (???)  tana.jones (???) 
2  mark.taylor (EMP)  sara.shackleton (???)  sara.shackleton (???)  jeff.dasovich (EMP) 
3  sara.shackleton (???)  mark.taylor (EMP)  jeff.dasovich (EMP)  sara.shackleton (???) 
4  carol.clair (LAW)  carol.clair (LAW)  mark.taylor (EMP)  bill.williams (???) 
5  jeff.dasovich (EMP)  marie.heard (???)  chris.germany (EMP)  mike.grigsby (MAN) 
6  eric.bass (TRA)  jeff.dasovich (EMP)  eric.bass (TRA)  chris.germany (EMP) 
7  steven.kean (VP)  mark.haedicke (MD)  carol.clair (LAW)  mark.taylor (EMP) 
8  mark.haedicke (MD)  d..steffes (VP)  susan.scott (???)  eric.bass (TRA) 
9  elizabeth.sager (EMP)  elizabeth.sager (EMP)  scott.neal (VP)  john.arnold (VP) 
10  mary.hain (LAW)  eric.bass (TRA)  drew.fossum (VP)  scott.neal (VP) 
11  richard.sanders (VP)  steven.kean (VP)  mike.grigsby (MAN)  phillip.love (???) 
12  phillip.allen (???)  louise.kitchen (PRE)  david.delainey (CEO)  phillip.allen (???) 
13  susan.scott (???)  richard.sanders (VP)  phillip.allen (???)  susan.scott (???) 
14  bill.williams (???)  bill.williams (???)  sally.beck (EMP)  debra.perlingiere (???) 
15  chris.germany (EMP)  mike.grigsby (MAN)  debra.perlingiere (???)  kimberly.watson (???) 
16  mike.grigsby (MAN)  mary.hain (LAW)  john.arnold (VP)  steven.kean (VP) 
17  sally.beck (EMP)  kim.ward (???)  bill.williams (???)  louise.kitchen (PRE) 
18  drew.fossum (VP)  phillip.love (???)  elizabeth.sager (EMP)  sally.beck (EMP) 
19  david.delainey (CEO)  chris.germany (EMP)  richard.sanders (VP)  david.delainey (CEO) 
20  matthew.lenhart (EMP)  gerald.nemec (???)  gerald.nemec (???)  carol.clair (LAW) 
21  gerald.nemec (???)  phillip.allen (???)  mark.haedicke (MD)  mary.hain (LAW) 
22  phillip.love (???)  matthew.lenhart (EMP)  matthew.lenhart (EMP)  drew.fossum (VP) 
23  scott.neal (VP)  kay.mann (EMP)  phillip.love (???)  d..steffes (VP) 
24  d..steffes (VP)  sally.beck (EMP)  steven.kean (VP)  gerald.nemec (???) 
25  kay.mann (EMP)  john.arnold (VP)  mary.hain (LAW)  matthew.lenhart (EMP) 
26  debra.perlingiere (???)  david.delainey (CEO)  darron.giron (EMP)  darron.giron (EMP) 
27  john.arnold (VP)  susan.scott (???)  mike.mcconnell (???)  john.lavorato (CEO) 
28  darron.giron (EMP)  debra.perlingiere (???)  kay.mann (EMP)  kay.mann (EMP) 
29  jane.tholt (VP)  scott.neal (VP)  kate.symes (EMP)  richard.sanders (VP) 
30  mike.mcconnell (???)  drew.fossum (VP)  john.lavorato (CEO)  marie.heard (???) 
31  john.lavorato (CEO)  darron.giron (EMP)  dan.hyvl (EMP)  kate.symes (EMP) 
32  kimberly.watson (???)  barry.tycholiz (VP)  jane.tholt (VP)  elizabeth.sager (EMP) 
33  lynn.blair (???)  kimberly.watson (???)  kimberly.watson (???)  lynn.blair (???) 
34  louise.kitchen (PRE)  john.lavorato (CEO)  d..steffes (VP)  mark.haedicke (MD) 
35  dan.hyvl (EMP)  jane.tholt (VP)  jeffrey.shankman (PRE)  errol.mclaughlin (EMP) 
36  kim.ward (???)  dan.hyvl (EMP)  errol.mclaughlin (EMP)  mike.mcconnell (???) 
37  errol.mclaughlin (EMP)  mike.mcconnell (???)  louise.kitchen (PRE)  kevin.presto (VP) 
38  marie.heard (???)  kevin.presto (VP)  hunter.shively (VP)  kim.ward (???) 
39  jeffrey.shankman (PRE)  errol.mclaughlin (EMP)  marie.heard (???)  dan.hyvl (EMP) 
40  kate.symes (EMP)  lynn.blair (???)  lynn.blair (???)  michelle.lokay (EMP) 
41  barry.tycholiz (VP)  michelle.cash (???)  michelle.lokay (EMP)  rod.hayslett (VP) 
42  kevin.presto (VP)  kam.keiser (EMP)  kim.ward (???)  jane.tholt (VP) 
43  tracy.geaccone (EMP)  rod.hayslett (VP)  rob.gay (???)  tracy.geaccone (EMP) 
44  hunter.shively (VP)  stacy.dickson (EMP)  kevin.presto (VP)  barry.tycholiz (VP) 
45  darrell.schoolcraft (???)  michelle.lokay (EMP)  chris.dorland (EMP)  mark.whitt (???) 
46  michelle.lokay (EMP)  kenneth.lay (CEO)  fletcher.sturm (VP)  john.forney (MAN) 
47  rod.hayslett (VP)  tracy.geaccone (EMP)  robin.rodrigue (???)  chris.dorland (EMP) 
48  rob.gay (???)  jeffrey.shankman (PRE)  tracy.geaccone (EMP)  jeffrey.shankman (PRE) 
49  robin.rodrigue (???)  fletcher.sturm (VP)  rod.hayslett (VP)  darrell.schoolcraft (???) 
50  robert.badeer (DIR)  kate.symes (EMP)  andrea.ring (???)  kam.keiser (EMP) 
51  tori.kuykendall (TRA)  susan.bailey (???)  barry.tycholiz (VP)  hunter.shively (VP) 
52  greg.whalley (VP)  mark.whitt (???)  greg.whalley (VP)  kenneth.lay (CEO) 
53  kenneth.lay (CEO)  tori.kuykendall (TRA)  tori.kuykendall (TRA)  bill.rapp (???) 
54  fletcher.sturm (VP)  hunter.shively (VP)  john.forney (MAN)  lindy.donoho (EMP) 
55  chris.dorland (EMP)  martin.cuilla (MAN)  michelle.cash (???)  fletcher.sturm (VP) 
56  peter.keavey (EMP)  james.derrick (LAW)  peter.keavey (EMP)  shelley.corman (VP) 
57  bill.rapp (???)  jeffrey.hodge (MD)  mark.guzman (TRA)  martin.cuilla (MAN) 
58  michelle.cash (???)  jeff.skilling (CEO)  darrell.schoolcraft (???)  tori.kuykendall (TRA) 
59  daren.farmer (MAN)  andy.zipper (VP)  kenneth.lay (CEO)  kevin.hyatt (DIR) 
60  lindy.donoho (EMP)  darrell.schoolcraft (???)  larry.may (DIR)  andrea.ring (???) 
61  mark.whitt (???)  chris.dorland (EMP)  daren.farmer (MAN)  rob.gay (???) 
62  larry.may (DIR)  bill.rapp (???)  martin.cuilla (MAN)  andy.zipper (VP) 
63  benjamin.rogers (???)  greg.whalley (VP)  mark.whitt (???)  greg.whalley (VP) 
64  john.forney (MAN)  dutch.quigley (???)  jeff.skilling (CEO)  dutch.quigley (???) 
65  martin.cuilla (MAN)  lindy.donoho (EMP)  rick.buy (MAN)  jeff.skilling (CEO) 
66  andy.zipper (VP)  shelley.corman (VP)  james.derrick (LAW)  rick.buy (MAN) 
67  shelley.corman (VP)  patrice.mims (???)  patrice.mims (???)  t..lucci (EMP) 
68  jeff.skilling (CEO)  monique.sanchez (???)  dutch.quigley (???)  robin.rodrigue (???) 
69  monique.sanchez (???)  peter.keavey (EMP)  shelley.corman (VP)  james.derrick (LAW) 
70  kam.keiser (EMP)  rick.buy (MAN)  benjamin.rogers (???)  jonathan.mckay (DIR) 
71  dutch.quigley (???)  rob.gay (???)  lindy.donoho (EMP)  jim.schwieger (TRA) 
72  mark.guzman (TRA)  robin.rodrigue (???)  bill.rapp (???)  larry.may (DIR) 
73  rick.buy (MAN)  thomas.martin (VP)  kam.keiser (EMP)  monique.sanchez (???) 
74  kevin.hyatt (DIR)  kevin.hyatt (DIR)  mike.carson (EMP)  michelle.cash (???) 
75  james.derrick (LAW)  larry.may (DIR)  dana.davis (???)  mark.guzman (TRA) 
76  andrea.ring (???)  joe.parks (???)  andy.zipper (VP)  thomas.martin (VP) 
77  stacy.dickson (EMP)  john.forney (MAN)  monique.sanchez (???)  teb.lokey (MAN) 
78  patrice.mims (???)  jim.schwieger (TRA)  kevin.ruscitti (TRA)  patrice.mims (???) 
79  jim.schwieger (TRA)  john.zufferli (EMP)  judy.hernandez (???)  diana.scholtes (TRA) 
80  jonathan.mckay (DIR)  daren.farmer (MAN)  jim.schwieger (TRA)  peter.keavey (EMP) 
81  kevin.ruscitti (TRA)  t..lucci (EMP)  stacy.dickson (EMP)  john.zufferli (EMP) 
82  t..lucci (EMP)  jonathan.mckay (DIR)  larry.campbell (???)  daren.farmer (MAN) 
83  sandra.brawner (DIR)  richard.ring (EMP)  kevin.hyatt (DIR)  stacy.dickson (EMP) 
84  geir.solberg (EMP)  andrea.ring (???)  t..lucci (EMP)  sandra.brawner (DIR) 
85  jeffrey.hodge (MD)  judy.townsend (EMP)  jonathan.mckay (DIR)  matt.smith (???) 
86  geoff.storey (DIR)  robert.badeer (DIR)  thomas.martin (VP)  danny.mccarty (VP) 
87  thomas.martin (VP)  teb.lokey (MAN)  sandra.brawner (DIR)  cara.semperger (EMP) 
88  teb.lokey (MAN)  mark.guzman (TRA)  jeffrey.hodge (MD)  larry.campbell (???) 
89  matt.smith (???)  doug.gilbertsmith (MAN)  judy.townsend (EMP)  dana.davis (???) 
90  john.zufferli (EMP)  diana.scholtes (TRA)  matt.smith (???)  benjamin.rogers (???) 
91  judy.townsend (EMP)  geoff.storey (DIR)  john.zufferli (EMP)  jeffrey.hodge (MD) 
92  danny.mccarty (VP)  danny.mccarty (VP)  jason.williams (???)  ryan.slinger (TRA) 
93  diana.scholtes (TRA)  sandra.brawner (DIR)  diana.scholtes (TRA)  joe.parks (???) 
94  jay.reitmeyer (EMP)  jay.reitmeyer (EMP)  teb.lokey (MAN)  sean.crandall (DIR) 
95  holden.salisbury (EMP)  charles.weldon (???)  sean.crandall (DIR)  jason.williams (???) 
96  frank.ermis (DIR)  matt.smith (???)  paul.thomas (???)  paul.thomas (???) 
97  ryan.slinger (TRA)  benjamin.rogers (???)  charles.weldon (???)  jay.reitmeyer (EMP) 
98  larry.campbell (???)  ryan.slinger (TRA)  danny.mccarty (VP)  geoff.storey (DIR) 
99  joe.parks (???)  cara.semperger (EMP)  ryan.slinger (TRA)  mike.carson (EMP) 
100  dana.davis (???)  geir.solberg (EMP)  geir.solberg (EMP)  geir.solberg (EMP) 
101  sean.crandall (DIR)  sean.crandall (DIR)  geoff.storey (DIR)  kevin.ruscitti (TRA) 
102  cara.semperger (EMP)  kevin.ruscitti (TRA)  susan.pereira (EMP)  judy.hernandez (???) 
103  mike.carson (EMP)  jason.wolfe (???)  frank.ermis (DIR)  charles.weldon (???) 
104  paul.y’barbo (???)  scott.hendrickson (???)  robert.badeer (DIR)  judy.townsend (EMP) 
105  andrew.lewis (DIR)  holden.salisbury (EMP)  joe.parks (???)  holden.salisbury (EMP) 
106  charles.weldon (???)  keith.holst (DIR)  jay.reitmeyer (EMP)  theresa.staab (EMP) 
107  jason.williams (???)  susan.pereira (EMP)  cara.semperger (EMP)  paul.y’barbo (???) 
108  paul.thomas (???)  frank.ermis (DIR)  holden.salisbury (EMP)  vladi.pimenov (???) 
109  jason.wolfe (???)  albert.meyers (EMP)  jeff.king (MAN)  don.baughman (TRA) 
110  susan.pereira (EMP)  dana.davis (???)  paul.y’barbo (???)  jeff.king (MAN) 
111  mike.swerzbin (TRA)  paul.y’barbo (???)  theresa.staab (EMP)  susan.pereira (EMP) 
112  judy.hernandez (???)  larry.campbell (???)  andrew.lewis (DIR)  doug.gilbertsmith (MAN) 
113  theresa.staab (EMP)  mike.swerzbin (TRA)  scott.hendrickson (???)  jason.wolfe (???) 
114  scott.hendrickson (???)  theresa.staab (EMP)  jason.wolfe (???)  harry.arora (VP) 
115  mike.maggi (DIR)  mike.carson (EMP)  vince.kaminski (MAN)  frank.ermis (DIR) 
116  keith.holst (DIR)  don.baughman (TRA)  don.baughman (TRA)  john.griffith (MD) 
117  jeff.king (MAN)  jason.williams (???)  tom.donohoe (???)  eric.saibi (TRA) 
118  vladi.pimenov (???)  john.griffith (MD)  vladi.pimenov (???)  mike.swerzbin (TRA) 
119  don.baughman (TRA)  paul.thomas (???)  mike.maggi (DIR)  scott.hendrickson (???) 
120  richard.shapiro (VP)  vladi.pimenov (???)  mike.swerzbin (TRA)  keith.holst (DIR) 
121  vince.kaminski (MAN)  judy.hernandez (???)  harry.arora (VP)  richard.ring (EMP) 
122  harry.arora (VP)  mike.maggi (DIR)  eric.saibi (TRA)  vince.kaminski (MAN) 
123  susan.bailey (???)  pam.butler (???)  john.griffith (MD)  susan.bailey (???) 
124  doug.gilbertsmith (MAN)  jeff.king (MAN)  keith.holst (DIR)  mike.maggi (DIR) 
125  john.griffith (MD)  andrew.lewis (DIR)  doug.gilbertsmith (MAN)  robert.badeer (DIR) 
126  eric.saibi (TRA)  vince.kaminski (MAN)  susan.bailey (???)  tom.donohoe (???) 
127  richard.ring (EMP)  harry.arora (VP)  cooper.richey (MAN)  albert.meyers (EMP) 
128  tom.donohoe (???)  eric.saibi (TRA)  richard.ring (EMP)  clint.dean (TRA) 
129  clint.dean (TRA)  richard.shapiro (VP)  joe.stepenovitch (VP)  andrew.lewis (DIR) 
130  albert.meyers (EMP)  tom.donohoe (???)  clint.dean (TRA)  cooper.richey (MAN) 
131  cooper.richey (MAN)  clint.dean (TRA)  joe.quenet (TRA)  joe.stepenovitch (VP) 
132  pam.butler (???)  cooper.richey (MAN)  albert.meyers (EMP)  pam.butler (???) 
133  joe.stepenovitch (VP)  joe.stepenovitch (VP)  pam.butler (???)  steven.merris (???) 
134  joe.quenet (TRA)  stephanie.panus (EMP)  richard.shapiro (VP)  richard.shapiro (VP) 
135  stephanie.panus (EMP)  joe.quenet (TRA)  steven.merris (???)  monika.causholli (EMP) 
136  stanley.horton (PRE)  brad.mckay (EMP)  phillip.platter (EMP)  mark.fisher (???) 
137  steven.merris (???)  stanley.horton (PRE)  stanley.horton (PRE)  phillip.platter (EMP) 
138  brad.mckay (EMP)  phillip.platter (EMP)  mark.fisher (???)  stephanie.panus (EMP) 
139  phillip.platter (EMP)  steven.merris (???)  monika.causholli (EMP)  stanley.horton (PRE) 
140  monika.causholli (EMP)  monika.causholli (EMP)  brad.mckay (EMP)  joe.quenet (TRA) 
141  mark.fisher (???)  mark.fisher (???)  stephanie.panus (EMP)  brad.mckay (EMP) 
References
 N. Perra and B. Gonçalves, “Modeling and predicting human infectious diseases,” in Social Phenomena, pp. 59–83, Springer International Publishing, 2015.
 L. Weng, F. Menczer, and Y.Y. Ahn, “Virality prediction and community structure in social networks,” Scientific reports, vol. 3, p. 2522, 2013.
 S. Ronen, B. GonÃ§alves, K. Z. Hu, A. Vespignani, S. Pinker, and C. A. Hidalgo, “Links that speak: The global language network and its association with global fame,” Proceedings of the National Academy of Sciences, vol. 111, no. 52, pp. E5616–E5622, 2014.
 L. E. Quevillon, E. M. Hanks, S. Bansal, and D. P. Hughes, “Social, spatial, and temporal organization in a complex insect society,” Scientific reports, vol. 5, 2015.
 G. Petri, P. Expert, F. Turkheimer, R. CarhartHarris, D. Nutt, P. Hellyer, and F. Vaccarino, “Homological scaffolds of brain functional networks,” Journal of The Royal Society Interface, vol. 11, no. 101, p. 20140873, 2014.
 M. S. Granovetter, “The strength of weak ties,” American journal of sociology, pp. 1360–1380, 1973.
 H. Jeong, S. P. Mason, A.L. Barabási, and Z. N. Oltvai, “Lethality and centrality in protein networks,” Nature, vol. 411, no. 6833, pp. 41–42, 2001.
 A. E. Motter, A. P. de Moura, Y.C. Lai, and P. Dasgupta, “Topology of the conceptual network of language,” Physical Review E, vol. 65, no. 6, p. 065102, 2002.
 D. de Solla Price, “A general theory of bibliometric and other cumulative advantage processes,” Journal of the Association for Information Science and Technology, vol. 27, no. 5, pp. 292–306, 1976.
 A.L. Barabâsi, H. Jeong, Z. Néda, E. Ravasz, A. Schubert, and T. Vicsek, “Evolution of the social network of scientific collaborations,” Physica A: Statistical mechanics and its applications, vol. 311, no. 3, pp. 590–614, 2002.
 D. F. Gleich, “Pagerank beyond the web,” SIAM Review, vol. 57, no. 3, pp. 321–363, 2015.
 J. Stehlé, N. Voirin, A. Barrat, C. Cattuto, V. Colizza, L. Isella, C. Régis, J.F. Pinton, N. Khanafer, W. Van den Broeck, et al., “Simulation of an seir infectious disease model on the dynamic contact network of conference attendees,” BMC medicine, vol. 9, no. 1, p. 87, 2011.
 V. Colizza, A. Barrat, M. Barthélemy, and A. Vespignani, “The role of the airline transportation network in the prediction and predictability of global epidemics,” Proceedings of the National Academy of Sciences of the United States of America, vol. 103, no. 7, pp. 2015–2020, 2006.
 P. T. Johnson, J. C. De Roode, and A. Fenton, “Why infectious disease research needs community ecology,” Science, vol. 349, no. 6252, p. 1259504, 2015.
 R. PastorSatorras and A. Vespignani, “Epidemic spreading in scalefree networks,” Physical review letters, vol. 86, no. 14, p. 3200, 2001.
 C. Sanli and R. Lambiotte, “Temporal pattern of online communication spike trains in spreading a scientific rumor: How often, who interacts with whom?,” Frontiers in Physics, vol. 3, no. 79, 2015.
 S. Wang, T. Yang, Q. Zhang, and K. Zhao, “Concurrent diffusion of information and behaviors in online social networks–a case study of the ice bucket challenge,” arXiv preprint arXiv:1508.04417, 2015.
 C. Knappett, T. Evans, and R. Rivers, “Modelling maritime interaction in the aegean bronze age,” Antiquity, vol. 82, no. 318, pp. 1009–1024, 2008.
 P. Holme, “Modern temporal network theory: a colloquium,” The European Physical Journal B, vol. 88, no. 9, pp. 1–30, 2015.
 T. O. Richardson and T. E. Gorochowski, “Beyond contactbased transmission networks: the role of spatial coincidence,” Journal of The Royal Society Interface, vol. 12, no. 111, p. 20150705, 2015.
 P. Grindrod, M. C. Parsons, and E. Higham, Desmond J and, “Communicability across evolving networks,” Physical Review E, vol. 83, no. 4, p. 046120, 2011.
 A. Masucci and G. Rodgers, “Differences between normal and shuffled texts: structural properties of weighted networks,” Advances in Complex Systems, vol. 12, no. 01, pp. 113–129, 2009.
 C. Sanli and R. Lambiotte, “Local variation of hashtag spike trains and popularity in twitter,” PLOS ONE, vol. 10, no. 7, p. e0131704, 2015.
 M. Karsai, M. Kivelä, R. K. Pan, K. Kaski, J. Kertész, A.L. Barabási, and J. Saramäki, “Small but slow world: How network topology and burstiness slow down spreading,” Physical Review E, vol. 83, no. 2, p. 025102, 2011.
 E. Estrada and N. Hatano, “Communicability in complex networks,” Physical Review E, vol. 77, no. 3, p. 036111, 2008.
 E. Estrada, “Communicability in temporal networks,” Physical Review E, vol. 88, no. 4, p. 042811, 2013.
 P. Grindrod and D. J. Higham, “A dynamical systems view of network centrality,” Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, vol. 470, no. 2165, p. 20130835, 2014.
 P. Grindrod and D. J. Higham, “A matrix iteration for dynamic network summaries,” SIAM Review, vol. 55, no. 1, pp. 118–128, 2013.
 T. Rogers, “Null models for dynamic centrality in temporal networks,” Journal of Complex Networks, p. cnu014, 2014.
 A. H. AlMohy and N. J. Higham, “A new scaling and squaring algorithm for the matrix exponential,” SIAM Journal on Matrix Analysis and Applications, vol. 31, no. 3, pp. 970–989, 2009.
 M. Molloy and B. Reed, “A critical point for random graphs with a given degree sequence,” Random structures & algorithms, vol. 6, no. 23, pp. 161–180, 1995.
 F. Chung and L. Lu, “Connected components in random graphs with given expected degree sequences,” Annals of combinatorics, vol. 6, no. 2, pp. 125–145, 2002.
 https://www.cs.cmu.edu/ ./enron/.
 http://enrondata.org/assets/edo_enroncustodians data.html.
 P. Vanhems, A. Barrat, C. Cattuto, J.F. Pinton, N. Khanafer, C. RâÂ©gis, B.a. Kim, B. Comte, and N. Voirin, “Estimating potential infection transmission routes in hospital wards using wearable proximity sensors,” PLoS ONE, vol. 8, p. e73970, 09 2013.
 D. V. Greetham, Z. Stoyanov, and P. Grindrod, “Centrality and spectral radius in dynamic communication networks,” in Computing and Combinatorics, pp. 791–800, Springer, 2013.
 I. Chen, M. Benzi, H. H. Chang, and V. S. Hertzberg, “Dynamic communicability and epidemic spread: a case study on an empirical dynamic contact network,” arXiv preprint arXiv:1601.07586, 2016.
 M. E. Newman, “The structure and function of complex networks,” SIAM review, vol. 45, no. 2, pp. 167–256, 2003.
 A. Clauset, M. E. Newman, and C. Moore, “Finding community structure in very large networks,” Physical review E, vol. 70, no. 6, p. 066111, 2004.
 V. C. Barclay, T. Smieszek, J. He, G. Cao, J. J. Rainey, H. Gao, A. Uzicanin, and M. Salathé, “Positive network assortativity of influenza vaccination at a high school: implications for outbreak risk and herd immunity,” PloS one, vol. 9, no. 2, p. e87042, 2014.
 N. Malik and P. J. Mucha, “Role of social environment and social clustering in spread of opinions in coevolving networks,” Chaos: An Interdisciplinary Journal of Nonlinear Science, vol. 23, no. 4, p. 043123, 2013.
 K.I. Goh and A.L. Barabási, “Burstiness and memory in complex systems,” EPL (Europhysics Letters), vol. 81, no. 4, p. 48002, 2008.
 H.H. Jo, M. Karsai, J. Kertész, and K. Kaski, “Circadian pattern and burstiness in mobile phone communication,” New Journal of Physics, vol. 14, no. 1, p. 013055, 2012.
 M. Benzi and C. Klymko, “Total communicability as a centrality measure,” Journal of Complex Networks, vol. 1, no. 2, pp. 124–149, 2013.