Asymptotic theory for the dynamic of networks with heterogenous social capital allocation
Abstract
The structure and dynamic of social network are largely determined by the heterogeneous interaction activity and social capital allocation of individuals. These features interplay in a nontrivial way in the formation of network and challenge a rigorous dynamical system theory of network evolution. Here we study seven real networks describing temporal human interactions in three different settings: scientific collaborations, Twitter mentions, and mobile phone calls. We find that the node’s activity and social capital allocation can be described by two general functional forms that can be used to define a simple stochastic model for social network dynamic. This model allows the explicit asymptotic solution of the Master Equation describing the system dynamic, and provides the scaling laws characterizing the time evolution of the social network degree distribution and individual node’s ego network. The analytical predictions reproduce with accuracy the empirical observations validating the theoretical approach. Our results provide a rigorous dynamical system framework that can be extended to include other features of networks’ formation and to generate data driven predictions for the asymptotic behavior of largescale social networks.
The formation of social networks requires investments in time and energy by each individual actor with the anticipation that collective benefits can arise for individuals and groups. Individuals however invest in developing social interactions heterogeneously and according to very diverse strategies. In the first place not all individuals are equally active in a given social network. Furthermore, individuals may allocate their social capital in very diverse way, for instance by favoring the strengthening of a limited number of strong ties (bonding capital) as opposed to favor the exploration of weak ties opening access to new information and communities (bridging capital) [1, 2, 3, 4, 5, 6, 7, 8]. The origins of such heterogeneities are rooted in the trade off between competing factors such as the need for close relationships [9], the efforts required to keep social ties [10], temporal and cognitive constraints [11, 12, 13], and have long been acknowledged as key elements in the description of social networks’ properties [14, 15, 16], dynamical features [17, 18, 19, 20, 21, 22, 23, 24, 15, 25], and the the behavior of processes unfolding in social systems [14, 15, 16, 17, 26, 27, 28, 29, 30, 31, 32]. However, it is still lacking a general dynamical system framework able to relate the emerging connectivity pattern of social networks to the combined action of social actors activity and their heterogeneity in distributing resources in social capital allocation.
Here we analyze seven timeresolved datasets describing three different types of social interactions: scientific collaborations, Twitter mentions, and mobile phone calls. For all network datasets we define two functions statistically encoding the instantaneous activity of nodes and the allocation of social capital, respectively. The latter function is regulated by two parameters  system dependent that define a simple reinforcement mechanism. In particular we observe in all datasets that the larger the number of social ties already activated by each node and the smaller is the probability of creating a new tie. We provide a thorough statistical characterization of the activity and reinforcement dynamics at play in each network and identify the basic parameters defining the dynamic of ties evolution.
Prompted by this statistical analysis, we propose a dynamic network model that includes the heterogenous activity of nodes and the the tie formation mechanisms. This model allows the definition of a formal Master Equation (ME) describing the evolution of the network connectivity structure that can be solved in the asymptotic regime (large network size and long time evolution). The solution of the ME provides the asymptotic form the degree distribution and the scaling relations relating degree, activity and and the functions characterizing the social capital allocation. The analytical solutions are capturing very well the empirical behavior measured in the analyzed datasets, connecting explicitly the evolution of social networks to the parameter regulating the emergence of heterogenous social ties. The proposed analytical framework is remarkably general and it can be solved for statistically different activity patterns. The presented results have the potential to open the path to a general asymptotic theory of the dynamic of social networks by progressively integrating further social capital allocation strategies for the formation of social ties.
1 Results
We analyze seven datasets containing timestamped information about three different type of social interactions: scientific collaborations, Twitter mentions, and mobile phone calls. While we refer the reader to the Material and Methods section for the details of each data set, we represent all datasets as timevarying networks. Each node describes an individual. Each timeresolved link describes a social act. The nature of connections is different according to the specific dataset. Links might represent a collaboration resulting in a publication in a scientific journal, a Twitter mention, or a mobile phone call. We considered five scientific collaborations networks obtained from five different journals (, , , , and ) of the American Physical Society (APS), one Twitter mentions network (), and one mobile phone network ().
In order to characterize the timevarying properties of such networks we first measure the activity . Formally, is defined as the fraction of interactions in which node is engaged per unit of time with respect to all the interactions per unit time occurring in the network. This quantity describes the propensity of nodes to be involved in social interactions. Empirical measurements in a wide set of social networks show broad distributions of activity [23, 28, 29, 16, 33]. As shown in Figure 1 [AD], we confirm these observations in our datasets. In particular we find that in the APS and MPN datasets the activity is well fitted by a truncated power law, while in the TMN we find a LogNormal distribution (see Material and Methods and Supplementary Online Materials for details).
1.1 Social capital allocation
The activity sets the clock for the activation of each node, however it does not provide any information on how each node invests its social capital in exploring new ties or reinforcing already established ties [34]. In order to measure the formation of new ties, we group nodes in classes with similar activity and final degree , so that each class contains actors with statistically equivalent characteristics (see SI for details). We then measure the probability that the next social act for the nodes in the class that have already contacted nodes will result in the establishment of a new, th, tie. As shown in Figure 1 [EH] is in general a decreasing function of . This observation resonates with previous research and empirical findings suggesting that our social interactions are bounded by cognitive and temporal constrains [10, 11, 12, 13]. Indeed, the larger the number of alters in our social circle, the smaller the probability that the next social act will be towards a new tie.
The above empirical findings suggest that the mechanism governing the allocation of social capital follow a general form that in its simplest analytical form can be written as:
(1) 
In this expression, modulate the tendency to explore new connections, while define the intrinsic characteristic limit of the individual to maintain multiple ties. Although one could imagine more complicate analytical forms, we use this parsimonious approach to characterize the different data sets. Interestingly, we find that in the five coauthorship networks and Twitter, the exponent is the same regardless of the class . Furthermore, the values of are typically peaked around a well defined value (see SI for details). More in detail, we can rescale the proposed functional form in each class by defining the variable , yielding
(2) 
In the presence of a single exponent characterizing the system, as shown in Figure 1 [IK], all empirical curves do collapse on the reference function . The data collapse however is not occurring in the case of the MPN dataset. In the latter we find a more heterogeneous scenario in which different nodes’ classes are characterized by different values of and , see Figure 1 L. In the Supplementary Online Material we provide further evidence for the evidence of a single or distirbuted value of in different datasets.
1.2 Stochastic model for the network dynamic
By leveraging on the empirical evidence gathered here, it is possible to define a basic generative model of network formation based on two stochastic processes. Defined the network containing nodes, at each time step a node is active according to a probability drawn from distribution . [23, 28, 29, 16, 33]. Once active, the node that has already contacted different agents will contact a new, randomly chosen node with probability . Otherwise, with probability , it will interact with an already contacted node chosen at random. Interactions are considered to last one single time step. For this model it is possible to write explicitly the master equation (ME) describing the evolution of the probability distribution that a node has degree at time :
(3) 
In the above equation and are the sum over the nodes already contacted and not yet contacted by , respectively. Within these sums, we use as the degree of the node . The first two terms on the right hand side of Eq. (1.2) account for the creation of nodes of degree which occurs when a node of degree gets active and contacts a new node, or when it gets in contact with a new node of previous degree that activates and attaches to node . The third and fourth terms of the r.h.s. of the equation account for the conservation of nodes of degree , i.e. nodes that either get active and contact one of their neighbors with probability or get contacted by one of their neighbors. The last line of Eq. (1.2) takes into account for the case in which no node gets active in the current evolution time step, thus conserving the .
1.3 Asymptotic theory for network with
In the case of networks characterized by a single exponent it is possible to consider for the ME the large time and large limit, so that can be approximated by a continuous variable. By neglecting the subleading terms of order we can thus write the continuous asymptotic version of Eq. (1.2) as
(4)  
This equation can be solved explicitly (see SI for details), yielding the asymptotic form:
(5) 
where is a normalization constant, a constant and a multiplicative factor of the term that depends on the activity and of the considered agent. Its implicit expression is given in the SI.
A first general result concerns the evolution in time of the average degree of nodes belonging to a given activity class that follows the scaling laws
(6) 
The growth of the system is thus modulated by the parameter that sets the strength of the reinforcement process in the process ruling the establishment of new social ties. In the limit case the growth would be linear. Indeed, the reinforcement of previously activated ties would be zero and nodes would keep connecting randomly to other vertices, thus increasing indefinitely their social circle. In the opposite limit each node would invest is social capital on just one single connection, i.e. the first established. In the six datasets described by a single value, we observe the range that indicates a sublinear growth of the social system. In Figure S12 we find a very good agreement between the analytical prediction of Eq. (6) and the empirical curves, obtaining the first empirical validation of the modeling framework proposed and its ability at capturing the network formation dynamic.
Furthermore, Eq. (6) connects, at a given time , the degree and the activity of a given node, as . Thus, given any specific activity distribution , we can infer the functional form of the degree distribution by substituting , finding:
(7) 
It is important stressing that the analytical framework is not limited to a specific functional form of the activity. Indeed, with an arbitrary functional form of , Eq. (6) gives us the possibility to predict the behavior and parameters of the corresponding degree distribution. In Table 3 we report the degree distribution predicted by Eq. (6) for activities following a common set of heavytailed distributions, i.e. powerlaws, truncated powerlaw, stretched exponentials, and lognormal, that are usually find in empirical data. In Figure S12[EG] we compare the degree distributions predicted by Eq. (S22) with real data. Interestingly, also in this case the functional form obtained from the analytical solution of the model fit remarkably well the empirical evidence. It is important to notice that is also function of the parameter . In other words, the connectivity patterns emerging from social interactions can be inferred knowing the propensity of individuals to be involved in social acts, the activity, and the strength of the reinforcement towards previously establish ties, . Finally it is worth remarking that Eqs. (6, S22) are not affected by the distribution of . This is an important result as it reduces the number of relevant parameters necessary to define the temporal evolution of the system.
1.4 Asymptotic theory for networks with distributed
As we already mentioned, in the dataset we find the evolution of social ties described by a distribution of rather than a single value of it. This observation points to a more heterogeneous distribution of social attitudes with respect to the other six analyzed datasets. Arguably, such tendency might be driven by the different functions phone calls serve enabling us to communicate with relatives, friends or rather to companies, clients etc.. The need to introduce different values of in the system complicates the model beyond analytical tractability (see SI for details). Nevertheless, we find that the leading term of the evolving average degree can be described by introducing a simplified model, in which the nodes of the system feature different values of and undergo a simplified dynamics (see SI for further information) that neglects, for every node, the effects of links established by others. In these settings we can solve the ME and show that the minimum value of , , rules the leading term of the evolving average degree. In other words, we find that even in this case evolves as in Eq. (6) but with substituted by . As shown in Figure S12D the analytical predictions coming from the simplified model find good agreement with the empirical evidence. It is interesting to notice that the nodes characterized by are those with the weak tendency to reinforce already established social ties. They are social explorers [34]. Notably, our results, indicate that they lead the growth of average connectivity of the network.
2 Discussion
The empirical finding presented here shows clearly that the “cost” associated to the establishment of a new social tie is not constant but is function of the number of already activated ties, thus supporting the idea that social capabilities are limited by cognitive, temporal or other forms of constraints [11, 12, 13]. Framing this empirical finding in a simple stochastic model of network formation, we can derive a general asymptotic theory of the network dynamic and derive the general scaling laws for the behavior in time of the node’s degree and degree distribution.
The model comes with some shortcomings. Indeed, it does not capture the modular structure or, more in general, correlations beyond the nearest neighborhood that are typical of many social networks [35]. In fact, individuals tend to organize their social circles in tight, often hierarchical, communities. The model does not capture the burstiness typical of social acts [36, 37]. We consider a simplified Poissonian scheme of nodes activation. A recent extension of the activity driven framework, without the reinforcement mechanism acting on social ties, has been proposed to account for non Poissonian node dynamics [38]. This is the natural starting point to generalize our model to bursty activities. Furthermore, the model does not consider the turnover of social ties [34]. Indeed, in our framework once a social connection has been established it cannot be eliminated in favor of others. Clearly, this feature is of particular importance when considering social systems evolving on longer time scales, as the scientific journals we studied here, and might influence the measurement of the parameters describing evolution of the egonetworks.
Notwithstanding these limitations, the modeling framework we propose pave the way to a deeper understanding of the emergence and evolution of social ties. The agreement between the analytical predictions and observed behaviors in seven real datasets, describing different types of social interactions, are encouraging steps in this direction. Finally, our results are a starting point for the development of predictive tools able to forecast the growth and evolution of social systems based not just of regression models or simplified toy models but on a more rigorous analysis of egonetwork dynamics.
3 Materials
3.1 Datasets
We analyzed seven largescale and time resolved networks describing three different types of social interactions.

Five networks from the APS datasets takes into account the coauthorship networks found in the Journals of the American Physical Society. Specifically, the PRA dataset covers the period from Jan. 1970 to Dec. 2006 and contains 36,880 papers written by 34,093 authors and connected by 100,683 edges. The PRB dataset refers to the Jan. 1970 to Dec. 2007 period and contains 104,047 papers published by 84,367 authors which are connected by 416,048 links. The PRD datasets covers the same period as the PRB one and it is composed by 33,376 papers, 21,202 authors and 60,033 edges. The PRE dataset refers to the Jan. 1993 to Dec. 2006 period with 24,204 papers published by 28,188 authors connected by 68,029 edges. Finally, the PRL dataset contains all the 66,422 papers published between Jan. 1960 to Dec. 2006 and written by 78,763 authors forming 299,017 edges.

One network dataset describing Twitter mentions (TMN), exchanged by users from January to September 2008. The network has 536,210 nodes performing about 160M events and connected by 2.6M edges.

One Network dataset describing the mobile phone calls network (MPN) of 6,779,063 users of a single operator with about 20% market share in an undisclosed European country from January to July 2008. The datasets contains all the phone calls to and from company users thus including the calls towards or from 33,160,589 users in the country connected by 92,784,825 edges.
3.2 Asymptotic solution of the ME for distributed values
The solution of Eq. (4) found in Eq. (5) holds if the system feature a single value of . As already discussed in the MPN dataset we find multiple values of ranging from a minimum value, to a maximum one . To find a prediction of the long time behavior of such a system, let us propose a simplified model in which we focus on a single agent whose parameters are , and . In this simplified version the agent can only call other nodes in the network, i.e. we neglect the contribution coming from the incoming calls). In this approximation we have to solve a modified version of Eq. (1.2), obtained by discarding all the terms containing the activity of the nodes . By repeating the same procedure above, we get to the continuum limit that reads:
(8) 
whose solution is similar to Eq. (5), the only differences being the value of and the behavior of the constant (see Materials and Methods and the SI for details). Interestingly, even in this case we find an average degree growing accordingly to the exponent , i.e. . Now, let us create a reservoir of distinct nodes of equal activity and assign to each of them a different value of drawn from an arbitrary distribution . Let us also group these nodes in classes, defined so that each class contains all the nodes featuring a similar value of . If we now let these nodes evolve following the simplified model above, the average degree of each class will grow as . Then, in the long time limit, the minimum value of , i.e. , will lead the growth of the ensemble’s average degree (see SI for further details), i.e.
(9) 
3.3 and distributions from real data
We implement the method found in [39] to determine the most likely functional form of both the activity and degree distributions. The fitting procedure is as follows: for each functional form of the distribution considered (power law, lognormal, truncated power law and stretched exponential) we first determine the value, i.e. the lower bound to the functional form behavior. The value is defined as the value that minimizes the KolmgorovSmirnov (KS) distance between the analytical complementary cumulative distribution (CDF) and the CDF of the data. The latter are found for each value of by computing the optimal parameters of the distribution using the maximumlikelihood estimator (MLE). Then, comparing the CDF of the data with the analytical one , we compute the KSdistance as the maximum distance between the two CDF, i.e. . Once all the distances are computed we determine as the values at which the minimum distance is recorded, i.e. (see SI and [39] for details). Once we compute all the parameters for all the functional forms analyzed we compare them with the likelihood ratio test combined with the value that gives the statistical significance of (see SI for details). The result of this procedure gives us the best candidate for the for each dataset. We find that a truncated power law is the best candidate for all the APS datasets together with the MPN one. The only exception is the TMN that displays a LogNormal distribution of activity (see Fig. 1 and SI for details). After we estimate the functional form and the parameters of the activity distribution , Eq. (S22) gives us the possibility to predict both the functional form of the degree distribution and the values of the parameters of such a distribution (e.g. the exponent in a powerlaw with cutoff, see Table 3 for details). The degree distribution can then be fitted by optimizing over the nonscalefree parameters for whose values we do not have an analytical or numerical prediction (e.g. the cutoff in a powerlaw with cutoff). Indeed, we are missing the value of the constant in front of the term in the growth of the average degree in Eq. (6).
Power Law  

Stret. Exp.  
Trunc. PL  
LogNormal 
References
 [1] Mark S Granovetter. The strength of weak ties. American journal of sociology, pages 1360–1380, 1973.
 [2] Noah Friedkin. A test of structural features of granovetter’s strength of weak ties theory. Social Networks, 2(4):411–422, 1980.
 [3] Nan Lin, Walter M Ensel, and John C Vaughn. Social resources and strength of ties: Structural factors in occupational status attainment. American Sociological Review, pages 393–405, 1981.
 [4] Mark Granovetter. The strength of weak ties: A network theory revisited. Sociological Theory, 1(1):201–233, 1983.
 [5] Jacqueline Brown and Peter Reingen. Social ties and wordofmouth referral behavior. Journal of Consumer Research, 14(3):350–362, 1987.
 [6] Reed E Nelson. The strength of strong ties: Social networks and intergroup conflict in organizations. Academy of Management Journal, 32(2):377–401, 1989.
 [7] Daniel Z Levin and Rob Cross. The strength of weak ties you can trust: The mediating role of trust in effective knowledge transfer. Management Science, 50(11):1477–1490, 2004.
 [8] Pasquale De Meo, Emilio Ferrara, Giacomo Fiumara, and Alessandro Provetti. On facebook, most ties are weak. Commun. ACM, 57(11):78–84, October 2014.
 [9] Julianne HoltLunstad, Timothy B Smith, and J Bradley Layton. Social relationships and mortality risk: a metaanalytic review. PLoS medicine, 7(7):e1000316, 2010.
 [10] Robin IM Dunbar. The social brain hypothesis. brain, 9(10):178–190, 1998.
 [11] Giovanna Miritello, Esteban Moro, Rubén Lara, Rocío MartínezLópez, John Belchamber, Sam GB Roberts, and Robin IM Dunbar. Time as a limited resource: Communication strategy in mobile phone networks. Social Networks, 35(1):89–95, 2013.
 [12] James Stiller and Robin IM Dunbar. Perspectivetaking and memory capacity predict social network size. Social Networks, 29(1):93–104, 2007.
 [13] Joanne Powell, Penelope A Lewis, Neil Roberts, Marta GarcíaFiñana, and RIM Dunbar. Orbital prefrontal cortex volume predicts social network size: an imaging study of individual differences in humans. Proceedings of the Royal Society B: Biological Sciences, 279(1736):2157–2162, 2012.
 [14] J.P. Onnela, J. Saramäki, J. Hyvönen, G. Szabó, D. Lazer, K. Kaski, J. Kertész, and A.L. Barabási. Structure and tie strengths in mobile communication networks. Proceedings of the National Academy of Sciences, 104(18):7332–7336, 2007.
 [15] M. Karsai, M. Kivelä, R. K. Pan, K. Kaski, J. Kertész, A.L. Barabási, and J. Saramäki. Small but slow world: How network topology and burstiness slow down spreading. Phys. Rev. E, 83:025102, Feb 2011.
 [16] Márton Karsai, Nicola Perra, and Alessandro Vespignani. Time varying networks and the weakness of strong ties. Sci. Rep., 4:4001, 02 2014.
 [17] P. Holme and J. Saramäki. Temporal networks. Physics Reports, 519:97–125, October 2012.
 [18] Petter Holme and Jari Saramäki. Temporal networks. Springer, 2013.
 [19] C. Cattuto, W. Van den Broeck, A. Barrat, V. Colizza, J.F. Pinton, and A. Vespignani. Dynamics of persontoperson interactions from distributed rfid sensor networks. PloS One, 5(7):e11596, 2010.
 [20] Lorenzo Isella, Juliette Stehlé, Alain Barrat, Ciro Cattuto, JeanFrançois Pinton, and Wouter Van den Broeck. What’s in a crowd? analysis of facetoface behavioral networks. J. Theor. Biol, 271:166, 2011.
 [21] Juliette Stehlé, Alain Barrat, and Ginestra Bianconi. Dynamical and bursty interactions in social networks. Phys. Rev. E, 81:035101, Mar 2010.
 [22] José Luis Iribarren and Esteban Moro. Impact of human activity patterns on the dynamics of information diffusion. Physical review letters, 103(3):038702, 2009.
 [23] N. Perra, B. Gonçalves, R. PastorSatorras, and A. Vespignani. Activity driven modeling of time varying networks. Sci. Rep., 2, 06 2012.
 [24] Jari Saramäki, E. A. Leicht, Eduardo López, Sam G. B. Roberts, Felix ReedTsochas, and Robin I. M. Dunbar. Persistence of social signatures in human communication. Proceedings of the National Academy of Sciences, 111(3):942–947, 2014.
 [25] A. Clauset and N. Eagle. Persistence and periodicity in a dynamic proximity network. In DIMACS Workshop on Computational Methods for Dynamic Interaction Networks, pages 1–5, 2007.
 [26] M. Morris. Telling tails explain the discrepancy in sexual partner reports. Nature, 365:437, 1993.
 [27] Luis EC Rocha, Fredrik Liljeros, and Petter Holme. Information dynamics shape the sexual networks of internetmediated prostitution. Proceedings of the National Academy of Sciences, 107(13):5706, 2010.
 [28] N. Perra, A. Baronchelli, D. Mocanu, B. Gonçalves, R. PastorSatorras, and A. Vespignani. Random Walks and Search in TimeVarying Networks. Physical Review Letters, 109(23):238701, December 2012.
 [29] B Ribeiro, N. Perra, and A. Baronchelli. Quantifying the effect of temporal resolution on timevarying networks. Scientific Reports, 3:3006, 2013.
 [30] R. Pfitzner, I. Scholtes, A. Garas, C.J Tessone, and F. Schweitzer. Betweenness preference: Quantifying correlations in the topological dynamics of temporal networks. Phys. Rev. Lett., 110:19, 2013.
 [31] Michele Starnini, Andrea Baronchelli, Alain Barrat, and Romualdo PastorSatorras. Random walks on temporal networks. Phys. Rev. E, 85:056115, May 2012.
 [32] Eytan Bakshy, Itamar Rosenn, Cameron Marlow, and Lada Adamic. The role of social networks in information diffusion. In Proc. ACM Intl. World Wide Web Conf. (WWW), pages 519–528, 2012.
 [33] Mario V Tomasello, Nicola Perra, Claudio J Tessone, Márton Karsai, and Frank Schweitzer. The role of endogenous and exogenous mechanisms in the formation of r&d networks. Scientific reports, 4, 2014.
 [34] Giovanna Miritello, Rubén Lara, Manuel Cebrian, and Esteban Moro. Limited communication capacity unveils strategies for human interaction. Scientific reports, 3, 2013.
 [35] Santo Fortunato. Community detection in graphs. Physics Reports, 486(3):75–174, 2010.
 [36] AlbertLaszlo Barabasi. The origin of bursts and heavy tails in human dynamics. Nature, 435(7039):207–211, 2005.
 [37] Márton Karsai, Kimmo Kaski, AlbertLászló Barabási, and János Kertész. Universal features of correlated bursty behaviour. Scientific reports, 2, 2012.
 [38] Antoine Moinet, Michele Starnini, and Romualdo PastorSatorras. Burstiness and aging in social temporal networks. Physical review letters, 114(10):108701, 2015.
 [39] Aaron Clauset, Cosma Rohilla Shalizi, and Mark EJ Newman. Powerlaw distributions in empirical data. SIAM review, 51(4):661–703, 2009.
 [40] Jeff Alstott, Ed Bullmore, and Dietmar Plenz. powerlaw: A python package for analysis of heavytailed distributions. PLoS ONE, 9(1):e85777, 01 2014.
 [41] Michele Starnini and Romualdo PastorSatorras. Topological properties of a timeintegrated activitydriven network. Phys. Rev. E, 87:062807, Jun 2013.
Supplementary Information
S1 Datasets
s1.1 American Physical Society
The APS dataset contains the five coauthorship networks of five journals of the American Physical Society, i.e., Physical Review A, B, D, E and Letters (L).
The various datasets contains the data referring to all the issues of the single journals from their first issue up to a certain edition, specifically:

PRA from January to December ;

PRB and PRD from January to December ;

PRE from January to December ;

PRL from February to December .
Each dataset is composed by several files (one per month). Each file has as many lines as the number of papers published in that month. Finally, each line contains the IDs of the authors of the specific paper. For instance, the typical head of a file is:
Author_000 Author_001 Author_002 #First Paper with 3 authors Author_003 Author_004 #Second Paper with 2 authors . . . . . . . . . . . .
The data are cleaned so as to not take into account the papers with a single author.
When analyzing this dataset we define the user’s activity as the number of engaged collaborations (e.g. an author that publish two papers, the first with 3 coauthors and the second with a single coauthor, has activity ).
s1.2 Twitter Mention Network
The dataset of Twitter is composed by daily files covering the period between January the to September the . The dataset contains the so called firehose, i.e., all the citations done by all the users in the given period. The nodes in the network are connected via edges.
Each file contains the daily events with the structure:
Citer_ID_00 Cited_ID_00 # Event 0 Citer_ID_01 Cited_ID_01 # Event 1 Citer_ID_02 Cited_ID_02 # Event 2 . . . . . . . . .
This dataset is not cleaned, as we have all the events that happened on the platform in the selected period.
When analyzing this dataset we define the user’s activity as the number of citation made by , i.e. the number of events actually engaged by the node .
s1.3 Mobile Phone Network
The dataset of the Mobile Phone Calls (MPC) is composed by a single file containing the time ordered events with second resolution covering the period between January and July of for users of a single operator with market share in an undisclosed European country.
The dataset contains all the events from and toward users of the company (so that even the calls from noncompany users to company users and viceversa are taken into account). As a result, we have nodes (of which are users of the selected company) that are connected via edges.
We split the huge list of events in files (each of them containing more or less the same number of events) for computing convenience. Each file contains events with the structure:
Caller_ID Called_ID Company_Caller Company_Called # Event 0 Caller_ID Called_ID Company_Caller Company_Called # Event 1 Caller_ID Called_ID Company_Caller Company_Called # Event 2 . . . . . . . . . . . . . . .
where Company_Caller and Company_Called are the value of the provider company of the called and caller nodes, respectively (e.g. the value is set to if the node is a customer of our company, otherwise).
When analyzing this dataset we define the user’s activity as the number of calls done by the node, i.e. the number of calls actually engaged by the node .
S2 Data analysis
s2.1 Activity distribution and the nodes binning
For the datasets presented in Section S1 we first evaluate, for each node , the total number of events engaged by the node itself. For instance is the number of calls made by the node in the MPC dataset or the number of citations done by in the Twitter dataset.
We then define the node activity as the ratio between the th node’s number of events and the total number of events observed in the dataset, i.e. where . Thus, falls in the range with . We then introduce and compute the activity distribution . In Fig. S1 we show the resulting activity distribution for each analyzed dataset, while in Table (1) we show the best candidate functional form for the distribution of each dataset. The latter is estimated using the methods found in [39].
Dataset  Distribution  Parameters  

TMN  Lognormal  
PRA  Truncated  
PRB  Truncated  
PRD  Truncated  
PRE  Truncated  
PRL  Truncated  
MPN  Truncated 
In particular, we compare the goodness of fit on the distribution of the functional forms found in Table [1] of the main paper, i.e. powerlaw, truncated powerlaw, stretched exponential and lognormal distribution. The procedure for each dataset and each functional form reads as follows:

we fit the taking into account all the nodes featuring , where is the lower bound of the distribution. The fit is performed using the maximum likelihood estimators (MLE) that return the optimal values of the parameters;

once the optimal parameters are found we compute the KolmogorovSmirnov distance () between the analytical and experimental complementary cumulative distribution function (CDF);

we then apply this procedure for different and set the lower bound value as the one that minimizes the .
We then repeat this procedure for all the functional forms of the and we then compare them with the likelihood ratio test combined with the value that gives the statistical significance of [39, 40]. The result of this procedure gives us the best candidate for the for each dataset as shown in Table (1). We find that a truncated power law is the best candidate for all the APS datasets together with the MPN one. On the other hand, in the TMN we find a lognormal distribution as the best candidate for the dataset.
Our datasets provide evidence that nodes within the same activity class (i.e. node with similar values of activity ) can feature very different memory behavior. In particular agents with large activity may connect to very few different nodes (strong reinforcement) or establish new links at almost every step (weak reinforcement). For this reasons each node of the network is naturally classified according to her activity and her final degree , i.e. the total number of different agents that have been connected to in the considered time window.
We then define a binning procedure that let us group together the similar nodes, i.e. nodes with similar activity and final degree. We divide the nodes in activity classes so that within each activity class the most active node performs at most times the events of the least active node. Then, with the same procedure, we further group the nodes within each activity class according to their final degree, thus defining final degree classes. The nodes are therefore divided in activitydegree classes. From now on, unless differently stated, whenever we mention the nodes’ class or bin we will be referring to one of these classes.
s2.2 The reinforcement process
To measure the reinforcement process of each system, we count all the communication events engaged by every node of the th class when it has degree . In other words, is the total number of events engaged by the nodes of the th class at degree .
Each time an event engaged by a node of the th class results in a degree increase , we increment the counter by . In other words, is the total number of events that the nodes belonging to the th and featuring degree perform toward a new node. Of course, if a node of the th class with degree increases its degree to because it gets called by a new node, the counter is not incremented.
The best estimate of the probability for a new node to get establish a new connection at degree then reads:
(S1) 
where and are the event counters as defined above. We can give an estimate of the uncertainty on , by assuming that at a given degree the events are independent (i.e. there are no correlations between users) and by checking that so that the STD of reads:
(S2) 
We then fit with the proposed reinforcement function :
(S3) 
where is the social propensity of the th bin, is the cumulative degree and is the reinforcement strength, that will be kept fixed for all the nodes in the system. In particular, for each class and with a fixed , we optimize the parameter , by minimizing the function :
(S4) 
where the index runs over the points of the th bin’s curve and is as defined in Eq. (S2). By repeating this procedure for each value of we find, for each class , a curve.
In Fig. S2 we show the behavior of . For each class we find a minimum of at a certain (see the horizontal lines in the heatmaplike panels of Fig. S2).
Moreover, Fig. S2 shows that there are two different behaviors. Specifically, in the TMN case (see Fig. S2 (a)), one value of fits most of the curves, exception made for some outsiders: the value of that maximizes the is practically the same for all the bins. On the contrary, in the MPC case the maximum of the function follows a diagonal path ranging from a larger for bins with lower final degree to a smaller for larger degree bins. In this case a single cannot fit all the curves and we have to consider a multi model where each class features a different optimal value of , .
In Fig. S3 we present the rescaled curves for the PRA, PRD, PRE, PRL, TMN and MPC datasets. In the first five cases we show the rescaled curves obtained by substituting and then plotting . As one can see, the curves nicely collapse on the reference curve . In the MPC case we show instead the original curves, each one fitted with its own . The latter parameter falls in the interval for most of the curves as we also show in Fig. S2.
To quantitatively define the parameter, let us define the total mean square deviation as
(S5) 
where is the total number of curves, i.e. the number of activitydegree bins . Then, for the single exponent case, the function allows to define as:
(S6) 
In the multi case instead, we compute the different values of the exponent found in the system by grouping the memory classes accordingly to their final degree as shown in Fig. S2. The optimal value of is found to be minimum for the bins featuring a large final degree, i.e. , which, as we will show in Section S3.2.3, is the exponent driving the evolution of the network.
To corroborate the results just outlined, we show in Fig. S4 the box plot of the distribution for different groups of nodes classes grouped by their final degree. We note that the APS and TWT datasets are well approximated by a single as the distribution of within each subgroup of nodes is compatible with the global optimal value . On the other hand, in the MPN case we see that the large finaldegree classes have their distribution centered around a smaller value of . As already anticipated, this value will lead the asymptotic growth of the system as we will show in Section S3.2.3.
As a last remark we present in Fig. S5 the measured distribution of the constant for the MPN and TMN datasets. We show the distribution for all the nodes in the network and for each activity class , i.e. the group of nodes featuring similar activity. The values of this constants are distributed but peaked around an average value. Moreover, the distribution of the parameter within each activity closely follows the global one. The distribution of the social attitude then appears to be a global, activity independent feature of the nodes in the system. Finally, in Fig. S5 (c) we show how the average value of the constants, , differs from one dataset to the other varying from in PRB to in TMN and in the MPN case, respectively.
S3 The model
s3.1 Activity driven networks with no memory
The activity driven networks are an effective framework to describe time varying networks.
The simplest memoryless model is defined as follows: the network consists of nodes
featuring an activity potential , i.e. the probability for a node to get active in a
certain time interval reads .
The evolution rules are: (i) at each time step we start with disconnected
nodes; (ii) each node whether gets active with probability or does not
activate with probability . If a node gets active it calls a randomly
selected node in the network, thus creating an edge . (iii) At the end
of the time step all the created connection are deleted and we start again from the
initial step (i).
These evolution rules define the Master Equation (ME) for , i.e. the probability that a node of activity has degree at time , where the degree is the number of nodes that contacted up to time . We also set, without losing generality, . The discrete time equation for then reads:
(S7)  
(S8) 
The equation is obtained in the approximation where , so that between two consecutive times and only one site can be active. We will assume that the activity of a node is small, i.e. , and we wil also consider the approximation i.e. the integrated number of neighbors of a site is much larger than but much smaller than the total number of agents . The first term of the sum represents the probability that the site is active and a new link is added to the system. The second term is the probability that the site is active but this site connects to a site that has been already linked. In the third and fourth terms, the symbol denotes the sum over the sites that are not yet connected to . In particular, the third term represents the probability that one of these sites is active and that it connects to . The fourth term is the probability that one of these sites is active but no link between and is established. The fifth term is the probability that one of the sites already connected to is active (being the sum over the nodes already connected to ); in this case no new link is added to . Finally, the last term represents the probability that at time all the sites are not active. For , the second term can be neglected. After some algebra we obtain the equation:
given that . For , we assume that i.e. the average value of the activity. In the limit of large time and large we can write a continuous equation in and obtaining:
(S9) 
The solution of Eq. (S9) is straightforward:
(S10) 
In the large time limit this solution reduces to a delta function: Therefore, the average degree of the nodes of activity grows as:
(S11) 
as already found in [23, 41]. Moreover the asymptotic degree distribution of a network with activity distribution is:
(S12) 
s3.2 Plugging in the reinforcement process
The model presented in Sec. S3.1 is a basic model as it contains no
correlations on an agent’s story at all.
In particular, the probability for a node to recall an already contacted node is
independent of the node degree. While simple to describe and solve analytically, this
model is not realistic, as there are no correlations in the each agent’s history.
Moreover, the probability to call an already contacted node is always small as
(and thus the probability to call a new node remains even at large degree ).
However, as shown in Sec. S2.2, realworld systems features a strong reinforcement
process, as the probability to call a new node at degree decreases as
the degree increases.
For this reason we introduce an extended version of the model described in [16] et al. which includes a reinforcement function that measures the probability for an active node , that has already contacted different nodes in the network, to call a new node instead of an already contacted one.
s3.2.1 The single case
As already shown in Sec. S2.2 the functional form for the reinforcement process , i.e. the probability of adding a new link for the node of degree , reads:
(S13) 
By plugging Eq. (S13) into Eq. (S8) for node , we get:
(S14)  
where is the number of nodes in the network, is the sum over the nodes not yet connected to and is the sum over all the nodes of the network. Each term of Eq. (S14) corresponds to a particular event that may take place in the system, as already presented in the paper. For instance, the first term of the l.h.s. of Eq. (S14) takes into account the increment of the node ’s degree from to . This may happen whether because node gets active and contacts a new node in the system with probability or because a node never contacted before gets active and calls exactly node with probability , being the degree of . In the same way, the second line takes into account that node does not change degree whether because it calls an already contacted node or because the non contacted nodes call other nodes in the network. The last line of Eq. (S14) considers the possibility that no node in the network gets active.
If we now substitute Eq. (S13) in Eq. (S14), after some algebra we get:
(S15) 
Then, by applying the same approximations of large degree and time we obtain the continuous equation:
(S16)  
where is the probability for a node of activity to
have reinforcement constant .
The long time asymptotic solution of Eq. (S16) is of the form:
(S17) 
Moreover, is a constant depending on the activity and the reinforcement constant that follows the: