Preferential survival in models of complex ad hoc networks

Preferential survival in models of complex ad hoc networks

Joseph S. Kong    Vwani P. Roychowdhury jskong,vwani@ee.ucla.edu Department of Electrical Engineering, UCLA, Los Angeles, CA 90095, USA
Abstract

There has been a rich interplay in recent years between (i) empirical investigations of real world dynamic networks, (ii) analytical modeling of the microscopic mechanisms that drive the emergence of such networks, and (iii) harnessing of these mechanisms to either manipulate existing networks, or engineer new networks for specific tasks. We continue in this vein, and study the deletion phenomenon in the web by following two different sets of web-sites (each comprising more than 150,000 pages) over a one-year period. Empirical data show that there is a significant deletion component in the underlying web networks, but the deletion process is not uniform. This motivates us to introduce a new mechanism of preferential survival (PS), where nodes are removed according to the degree-dependent deletion kernel, , with . We use the mean-field rate equation approach to study a general dynamic model driven by Preferential Attachment (PA), Double PA (DPA), and a tunable PS (i.e., with any ), where nodes () are deleted per node added to the network, and verify our predictions via large-scale simulations. One of our results shows that, unlike in the case of uniform deletion (i.e., where ), the PS kernel when coupled with the standard PA mechanism, can lead to heavy-tailed power law networks even in the presence of extreme turnover in the network. Moreover, a weak DPA mechanism, coupled with PS, can help make the network even more heavy-tailed, especially in the limit when deletion and insertion rates are almost equal, and the overall network growth is minimal. The dynamics reported in this work can be used to design and engineer stable ad hoc networks and explain the stability of the power law exponents observed in real-world networks.

pacs:
89.75.Da,89.75.Fb,89.75.Hc

I Introduction

i.1 Motivation and Background

The empirical study of real-world networks such as the World Wide Web, the movie actor collaboration network, and scientific citation network has attracted considerable interest. Such large-scale and complex networks have been treated as physical systems, and stochastic models based on randomized mechanisms or protocols have been developed to model and explain empirically observed network characteristics. Concomitantly, several works have shown that the network dynamic models have applications beyond merely modeling real-world systems: It has been shown that randomized protocols can be used to design and engineer systems, with peer-to-peer networks being the primary example Adamic et al. (2001); Wouhaybi (2004); Sarshar and Roychowdhury (2004); Sarshar et al. (2004); Sarshar and Roychowdhury (2005); Chawathe et al. (2003). One of the motivations of this work is to continue such efforts aimed at discovering new mechanisms that play an important role in organic real-world networks, and that might be useful in designing engineered networks and protocols.

Well-known examples of such data-inspired dynamic models, include preferential attachment (PA) and its variants Simon (1955); Barabasi and Albert (1999); Willis and Yule (1922), copying Kleinberg et al. (1999); Vázquez (2003); Krapivsky and Redner (2005), PA with fitness Bernardo A. Huberman (1999); Adamic et al. (2000); Bianconi and Barabási (2001), double preferential attachment of links Dorogovtsev and Mendes (2000) and the rewiring of links Albert et al. (2000). These mechanisms, however, model the dynamics of a growing network, where the effect of node deletion is not considered significant. Many real-world networks experience significant rates of node deletions. For example, nodes join and depart from peer-to-peer networks in a random and rapid manner, and movie actors end their careers, effectively removing themselves from collaboration networks. Hence, developing a network dynamic model for the class of ad hoc networks with a significant deletion component is important.

Several recently proposed models have addressed the node deletion process Chung and Lu (2004); Cooper et al. (2004); Moore et al. (2006); Sarshar and Roychowdhury (2004). However, these works take an egalitarian approach in modeling the deletion process as uniform node failure. The uniform deletion model fails to account for the heterogeneity of the nodes’ abilities to compete for survival, or participate for varying periods of time in a network. Interestingly, these uniform deletion models predict that a network’s power law (PL) degree distribution, a signature of several real-world networks such as the Web, will disappear as the deletion rate becomes more significant when the primary mechanism driving the network formation is the PA rule Moore et al. (2006); Sarshar and Roychowdhury (2004); Chung and Lu (2004). In order to retain a heavy-tailed distribution (i.e., networks with PL exponent, , being less than and closer to ) the vanilla PA mechanism has to be augmented with a dominant second mechanism that initiates new edges from the existing nodes, such as a distributed compensation mechanism as introduced in Sarshar and Roychowdhury (2004), or a Double Preferential Attachment (DPA) mechanism (see Section III.2 and Chung and Lu (2004); Cooper et al. (2004)). It is not clear whether organic networks with high deletion rates naturally and inherently possess such compensatory mechanisms to retain their empirically-observed heavy-tailed distributions. Moreover, while in an engineered network one might be able to enforce such stabilizing protocols (in order to retain the advantages accrued from the natural hierarchy present in a heavy-tailed network Adamic et al. (2001); Sarshar et al. (2004)), it might be too expensive to do so or might be too difficult to enforce, and alternative mechanisms that can stabilize the network structure might be needed.

In the absence of any empirical studies on the node removal process of real-world networks, the simple uniform deletion model is a reasonable assumption to work with. We, however, ask: Is it possible to empirically study the node deletion process of an organically grown network and quantify its characteristics? We turn to the Web, which has proved to be a treasure trove for mechanism and modeling sleuths. Recent empirical studies of the Web suggest that the current Web environment is extremely dynamic. For example, Ntoulas et al. found that 20% of the web pages in their large data set is permanently removed in just 1 month and 50% of the web pages are deleted in 9 months Ntoulas et al. (2004). Similar findings on the short lifetime of web pages are reported in Cho and Garcia-Molina (2000); Gomes and Silva (2006). These works, while they categorically establish that deletions of nodes is a significant event and should be included in any dynamic modeling of the web, they do not answer the nature of the deletion dynamics, and whether it is uniform or not.

i.2 Summary of Results

In a competitive network such as the Web, we expect the nodes to compete for survival in addition to competing for links. A webpage’s degree is a good approximation to its ability to compete, since heavily linked Web documents are entitled to numerous benefits, such as being possibly ranked higher in search engine results, and attracting higher traffic and, thus, higher revenue through online advertisements. As a result, we conjecture the mechanism of preferential survival (PS), whereby each node’s chance of survival increases with its degree; in other words, pages with higher degrees would be less likely to be deleted than their counter parts with lower degrees.

In order to verify our conjecture, we made a longitudinal study of Web data, where we followed two different sets of web-sites (each comprising more than pages) over a period of one year, as described in Sec. II 111Since we are following a fixed number of sites or hosts in our crawl data, only links among the pages in these sites are considered for our empirical analysis; links to and from these pages to pages outside of this universe are ignored.. We found that indeed there exists a significant rate of node deletion in the crawl data we studied. The deletion rates (the average number of nodes deleted per node that is added to the network, i.e., the network grows at the rate of ) for the sites we tracked are observed to be as high as (see the Appendix for further details). We next developed a method to quantify the node deletion kernel, and found that the conjectured PS mechanism is indeed in play and that the degree-dependent deletion kernel (i.e., the probability that a node of degree is deleted at any time step) behaves as, (), where is estimated to be 1.0 for our crawl data. Interestingly, given the high rate of node deletion rates in our crawl data, we found no sign of the disappearance of the power law degree distribution.

The empirical findings motivated us to study the role of preferential survival mechanism in the well-studied stochastic PA and DPA models; see Sec. III. That is, at every time step, in addition to adding a new node that initiates preferential edges, an existing node is chosen according to the PS deletion kernel, (), and this node (along with all of its edges) is then deleted with probability . Thus, for it reduces to the already studied case of uniform deletion where nodes are deleted at the rate of . Otherwise, as increases, the dynamic shields higher degree nodes against deletion, even though the overall deletion rate remains fixed at . The main predictions of our analysis can be summarized as follows (all analytical results are verified by large-scale simulations):

  1. In the special case of PA and only PS with , our analysis shows that the power law exponent is expected to be for any turnover rate between 0 and 1. Our large-scale simulation results are in good agreement with our analysis. Thus, the PS mechanism by itself can arrest the divergence of the PL exponent predicted for the uniform deletion case (i.e., ). Furthermore, we analytically derive the node lifetime distribution for the preceding case of PS (with ) and PA, and find that the probability a given node survives for time steps, converges to a constant as grows. The analytical distribution closely matches the empirical distribution of lifetimes in our crawl, providing further credence to the model.

  2. As a comparison, when we study the case of uniform deletion and DPA, we find that in order for the PL exponent to be stabilized at around for high rates of deletions (i.e., when ), the number of doubly preferential (DP) edges have to be increased significantly, i.e., if at every time step, each incoming node brings in edges on the average, then the existing nodes have to initiate DP edges at every time step, where . Thus, one needs a very strong DPA component to compensate for a uniform deletion case.

  3. In the case of both PS and DPA, we show that the power law exponent actually decreases as the network experiences higher turnover rate. Thus, when used in conjunction with the PS dynamic, even a weak DPA mechanism (i.e. for example for the empirically estimated values of and ) can be critical in driving the power law exponent close to 2 even in the face of extremely high rate of turnovers.

Although the PS mechanism is inspired by empirical Web dynamics, a complete model of the Web should take into account other factors such as the nodes’ varying fitness in attracting links Adamic et al. (2000); Bianconi and Barabási (2001), and such a modeling effort is beyond the scope of this paper. Moreover, while deletion is dominated by the PS mechanism for the two sets of crawls studied in this paper, further work studying a larger web sample is needed to justify a general conclusion that PS is a dominant mechanism for all parts or a majority of the web.

However, as our analysis and simulations indicate, the PS dynamic when incorporated into the PA and DPA models leads to a stable structure and can be used to model and design protocols to engineer large scale complex networks. Potential implications of the PS mechanism for both modeling and designing real-world network application purposes are discussed further in Section IV.

Ii Empirical Measurements

ii.1 The Dataset

Our dataset of the World Wide Web was obtained from the Stanford WebBase project 222http://dbpubs.stanford.edu:8091/~testbed/doc2/WebBase/. We randomly selected a set of web hosts, comprising of roughly 170 thousand pages, and tracked their evolution monthly for the year 2006. This dataset is denoted as SET1. In order to further validate our results, we sampled another set of web hosts comprising of roughly 150 thousand pages and tracked their evolution for the same period. For this dataset, more than 99% of the nodes belong to the weakly giant connected component (i.e., when we look at all edges as undirected) over the examined period. This dataset is denoted as SET2. The nodes in The WebBase crawler would extract a maximum of 10 thousand pages per host. However, the 10 thousand pages per host limit is not a problem since none of the tracked hosts reaches this limit.

(a)
(b)
(c)
(d)
Figure 1: (a) SET1: The power law exponent of the degree distribution of the sampled Web for each month in 2006. The inset figure shows the degree distribution for May, 2006. (b) SET1: the power law exponent of the degree distribution of the set of removed nodes for different months in 2006. The inset figure shows the power law degree distribution for the set of webpages that are removed in March, 2006. (c) SET2: the degree distribution for June, 2006. The power law exponent is . (d) SET2: the power law degree distribution for the set of webpages that are removed in June, 2006. The power law exponent is .

ii.2 Evidence of Preferential Survival of Webpages

We mined our Web dataset for direct empirical evidence of the preferential survival mechanism. We regard the web graph as an undirected network and investigate the degree distribution of the set of deleted nodes in a given month (i.e. the set of nodes that are alive in a given month but disappear in the following month). If nodes were to be deleted uniformly randomly, the degree distribution of the set of deleted nodes would be identical to the network’s degree distribution. For SET1, we found that the power law exponent of the degree distribution of the set of deleted nodes to be (Fig. 1(b)), which is significantly different from the power law exponent for the entire network (Fig. 1(a)). We found similar results for SET2: the power law exponent of the degree distribution of the set of removed nodes to be (Fig. 1(d)), which is significantly greater than the power law exponent for the entire network (Fig. 1(c)).

Our finding from both SET1 and SET2 suggests that a node is removed according to the deletion probability kernel: , where in our case. We will show in our model (see Sec. III.1) that a deletion kernel with leads to the stabilization of the power law exponent at , for any turnover rate between 0 and 1.

ii.3 Resilience of the Power Law Exponent

We tracked the PL exponent, , for our Web dataset with very high turnover rates (see Appendix). The power law exponent does not show any sign of divergence and is highly resilient under high rate of turnover. For SET1, the exponent stays around (Fig. 1(a)); to be self-consistent, only edges linking the tracked pages are considered in estimating the degree distributions. For SET2, the exponent is around over the examined period.

Iii The Model

In order to study the implication of the preferential survival mechanism, we propose the following dynamic model: at each time step, a node joins the network and makes links to nodes preferentially; with probability , a node is chosen to be removed, according to the deletion kernel , along with all of its associated links; new internal edges link in a double preferential attachment (DPA) manner to existing nodes. The parameter denotes the turnover rate or the deletion rate, which is defined as the rate of node removal divided by the rate of node addition.

Each node in the network is labeled by its insertion time. Let be the probability that the th node is still in the network at time , where . Note that yields the lifetime distribution of node . We have:

(1)

The initial condition is and , which can be considered as the ”-” moment of the degree distribution at time (see Table 1 for the definition of symbols).

Assuming the th node is still in the network at time , the evolution of its expected degree is described by the following equation:

(2)

where the sum of node degrees at time is described by , with denoting the average node degree at time and is the number of nodes at time .

Var. Definition
expected degree of the th node at time
sum of node degrees at time
size of the network at time
average node degree at time
average degree of a deleted node at time
Const. Definition
number of connections of the joining node
turnover rate or number of nodes deleted in each time step
ratio of number of internal edges added per time step
and number of connections per joining node
exponent in the deletion kernel
the ”-1” moment of the degree distribution:
Table 1: Table of Definitions
(a)
(b)
Figure 2: Power law exponent for the degree distribution of networks generated by simulation (, ). At the time the snapshots are taken, the networks reach 20 000 nodes. (a) Preferential survival (squares): the points do not deviate from 3 for (see Eq. (10)). For , the simulation points deviate from slightly due to finite number of time steps in simulations. (b) Preferential survival (circles), (squares), (triangles): simulations results indicate that a greater slows down the increase of the power law exponent.

The initial condition is: . Eq. (2) gives the rate at which the th node gains connections at time . The first term in Eq. (2) describes the attachments of the preferential links as a result of the joining node; the second term denotes the deletion of node ’s neighbors according to the deletion kernel; the third term describes the appearance of new internal edges attaching in a double preferential manner to target nodes. Furthermore, the evolution of is described by:

(3)

where is the average degree of a deleted node at time .

Eq. (3) gives the rate of increase for the sum of node degrees at time ; the first term on the right hand side describes the addition of edges, hence degrees are added to the sum of degrees; the second term describes the loss of edges as a result of the removed node.

Now to calculate the power-law exponent, we note that

(4)

The general model stated above appears to be very difficult to solve analytically.

iii.1 PA with PS ()

We now consider the preferential survival model by setting the parameters and , where the value of is inspired by empirical measurements of the Web. We first note that

(5)

where , under the assumption that converges rapidly to the stationary value . This assumption has been verified numerically. We thus obtain that . It is simple to show that the evolution of the sum of degrees at time is given as:

(6)

We now obtain: .

Similarly, we invoke the assumption that converges quickly to the stationary constant and verified this assumption numerically. Now, assuming the th node’s neighbors have the average degree , the evolution of the expected degree of the th node at time is described by the following equation after performing some calculations:

(7)

The equation above implies that

(8)

where .

After substituting Eq. (8) into Eq. (1), one can show that the equation is described by the following:

(9)

where . It is interesting to note that has an initial exponential decay and converges to a positive constant as .

We now invoke Eq. (4) to get the stationary degree distribution:

(10)

which has a power law tail with the exponent . This is the same power law exponent obtained for the simple preferential attachment model with no deletions. The analytical result is verified by large-scale simulations (see Fig. 2(a)). Thus, we found that preferential survival is the self-stabilization mechanism that nulls the harmful effect node deletion has on the power law exponent. Consequently, the power law exponent remains at 3 even in the face of node turnovers under the preferential survival mechanism.

For the case of non-unity , we resort to simulation studies: for the divergence observed for the uniform deletion case is checked, i.e., the PL exponent does not go to infinity as approaches 1, but still can be much larger than ; for (i.e., the high-degree nodes are now being protected even more), the distribution becomes slightly more heavy tailed (see Fig. 2(b)).

Lifetime Distribution of Webpages. In addition to the degree distribution, we study the lifetime characteristics of webpages. In order to obtain the empirical lifetime distribution of webpages, we gathered and processed additional crawls from the period 2003 to 2005 from WebBase (SET1). Our analytical model predicts that the probability a given node survives for time steps, has an initial exponential decay, followed by a slow convergence to a positive non-zero constant as grows: , where and are positive constants (see Eq. (9)). In other words, a given node has a non-zero probability of achieving a very long, if not an eternal, life.

Figure 3: The figure plots the lifetime distribution of webpages from our sampled Web. The empirical distribution matches well with the analytical function predicted by our model.

Our model, thus, offers an explanation to the empirical observation that a significant fraction of webpages has short lifetimes, while some webpages persist for a very long time (Fig. 3). Empirical webpage lifetime distribution of similar form has also been obtained by another measurement study Ntoulas et al. (2004), but no theoretical explanation has been offered.

iii.2 PA and DPA with Uniform Deletion ()

We now consider the double preferential attachment (DPA) with uniform deletion model by setting the parameter . Using different methods, the same model has been analyzed in Chung and Lu (2004); Cooper et al. (2004). The evolution of the expected degree of the node born in time step at time is described by the following equation:

(11)

where , and is described by

(12)

Solving Eq. (12), we get:

(13)

Now, Eq. (11) becomes:

(14)

Solving Eq. (14), we get:

(15)

where .

In addition, we solve the equation and obtain:

(16)

After invoking Eq. (4), the scaling relation for the power law exponent can be derived as:

(17)

The analytical prediction matches closely with large-scale simulation results as shown in Fig. 4(a). From our Web dataset, the parameter is estimated to be quite low at (see Appendix). For such low value of , the power law exponent is expected to diverge rapidly as the turnover rate moves away from zero as shown in Eq. (17). Thus, the emergence of the double preferential attachment edges is not sufficient to explain the observed resilience of the power law under high rate of turnover.

(a)
(b)
Figure 4: Power law exponent for the degree distribution of networks generated by simulation (). At the time the snapshots are taken, the networks reach 20 000 nodes. Double Preferential Attachment (DPA): Power law exponent for the degree distribution of networks generated with analytical models for uniform deletion (, ): (circles), (squares), (stars). Note that tracking high power law exponent values much above is rather difficult, since the distribution is rapidly decreasing and the power law region is typically exhibited over less than one decade. Hence, some data points for are omitted. As shown, unless is unreasonably high, DPA alone cannot stop the divergence of the PL exponent under heavy turnover. (b) Preferential survival with double preferential attachment edges (): (squares) and (circles). The simulation results are in agreement with the theoretical prediction from Eq. (22).

iii.3 PA and DPA with PS

We now investigate the preferential survival with double preferential attachment model by setting the parameters and for general . It is simple to show that the evolution of is described by the following equation:

(18)

As in the previous section, the assumption of the fast convergence of and is numerically verified and used. The evolution of the expected degree of the th node at time is described by the following:

(19)

Solving the above equation, we get:

(20)

with

(21)

where is the average degree in the network. Finally, the power law exponent is given by:

(22)

Note that Eq. (21) is a strictly increasing function for , assuming that stays roughly constant for different value of (this assumption has been numerically verified). Thus, the power law exponent actually decreases as the turnover rate increases. After obtaining the value of numerically, the analytical results are verified by large-scale simulations (see Fig. 4(b)).

In summary, our model predicts the following: for , our model reduces to the case of uniform deletion, where the power law exponent is predicted to diverge for moderate amount of DPA edges. In the case of PS with , our analysis shows that the power law exponent is expected to be for any turnover rate between 0 and 1. Thus, the preferential survival mechanism by itself can prevent the divergence of the PL exponent predicted for the uniform failure case. Furthermore, PS aided by a weak DPA dynamic can reinforce and stabilize the network’s degree hierarchy even more as approaches 1.

Iv Discussions

Our work takes an important step in understanding a relatively unexplored class of networks: the class of ad hoc networks with significant rates of addition and deletion of nodes. To the best of our knowledge, we provide the first empirical study on the nature of deletion dynamics in complex networks. Using longitudinal Web crawl data that spanned the period of one year, we discovered the preferential survival mechanism and quantified its parameter. In order to study the implication of the preferential survival dynamic, we analyzed a stochastic model that incorporated the standard preferential attachment mechanism with preferential survival and showed that the power law exponent is preserved even in the face of extremely high rate of node deletion.

As large scale network systems play an increasingly important role in our daily lives, the dynamics identified in this work could shed light on the empirical observation of real-world networks, and could make good candidates to be harnessed to engineer network applications. For example, from the perspective of modeling real-world networks, the empirical observation of PS (with ) and a weak DPA mechanism in the crawled web data, can by itself explain the resilience of the PL exponent observed in the web networks, even though the deletion rates are quite high (See Section III.3 for the analysis results). As noted in the introduction, however, a more complete modeling of the web data should take into consideration the underlying fitness distribution of the pages, and such a comprehensive modeling effort is beyond the scope of this paper.

The models studied in this paper could also find applications in the design of engineered networks. For example, in order to develop scalable search algorithms for large scale peer-to-peer (P2P) networks such as Gnutella, researchers have proposed efficient search protocols that harnessed and exploited the network’s power law degree distribution to deliver search hits at a traffic cost that scales sublinearly with network size Adamic et al. (2001); Sarshar et al. (2004). Given the unreliable and ad hoc nature of the nodes in peer-to-peer networks, it is important to develop distributed and local protocols that will guarantee the maintenance of the network’s power law topology even in the face of extremely high rate of node turnovers. One of the solutions proposed in Sarshar and Roychowdhury (2004) is to introduce a compensatory mechanism where existing nodes compensate for lost edges. In this work, however, we showed that preferential survival (PS) mechanism can stabilize the power law exponent. One potential way to implement PS in a P2P setup would be to enforce an incentive mechanism to encourage high-degree nodes to remain in the network. For example, peers can be rewarded with virtual monetary payment in an incremental manner for extending its availability in the network; in return, the virtual earnings can then be exchanged for services such as priority in downloading files. The exact reward function can be tuned to generate a preferential survival mechanism with a deletion kernel, , with . The precise implementation of the incentive mechanism is beyond the scope of the current work and is left for future investigations.

Appendix: Further Empirical Analysis of Web Dynamics

Estimating the Deletion Rate. From SET1, we found that the deletion rate is quite high (Fig. 5). We further found that as much as more than 10% of the nodes are involved in turnovers (inset Fig. 5). SET2 yields very similar results indicating high rate of turnover with the average turnover rate found to be . However, these figures are overestimates since we are taking measurement from a fixed set of web hosts.

Consider the liberal assumption that the Web grows at a rate of about 35% annually (i.e. 3% monthly). This assumption implies that the Web will double its current enormous size of more than 11 billion nodes Gulli and Signorini (2005) in just a little over two years. This rough estimate is obtained by noting that the number of web hosts has been growing at a rate of 25% for the past few years as measured by the Netcraft server survey 333http://news.netcraft.com/archives/web_server_survey.html.

Given that the Web grows at a rate of 3% monthly and the finding that around 10% of the webpages are removed in a month’s time, a set of new nodes with a size equal to 13% of the network size must be inserted to achieve the 3% monthly growth. These figures translate to a deletion rate of on the Web. Note that measurements from the literature indicates up to 20% of webpages are deleted in a month’s time Ntoulas et al. (2004), which will imply a even higher deletion rate. In addition, even if we assume an unlikely annual growth rate of 100%, the deletion rate is still well above 0.5, which is quite significant.

Figure 5: The monthly turnover rate of the sampled Web (SET1), comprising more than 170,000 pages, for the year 2006. The dashed horizontal line denotes the time average turnover rate: . The inset figure shows the number of webpages (in thousands) inserted (circles) and removed (triangles) for each month. More than 10% of the webpages are involved in turnovers for most months.
(a)
(b)
Figure 6: (a) Empirical evidence of the preferential attachment of new webpages introduced in February, 2006. The dotted line has a slope of 1.9 on a log-log scale in a cumulative function plot, which suggests that the preferential attachment kernel is of the form . The exponent of is very close to the exponent of from a linear preferential attachment kernel (also the attachment kernel obtained by the ”copying” mechanism). (b) The figure shows empirical evidence of double preferential attachment (data from April, 2006). Since the cumulative function is plotted, a slope of 2 (dotted line) on a log-log scale corresponds to double preferential attachment.

Weak Mechanism of Double Preferential Attachment Edges. On the Web, existing webpages often make new links to each other, where these new links are found to attach in a double preferential manner (as determined from our empirical dataset discussed in the next subsection). Under the assumption of uniform deletion, we found that even with DPA, the power law exponent behaves only as (derived in Sec. III.2), where is the ratio of the number of DPA edges and preferential attachment (PA) edges (from a joining node) per time step. Thus to get a for , has to be ; however, our empirical data (both SET1 and SET2) indicates that the DPA edges are only a small fraction of the PA edges (i.e., ), and thus DPA by itself cannot explain the low power law exponent we observe under high turnover rate (e.g. for and , the predicted exponent is ). In the case of both preferential survival and DPA, we use the empirically obtained values of and , and found that the power law exponent actually decreases as the network experiences higher turnover rate (see Sec. III.3). Thus, when used in conjunction with the preferential survival dynamic, even a weak DPA mechanism (i.e. ) is critical in driving the power law exponent close to in the face of extremely high rate of turnovers.

Measuring the Preferential Attachment Kernel. Although the preferential attachment (PA) model and the copying model Barabasi and Albert (1999); Kleinberg et al. (1999) are widely accepted as models of the Web, relatively few direct measurement studies Jeong et al. (2003); Newman (2001); Buriol et al. (2006) has been performed to validate the linear preferential attachment kernel generated by these models. When a new node joins the existing network and attaches edges to existing nodes preferentially, we obtain the following attachment kernel: , where denotes the degree of the target node. Using our data set, we performed measurements on sets of new nodes that appear every month, and confirmed the validity of the PA hypothesis (Fig. 6(a)). Similarly, when a new edge emerges and attaches to two existing nodes preferentially, we obtain the following double preferential attachment (DPA) kernel: , where and denote the degrees of the target nodes. We repeat the same measurement on the set of new edges that attach to existing nodes in a given month, and confirmed the validity of the DPA hypothesis (Fig. 6(b)).

References

  • Adamic et al. (2001) L. A. Adamic, R. M. Lukose, A. R. Puniyani, and B. A. Huberman, Phys. Rev. E 64, 046135 (2001).
  • Wouhaybi (2004) A. Wouhaybi, R.H.; Campbell, IEEE INFOCOM 2004 1, 119 (2004).
  • Sarshar and Roychowdhury (2004) N. Sarshar and V. Roychowdhury, Physical Review E 69, 026101 (2004).
  • Sarshar et al. (2004) N. Sarshar, P. O. Boykin, and V. P. Roychowdhury, in P2P 04 (IEEE, 2004), pp. 2–9.
  • Sarshar and Roychowdhury (2005) N. Sarshar and V. Roychowdhury, Physical Review E 72, 026114 (2005).
  • Chawathe et al. (2003) Y. Chawathe, S. Ratnasamy, L. Breslau, N. Lanham, and S. Shenker, in ACM SIGCOMM 2003 (ACM, New York, NY, USA, 2003), pp. 407–418.
  • Simon (1955) H. A. Simon, Biometrika 42, 425 (1955).
  • Barabasi and Albert (1999) A.-L. Barabasi and R. Albert, Science 286, 509 (1999).
  • Willis and Yule (1922) J. C. Willis and G. U. Yule, Nature 109, 177 (1922).
  • Kleinberg et al. (1999) J. M. Kleinberg, R. Kumar, P. Raghavan, S. Rajagopalan, and A. S. Tomkins, Lecture Notes in Computer Science 1627, 1 (1999).
  • Vázquez (2003) A. Vázquez, Phys. Rev. E 67, 056104 (2003).
  • Krapivsky and Redner (2005) P. L. Krapivsky and S. Redner, Physical Review E 71, 036118 (2005).
  • Bernardo A. Huberman (1999) L. A. A. Bernardo A. Huberman, Nature 401, 131 (1999).
  • Adamic et al. (2000) L. A. Adamic, B. A. Huberman, A.-L. Barabasi, R. Albert, H. Jeong, and G. Bianconi, Science 287, 2115a (2000).
  • Bianconi and Barabási (2001) G. Bianconi and A.-L. Barabási, Europhysics Letters (EPL) 54, 436 (2001).
  • Dorogovtsev and Mendes (2000) S. N. Dorogovtsev and J. F. F. Mendes, Europhysics Letters 52, 33 (2000).
  • Albert et al. (2000) R. Albert, H. Jeong, and A.-L. Barabasi, Phys. Rev. Lett. 85, 5234 (2000).
  • Chung and Lu (2004) F. Chung and L. Lu, Internet Math. 1, 409 (2004).
  • Cooper et al. (2004) C. Cooper, A. Frieze, and J. Vera, Internet Math. 1, 463 (2004).
  • Moore et al. (2006) C. Moore, G. Ghoshal, and M. E. J. Newman, Physical Review E 74, 036121 (2006).
  • Ntoulas et al. (2004) A. Ntoulas, J. Cho, and C. Olston, in Proc. of the 13th Int’l Conf. on WWW (2004), pp. 1–12.
  • Cho and Garcia-Molina (2000) J. Cho and H. Garcia-Molina, in VLDB (Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2000), pp. 200–209.
  • Gomes and Silva (2006) D. Gomes and M. J. Silva, in ICWE (ACM Press, New York, NY, USA, 2006), pp. 193–200.
  • Gulli and Signorini (2005) A. Gulli and A. Signorini, in WWW ’05: Special interest tracks and posters of the 14th international conference on World Wide Web (ACM Press, New York, NY, USA, 2005), pp. 902–903.
  • Jeong et al. (2003) H. Jeong, Z. Neda, and A. L. Barabasi, Europhysics Letters 61, 567 (2003).
  • Newman (2001) M. E. J. Newman, Phys. Rev. E 64, 025102 (2001).
  • Buriol et al. (2006) L. S. Buriol, C. Castillo, D. Donato, S. Leonardi, and S. Millozzi, Proceedings of International Conference on Web Intelligence 2006 (2006).
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
17425
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description