The Effect of Network Topology on the Stability of Discrete State Models of Genetic Control

The Effect of Network Topology on the Stability of Discrete State Models of Genetic Control

Andrew Pomerance pomeranc@umd.edu    Edward Ott    Michelle Girvan    Wolfgang Losert Institute for Research in Electronics and Applied Physics
University of Maryland, College Park
College Park, MD, 20752
Abstract

Boolean networks have been proposed as potentially useful models for genetic control. An important aspect of these networks is the stability of their dynamics in response to small perturbations. Previous approaches to stability have assumed uncorrelated random network structure. Real gene networks typically have nontrivial topology significantly different from the random network paradigm. In order to address such situations, we present a general method for determining the stability of large Boolean networks of any specified network topology and predicting their steady-state behavior in response to small perturbations. Additionally, we generalize to the case where individual genes have a distribution of ‘expression biases,’ and we consider non-synchronous update, as well as extension of our method to non-Boolean models in which there are more than two possible gene states. We find that stability is governed by the maximum eigenvalue of a modified adjacency matrix, and we test this result by comparison with numerical simulations. We also discuss the possible application of our work to experimentally inferred gene networks.

I Introduction

Boolean networks have been extensively investigated as a model for genetic control of cells Kauffman1969 (); Kauffman1993 (). In this model, each gene is represented by a node of a network, and each node has one of two states: on – i.e., producing (‘expressing’) its target protein – or off. Directed links between genes indicate that one gene influences the expression of another. This can correspond to the expressed protein directly binding to DNA and modulating the transcription of a gene or to other signaling pathways that modulate DNA transcription. In the standard Boolean network model, the system evolves in discrete timesteps (), and at each step the state of every node is simultaneously updated according to some function of its inputs. This function approximates the action of activators (proteins which act to increase expression of a given gene) or inhibitors (proteins which act to reduce expression). While this model might seem to be an oversimplification considering the complex kinetics involved in all steps of a transcription pathway, experimental evidence suggests that real biological systems are, in some cases, reasonably well-approximated by Boolean networks Albert2003 ().

In 1969, S.A. Kauffman Kauffman1969 () introduced a type of Boolean network known as an network. In this model, there are nodes each having exactly input links, and the nodes from which these input links originate are chosen randomly with uniform probability. We refer to the number of input (output) links to (from) a node as the in-degree (out-degree) of that node. At any given time , the system state can be represented as an -vector whose th component is either zero or one, where . There are possible states. The function determining the time evolution at each node is defined by a random, time-independent, -entry truth table. Since this is a finite, deterministic system, there is always an attractor: eventually, the system must return to a previously visited state (finiteness), after which the subsequent dynamics will be the same as for the previous visit (determinism). These attractors can be fixed points or periodic orbits. Using the Hamming distance between two states (i.e., the number of nodes for which the disagree) as the distance measure, the system exhibits both what is termed a ‘chaotic’ (or unstable) regime, where the distance between typical initially close states on average grows exponentially in time, as well as a stable regime, where the distance decreases exponentially. Between the two there is a ‘critical’ regime. (Here by ‘close’ we mean that the Hamming distance is small compared to .)

As a model of genetic control, these attractors have been postulated to represent a specific pattern of protein expression which defines the cell’s character Kauffman1969 (). In single-celled organisms, these attractors might be taken to correspond to different cell states (growing, dividing, starving, heat- or pH-shocked). In multi-cellular organisms, different cell types (muscle, nerve, liver, etc.) have different expression patterns, and, within each type, a cell could be in a variety of states (resting, ‘activated,’ dividing, etc.) that each correspond to different expression patterns. Boolean network approximations have been successful in predicting the gene expression time sequence of the segment polarity gene network in Drosophilia, a model for embryonic development where individual cells turn specific proteins on and off in patterns that guide the growth of certain organs and structures Albert2003 (). Since the protein expression pattern of the cell is modeled from the state of the corresponding Boolean network, the question of the stability of the network then becomes important: do small perturbations in the expression pattern, due perhaps to chemical fluctuations, die out quickly, returning the cell to its original state, or do they quickly grow, pushing the cell into another state? The purpose of this paper is to examine the stability of network dynamics in the context of discrete state models of gene networks.

One motivation for the consideration of dynamical stability is its possible relevance to cancer. Specifically, we hypothesize that dynamical instability of a gene network might be a causal mechanism contributing to the occurence of some cancers. We emphasize that this hypothesis is distinct from the previous hypothesis of ‘genomic instability’ as a cause of cancer GenomicInstab1 (). In particular, genomic instability has been defined111As defined in the glossary of Nature Genetics as ‘the failure to transmit an accurate copy of the entire genome from one cell to its two daughter cells.’ In contrast, the instability we refer to is that of the dynamics of a given gene network, and we use the term ‘dynamical network instability’ (DNI) to distinguish this condition. We speculate that DNI might arise from mutations and that, once established, as cells divide, DNI could lead to widely varying gene expression patterns from cell to cell. We emphasize that DNI implies that this variation would arise even in the absence of further mutation. That is, similar to the concept of chaos in continuous-state dynamical systems (e.g., OttBook ()), DNI causes exponential sensitivity of typical system trajectories to small changes, which we speculate may lead to many different outcomes in the course of cell division. Recent microdissection results indicate wide variations in gene expression patterns even for nearby cells within the same cancerous tissue w1 (). This variability provides a basis for understanding why cancer can adapt and evade treatment w2 ().

Another motivation for our study is the argument, put forward by Kauffman Kauffman1993 (), that evolution favors gene networks that are on the border between stability and instability Critical1 (); Critical2 (); Critical3 (); Critical4 (). Whether or not our cancer hypothesis or Kauffman’s stability-border hypothesis holds, the question of dynamical stability of such networks is crucial to their understanding and use as models.

While previous works have addressed the question of dynamical network stability in simple, specific types of random networks (e.g. nets), in this paper we address the question of dynamical network stability for general network topology and node attributes. We also consider nonsynchronous update and extend the considerations to non-Boolean models allowing for the possibility of nodes having more than two states. Thus our work provides a potentially enhanced framework for modeling and using the discrete state network paradigm. In particular, we consider how our network stability considerations can be employed on experimentally derived gene networks.

In the original nets as proposed by Kauffman, the truth table output governing node dynamics was randomly chosen with on and off having equal probability. Subsequently, it was shown that if the truth table output was biased such that denotes the probability of randomly assigning an off output, the transition between the stable and chaotic regimes depends on Derrida1986 (). We term the ‘expression bias.’ Additionally, networks with a distribution of in-degrees, but no in-/out-degree correlation, have been considered in Luque1995 (); Luque1997 (); Fox2001 (); Aldana2003 (), and it has been shown that the nodal in-degree average, , suffices to determine the stability. (Here indicates average of a nodal quantity over all nodes.) Specifically, the critical average number of connections, , governing this transition is

(1)

where the network is stable for , unstable for , and critical for . Aldana and Cluzel Aldana2003 () considered the consequences of Eq. (1) in the case of networks with scale-free topology Barabasi1999 (), i.e., the probability distribution (or ) that a randomly chosen node has in-degree (out-degree ) is a power-law: . (Since every out-link for a node is an in-link for some other node, ; thus the result is unchanged whether it is the in- or out-degree that has power-law scaling.)

Recently, some authors have noted, but not numerically tested, a generalization of Eq. (1) that takes into account nodal correlations between the in-degree and out-degree characterized by the joint degree distribution function . In this case, the critical transition occurs at Lee2008 ()

(2)

We emphasize that Eq. (1) was derived in the annealed approximation (see later discussion) for networks with a given in- or out-degree distribution and with the complimentary links completely random, and that Eq. (2) uses only the additional information contained in the nodal in-/out-degree correlation. Furthermore, all nodes (‘genes’) were taken to have the same value. However, gene networks, in common with real networks occurring across a broad range of applications, can be expected to deviate substantially from the above simple network model. Examples of network properties that could make previous analyses of network stability inapplicable are assortativity Newman2002 () (the tendency for highly connected nodes to prefer or avoid linking to other highly connected nodes) and community structure Girvan2004 () (the existence of highly connected, sparsely interconnected subgraphs), two properties that are not captured in the degree distributions. Additionally, these properties may have biological implications. For example, a recent paper Wang2007 () examined gene interaction networks from cancerous tissue and found significant community structure, as well as positive correlation between the in-degree and out-degree of nodes; additionally, protein interaction networks have been shown to exhibit significant disassortativity Newman2002 (); ProteinNet (). Furthermore, for modeling purposes, it might be important to allow the expression bias to vary from node to node (as an extreme example, so-called housekeeping genes Housekeeping () have a predominant tendency to be on, corresponding to low , unlike other genes). In this paper, we derive and test the stability criterion for large networks with arbitrary network topology and heterogeneous expression biases. In particular, our theory evaluates the stability of any given network with its specific topology (i.e., its adjacency matrix defined subsequently), and by its node-specific expression biases. We show that stability is determined by the largest eigenvalue of a modified adjacency matrix, and we numerically test this criterion.

With respect to real gene networks, the synchronous update at integer times () used in the above models represents an additional deviation from the real situation, where chemical kinetics and transport processes can be expected to introduce non-trivial dynamics. As a partial step toward remedying this (and to make Boolean approximations suitable for atmospheric and geophysical processes), Ghil and Mullhaupt Ghil () consider a generalization in which is a continuous variable and depends on , where is a delay time that can be different for each link from to . The original formulation (e.g., in Refs. Kauffman1969 (); Derrida1986 (); Luque1995 (); Luque1997 (); Fox2001 (); Aldana2003 ()) corresponds to for all . We will argue and numerically confirm that the criterion determining the stability/instability border of this generalization of the Boolean network model is the same as that for the synchronous update models.

In addition to nonsynchronous update, another generalization of Boolean networks that we will examine is models in which each node is allowed to have one of possible discrete states (e.g., for , we label the states , and for Boolean networks for all ). This model may be closer to the behavior of actual cells, and models with multiple states can be related to certain piece-wise ODE models of transcription ODE (); Multivalue2 (). The general model using arbitrary, multivalued truth tables has been previously treated in the special case of networks with all nodes having the same number of possible states . In the case where each possible state is equally likely, the critical number of inputs is 222Unpublished results by Sole, RV, Luque, B, and Kauffman, S.

(3)

The applicability of our work to any specific network and set of node-wise expression biases may be of particular interest in situations where experimental data provide the possibility of estimating a gene network and expression biases. Such information could be used as input to our method which could give an indication of the stability of a given experimentally-derived network. The possibility that such analyses may be feasible becomes more and more likely with the rapid technological advances in obtaining new types of high-quality, quantitative data useful for deducing gene networks. For example, such analyses could be used to address the hypothesis that dynamical instability of gene networks is connected with the occurence of cancer.

Ii Model

Deterministic Boolean networks are formally defined by a state vector , where , and a set of update functions , such that

(4)

where denote the indices of the nodes that input to node ; we denote this set of nodes by . The update function is defined at each node by specifying a truth table whose outputs are randomly populated. Previous analytic results assumed a constant expression bias for all nodes; however, we allow that, in the truth table for node , output entries are randomly assigned zero with probability or one with probability . In the case of uniform expression bias, we drop the subscript and use the notation .

We consider the interaction structure of this system as a graph where the nodes represent individual elements of the state vector, and a directed edge is drawn from node to node if . An adjacency matrix is defined in the usual way: a matrix entry is one if there is a directed edge from node to node and zero otherwise.

The stability of a large Boolean network is defined by considering the trajectories resulting from two close initial states, and . To quantify their divergence, the Hamming distance of coding theory is used: . If the network is stable, on average as . In unstable networks, quickly increases to , while a ‘critical’ network is at the border separating stability and chaos.

In order to study the stability of an Boolean network, Derrida and Pomeau Derrida1986 () considered an annealed situation and calculated the probability that, after steps, a node state is the same on two trajectories that originated from initially close conditions. (This calculation was later generalized to variable in-degree Luque1995 (); Luque1997 (); Fox2001 (), and joint degree distribution in Lee2008 ().) In Derrida and Pomeau’s ‘annealed’ situation, at each time step the truth table outputs and the network of connections are randomly chosen. The actual situation of interest, however, is the case of ‘frozen-in’ networks, where the truth table and network of connections are fixed in time. It has been commonly assumed that analytical results obtained for the annealed case are a good approximation to the frozen-in case (e.g., Refs. Derrida1986 (); Luque1995 (); Luque1997 (); Fox2001 ()). We also adopt this view in a modified form, and we will test its predictions with numerical simulations.

The randomization of the network of connections at each time step while keeping the degree distribution fixed carries the implicit assumption that there is no additional dynamically relevant structure in the frozen network other than that contained in the joint degree-distribution . To avoid this assumption, we obtain theoretical results for a different annealing protocol, which we term ‘semi-annealed.’ In this semi-annealed procedure, we keep the network fixed (i.e., the adjacency matrix does not change with time), and we envision randomly assigning the output entries of the truth table of each node at every time according to the time-independent expression bias assigned to node . We then imagine tracking the probability that individual node states and differ over time with an -dimensional difference vector, whose components are , where denotes an average over every possible small initial perturbation. Here by ‘every possible small initial perturbation’ we mean all perturbations for which a small fraction of the states are flipped. Additionally, we define the ‘sensitivity’ as the probability that the output of changes when given two different input strings, similar to the ‘average sensitivity’ of Ref. sensitivity (). In the case of completely random Boolean functions,

(5)

Thus, similar to Ref. Derrida1986 (), we can write the update equation for as

(6)

Equation (6) follows from noting that the probability that and are equal is and thus the probability that all inputs to node are equal is the above product. Note that this equation uses topological information contained in the . However, we have treated , and as if they were probabilities of statistically independent random events. We hypothesize that this semi-annealed protocol might be expected to yield good results for frozen-in cases when the network is large and the fraction of network nodes on short loops is small (the network is ‘locally tree-like’). To see the problem posed by short loops, consider a node with two inputs that themselves have inputs both coming from a common node; in this case, the elements of in Eq. (6) are no longer statistically independent and multiplying the probabilities is no longer correct. See Ref. Ott2008 () for discussion related to the locally tree-like assumption. Our numerical tests of frozen networks indeed yield results that agree very well with our semi-annealed hypothesis on large, locally tree-like networks. We also find our predictions to hold for networks with a large number of feedforward motifs, a nontree-like three-node subgraph that has been found to be prevalent in real gene networks Alon2002 ().

The case where both network states are exactly the same corresponds to , which is a fixed point of Eq. (6). In order to determine the stability of this fixed point, we linearize Eq. (6) around for small perturbations:

(7)

where are the elements of the adjacency matrix . Equation (7) can be written in matrix form as where

(8)

The stability is thus governed by the largest eigenvalue of this matrix:

(9)

Since , the Perron-Frobenius theorem MacCluer2000 () guarantees that is real and positive. We also note that, for any given adjacency matrix and assignment of ’s to nodes, Eq. (6) can be iterated numerically to predict the expected time-asymptotic saturation value of the difference in two initially nearby states when evolved to steady-state. We numerically test this prediction, as well as the stability criterion in Eq. (9) in the next section. 333We note that Eq. (8) and the condition also occurs in the treatment Ott2008 () of site percolation on directed networks where different sites have different removal properties. A similar condition involving also arises in percolation on undirected networks Cohen ().

As a special case of interest, if the the are uniform, , then , where is the maximum eigenvalue of the adjacency matrix. This yields the critical condition,

(10)

Furthermore, for the case of a large network whose links are randomly assigned subject to a joint probability distribution at each node (with no assortativity), the mean field approximation for the largest eigenvalue is Ott2007 ()

(11)

where, since necessarily, we use the notation . Equations (10) and (11) yield the same criterion as in Eq. (2). In the case where and are uncorrelated, and , yielding Eq. (1).

The eigenvalue of random network adjacency matrices with assortativity has been considered in Ref. Ott2007 (), which defines an assortativity measure as

(12)

where denotes an average over all links from node to node . The network is assortative (disassortative) if (). For near one, the largest eigenvalue is approximately given by Ott2007 ()

(13)

Thus by Eqs. (6) and (9), it is predicted that, for uniform , assortativity (disassortativity) decreases (increases) the critical value.

In the case of nonuniform , we have recently generalized Eq. (11) to obtain an analogous mean field approximation to without assortativity or community structure,

(14)

Our derivation of (14) will be published elsewhere. From (14), we see that correlation (anticorrelation) between and decreases (increases) network stability and that, in the absence of correlation, the result is similar to that for a uniform , , with replacing the uniform (Eqs. (9) and (10)).

We now consider the generalization to allow any number of discrete node states. We denote the number of possible states of node by , and we label the possible states . The number of possible inputs to from the set of nodes that influence it is . For each of these possible inputs, the truth-table function in Eq. (4) assigns one of the possible states to node . Similar to the Boolean case, we take the assignment to be random and to have an ‘expression bias’ for each of the node states, where denotes the probability that , for a given set of inputs, assigns the state to node , and for all nodes . As in the Boolean case, we can then introduce the sensitivity giving the probability that two different sets of inputs result in a different updated state of node , which, in the random truth table case,

(15)

With this definition, we see that all our previous reasoning still applies, and Eqs. (6)-(9) hold with this generalized expression for the node sensitivities and with interpreted as the probability of disagreement between and . In the case of uniform number of node states and equal expression biases , among these states, Eq. (15) becomes , which, when combined with Eq. (10) yields the previous result in Eq. (3).

Finally, we note that our criticality criterion, , is unchanged by the presence of delays, as in the models of Refs. Ghil (), and only a slight modification is required of Eq. (6) (i.e., replaces ). The condition implies that the components of in Eq. (6) are time-independent. Thus we predict that the delays do not influence the result, and the criticality condition in Eq. (9) is independent of the synchronous update structure of the most commonly used random Boolean network models. Similarly, the time-asymptotic steady state obtained by repeated iteration of (6) is, by definition, time-independent and thus also does not depend on the (although the will influence the time-dependent approach to the asymptotic steady state; see supplementary material).

Iii Statistical Methods

We numerically test the above predictions on several classes of Boolean networks with uniform sensitivity (i.e., is the same for all nodes):

  1. random networks with ;

  2. random networks with imperfect correlation between and ;

  3. networks with assortativity or disassortativity; and

  4. networks constructed as in (a) but with a substantial number of feedforward loops.

We additionally test our predictions on two classes of networks with nonuniform sensitivities:

  1. networks constructed as in (a) but where nodes have different sensitivities correlated with the degrees of the nodes; and

  2. networks with significant community structure, where the two communities have different, uniform sensitivities.

Finally, we test our generalization to more than two node states on networks of type (a) but with for all nodes. For types (a)-(c) and (e), we use networks with truncated power-law degree distributions. (Evidence for the presence of this type of distribution in gene networks has been seen in Scalefree ().)

The algorithms for constructing the networks of types (a)-(c) are as follows. (i) Establish the in- and out-degrees for each node, which are drawn from a distribution,

(16)

where and (Boolean case) or ( case). The out-degree is initially set to the in-degree. (ii) Randomly swap the out-degrees between pairs of nodes. If maximal correlation between in- and out-degrees is desired, as in (a), this step is skipped so that and is maximal. A completely uncorrelated network has every nodal out-degree swapped exactly once, yielding . The quantity , which approximately determines by Eq. (11), can thus be tuned by the number of nodes that have their out-degrees swapped. (iii) Place links randomly between nodes subject to the constraints of the specified in- and out-degrees assigned at each node by the ‘configuration model’ configmethod (). (iv) If networks with assortativity (disassortativity) are desired, as in (c), perform a given number of link swaps, as in Ott2007 (), that increase (decrease) the assortativity in Eq. (12). In all cases we employ networks with and two initial conditions separated by a Hamming distance of 100. In the supplementary material we discuss finite size effects that can occur for smaller .

We emphasize that, although we determine our networks randomly, in our numerical experiments we do not average over this randomness. Rather, we generate one random network for each experiment and examine the resulting behavior of that specific network.

Iv Results

We test the steady-state predictions of Eq. (6) and the criticality condition of Eq. (9) in Fig. 1, and compare the calculated critical parameters to the mean-field-type approximations of Eqs. (11), (12), and (14) in the supplementary notes. In order to compare Eq. (6) (solid curves in Fig. 1) to experimental measurements of the Hamming distance from numerical evolution of true frozen Boolean dynamical systems (markers in Fig. 1), we calculate the node averaged steady-state fractional Hamming distance,

(17)

In practice, this limit is calculated as the average Hamming distance from to when all delays are the same (), and from to when nonuniform delays are present. These times are well after the steady-state value is reached (see supplementary material). Each experimental data point in Fig. 1 corresponds to a single realization of interconnections averaged over 100 realizations of the time-independent truth table with specified sensitivity as before.

iv.1 In-/Out-degree Correlations and Heterogeneous Time Delay

Figure 1(a) shows the steady-state Hamming distance as a function of the sensitivity for one network of type (a) () and two of type (b) (). The closed markers in the figure represent experiments with uniform delay on all links, while the open markers correspond to experiments where half the links, randomly chosen, have and the remainder have . Importantly, the degree distributions are the same for all three networks, and we attain different values by varying the correlation between the in-degree and the out-degrees. We see from Fig. 1(a) that there is close agreement between the theoretical prediction and the experimental results and our prediction that the presence of delays does not change the stability is confirmed. Additionally, the measured steady-state Hamming distance is essentially zero below the critical value of the sensitivity, (this point is indicated by vertical downward arrows in Fig. 1(a)). We emphasize that the degree distributions (and hence ) are the same for the networks in Fig. 1(a), and thus, if the in-/out-degree correlation were ignored, the observed difference between the stability conditions for these networks would not be predicted.

iv.2 Assortativity/Disassortativity

Figure 1(b) shows results obtained when significant assortativity or disassortativity is present (type (c) networks). In this experiment, as well as all those reported below, the delays are all uniform. The networks under consideration have the same joint degree-distribution with . However, each of the networks have very different assortativities (, defined in Eq. (12)), which yield different largest eigenvalues (). Since the joint degree distributions are the same, Eq. (2) would predict that the three networks have the same stability characteristics. However, since their eigenvalues are very different, we predict that, as observed, the transitions of the three networks occur at different values of . Again the theoretical predictions of are indicated by vertical arrows.

iv.3 Motifs

Random construction of networks, as used in the networks above, is expected to yield networks that are locally tree-like Ott2007 (). However, we note that biological and other types of networks often have motifs (small subgraphs) that occur with higher frequency than in randomly constructed networks Alon2002 (). For gene networks of E. coli and S. cerevisiae, it was found that the number of feedforward loop motifs (see inset to Fig. 1(c)) is significantly enhanced compared to the expected number in a randomly constructed network. In these networks, the number of feedforward loops per node is roughly 0.1. Thus we consider a network of type (a) ( and ) after adding 1000 () and, in an extreme case, 2000 () feedfoward loops. To add a feedforward loop, we randomly choose a node , follow a random output to node , and follow a random output of to node . We then add a link from node to node . We do this a given number of times, avoiding nodes that already participate in an added feedforward loop. In Fig. 1(c), we see that the semi-annealed theory of Eq. (6) (solid curve) again agrees well with our numerical experiments (solid markers). Based on such results, we believe that the locally tree-like network requirement does not invalidate application of our method to real gene networks. We also note that the critical point is essentially unchanged by the addition of loops (adding links only slightly increases the largest eigenvalue), however more feedforward loops tend to increase the steady-state Hamming distance for .

iv.4 Application to S. cerevisiae

As a real biological example, we include in the supplementary notes a graph similar to those in Figs. 1(a)-(c) using a published network for the yeast S. cerevisiae Yeast ().

iv.5 Heterogeneous Correlated Sensitivities

Figure 1(d) demonstrates the effect of a distribution of ’s on the stability of a network with and with correlation between the nodal values of and , i.e., type (d) networks. We consider two situations, one where is maximal, and one where it is minimal. The are drawn from a uniform distribution centered at (the abscissa in the figure), with width . Maximal (minimal) is attained by assigning the largest to the node with the largest (smallest) , the second largest to the node with the second largest (second smallest) , and so on. As can be seen from the figure, there is good agreement between the semi-annealed theory and the numerical experiments, and the two networks become unstable at different values of . (Vertical arrows again indicate the points where .)

iv.6 Community Structure

Figure 1(e) shows our results for a case where there is community structure and community-dependent sensitivity. To construct the networks in Fig. 1(e), consider the case where there are two communities, and we assign a link from node in community to node in community with probability . We impose the additional constraints that and that , and the size of the two communities are the same, . We take to be the same for both communities, and we also assume that communities and have different sensitivities and , respectively. As is increased from zero to , changes from the case of two completely separated communities to one of a single random network. Communities and have equal sizes of 5000 nodes, community has , and community has . In order to vary , we vary and , keeping their sum constant in order to maintain constant . As with the curves in Fig. 1(a)-(c), the transition to chaos is governed by ( = 1 at the vertical arrow), and Eq. (6) (solid curve) accurately predicts the numerically observed (solid circles) steady-state Hamming distance.

iv.7 Non-Boolean Models

Figure 1(f) illustrates an application to a case in which there are more than two possible states at each node. In particular, we consider possible states at each node. (Since the number of possible inputs to the truth table for node in this case is , we take due to memory constraints.) Labeling the possible node states , we take nodes to have uniform expression biases for occurence of state-label 0, , from 0 to 1. The three remaining labels () also have uniform biases for all nodes , . From Eq. (15), , which has a maximum at As can be seen in the figure, the predicted fraction of nodes with differing states (solid curve) also has a maximum there. It is again seen that the measurements (markers) are well-predicted by the theory.

Figure 1: (a) vs. for three networks with different largest eigenvalues (), both with uniform delay on all links (closed markers) and with half the links having increased delay of (open markers). The solid curves correspond to the prediction (defined in Eq. (17)) obtained by simulating Eq. (6). The downward vertical arrows correspond to for each of the three networks. (b) vs. for three networks with different assortativities. (c) vs. for networks with added feedforward motifs, with an illustration of the feedforward motif (inset). (d) vs. for networks with maximum correlation (circles) and minimum correlation (squares) between and , where is drawn from a uniform distribution centered at with width . (e) vs. for networks with community structure, where the two communities have and . (f) vs. for a network where each node can take one of possible states. is the probability that a zero appears in the truth table output; the remaining three symbols appear with equal probability.

V Discussion

In this paper, we have presented theoretical results (Eqs. (6) and (9)) which predict the steady-state Hamming distance between states evolved from two nearby initial conditions and the stability of a given network. These results are derived using the hypothesis that a theory derived in the semi-annealed case approximates the true situation, where by semi-annealed we mean that the network of connections is frozen, but the truth table at each node is randomly reassigned at each timestep. For large networks, this approximation was found to give excellent agreement with the true case of frozen connections and frozen truth tables. Our semi-annealed hypothesis does not rely on gross statistical properties of the network, but instead uses the specific network topology, as characterized by the network adjacency matrix, and the individual node sensitivities to make predictions.

We tested our theoretical predictions with numerical experiments. Previously unaddressed issues that we considered include the effects of assortativity, nonuniform time delay, nonuniform sensitivity, motifs, and community structure. In all cases tested we found good agreement with our theory.

The theory that we have presented and tested above may represent a step forward in facilitating the application of discrete state dynamical network models to biological systems. Given a specific genetic interaction network and an estimate of the node sensitivities, Eq. (9) predicts the stability of that particular network directly from the adjacency matrix. Curated networks already exist in the literature for model single-cellular systems, and new algorithms continue to be developed for inferring interaction networks from a wide range of data sources (microarray experiments, GO annotation, genome sequencing, etc.). We note that such a procedure has the advantage that, because the actual experimentally determined network is employed, topological aspects such as nodal in-/out- degree correlation, assortativity, community structure, etc., do not first have to be determined and then statistically modeled. Thus, by use of our stability criterion (9), there is the potential that future analysis may be able to evaluate a supposed relationship between the stability characteristics of various networks and their functioning. For example, one might test whether cancer gene networks are less stable than those in healthy tissue. This could lead to the strong variations in gene expression observed in cancerous tissue w1 (), even when the underlying gene network is unchanged. We are currently pursuing research along this line.

This work was supported by NSF (Physics) and by ONR (contract N00014-07-1-0734). We thank L. Staudt for discussion. The work of A.P. was partly supported by the NCI intramural program.

Vi Supplementary Information 1: Transient Evolution

Figure S1 shows the time evolution results for networks of type (a) (i.e., for all and uniform ). Once a random network is generated, we simulate the evolution of two close initial conditions and plot the Hamming distance as a function of time in Fig. S1. Specifically, we take an arbitrary initial condition and generate a perturbed initial condition by flipping a fraction of the state bits; in Fig. S1, and corresponding to 10 flipped bits. Figure S1(a) shows the Hamming distance as a function of time step for four cases with different values of sensitivity () and uniform delays (as in Eq. [4]). Each of these four curves are generated using the same network of interconnections (for which ) and the same perturbation in initial conditions averaged over 100 realizations of the nodal truth tables. For the three cases , , the network is predicted to be unstable. We see in Fig. S1(a) that in these cases, the Hamming distance rises and eventually saturates at a constant value. In the fourth case, and we have that , and the network is predicted to be stable, which is demonstrated in the figure. These cases illustrate the strong effect that in-/out-degree correlations can have: for the value in the network of Fig. S1, the prediction from the result for uncorrelated networks, Eq. [1], is stability for all values of (the minimum value of , the right hand side of [1], is 2, which exceeds ). Figure S1(b) shows time traces of the Hamming distance for when non-uniform delays are present. In the curves shown, a fraction of the links are randomly chosen and given delays of with the remaining links having delay . The curves are for the same network as in Fig. S1(a) with and are again the average of 100 different realization of the truth table. (The choice of delayed links is the same for all 100 realizations.) In each case, we see that the network is unstable and the Hamming distance rises to the same steady-state value, albeit at a slower rate for larger . This result thus is consistent with our prediction that whether or not a network is stable and its final saturation value do not depend on heterogeneity of the delays.

Figure S1: (a) Evolution of the Hamming distance between two initial conditions for a typical network of size and for various values of the sensitivity and uniform delay . (b) Evolution of the Hamming distance between two initial conditions for a typical network of size , , and . Results are shown for a network with for all links (solid curve) and with on 0.1 (dashed curve) and 0.5 (dotted curve) of the links.

Vii Supplementary Information 2: Finite-Size Effects

In Fig. S2(a) and (b), we consider the importance of finite-size effects by varying (a parameter which does not appear in the theory) for two different size networks of type (a). Figure S2(a) also compares the results of simulating the frozen case (solid markers) to the semi-annealed case described in the Theory section (open markers) for . As before, in simulating a semi-annealed network, at each time step the nodal truth tables are randomly generated with the same . The networks under consideration in Figs. S2(a)-(b) have , , and (a) and (b) . As the figure demonstrates, larger yields better agreement with the theory, and the semi-annealed case seems indistinguishable from the frozen case. Note also that the results for () and is similar to that for () and , suggesting that the relevant quantity is , the number of flipped states. The inset of Fig. S2(a) shows the histogram of the Hamming distances used in calculating the point , (upward vertical arrow). The different trials used in generating this histogram correspond to different truth table realizations. The distribution shown in the inset consists of a large number of samples with Hamming distance zero and a roughly symmetric part that has a mean near the theoretical prediction. The overestimation of the mean by the theory therefore seems to be driven by the relative number of zero samples compared to the symmetric part.

In order to understand the origin of the zero samples, we note that one way that they can arise is through ‘irrelevant’ nodes (i.e., nodes that do not influence the dynamics of the network) and ‘frozen’ nodes (i.e., nodes whose output is independent of its inputs due to the random assignment of the truth table). Irrelevant nodes can arise by either having no out-going links or by inputting only to other irrelevant or frozen nodes. Flipping the value of an irrelevant node, by definition, does not change the subsequent evolution of the network; if a perturbation between nearby initial conditions consists solely of such flips, that perturbation dies out quickly. Assuming that the fraction of irrelevant nodes is independent of , then the probability that all nodes for which the two initial conditions differ are irrelevant goes to zero as for constant ; in this case, the observations should exactly match the theoretical prediction. This is consistent with the trend indicated by our comparison of the network in Fig. S2(a) with the network in Fig. S2(b).

Figure S2: The steady-state fractional Hamming distance for (a) and (b) as a function of the sensitivity for various values of , both in the frozen case (filled symbols) and the annealed case (open symbols). The largest eigenvalue of this network’s adjacency matrix is . While the theory does not depend on the value of , finite-size effects cause a dependence on the number of flipped bits. The inset to (a) shows a histogram of measured Hamming distances at and (up arrow).

Viii Supplementary Information 3: Comparison of Mean-field Eigenvalue Approximations with Exact

For the systems tested in our numerical experiments, Table S1 shows the critical parameter values at the stability/instability border as obtained from direct calculation of the maximum eigenvalue of the matrix for the relevant specific networks (downward arrows in Fig. 1 (a)-(f)) compared to the corresponding results predicted from the mean-field-type theoretical estimates (Eqs. [11], [13]-[14], and [18]). For the community structure example we use the approximation,

(18)

which applies for the case of two equal communities with symmetric connectivity probabilities () as in Fig. 1(e). The analysis leading to (18) will be published elsewhere.

The largest eigenvalue approximations predict the observed transition to unstable behavior quite well, as seen in the table below. The only exceptions to this agreement are in the case of significant assortativity or disassortativity; however, this is to be expected since the approximate theory is a linear approximation for values of close to one. The values of assortativity and disassortativity used in the paper (1.7 and 0.5) are far from this regime. Nevertheless, even for these cases, the theory correctly predicts the qualitative trend that assortativity (disassortativity) decreases (increases) the critical .

Direct Evaluation Approximate Theory
Critical ’s from Fig. 1(a) 0.22 0.23
(Approx. Theory from Eq. [11]) 0.34 0.34
0.43 0.44
Critical ’s for Fig. 1(b) 0.10 0.13
(Approx. Theory from Eq. [12]) 0.22 0.23
0.33 0.45
Critical ’s for Fig. 1(c) 0.33 0.33
(Approx. Theory from Eq. [11]) 0.34 0.33
0.34 0.34
Critical for Fig. 1(d) 0.19 0.20
(Approx. Theory from Eq. [14]) 0.27 0.28
Critical for Fig. 1(e) 0.21 0.21
(Approx. Theory from Eq. [18])
Critical for Fig. 1(f) 0.80 0.80
(Approx. Theory from Eq. [15])
Table S1: Comparison between criticality conditions evaluated directly from and from the approximate theory.

Ix Supplementary Information 4: Application to the Regulatory Network of S. cerevisiae

Figure S3, similar to Figs. 1(a)-(c), illustrates an application of Eq. [17] to the published network of the yeast S. cerevisiae [35]. The largest eigenvalue of this network is . We have assumed in this plot that each node has the same sensitivity , and again we see that the network undergoes a transition from stable to unstable behavior at . However, in order to draw any conclusions about the criticality of the yeast regulatory network, we need a reliable estimate of the individual ’s, from which we can calculate . While estimating the sensitivities may be possible with existing microarray datasets, this is beyond the scope of this paper.

Figure S3: vs. , calculated from Eq. [17], for the published regulatory network of S. cerevisiae [35]. The network undergoes a transition from stable to unstable behavior at .

References

  • (1) Kauffman, S (1969) Metabolic stability and epigenesis in randomly constructed genetic nets. J Theor Biol 22:437–67.
  • (2) Kauffman, S (1993) The Origins of Order (Oxford University Press, New York).
  • (3) Albert, R, Othmer, HG (2003) The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in Drosophila melanogaster. J Theor Biol 223:1–18.
  • (4) Coleman, W, Tsongalis, G (2001) Molecular Basis of Human Cancer (Humana Press).
  • (5) Ott, E (2002) Chaos in Dynamical Systems (Cambridge University Press), 2 edition.
  • (6) Gonzalez-Garcia, I, Sole, RV, Costa, J (2002) Metapopulation dynamics and spatial heterogeneity in cancer. Proc Natl Acad Sci USA 99:13085–13089.
  • (7) Bignolda, L (2007) Variation, ‘evolution’, immortality and genetic instabilities in tumour cells. Cancer Lett 253:155–169.
  • (8) Rämö, P, Kesselia, J, Yli-Harjaa, O (2006) Perturbation avalanches and criticality in gene regulatory networks. J Theor Biol 242:164–170.
  • (9) Serra, R, Villani, M, Semeria, A (2004) Genetic network models and statistical properties of gene expression data in knock-out experiments. J Theor Biol 227:149–157.
  • (10) Nykter, M et al. (2003) Gene expression dynamics in the macrophage exhibit criticality. Proc Natl Acad Sci USA 105:1897–1900.
  • (11) Balleza, A et al. (2008) Critical dynamics in genetic regulatory networks: Examples from four kingdoms. PLoS ONE 3(6):e2456.
  • (12) Derrida, B, Pomeau, Y (1986) Random networks of automata: A simple annealed approximation. Europhys Lett 1:45–49.
  • (13) Ricard V. Solé, RV, Luque, B (1995) Phase transitions and antichaos in generalized Kauffman networks. Phys Lett A 196:331–334.
  • (14) Luque, B, Solé, RV (1997) Phase transitions in random networks: Simple analytic determination of critical points. Phys Rev E 55:257–260.
  • (15) Fox, JJ, Hill, CC (2001) From topology to dynamics in biochemical networks. Chaos 11:809–815.
  • (16) Aldana, M, Cluzel, P (2003) A natural class of robust networks. Proc Natl Acad Sci USA 100:8710–8714.
  • (17) Barabasi, AL, Albert, R (1999) Emergence of scaling in random networks. Science 286:509–512.
  • (18) Lee, DS, Rieger, H (2008) Broad edge of chaos in strongly heterogeneous Boolean networks. J Phys A 41:415001.
  • (19) Newman, MEJ (2002) Assortative mixing in networks. Phys Rev Lett 89:208701.
  • (20) Newman, MEJ, Girvan, M (2004) Finding and evaluating community structure in networks. Phys Rev E 69:026113.
  • (21) Cui, Q et al. (2007) A map of human cancer signaling. Mol Sys Biol 3:152.
  • (22) Maslov, S, Sneppen, K (2002) Specificity and stability in topology of protein networks. Science 269:910–913.
  • (23) Eisenberg, E, Levanon, EY (2003) Human housekeeping genes are compact. Trends Genet 19:362–365.
  • (24) Ghil, M, Mullhaupt, A (1985) Boolean delay equations II: Periodic and aperiodic solutions. J Stat Phys 41:127–173.
  • (25) Kappler, K, Edwards, R, Glass, L (2003) Dynamics in high-dimensional model gene networks. Signal Process 83:789 – 798.
  • (26) Luque, B, Ballesteros, FJ (2004) Random walk networks. Physica A 342:207–213.
  • (27) Shmulevich, I, Kauffman, SA (2004) Activities and sensitivities in boolean network models. Phys Rev Lett 93(4):048701.
  • (28) Restrepo, JG, Ott, E, Hunt, BR (2008) Weighted percolation on directed networks. Phys Rev Lett 100:058701.
  • (29) Cohen, R, Erez, K, ben-Avraham D, Havlin, S (2000) Resilience of the Internet to random breakdowns. Phys Rev Lett 85: 4626 - 4630.
  • (30) Milo, Ret al. (2002) Network motifs: Simple building blocks of complex networks. Science 298:824–827.
  • (31) MacCluer, CR (2000) The many proofs and applications of Perron’s theorem. SIAM Rev 42:487–498.
  • (32) Restrepo, JG, Ott, E, Hunt, BR (2007) Approximating the largest eigenvalue of network adjacency matrices. Phys Rev E 76:056119.
  • (33) Farkas, I, Jeong, H, Vicsek, T, Barabási, AL, Oltvai, ZN (2003) The topology of the transcription regulatory network in the yeast, Saccharomyces cerevisiae. Physica A 318:601 – 612.
  • (34) Newman MEJ (2003) The structure and function of complex networks. SIAM Rev 45:167.
  • (35) Luscombe, NM, et al. (2004) Genomic analysis of regulatory network dynamics reveals large topological changes. Nature 431:308-312.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
49386
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description