# Modeling Evolution of Crosstalk in Noisy Signal Transduction Networks

## Abstract

Signal transduction networks can form highly interconnected systems within cells due to network crosstalk, the sharing of input signals between multiple downstream responses. To better understand the evolutionary design principles underlying such networks, we study the evolution of crosstalk and the emergence of specificity for two parallel signaling pathways that arise via gene duplication and are subsequently allowed to diverge. We focus on a sequence based evolutionary algorithm and evolve the network based on two physically motivated fitness functions related to information transmission. Surprisingly, we find that the two fitness functions lead to very different evolutionary outcomes, one with a high degree of crosstalk and the other without.

###### pacs:

87.10.Mn,87.18.Mp,87.23.Kg,87.16.Acrange-phrase = – \newfloatcommandcapbtabboxtable[][\FBwidth]

## I Introduction

Signaling networks have evolved to transduce external and internal information to inform critical cellular decisions such as growth, differentiation, directional motion, metabolic transitions, and apoptosis (1). These networks can form highly interconnected systems within cells due to network crosstalk, the sharing of input signals among multiple canonical pathways. Crosstalk between pathways accounts for many of the complex behaviors exhibited by signaling networks (2); (4); (3); (5); (6); (7). How did such complex interconnected networks evolve and what constraints did the dynamics of evolution place on their architecture? Can we understand the resulting degree of crosstalk in terms of optimization of fitness associated with the accurate transmission of information?

In cells, there exist examples of both high degrees of crosstalk and high degrees of specificity. As an example of crosstalk, studies have shown interactions between the IGF-I and the TGF- pathways, where in the Hep3B human hepatoma cell line, IGF-I and insulin were each shown to block TGF- induced apoptosis, via a PI3-kinase/Akt dependent pathway (8). In another example of crosstalk, cyclic AMP helps regulate cell proliferation by interacting with the mitogen-activated protein (MAP) kinase pathway (9). More examples can be found in (10); (11); (12); (13). On the other hand, two-component signaling systems, which form the dominant signaling modality in bacteria, exhibit a high degree of pathway isolation and therefore a high degree of specificity (14). Examples of specificity in signaling are found in (15); (16); (17); (18); (19); (20). Indeed, undesirable crosstalk underlies many pathological conditions in higher organisms (21); (22); (23).

Understanding the evolutionary drive toward pathway specificity or crosstalk is a fundamental problem in signal transduction. However, modeling the evolution of crosstalk in signaling networks is challenging because evolutionary processes are governed by changes at the genotypic level, whereas selection occurs at the phenotypic level (24) and the mapping between genotype and phenotype is generally poorly understood. Currently, much of the theory related to evolution of signal transduction networks focuses on changes at the phenotypic level (e.g. changing protein interactions directly) (26); (25). In this paper we adapt a sequence-based evolutionary model due to Zulfikar et al. (27) that allows us to map from sequence space (genotype) to rate constant space (phenotype). For the first time, we apply this model to signal transduction in order to better understand the evolution of crosstalk and the emergence of specificity.

New signaling pathways can enter the genome via gene duplication and subsequent divergence (28). Therefore, in this paper, we consider two parallel pathways that arise via gene duplication but then are allowed to diverge. We evolve our network using two biologically motivated fitness functions related to the transmission of information. For the first fitness function, we focus on a system which may have evolved to transmit the total information content along the signaling network. Drawing from Shannon’s work on communication theory (29), a suitable choice of fitness for this scenario is the total mutual information, denoted by . For the second fitness function, we consider a system where inputs transmitted through their cognate signaling pathways lead to distinct responses. A natural choice of fitness function for this scenario is the sum of the mutual informations of individual pathways, denoted by . We find that the two fitness functions lead to very different evolutionary outcomes. In particular, evolution retains a high degree of crosstalk for the case of while leading to high specificity for .

## Ii Evolutionary Model

In our model of a signaling pathway, we assume two layers of proteins that represent an input-output process. The first layer corresponds to a set of proteins (e.g. cell surface receptors or protein kinases) that become activated by an extracellular signal (e.g. a ligand); the activated fraction of these proteins represents the input. These input proteins, in their active form, can activate a second layer of proteins whose activated fraction represents the output. To study information transmission in this system (see Fig. 1), we employ the chemical Langevin equation, which approximately models the stochastic dynamical behavior of a well-stirred mixture of molecular species that chemically interact (30):

(1) |

is the strength of input , is the concentration of activated output protein (aka the output), and is the inactive concentration, with the total concentration of output protein held fixed i.e. . We assume a background deactivation rate of and , which define our units of time and volume. represents the volume of the system, and controls the level of noise. The factors are reaction rate constants. are temporally uncorrelated, statistically independent Gaussian white-noise terms whose coefficients are defined by the square root of the sum of the activation and deactivation rates, see (31). The have zero mean , and are delta-correlated in time .

Crosstalk in signaling networks results from the intrinsic promiscuity of protein-protein interactions (32). Protein-protein interaction strengths are generally determined by amino-acid-residue interactions at specific molecular interfaces. Moreover, it has been estimated that greater than 90% of protein interaction interfaces are planar with the dominant contribution coming from hydrophobic interactions. For simplicity, we therefore assume that input proteins possess an out-face and output proteins possess an in-face which form a pair of interaction interfaces; we associate a binary sequence, , of hydrophobic residues (1s) and hydrophilic residues (0s) to each interface. The interaction strength between a protein (denoted by index ) and its target (denoted by index ) is determined by the interaction energy between the out-face of the input protein and the in-face of the output protein. represents the effective interaction energy between two hydrophobic residues. (All energies are expressed in units of the thermal energy .) The reaction rate is then given by

(2) |

where plays the role of a threshold energy, e.g. accounting for the loss of entropy due to binding. In our calculations we varied between \numrange120, between \numrange0.20.6, and between \numrange110. We set , and we took the length of each sequence representing an interface to be . These interaction parameters were chosen to provide a large range (\numrange0.00620) for the possible rate constants as a function of sequence and to keep the background deactivation rates small compared to the highest activation rates.

For our evolutionary scheme, we assume a population sufficiently small that each new mutation is either fixed or entirely lost (33); (34). We consider only point mutations - namely replacing a randomly chosen hydrophobic residue (1) in the in- or out-face of one protein by a hydrophilic residue (0), or vice versa. In this study, mutations are accepted if and only if they produce a fitness that is greater than or equal to the current fitness. In this work, we studied two fitness functions based on the mutual information between the inputs and outputs of our system, with MI defined as (29):

(3) |

where always represents a probability distribution function. For simplicity, we chose the input probability distribution to be Gaussian. The mutual information (MI) of two random variables is a measure of the mutual dependence between the two variables. The two fitness functions that we studied here are based on the above general definition of mutual information and can be expressed as follows:

(4) |

Qualitatively, represents the fitness for a system which evolves to transmit the total information content via the entire signaling network, whereas represents fitness for a system where inputs transmitted through their cognate signaling pathways lead to distinct responses.

## Iii Phenotypic Fitness Landscapes

To implement the above evolutionary model, we must be able to calculate mutual information. We use the Fokker-Planck (FP) equation (57) corresponding to our Langevin equation (Eq. 1) to calculate the probability distributions appearing in the MI (Eq. 3). We first consider the simpler case of a one-input, one-output system to develop tools to address multiple input-output systems with crosstalk.

### iii.1 Single Input and Output: No Crosstalk

For a one-input, one-output system, as shown schematically in Fig. b, the Langevin equation can be written as a deterministic part and a stochastic part :

(5) |

where and are defined as follows.

(6) |

The resulting FP equation (in the It formulation (58)) is

(7) |

Note that Eq. S8 has the form of a continuity equation for probability

(8) |

where can be viewed as a probability current. The steady-state solution of the FP equation corresponds to a constant value of . Imposing the boundary conditions at and at then implies that everywhere. The solution of the steady-state FP equation for zero-probability-current boundary conditions can be written as (37)

(9) |

where is a normalization constant. Note that while this conditional output probability distribution is peaked for or higher, it does not resemble a Gaussian distribution even at reasonably large values of (Fig. 2). Additionally, it might appear that the RHS of Eq. S36 approaches as ; however setting and Taylor expanding around , we find that the divergent terms cancel (37).

We can determine the output probability by numerically integrating the conditional output probability over the input distribution as follows,

(10) |

We obtain the mutual information as a function of , as shown in Fig. 3. The mutual information is nearly zero both at very small values of because of low activation and at very large values of because of saturated output. The inset in Fig. 3 shows the maximum value of mutual information as a function of system volume ; the maximum mutual information starts flattening out for .

### iii.2 Duplicated Inputs and Outputs: Nonzero Crosstalk

We now extend the one-input, one-output system to two inputs and two outputs, and allow for crosstalk. The Langevin equations for this system are

(11) |

(12) |

Compactly, the multivariate Langevin equation can be written as

(13) |

where the index can take on the values {1,2}, and where the functions and are represented by the first and second terms, respectively, in Eq. 11 and Eq. 12. The resulting FP equation for the joint probability distribution is (59); (39):

(14) |

The steady-state solution that satisfies the zero-probability-current boundary conditions for Eq. 14 is (37)

(15) |

For notational convenience we have introduced modified rates defined as follows:

(16) |

Having obtained the conditional probabilities, we can numerically obtain the two fitness functions using Eqs. 3 and 4. For , if we set (i.e. corresponding to values of these rate constants close to the optimum of MI for a single pathway, as seen in Fig. 3), then we can depict the density plots of fitness versus. crosstalk, as in Fig. 4, and observe that the optima for both fitness functions occur at zero crosstalk (larger volumes yield qualitatively similar landscapes, see (37) for a calculation with ). Both fitness landscapes look similar and both have a fitness maximum at zero crosstalk. However, Fig. 4 provides only a slice through parameter space. How might an evolving system explore the full space? To answer this question we take an evolutionary approach.

## Iv Evolution of Crosstalk

The fitness landscapes in Fig. 4 are obtained by setting direct rate constants (for ) and by sweeping over values of the crosstalk rate constants . This choice of direct rate constants is motivated by the value of the rate constant in Fig. 3, for which produces a maximum. However, this one slice through the four-dimensional space of rate constants cannot capture the full fitness landscape. Moreover, we expect biologically that random mutations will change more than one rate constant. Motivated by these considerations, we therefore implemented an evolutionary algorithm in which we make random mutations to the binary strings that determine the rate constants (Eq. 2), and accept mutations if and only if they produce a fitness that is greater than or equal to the current fitness (Eq. 4). The initial state of the system corresponds to duplicated pathways where all the rate constants are equal (e.g. for all strings initialized to zero and , ). Fig. 5 shows some sample runs of the evolutionary algorithm for a few different choices of initial conditions; each solid curve represents the average fitness for one hundred runs for a specific set of initial strings, while the shaded regions indicate the 25-75 fitness percentiles at that particular number of accepted mutations over all trajectories. Fig. 5 shows results for , however the results for are the same qualitatively. We can see that the final values of the rate constants do not depend critically on our choice of initial strings.

Surprisingly, evolving leaves the optimized network with a high degree of crosstalk, contrary to our expectations based on Fig. 4. E.g. for the interaction parameter , if we start with low values of all , we typically find that all the rate constants increase simultaneously, as shown in Fig. a, implying high crosstalk. Strikingly, for larger , the majority of runs exhibit bifurcations in rate constants, but still leave the optimized network with a high degree of crosstalk (see Fig. b). In a typical bifurcation, and might dominate while and are suppressed, whereas and might dominate in a different run. These bifurcations yield examples of signal “fan-out” (single input, multiple outputs ) and signal “fan-in” (multiple inputs, single output) (40), found in biological systems (41); (42); (43); (44). Fig. c shows a probability distribution of rate constants after rate constants have stopped changing under evolution; the peaks of the histogram occur at similarly high values of the crosstalk and direct rate constants, implying a high degree of crosstalk as an evolutionary outcome for .

On the other hand, evolution under the fitness function leads to low crosstalk and thus isolated pathways. Fig. a shows a typical run of greedy evolution under . Note that in this typical run, the direct rate constant values grow (e.g. , in the evolved network, corresponding to the optimal values in the single-input single-output case, as in Fig. 3), whereas the crosstalk rate constants stay low (e.g. , ). Fig. b shows a histogram exhibiting separation of crosstalk and direct rate constants, with high values of direct rate constants and low values of crosstalk rate constants.

Our evolutionary approach has revealed essential differences between the two fitness functions. How can we understand the difference in evolutionary outcomes given that the maximum fitness depicted in Fig. 4 occurred at zero crosstalk for both fitness functions? Although the two landscapes appeared similar, it is important to recall that the phase space of the fitness landscapes is really four dimensional and Fig. 4 corresponds to a particular two-dimensional slice. We are then faced with the question of how to construct a lower dimensional slice of the fitness landscapes that could help us understand the difference in evolutionary outcomes. The crucial difference between evolutionary outcomes pertained to the typical ratio between direct and crosstalk rate constants; we therefore want to distinguish between the fitness dependence on the direct rate constants and crosstalk rate constants. Thus, we set , corresponding to the direct rate constant, and , corresponding to the crosstalk rate constant, and construct a two-dimensional slice where one axis represents the direct rate constant and the other the crosstalk rate constant. As shown in Fig. 8, the resulting fitness landscapes reveal a striking difference between the two fitness functions. In particular, we note that while is peaked at zero crosstalk (albeit with some spread to finite crosstalk), is optimal over an entire band corresponding to a range of direct and crosstalk rate constants. This observation helps us understand why the two fitness functions lead to very different evolutionary outcomes, and in particular, why leads to low crosstalk while can result in a high degree of crosstalk. Lastly, to understand the bifurcations observed in rate constants for evolution under for larger values of (Fig. b), we construct another two-dimensional slice of the fitness landscape where we set and and plot the resulting in Fig. 9. We note that while the gradient of along the diagonal is positive, it can be smaller than the gradient along either axis so that could increase in the transverse direction away from the diagonal. For larger , the change in the rate constants due a mutation could be larger, which increases the likelihood for the system to take a larger step away from the diagonal and to subsequently move towards either axis, leading to a bifurcation in the magnitudes of the rate constants.

## V Discussion and Conclusion

We have adapted a sequence based protein-protein interaction model to study the evolution of crosstalk in multiple-input, multiple-output signaling networks. Evolution is driven by random mutations in sequence space whereas selection occurs in the space of phenotypes. Interestingly, we have shown that two fitness functions, and , produce drastically different outcomes. represents the fitness for a system evolved to distribute the total information content of inputs throughout the signaling network whereas represents fitness for a system where inputs are transmitted exclusively through their cognate signaling pathways. Using our evolutionary scheme we have shown that retains a high degree of crosstalk whereas leads to insulated pathways with lowered crosstalk. In addition, we have seen how the evolutionary outcomes can be related to the fitness landscapes. In particular, we found that while is optimized for zero crosstalk, is optimal over a range of crosstalk. Our results pertaining to dependence of on crosstalk are unique to biochemical channels where the strength of the noise depends on input; these results are different from Gaussian channels with constant additive noise where crosstalk always leads to reduction in total mutual information (45).

Our work focuses on stochasticity inherent to biochemical reactions (intrinsic noise) rather than variability in cellular states (extrinsic noise) (46). While generally both intrinsic and extrinsic noise degrade information transmitted through signaling networks, experiments show that signaling networks can mitigate, and potentially eliminate, extrinsic-noise-induced information loss (47). Furthermore, the impact of extrinsic noise decreases with increasing network complexity (48), which justifies our focus on intrinsic noise (note however that owing to its simplicity, our framework can easily be generalized to incorporate extrinsic noise (49)). Our results are also robust to parameter choices. We varied our model parameters , , and such that the resulting rate constants spanned three orders of magnitude and observed similar outcomes in our simulations.

Our work shows that, depending on the choice of fitness function, evolution may or may not suppress crosstalk between signaling pathways. In biology we find systems displaying extensive crosstalk and also systems that exhibit high specificity. For example, signaling networks in eukaryotes display extensive crosstalk: during embryonic development of metazoans, complex and delicate interactions exist between the TGF-/BMP, Wnt/Wg, Hedgehog (Hh), Notch, mitogen-activated protein kinase (MAPK), and other pathways (50); (51). On the other hand, two component signaling networks, comprising the majority of prokaryotic signaling, display very little crosstalk. The results in our paper imply that systems for which inputs have to be integrated in order to produce output, such as quorum sensing (52), would be the appropriate fitness. In cases where distinct inputs require distinct responses from the system, we expect to be the suitable quantity for fitness, in which case our results suggest an evolutionary drive to eliminate crosstalk. An example for the latter is the high osmolarity and starvation response in yeast where the pathways respond to the appropriate environmental cues in very distinct and highly precise ways (53); (54),

We expect our model to be broadly useful for exploring principles of protein network evolution. While simple and easy to implement, the model is biologically grounded in sequence-based evolution, and also physically grounded insofar as all proteins potentially interact with all others. While we have assumed completely uncorrelated input distributions for our system, it would be interesting to explore how correlated inputs might affect evolution of crosstalk. We have also focused on two-layer signaling processes, but these can readily be extended to include multilayer cascades (55). Future work will address the effects of adding feedback, a higher number of pathways, and proteins such as histidine kinases that act both as activators and deactivators (56).

This work was supported in part by the National Science Foundation, Grant PHY-1305525, and the National Institutes of Health, Grant R01 GM082938.

Supplementary Material: Modeling Evolution of Crosstalk in Noisy Signal Transduction Networks

Ammar Tareen Ned S. Wingreen Ranjan Mukhopadhyay

## Appendix A Langevin and Fokker-Planck (FP) Equations

Consider a one-input, one-output system (Fig. S1), and let represent the number of activated output proteins, represent the total number of output proteins, and be the volume of the system. We can write a Langevin equation for information transmission along a single signaling pathway in terms of as follows

(S1) |

If we now use the relations and to change the equation for output protein number into an equation for concentration, we obtain the Langevin equation from the main text

(S2) |

where is constant. The Langevin equation can be written as a deterministic part and a stochastic part :

(S3) |

where functions and are as follows,

(S4) |

(S5) |

From the Langevin equation one can arrive at an FP equation that governs the time evolution of conditional probability . However, it is important to note that a Langevin equation with a concentration dependent noise coefficient (function ) may lead to two different forms of the Fokker-Planck equation depending on the choice of time used in a Wiener process. A choice of midpoint time yields the Stratonovich calculus, which results in the Stratonovich interpretation of the Fokker-Planck equation (57); (58):

(S6) |

A choice of time at the beginning of the integration interval yields the Itô form of the Fokker-Planck equation:

(S7) |

Note that in the Stratonovich formulation, Eq. S6, there is an extra term of the form . Whether such a term should be present in the Fokker-Planck equation can be resolved by writing down the master equation governing the physical process under question and using the Kramers-Moyal expansion to arrive at the Fokker-Planck equation. In our case, the resulting Fokker-Planck is of the form of Eq. S7, without the additional term, thus justifying the Itô formulation (59). These considerations generalize easily to multivariate systems, for example, see Chapter 5 in (59).

## Appendix B Solution of the Fokker-Planck Equation

As we have seen, the Fokker-Planck equation can be written in the form:

(S8) |

We are interested in finding the steady-state probability, which means the left hand side of Eq. S8 is zero. Extracting one derivative with respect to , the equation becomes

(S9) |

which means that the expression inside the outer square brackets must equal a constant, and we can reduce the second-order differential to a first-order differential equation:

(S10) |

Here is an arbitrary constant. Before we proceed, note is that the Fokker-Planck equation has the form

(S11) |

which can be re-written as

(S12) |

Denoting the term in the parenthesis as , we obtain the continuity equation representing the conservation of probability

(S13) |

where corresponds to . is the probability current, which must vanish at the boundaries of and therefore will provide the boundary conditions. Eq. S10 can now be written as

(S14) |

where the prime denotes a derivative with respect to . Rewriting,

(S15) |

(S16) |

We divide through by to obtain (not writing the dependence explicitly, for convenience)

(S17) |

we then have,

(S18) |

Now assume existence of an integrating factor with the property that . We multiply Eq. S18 by to obtain

(S19) |

Using the defining property of , we have

(S20) |

Now we rewrite this as

(S21) |

Integrating both sides,

(S22) |

we have

(S23) |

Now

(S24) |

To determine , we go back to the definition of and write

(S25) |

Integrating both sides again and multiplying by , we have

(S26) |

(S27) |

Exponentiating

(S28) |

renaming the constant

(S30) |

Extracting out a factor of , we have

(S31) |

and we denote the ratio as to write

(S32) |

which is the solution of the Fokker-Planck equation for the one-input, one-output system for arbitrary boundary conditions.

### b.1 FP Solution at

We have noted that the probability current is a constant in steady state. In terms of boundary conditions that needs to satisfy, we note that is a function of and can vary in the range [0-1]. Since the system does not have any sources or sinks at either boundary of , we expect to be zero at both boundaries (this is akin to reflective boundary conditions). Because needs to be constant, the only possible solution is that it is 0 everywhere. However, since our state space of activated protein concentration is finite, the probability need not be zero at the boundary, only the probability current needs to be zero.

Continuing from Eq. S10 in the previous section, the Fokker-Planck equation now becomes

(S33) |

Simplifying notation, and following similar algebra to the previous section, we have:

(S34) |

which is readily solved by

(S35) |

For convenience, we can denote as (normalization constant). To determine the value of the normalization constant, we integrate over the entire concentration interval , and set the result to 1. So the output probability distribution, after substituting in the forms of and and after some algebra, can be written as

(S36) |

Obtaining the solution of the multivariate system is now quite straightforward and follows almost exactly the same procedure.

### b.2 FP Solution at

Exactly at the point , the function is independent of , and the conditional probability at that point simplifies to

(S37) |

### b.3 FP solution near

One might worry that the expression for probability in Eq. S36 diverges as . This section shows that by setting and Taylor expanding Eq. S36 around , the terms that appear to diverge actually cancel out leaving a finite-valued function. We thus rewrite Eq. S36 as

(S38) |

For notational convenience, we define terms , and . Note that the expression “” in Eq. S38 does not include any terms that might potentially diverge, so we focus on terms “” and “”. Rearranging and rewriting, we have

(S39) |

which is equivalent to . It might appear that diverges as approaches zero. To check this, we Taylor expand in powers of around , using the expansion for log). We find

(S40) |

We can then see that the divergent terms cancel:

(S41) |

## Appendix C Chemical Rate Equations for Signaling Systems

The chemical reactions governing our single-input, single-output system are written as

(S42) |

where represents the intermediate complex. The rate of change of concentration of the active fraction of output is

(S43) |

where, for example, represents the concentration of . Under the assumption that the intermediate complex concentration is at steady state (quasi-static approximation), we obtain

(S44) |

which implies

(S45) |

Substituting Eq. S45 into Eq. S43, we obtain

(S46) |

where . For convenience, in the main text, we suppress the square brackets and denote the concentrations just as and . Following the supplementary material in (60), we assume an Arrhenius-type form for the rate , where is a constant and , which implies

(S47) |

where is the interaction energy. We can also write this equation in the form

(S48) |

where and are constants, and the energy is expressed in units of . These equations generalize readily to a system with multiple channels. As mentioned in the main text, in our evolutionary scheme, the interaction energy is determined by the interaction between the in-string of output protein (denoted by index j) and the out-string of input protein (denoted by index i). Fig. S2 depicts an image of how the binary sequences (genotype) of 1s and 0s in our system interact to give rise to binding energies that determine the values of the rate constants (phenotype).

## Appendix D Initial Conditions

From the main text, we know that the final values of the rate constants do not depend critically on our choice of initial strings. Fig. S4 and Fig. S4 show greedy evolution under with duplicated initial and completely arbitrary initial strings as initial conditions, respectively. In our simulations for one channel, initial strings were generated randomly. For two channels, one in-string and one out-string were generated randomly, and then both strings were duplicated. For reference, some initial strings used in simulations are listed below:

= (0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0)

= (0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0)

= (0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0)

= (0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0)

= (0,0,1,0,0,1,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0)

= (0,0,1,0,0,1,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0)

= (0,0,1,0,0,1,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0)

= (0,0,1,0,0,1,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0)

= (0,0,1,1,1,1,0,1,0,1,1,1,1,1,1,0,1,0,1,1,1,1,1,0,1)

= (0,0,1,1,1,1,0,1,0,1,1,1,1,1,1,0,1,0,1,1,1,1,1,0,1)

= (1,1,1,1,0,1,0,1,0,0,1,1,1,1,1,0,1,1,0,1,1,1,1,0,1)

= (1,1,1,1,0,1,0,1,0,0,1,1,1,1,1,0,1,1,0,1,1,1,1,0,1)

## Appendix E Metropolis-Hastings Monte Carlo

As a test of whether the failure of to suppress crosstalk was an artifact of our greedy evolutionary algorithm, we also employed a Monte-Carlo algorithm as our evolutionary scheme. Fig. S5 shows a sample run where a Metropolis-Hastings algorithm was implemented for evolution under . The algorithm accepts all positive changes in fitness and accepts negative changes in fitness with a Boltzmann-like probability , and otherwise rejects the changes. is a tunable parameter akin to the selection pressure ( is equivalent to temperature in equilibrium statistical mechanics systems). We used a range of between (0-10).

## Appendix F Robustness of Fitness Landscapes to Strength of Noise

In the main text, we used as the value for our system volume for the evolutionary simulations. Here we ensure that the qualitative behavior of the fitness landscapes remains similar for larger values of system volume . Figs. a and b show the fitness landscapes for as a function of crosstalk versus direct rate constants; while the peaks in the landscapes broaden, the qualitative behavior does not change. Finally, Fig. S7 shows the behavior of the landscape related to bifurcations for (see Fig. 9 in main text).

## References

### References

- J. D. Jordan, E. M. Landau, R. Iyengar, Signaling networks: the origins of cellular multitasking, Cell. 103, 193 (2000).
- S. M. Hill, Receptor crosstalk: Communication through cell signaling pathways, Anat. Rec. 253, 42 (1998).
- J. Ptacek, et al., Global analysis of protein phosphorylation in yeast, Nature, 438, 679 (2005).
- J.A. Ubersax, et al., Targets of the cyclin-dependent kinase Cdk1, Nature, 425, 859 (2003).
- M. L. Schmitz, A. Weber, T. Roxlau, M. Gaestel, M. Kracht, Signal integration, crosstalk mechanisms and networks in the function of inflammatory cytokines, Biochimica et Biophysica Acta (BBA) - Molecular Cell Research, 1813, 2165 (2011).
- M. A Rowland, W. Fontana, E. J. Deeds, Crosstalk and Competition in Signaling Networks Biophysical Journal, 103, 2389 (2012).
- S. M. Lyons, A. Prasad Cross-talk and information transfer in mammalian and bacterial signaling, PloS One 7, e34488 (2012).
- R. H. Chen, Y. H. Su, R. L. Chuang, T.Y. Chang Suppression of transforming growth factor-beta-induced apoptosis through a phosphatidylinositol 3-kinase/Akt-dependent pathway, Oncogene 17, 1959 (1998).
- M. Saxena, S. Williams, K. Tasken, T. Mustelin, Crosstalk between cAMP-dependent kinase and MAP kinase through a protein tyrosine phosphatase Nat. Cell Biol., 1, 305 (1999).
- M. A. Schwartz, M. H. Ginsberg, Networks and crosstalk: integrin signalling spreads Nat. Cell Biol. 4, E65 (2002).
- T. Hunter, The age of crosstalk: Phosphorylation, ubiquitination, and beyond, Mol. Cell, 28, 730 (2007).
- Y. Yan, C. L. Wei, W.R. Zhang, H. P. Cheng, J. Liu, Cross-talk between calcium and reactive oxygen species signaling, Acta. Pharmacol. Sin, 27, 821 (2006).
- H. Nishi, E. Demir, A.R. Panchenko, Crosstalk between signaling pathways provided by single and multiple protein phosphorylation sites, J. Mol. Biol., 427, 511 (2015).
- M. A Rowland, E. J. Deeds, Crosstalk and the evolution of specificity in two-component signaling, PNAS, 111, 5550 (2014).
- M. N. McClean, A. Mody, J.R. Broach, S. Ramanathan, Cross-talk and decision making in MAP kinase pathways, Nat. Gen., 39 409 (2007).
- M. Behar, H. G. Dohlman, T. C. Elston, Kinetic insulation as an effective mechanism for achieving pathway specificity in intracellular signaling networks, Proc. Natl. Acad. Sci. USA., 104, 16146 (2007).
- L. Bardwell, Mechanisms of MAPK signalling specificity, Biochem. Soc. Trans., 34, 837 (2006).
- W. Kolch, Coordinating ERK/MAPK signalling through scaffolds and inhibitors, Nat. Rev. Mol. Cell. Biol., 6, 827 (2005).
- L. J. Flatauer, S. F. Zadeh, L. Bardwell, Mitogen-Activated Protein Kinases with Distinct Requirements for Ste5 Scaffolding Influence Signaling Specificity in Saccharomyces cerevisiae, Mol. Cell. Biol., 25 1793 (2005).
- N. Dard, M. Peter, Scaffold proteins in MAP kinase signaling: more than simple passive activating platforms, BioEssays, 28,146 (2006).
- R. Muller Crosstalk of Oncogenic and Prostanoid Signaling Pathways, J. Cancer Res. Clin. Oncol., 130, 429 (2004).
- W. Shi, A. L. Harris, Notch signaling in breast cancer and tumor angiogenesis: cross-talk and therapeutic potentials, J Mammary Gland Biol. Neoplasia, 11, 41 (2006).
- D. Kalaitzidis, T.D. Gilmore, Transcription factor cross-talk: the estrogen receptor and NF-kappaB, Trends Endocrinol. Metab., 16, 46 (2005).
- S. J. Maynard, The Theory of Evolution, (Cambridge University Press, Cambridge, England, 1993).
- O. S. Soyer, S. Bonhoeffer, Evolution of complexity in signaling pathways, Proc. Natl Acad. Sci. USA, 103, 16337 (2006).
- M. Mobashir, B. Schraven, T. Beyer, Simulated Evolution of Signal Transduction Networks, PLoS ONE 7 e50905 (2012).
- M. Z. Ali, N. S. Wingreen, and R. Mukhopadhyay, Hidden long evolutionary memory in a model biochemical network, arXiv:1706.08499 [q-bio.MN].
- V. Prince, F. Pickett, Splitting pairs: the diverging fates of duplicated genes, Nat. Rev. Genet. 3, 827 (2002).
- C.E. Shannon, A Mathematical Theory of Communication, The Bell System Technical Journal, 27, 379 (1948).
- D. T. Gillespie, The chemical Langevin equation, J. Chem. Phys., 113, 297 (2000).
- D. Gonze, A. Ouattara, Stochastic simulations Application to biomolecular networks, (2014) ‘homepages.ulb.ac.be/ dgonze/TEACHING/stochastic.pdf’.
- J.E. Ladbury, S.T. Arold, Noise in cellular signaling pathways: causes and effects, Trends Biochem. Sci. 37, 173 (2012).
- P. A. P. Moran, Random processes in genetics, Proc. of the Cambridge Philosophical Society, 54, 60 (1958).
- M. A. Nowak, Evolutionary Dynamics: Exploring the Equations of Life, Belknap Press, (2006).
- H. Risken, The Fokker-Planck Equation, Springer-Verlag Berlin Heidelberg (1984).
- N.G. Van Kampen, Stochastic Processes in Physics and Chemistry, 3rd Edition, North Holland (2007).
- See Supplemental Material at [] for calculation details and additional information.
- J.L. Garcia-Palacios, Introduction to the theory of stochastic processes and Brownian motion problems, arXiv:cond-mat/0701242.
- Gillespie, D. T. The multivariate Langevin and Fokker-Planck equations, Am. J. Phys., 64, 1246 (1996).
- J. A. Papin, B. O. Palsson, Topological analysis of mass-balanced signaling networks: a framework to obtain network properties including crosstalk, J. Theor. Biol., 227, 283 (2004).
- Y. Z. Chen, J. Qiu, Pleiotropic Signaling Pathways in Rapid, Nongenomic Action of Glucocorticoid, Mol. Cell. Biol. Res. Commun., 2, 145 (1999).
- S. Pagliari, J. Jelinek, G. Grassi, G. Forte, Targeting pleiotropic signaling pathways to control adult cardiac stem cell fate and function, Frontiers in physiology, 5, 219 (2014).
- J. A. Granek, O. Kayikci, P. M. Magwene, Pleiotropic signaling pathways orchestrate yeast development, Current opinion in microbiology, 14, 676 (2011).
- M. C. Gustin, J. Albertyn, M. Alexander, K. Davenport, MAP kinase pathways in the yeast Saccharomyces cerevisiae, Microbiol. Mol. Biol. Rev., 62, 1264 (1998).
- I. E. Telatar, Capacity of Multi-Antenna Gaussian Channels, Tech. Rep. Bell Labs, Lucent Technologies, (1995).
- J.M. Raser , E.K. O’Shea, Noise in Gene Expression: Origins, Consequences, and Control, Science, 309, 2010 (2005).
- J. Selimkhanov, et al. Accurate information transmission through dynamic biochemical signaling networks, Science 46, 1370 (2014).
- L. Cardelli, A. Csikasz-Nagy, N. Dalchau, M. Tribastone, M. Tschaikowski, Noise reduction in complex biological switches. Scientific Reports, 6, 20214 (2016).
- P. S. Swain, M.B. Elowitz, E. D. Siggia, Intrinsic and extrinsic contributions to stochasticity in gene expression, Proc. natl. Acad. Sci. USA, 99, 12795 (2002).
- L. Attisano , E. Labbe, TGFbeta and Wnt pathway cross-talk, Cancer Metastasis Rev., 23, 53 (2004).
- T. Sumi, N. Tsuneyoshi, N. Nakatsuji, H. Suemori, Defining early lineage specification of human embryonic stem cells by the orchestrated balance of canonical Wnt/beta-catenin, Ac-tivin/Nodal and BMP signaling, Development, 135, 2969 (2008).
- T. Long , K.C. Tu , Y. Wang, P. Mehta, N. P. Ong, et al., Quantifying the Integration of Quorum-Sensing Signals with Single-Cell Resolution, PLoS Biology, 7, e1000068 (2009).
- M. A. Schwartz, H.D. Madhani, Principles of MAP kinase signaling specificity in Saccharomyces cerevisiae, Annu. Rev. Genet., 38, 725 (2004).
- Dohlman HG (2002) Annu Rev Physiol 64:129-152 H. G. Dohlman, G proteins and pheromone signaling, Annual Review of Physiology, 64,129 (2002).
- B. D. Gomperts, P. E. R. Tatham, I. M. Kramer, Signal transduction 1st Edition Amsterdam Elsevier Academic Press, (2004).
- P. M. Wolanin, P.A. Thomason, J. B. Stock, Histidine protein kinases: key signal transducers outside the animal kingdom, Genome Biology 3, 3013.1 (2002).
- H. Risken, The Fokker-Planck Equation, Springer-Verlag Berlin Heidelberg (1984).
- N.G. Van Kampen, Stochastic Processes in Physics and Chemistry, 3rd Edition, North Holland (2007).
- J.L. Garcia-Palacios, Introduction to the theory of stochastic processes and Brownian motion problems, arXiv:cond-mat/0701242.
- M. Z. Ali, N. S. Wingreen, and R. Mukhopadhyay, Hidden long evolutionary memory in a model biochemical network, arXiv:1706.08499 [q-bio.MN].