# Time-dependent solutions for a stochastic model of gene expression with molecule production in the form of a compound Poisson process

###### Abstract

We study a stochastic model of gene expression, in which protein production has a form of random bursts whose size distribution is arbitrary, whereas protein decay is a first-order reaction. We find exact analytical expressions for the time evolution of the cumulant-generating function for the most general case when both the burst size probability distribution and the model parameters depend on time in an arbitrary (e.g. oscillatory) manner, and for arbitrary initial conditions. We show that in the case of periodic external activation and constant protein degradation rate, the response of the gene is analogous to the RC low-pass filter, where slow oscillations of the external driving have a greater effect on gene expression than the fast ones. We also demonstrate that the -th cumulant of the protein number distribution depends on the -th moment of the burst size distribution. We use these results to show that different measures of noise (coefficient of variation, Fano factor, fractional change of variance) may vary in time in a different manner. Therefore, any biological hypothesis of evolutionary optimization based on the nonmonotonicity of a chosen measure of noise must justify why it assumes that biological evolution quantifies noise in that particular way. Finally, we show that not only for exponentially distributed burst sizes but also for a wider class of burst size distributions (e.g. Dirac delta and gamma) the control of gene expression level by burst frequency modulation gives rise to proportional scaling of variance of the protein number distribution to its mean, whereas the control by amplitude modulation implies proportionality of protein number variance to the mean squared.

###### pacs:

82.39.Rt, 87.10.Mn, 87.17.Aa## I Introduction

It has been confirmed experimentally that in living cells both mRNA Golding et al. (2005) and protein Ozbudak et al. (2002); Cai et al. (2006); Yu et al. (2006); Taniguchi et al. (2010); Choi et al. (2008) production may take form of stochastic bursts of a random size. The presence of bursts may be a result of processes involving short-lived molecules (e.g. the mRNA in case of protein production), concentration of which may be treated as a fast degree of freedom Friedman et al. (2006); Shahrezaei and Swain (2008). The number of protein molecules that can be produced from a single mRNA molecule before the latter is degraded is a random variable, and its distribution may, in the several experimentally known cases, be well approximated by geometric or exponential distribution Cai et al. (2006); Yu et al. (2006); Taniguchi et al. (2010). For that reason, in most of the existing models of bursty gene expression, the exponential (or geometric in a discrete case) bursts of protein Paulsson and Ehrenberg (2000); Friedman et al. (2006); Shahrezaei and Swain (2008); Aquino et al. (2012); Lin and Galla (2016) or mRNA Aquino et al. (2012) production are considered.

However, in the case of eukaryotic cells, certain models predict nonexponential distributions of burst sizes Schwabe et al. (2012); Elgart et al. (2011); Kuwahara et al. (2015). In particular, in the case of transcriptional bursts the molecular ratchet model predicts peaked distributions, that resemble gamma distribution Schwabe et al. (2012). Therefore, it seems desirable to study the analytically tractable models of bursty gene expression dynamics with a general, nonexponential form of burst size distributions.

Also, for the majority of stochastic models of gene expression proposed to date, even if the time-dependent solutions are considered Iyer-Biswas et al. (2009); Tabaka and Hołyst (2010); Ramos et al. (2011); Feng et al. (2012); Pendar et al. (2013); Kumar et al. (2014), it is usually assumed that model parameters are time-independent. However, taking into account the time variation of the model parameters, in particular the periodic time dependence of the rate of protein production Mugler et al. (2010) gives us an opportunity to model in a simple manner the response of a genetic circuit to oscillatory regulation and to indicate some qualitative properties of solutions for other oscillating parameters.

In this paper, we investigate a simple gene expression model, which is a natural generalization of the analytical framework proposed in Ref. Friedman et al. (2006), and which may serve as a model of both transcription and translation Aquino et al. (2012). Namely, in contrast to Ref. Friedman et al. (2006) we consider the case of an arbitrary (not necessarily exponential) burst size probability distribution and time-dependent model parameters. However, gene autoregulation is neglected. To the best of our knowledge, the time-dependent solutions of the model of Ref. Friedman et al. (2006) have not been known to date even in the absence of gene autoregulation or for the simplest case of time-independent model parameters.

We find the explicit time dependence of the cumulant-generating function for the probability distribution of molecule (protein) concentration. This general result is then applied to describe the oscillatory response of a gene to periodic modulation of the rate of protein production. In particular, we consider a gene driven by a single-frequency, sinusoidal regulation. In such a case, the time dependence of the mean molecule concentration consists of both the transient, exponentially decaying part and of the periodic part, whose amplitude depends on the driving frequency. We also point out that the division of the system’s response into periodic and transient part remains true in a more general case, when the model parameters are periodic functions of time.

We also show a simple relationship that links the -th cumulant of the protein number distribution and the -th moment of the burst size distribution. In particular, this relationship is proportional in the steady state. We use these results to discuss the question of possible evolutionary optimization of cellular processes with respect to noise intensity. Since it has been shown experimentally that distributions of protein numbers have universal scaling properties (variance proportional to mean or variance proportional to mean squared) Salman et al. (2012); Bar-Even et al. (2006), we use our results to gain an insight into possible origins of such scalings in the properties of the burst size distributions.

Stochastic models with bursty dynamics similar to the model considered here are known both in mathematics (so-called Takacs processes Cox and Isham (1986); Takacs et al. (1961)) and in physics (under the name of compound Poisson processes), where such models are used not only to describe stochastic dynamics of transcription or translation, but also to model such diverse phenomena as diffusion with jumps Luczka et al. (1995); Czernik et al. (1997); Łuczka et al. (1997), time dependence of soil moisture Porporato and D’Odorico (2004); Daly and Porporato (2006, 2010); Suweis et al. (2011); Mau et al. (2014), dynamics of snow avalanches Perona et al. (2012), statistics of the solar flares Wheatland (2008, 2009) and oil prices on the stock market Askari and Krichene (2008). And therefore, our results may be relevant to other fields beyond stochastic modeling of gene expression.

## Ii Results

Let us consider a source (gene) that creates objects (protein or mRNA molecules) of a single type, denoted by X, which are subsequently degraded or diluted due to the system size expansion, e.g. cell growth and division,

DNA | (1) |

We focus on the simplest situation, when the molecules interact neither with each other, nor with the source. In consequence, the probability of degradation of a single molecule does not depend on the total number of molecules in the system. This assumption leads to a linear decay process (first-order reaction), which is the simplest, but arguably the most natural choice here. Still, we assume that both the source intensity and the decay parameter may vary with time in an arbitrary manner. Therefore, although we assume that the characteristics of the source are independent on the number of molecules present in the system (feedback effects are neglected), we allow the source (gene) to be externally regulated. If the number of molecules is sufficiently large, the continuous approximation is justified and the molecule concentration may be used instead of the exact copy number of molecules.

In order to obtain the stochastic description of the system, we assume that the molecule production takes the form of bursts of random size. Namely, the number of newly created molecules (or the magnitude of a concentration jump in the present continuous model), , is a stochastic variable drawn from the probability distribution , which may be explicitly time-dependent. It is assumed here, that burst duration is short enough that even large bursts can be treated as instantaneous. The time of appearance of each burst is also a random variable.

The occurrence of stochastic bursts in a given system may be due to the presence of some processes that are much faster than production or degradation of molecules in question; such processes are not explicitly taken into account within the model. For example, translational bursts of proteins are attributed to the existence of short-lived mRNA molecules Shahrezaei and Swain (2008); Lin and Galla (2016). However, it is not our aim here to relate the functional form of the burst size probability distribution to dynamics of fast degrees of freedom. Rather, we treat bursty dynamics as a well-justified approximation leading to reasonable effective description of the system at hand.

The deterministic model describing the kinetics of reactions (1) is given by a simple rate equation (60), see Appendix A. Its stochastic counterpart is the following Langevin-like equation

(2) |

where is the molecule concentration and dot denotes the time derivative. appearing in (2) is now a compound Poisson process, i.e.

(3) |

where is the size of the molecule burst (concentration jump) that takes place at and is the number of concentration jumps in the interval .

Stochastic differential equations similar to (2) have been used to model diffusion in asymmetric periodic potentials Luczka et al. (1995); Czernik et al. (1997); Łuczka et al. (1997), soil moisture dynamics and other phenomena in geophysics Perona et al. (2012); Porporato and D’Odorico (2004); Daly and Porporato (2006, 2010); Suweis et al. (2011); Mau et al. (2014), astrophysics Wheatland (2008, 2009) and economics Askari and Krichene (2008).

Instead of Eq. (2) it is more convenient to study the corresponding master equation^{1}^{1}1According to the terminology of Ref. Gardiner (2009), Eq. (4) is a special case of differential Chapman-Kolmogorov equation., proposed in Ref. Friedman et al. (2006)

(4) | |||||

In the above equation, is a time-dependent probability distribution of molecule concentration in the population of cells. We also have

(5) |

where is the burst size probability distribution, denotes Dirac delta distribution, is the burst size, whereas and are time-dependent model parameters (in Ref. Friedman et al. (2006), only time independent model parameters have been considered).

Note that from Eq. (4) one can obtain equations for the time evolution of moments of , see Appendix B. However, solution of the moment equations is tedious, and it is usually much more convenient to work with the moment generating function.

In order to solve Eq. (4), we apply the Laplace transform: , , i.e., . In result, Eq. (4) is transformed into the following first-order linear partial differential equation

(6) |

which can be solved by the standard method of characteristics van Kampen (2007); Arfken et al. (2011). We obtain

(7) |

where

(8) |

is the Laplace transform of the initial probability distribution ;

(9) |

whereas is defined as

(10) |

It can be easily verified that for (7) we have

(11) |

If , , and are periodic functions of time (including constant function treated as a special case of periodic function), and at least one of these three functions is not a constant function, time evolution of has an oscillatory character. More precisely, it is shown that each cumulant of consists of both the periodic part and the exponentially decaying transient terms, cf. Appendix C.

In most cases, given by (7) cannot be expressed in terms of elementary or standard special functions. Even if for some choice of , , and functions it is feasible to obtain a closed analytical formula for , the analytical evaluation of the inverse Laplace transform and hence the explicit analytical form of is usually out of question (a notable exception, for which the explicit form of can be obtained is analyzed in Section II.3).

However, making use of the relationship between , moment generating function and the cumulant generating function ,

(12) |

(13) |

one may find the exact analytical form of the time evolution of moments and cumulants of van Kampen (2007); Gardiner (2009). The cumulants of are of special interest here; from (7), (8), (9), (10), and (13) one gets

(14) | |||||

In the above equation, denotes -th moment of the burst size probability distribution (5), i.e.,

(15) |

From (14) we see that the time evolution of depends only on its initial value, , on the time dependence of the model parameters , , and on the time evolution of -th moment of , but it does not depend explicitly on any other cumulants of or moments of . Note that by using Eqs. (13) and (14) we can reconstruct (at least in principle) the time evolution of , provided that the time evolution of all moments of as well as the initial distribution are given.

Eq. (14) can also be obtained in an alternative way, which does not require the solution of Eq. (6). Namely, dividing Eq. (6) by we obtain the following equation for given by Eq. (13)

(16) |

If we compute the -th derivative of Eq. (16) with respect to -variable, and subsequently put , we get the time-evolution equation for

(17) |

from which we immediately obtain (14).

The two most important cumulants are the mean molecule concentration and variance . In particular, is given by

(18) |

cf. Eq. (67) in Appendix B. and are of special interest also with the connection with two standard noise measures frequently used in biology: the Fano factor and the coefficient of variation , defined as

(19) |

### ii.1 Periodic gene regulation

Let us now analyze the case of a time-independent, but otherwise arbitrary burst size probability distribution , constant decay rate and molecule production rate (burst frequency) of the form

(20) |

where . In other words, our gene is periodically driven with a single angular frequency,

(21) |

where is an oscillation period; is the initial phase. Making use of (14) and (20), one can easily compute time evolution of -th cumulant of . Assuming for simplicity , we get

(22) | |||||

where

(23) |

given by (22) contains both the transient, exponentially decaying terms and the terms which are periodic functions of time, oscillating with an angular frequency of the driving. What is important, and easily visible when is written in a form (22), the oscillation amplitude depends on both and ,

(24) |

(24) is a monotonically decreasing function of , therefore in the present case no resonant behavior should be expected.

In Fig. 1 we plot the time evolution of the average protein number as a function of dimensionless time variable and for various oscillation frequencies corresponding to (blue), (green) and (orange), as well as for the limiting case of nonoscillatory driving (). We assume that , therefore . Also, we assume here that is an exponential distribution (subscript stands for ’exponential’)

(25) |

Moments of (25) are given by

(26) |

In particular, we have

(27) |

Exponentially (or geometrically) distributed sizes of translational bursts have been observed in E. coli Cai et al. (2006); Yu et al. (2006); Taniguchi et al. (2010); Choi et al. (2008). For that reason, (25) appears to be a natural choice of the burst size distribution in the case of stochastic models of gene expression in which particle concentration is used instead of discrete particle number. Note that any other choice of can only affect values of and in Eq. (22); this results in identical rescaling of each plot along the -axis. As can be inferred from Eq. (22), the amplitude of oscillation is the largest for the largest oscillation period.

Similarly, in Fig. 2 we plot the time evolution of the Fano factor (19), again as a function of dimensionless time variable and for the same model parameters as in Fig 1. By employing the L’Hôpital’s rule, it can be shown that for and (25) we have

(28) |

which is close to value () obtained in Ref. (Thattai and Van Oudenaarden, 2001) for a similar discrete model.

Most of the results of the present Section can be immediately generalized to the case of arbitrary periodic dependence of burst frequency

(29) |

Invoking (14), for given by (29) we obtain

(30) |

where

and

In Appendix C we show that the division of into constant, transient and periodic part as given by (30) remains valid when not only , but also or are periodic functions of time.

Finally, let us note that Eq. (17) with given by (20) or, in general case, by (29) has a simple mechanical interpretation. Namely, it is the equation of motion of a particle moving with velocity in a viscous medium under the influence of both the drag force (, with constant ) and the external periodic force . Perhaps an even more compelling analogy is the RC low-pass filter: Fast oscillations of the external driving of gene expression Mugler et al. (2010) have less effect than slow ones.

### ii.2 Time-independent model parameters

#### Time evolution of

When the model parameters do not depend on time, i.e., , , and , Eq. (7) may be rewritten as

(33) |

where

(34) |

(35) |

is given again by Eq. (8), whereas

(36) |

is a special case of (9). In the steady-state limit, from (33) we obtain

(37) |

The form of stationary distribution (we distinguish stationary and nonstationary probability distribution functions by the number of arguments) depends neither on the values of and parameters alone, nor on the initial condition, but only on the functional form of the burst size pdf and value of the parameter (34).

Using (37), we may rewrite (33) as

(38) |

Invoking the following property of Laplace transform Abramowitz and Stegun (1964)

(39) |

where , by taking the inverse Laplace transform of (38) we can express as the convolution of three terms

(40) |

where

(41) |

(37) and cannot simultaneously satisfy the necessary conditions required for the Laplace transform of an ordinary function, in particular the condition . Clearly, the latter condition should be obeyed by , hence we have . This implies that (41) is not an ordinary function, but a distribution consisting of (apart from some ordinary function) superposition of delta distribution and its derivatives. In particular, if is a polynomial of degree ,

(42) |

we obtain

(43) |

If the explicit form of both and (43) is known, it may be feasible to find the explicit form of by invoking Eq. (40) and the identity

(44) |

Derivative on the r.h.s of Eq. (44) should be understood as a distribution derivative Schwartz and Denise (1965). Namely, if has a discontinuity at , but is at least times differentiable for , the -th distribution derivative of reads

(45) |

where denotes distribution related to treated as an ordinary function (not defined at ), whereas Schwartz and Denise (1965). In Appendix F we apply Eqs. (38)-(45) to obtain solution of Eq. (4) with the exponential probability distribution of burst sizes (25) in an alternative way than the one used in Section II.3.

#### Time evolution of cumulants of

If the model parameters do not depend on time, Eq. (14) takes a remarkably simple form

(46) |

where is given by Eq. (15) with . In the limit, from (46) we obtain

(47) |

which also follows from Eq. (79) of Appendix D (in this Appendix, we further elaborate on the relationship between functional form of the burst size probability distribution and the functional form of the corresponding steady-state distribution of protein concentration, ). Using (47), we may rewrite (46) as

(48) |

The time evolution of as given by (46) or (48) consists of the exponentially decaying contribution coming from the initial probability distribution , as well as the contribution proportional to the stationary distribution (47); the latter is completely determined solely by the values of and .

In the present case, if only the initial distribution is known, the time evolution of cumulants may be immediately recovered from (46) if needed. This allows us to concentrate solely on the stationary limit (). Next, by making use of (13) and (46), we can obtain in the form of a power series in variable.

From Eqs. (46) or (48) we see that the higher the cumulant order is, the faster approaches its stationary value. In particular, variance approaches stationary value faster than the mean protein concentration. For , from (47) we obtain a simple relation,

(49) |

The parameter as defined by Eq. (34) is equal to the burst frequency, , multiplied by the characteristic time scale of the system, . Therefore is proportional to the mean number of bursts (in Ref. Friedman et al. (2006) parameter itself is called the burst frequency) and (49) has a simple interpretation, i.e., the average protein concentration (number) is the average burst size times the mean number of bursts in time interval of the length .

For , Eq. (46) can be rewritten as

(50) |

where and is the variance of . The term proportional to in Eq. (50) is related to the stochasticity of the burst size distribution. However, even for the dispersionless () burst size distribution,

(51) |

we have an irreducible contribution to coming from the term proportional to in Eq. (50). For a fixed , (51) minimizes the variance of , a result which could be intuitively expected.

### ii.3 Example: p(x,t) corresponding to the exponential burst size distribution

In this section we find the time-dependent solution of Eq. (4) for the exponential burst size distribution (25). Apart from gene expression models, exponential distribution (25), as well as closely related two-sided exponential distribution found applications in models of other phenomena Porporato and D’Odorico (2004); Daly and Porporato (2006, 2010); Suweis et al. (2011); Mau et al. (2014). It should be also noted that in most cases only for of the form (25) both Eq. (4) and its generalizations (e.g., jump-diffusion equations Luczka et al. (1995); Czernik et al. (1997); Łuczka et al. (1997)) are analytically tractable.

As shown in Ref. Friedman et al. (2006), for the time-independent model parameters, (25) leads to stationary distribution in the form of gamma distribution,

(52) |

This can be readily verified by making use of Eqs. (35) and (37).

(53) |

In the limit, we obtain , i.e., the cumulants of gamma distribution (52). From (13) and (53) the Taylor series expansion of can be reconstructed, we get

(54) |

hence , which is indeed the inverse Laplace transform of gamma distribution (52).

Interestingly, in the present case both the explicit expression for and even for can be obtained, at least for the initial distribution of the form

(55) |

where is the initial molecule concentration. The Laplace transform of (55) is , and hence from (33) we obtain

(56) |

For simplicity, we put (which is arguably the most natural choice in the case of gene expression models). Moreover, we confine our attention to , as only in this case we were able to find compact analytical expression for the inverse Laplace transform of (56). Still, (56) is valid for arbitrary real . It is also convenient to change the independent variable according to . In such a case, reads

where is given by (52), whereas by we denote (56) for and similarly for and . Each of functions (LABEL:solution_for_p_no_hat_x) for is a superposition of gamma distributions (Dirac delta can be also treated as a limiting case of the gamma distribution) with different integer values of and time-dependent weights. Hence, (LABEL:solution_for_p_no_hat_x) is a natural time-dependent generalization of the gamma distribution (52) with , obtained in Friedman et al. (2006), where only the stationary limit of Eq. (4) has been considered.

Note that for , the dependence of (56) on and the mean burst size is of the form (92), therefore , and in particular (LABEL:solution_for_p_no_hat_x) have the characteristic dependence on variable and parameter as given by (93), cf. Appendix E.

An alternative way of obtaining (LABEL:solution_for_p_no_hat_x), its generalization for and its explicit form for small are discussed in Appendix F.

## Iii Discussion, biological insights

The stochastic description of the simple system studied here shares a common feature with the corresponding deterministic model: The time evolution of the average protein number predicted by the stochastic model is identical with the time evolution of the protein concentration obtained from deterministic equations of kinetics (see Appendix A). On the other hand, the evolution of the -th cumulant of the protein number distribution in time depends solely on the behavior of the -th moment of the burst size distribution in time, but it does not depend on its other moments. In consequence, the time evolution of the average molecule number is identical for all burst size distributions which have the same first moments, if only the remaining model parameters are identical. If additionally the time dependence of the second moments of the burst size distributions is identical, we obtain an identical time dependence of the coefficient of variation and the Fano factor of the protein number distributions, the two important measures of gene expression noise. And therefore, the predictions of stochastic models with bursty molecule production are, to a large extent, universal as they do not depend on other details of the burst statistics. This may explain the success of gene expression models that commonly assume exponentially distributed burst sizes, despite the fact that the experimental evidence for this particular burst size distribution can be found in only a few papers Cai et al. (2006); Yu et al. (2006); Taniguchi et al. (2010); Choi et al. (2008). (Note that a somewhat similar conclusion about an unexpected universality of coarse-grained models was drawn by Pedraza et al. Pedraza and Paulsson (2008) in regards to statistics of waiting times between mRNA bursts.)

It should also be noted that the effective bursty dynamics results from the approximation based on integrating out fast degrees of freedom. In order to check the range of validity of this approximation, the dynamics of the effective model (e.g. with protein but without mRNA, as considered here) should be compared with the dynamics of the full model including both slow (protein copy number) and fast (mRNA copy number) degrees of freedom. However, it is expected that predictions of the latter model are in agreement with the predictions of the former for greater than few mRNA lifetimes Thattai and Van Oudenaarden (2001); Shahrezaei and Swain (2008).

The Eq. (46) shows that the relaxation of the variance is twice faster than that of the mean (Fig. 3 A). This has been shown previously for the model of gene expression where mRNA was explicitly taken into account and all reactions were Poissonian Thattai and Van Oudenaarden (2001). The same has been shown in ref. Bar-Even et al. (2006) (supplementary information therein), without referring to any particular reaction statistics. Eq. (46), on the other hand, links that result with the moments of an arbitrary distribution of protein bursts. Below, we will discuss these results in the context of evolutionary optimization of biological processes with respect to time-dependent noise intensity, and also we will relate the behavior of Eq. (46) to experimentally measurable scaling relations between protein mean and variance. Although our model does not account for extrinsic noise nor feedback in gene regulation, our analysis may shed some light on understanding of the relation between protein number statistics and underlying burst statistics.

### iii.1 Optimization of protein level detection with respect to noise is dependent on the assumed measure of noise

Suppose that a cell population expresses a protein at a certain level in given environmental conditions, and then the conditions abruptly change, which results in a change in the expression level. How does the width of the protein distribution vary over time before it reaches a new steady state? Although the stationary behavior of noise in gene circuits has been widely studied, somewhat less studies have been devoted to transient behavior of noise (see e.g. Thattai and Van Oudenaarden (2001); Tabaka et al. (2008); Palani and Sarkar (2012); Dixon et al. (2016)).

The difference in relaxation time scales of the protein mean and variance may result in a nonmonotonic or, at least, nonlinear dependence of noise on time. It would be tempting to put forward a hypothesis that this feature may be exploited by evolution for optimization of some processes with respect to noise: For example, let the gene expression be reduced from an induced level to a basal level, and suppose that this reduction should trigger some other processes in the cell. For the trigger to be maximally precise (such that all cells can detect the decrease in protein concentration at almost the same time), its threshold should not necessarily be located precisely at the basal expression level, but perhaps somewhere higher, where the noise is minimal.

We will show below, however, that such interpretations are dependent on the function assumed to quantify noise. We do not know what measure of noise does the biological evolution use – that probably depends on the nature of a specific biological process. Coefficient of variation (19) seems to be a relatively natural choice because it measures the ratio of distribution width to its mean, so it is a dimensionless quantity. However, Fano factor (19) is also frequently used in literature, a function that measures the ratio of variance to the mean, i.e. the deviation of the process from Poissonian statistics. On the other hand, in the context of detection of transition between two expression levels, an equally natural choice may be the fractional change of the distribution width between the initial and final (stationary) state. One can easily see that each of these quantities behaves differently.

For visualization of the problem, suppose that the proteins are produced in exponential bursts of a mean size . The number of proteins is therefore gamma-distributed with mean and variance . Let the initial expression level be proteins, and after the abrupt environmental change it tends to . Such a change can be attained by two mechanisms: Decreasing (frequency modulation, FM) or decreasing (amplitude modulation, AM). Experimental evidence suggests that cells are able to adjust both and Padovan-Merhar et al. (2015). For and , a ten-fold decrease in by changing at fixed results in a ten-fold change in variance (i). The same decrease in mean protein concentration by changing at fixed yields a change in variance by the factor of (ii).

In the case (i), the coefficient of variation has a deep minimum in , and the Fano factor has a minimum at (a relatively deep one, compared to the initial and final values). These dependencies are different in the case (ii): Here, the coefficient of variation has a shallow minimum at , and the Fano factor decreases almost monotonically by one order of magnitude, with a minimum that is insignificant compared to the total change of (see Fig. 3 B, C).

The situation is still different if one takes into consideration the fractional change in the protein distribution width between the initial and final state. It immediately follows from Eq. (48) that the fractional change of the -th moment,

(58) |

In particular, the square root of the fractional change of the variance is equal to the fractional change of the mean (Fig. 3). If (or its increasing function) is used as the measure of the distribution width, then its minimal value is at . And therefore, the optimization of the position of a detection threshold to minimize noise would be ambiguous, depending on a function chosen to quantify noise.

The above example shows that any biological hypotheses regarding the evolutionary optimization of some processes with respect to the amount of noise must assume that evolution has a specified method of measurement of that noise. If such optimizations really take place in nature, then it seems that the way how the evolution quantifies noise depends on a particular biological process. To date, it is not clear, however, which measure of noise is important in which process. This problem deserves a deepened experimental analysis.

### iii.2 Frequency modulation and amplitude modulation cause different scalings of protein number variance to mean

Experimental results suggest that cells can control gene expression levels both by adjusting the mean burst frequency (frequency modulation, FM) and the mean burst size (amplitude modulation, AM) Padovan-Merhar et al. (2015). With the Eq. (46) of our model, we can relate these two types of burst control to the scaling of mean and variance of protein distributions.

According to Eqs. (46) and (47), the scaling of the variance of the protein number distribution with the mean depends on three parameters that describe burst statistics: mean burst frequency , mean burst size , and the second moment of the burst size distribution, . For simplicity of notation, in the following discussion we will denote by a constant whose value is universal for a set of different genes or for a single gene in cells cultured in various conditions. If the protein number distributions produced by the studied genes obey the scaling (i), then the moments of the burst size distribution depend on each other so that and the mean burst frequency can be arbitrary. On the other hand, if one observes the scaling (ii), then the three burst parameters depend on each other so that

If additionally the burst size distribution is such that , as in our examples in the main text and in the Appendix E ( for delta burst size distribution, for exponential, and for gamma distribution, with defined in the Eq. 101), then (i) implies that the mean burst size is universal, and the gene expression levels in the studied gene set, or in the set of conditions studied, are modulated by varying the mean burst frequency (FM). If, on the other hand, (ii), then the mean burst frequency is universal and the gene expression level is modulated by mean burst size (AM). This dependence of the variance-to-mean relationship on AM or FM has been known Hornung et al. (2012), but an explicit or implicit assumption was that the burst size distributions are exponential. Here, we show that this property also extends to a class of nonexponential distributions.

Scaling (i) was observed, e.g., in S. cerevisiae Bar-Even et al. (2006), where different promoters controlled transcription of their native proteins fused with GFP, under different environmental conditions. Assuming burst size distributions such that , would it be possible that the mean size of protein burst was the same in all these experiments and only the burst frequency varied? This could perhaps be conceivable, if the protein burst size were globally limited by the availability of translational machinery, or if the mRNA of different GFP fusions had simultaneously similar stability and similar translation rates, such that the average number of proteins produced from one mRNA molecule was the same regardless of the gene. The burst frequencies could differ from gene to gene depending on the on/off switching rates of different promoters.

However, we note that if the parameters of the burst size distribution are independent on time, then this fact imposes a particular form of scaling of the mean and variance of protein number distribution with time (46). In the Ref. Bar-Even et al. (2006), the authors observed that when the variance and mean were normalized with respect to the initial state, , then the normalized variance was proportional to the normalized mean even in the time-dependent case out of the stationary state. Although the authors supposed that their theoretical model explained this scaling, it does not seem to be the case. The equations for nonstationary mean and covariance proposed in Bar-Even et al. (2006) are fully consistent with the moment equations of our model and they imply that

(59) |

The value in the braces changes from to , so cannot be maintained constant in time within our model, if the parameters of the burst distribution are constant in time. What if they are time-dependent? Using Eq. 14, one can easily check that no simple substitution of exponential dependence of or allows the function to lose the dependence on time. The problem with time-dependent scaling suggests that as simple a model as ours may not be suitable for description of the data presented in Bar-Even et al. (2006). This also suggests that more detailed studies are needed on time dependence of protein noise and its relation to the properties of burst statistics.

Scaling (ii) was observed, e.g., in E. coli and S. cerevisiae cultured in different conditions Salman et al. (2012). The GFP gene was inserted under the control of three different promoters in multiple-copy plasmids (5 or 15 copies), or integrated into the genome in a single copy. If the distributions of burst sizes were such that , would it be possible to explain this scaling behavior within our model? Gene expression level should be then modulated by the mean burst size, and the burst frequency should be universal. The quadratic scaling of mean vs. variance (Salman et al. (2012), Fig. 3 therein) can be fitted by a one-parameter parabola . We note that the values of are different for the three different promoters. Within our model, this would mean that each promoter has its own characteristic frequency of bursting. This would sound reasonable if single gene copies were studied. However, in the experiments of Salman et al. (2012), the promoters were present in variable numbers of copies. The burst frequencies of the gene copies should then add up (see Eq. 83) and a single promoter should have a lower burst frequency than its multiple copies, unless there is some mechanism of dosage compensation in cells, which keeps the total burst frequency independent on the copy number of a given promoter. Moreover, the universal scaling of the full protein number distributions in Salman et al. (2012) was defined by a function (where denotes standard deviation), so, for example, gamma distribution produced by exponential bursts does not obey that scaling. Therefore, the validity of our model with time-independent parameters and the AM modulation of gene expression seems unlikely in the case of the data presented in Salman et al. (2012).

Yet, the above considerations based on our simple theory reveal that there are still unexplored problems in the field of stochastic gene expression: Are the distributions of protein burst sizes constant in time? Do they always belong to the wide class of those fulfilling , which includes the exponential distribution, commonly assumed in modeling? Under what biological conditions are the protein number distributions controlled by amplitude modulation of protein bursts or by frequency modulation (AM vs. FM)? Do these mechanisms undergo dosage compensation in the case of gene multiplication? The present discussion may be, therefore, an inspiration for a deepened experimental analysis of time dependence of protein number statistics on the underlying burst size statistics.

## Acknowledgments

The research was partly supported by the Ministry of Science and Higher Education Grant No. 0501/IP1/2013/72 (Iuventus Plus).

Corresponding author, e-mail: jjedrak@ichf.edu.pl

## Appendix A Deterministic rate equation

The deterministic model of the reaction kinetics for the system described by Eq. (1) is given by the following rate equation

(60) |

where is the molecule concentration, is the source intensity, whereas is the decay parameter; dot denotes the time derivative. Note that describes a deterministic birth process, in contrast to the random source intensity of the corresponding stochastic model (2). The solution to the Eq. (60) for can be readily obtained:

(61) |

The existence of the stationary solution to Eq. (60), as well as the possible oscillatory character of molecule concentration depends on the functional form of and . Clearly, in the case of time independent model parameters, i.e., and , the unique, stable stationary point of (60) is given by

(62) |

## Appendix B Time-evolution of the moments of p(x,t)

Multiplying Eq. (4) by and integrating such obtained equation, one gets the time evolution equation for the -th moment of ,

(63) |

In the resulting time evolution equation for , the term derived from the first term on the r.h.s of Eq. (4) is readily integrated by parts, whereas in the term containing we have to change order of integration with respect to and as well as to change the independent variables according to , , cf. Ref. Feng et al. (2012). In result, we get

(64) |

where denotes -th moment of the burst size distribution as given by Eq. (15), i.e.,

(65) |

We assume here that the burst size pdf is properly normalized and that the normalization is conserved during time evolution, i.e. , but we impose no other restrictions on the functional form of . On the other hand, the normalization of , i.e., the condition follows immediately from Eq. (64). For , Eq. (64) reads

(66) |

Eq. (66) is identical to Eq. (60) provided that we put . In such a case, the time evolution of the average molecule number is the same as the time evolution of molecule concentration in the corresponding deterministic model (60); this is a general property of linear deterministic dynamical systems van Kampen (2007); McQuarrie (1967). Therefore, making use of Eq. (61), we may immediately write down the solution of Eq. (66)

(67) |

where is defined by Eq. (9), i.e.,

(68) |

For each , Eqs. (64) with form closed hierarchy of linear differential equations (time evolution equation for does not depend on if ). Therefore, in principle, starting from , the explicit analytical formula for can be found iteratively for arbitrary .

## Appendix C Periodic time dependence of the model parameters

Below, we show that the possibility of division of into two parts, the exponentially decaying and periodic one, as demonstrated on the simple example analyzed in Subsection II.1, is in fact a general feature of the present model when its parameters depend periodically on time. We assume now that not only (29), but also , and appearing in (14) are period