# Coherence of Biochemical Oscillations is Bounded by Driving Force and Network Topology

## Abstract

Biochemical oscillations are prevalent in living organisms. Systems with a small number of constituents cannot sustain coherent oscillations for an indefinite time because of fluctuations in the period of oscillation. We show that the number of coherent oscillations that quantifies the precision of the oscillator is universally bounded by the thermodynamic force that drives the system out of equilibrium and by the topology of the underlying biochemical network of states. Our results are valid for arbitrary Markov processes, which are commonly used to model biochemical reactions. We apply our results to a model for a single KaiC protein and to an activator-inhibitor model that consists of several molecules. From a mathematical perspective, based on strong numerical evidence, we conjecture a universal constraint relating the imaginary and real parts of the first non-trivial eigenvalue of a stochastic matrix.

###### pacs:

87.16.-b, 05.40.-a, 05.70.Ln## I Introduction

Circadian rhythms Goldbeter (1997), the cell cycle Ferrell *et al.* (2011) and gene expression
in somitogenesis (3) constitute examples of biochemical oscillations that are of central
importance for the functioning of living systems. While older observations of biochemical
oscillations were made with glycosis Pye and Chance (1966),
more recent advances include the observation of 24-h oscillations of the phosphorylation level of
the Kai proteins that form the circadian clock of a cyanobacterium Nakajima *et al.* (2005); Dong and Golden (2008).
Synthetically engineered genetic circuits can also display
oscillatory behavior Potvin-Trottier *et al.* (2016). On the theoretical side,
the basic conditions for biochemical oscillations to set in are well understood for
deterministic rate equations that ignore fluctuations Novák and Tyson (2008). Such rate
equations correspond to an effective description of an underlying set of chemical reactions that
is fully described by a stochastic chemical master equation.

In principle, biochemical oscillations can happen in a small system with large fluctuations in the chemical
species that oscillates, leading to variability in the period of oscillations. Hence,
stochastic biochemical oscillations cannot be coherent for an indefinite time. The number
of coherent oscillations, which quantifies the precision
of the biochemical oscillator, is given by the time for which oscillations remain coherent
divided by the period of oscillation (9); Cao *et al.* (2015). In such a context, a relevant question is as follows:
Given a biochemical system with significant fluctuations, what is the number of coherent oscillations that
can be sustained?

In a recent work related to this question, Cao et al. Cao *et al.* (2015) have investigated several stochastic
models that display biochemical oscillations. They demonstrated that this number of coherent of oscillations
increases with a larger rate of entropy production, which quantifies the free energy consumption
of the biochemical system. Their work can be seen as part of the recent effort to understand the
relation between a certain kind of precision and free energy consumption in biological systems,
which include studies on kinetic proofreading Qian (2007), adaptation Lan *et al.* (2012),
cellular sensing Mehta and Schwab (2012); Barato *et al.* (2013); De Palo and Endres (2013); Lang *et al.* (2014); Govern and ten
Wolde (2014a, b); Hartich *et al.* (2015),
information processing Sartori *et al.* (2014); Barato *et al.* (2014); Bo *et al.* (2015); Ito and Sagawa (2015); (24), and cost of precision in Brownian clocks Barato and Seifert (2016).
In particular, we have recently shown a general relation that establishes the minimal energetic cost for
a certain precision associated with a random variable like the output of a chemical reaction.
This thermodynamic uncertainty relation Barato and Seifert (2015a); (27); Gingrich *et al.* (2016) can be used to infer an unknown
enzymatic scheme in single molecule experiments Barato and Seifert (2015b) and yields a bound on the efficiency of
a molecular motor Pietzonka *et al.* (2016).

In this paper, we obtain a universal bound on the number of coherent oscillations
that can be sustained in any biochemical system that can be modelled as a Markov process
with discrete states. This universal bound depends on the thermodynamic forces that drive the
system out of equilibrium and on the topology of the network of states.
Our results are derived from a conjecture about the first non-trivial eigenvalue of a stochastic matrix that we
support with thorough numerical evidence. Specifically, we obtain a bound on the ratio of the imaginary and
real parts of this eigenvalue that quantifies the number of coherent oscillations. We illustrate
our results with a model for a single KaiC protein van Zon *et al.* (2007) and with an activator-inhibitor model with several
molecules Cao *et al.* (2015).

The paper is organized as follows. We consider the simple case of a unicyclic network in Sec. II. Our general bound for arbitrary multicyclic networks is formulated in Sec. III. In Sec. IV we apply our results to the two models. We conclude in Sec. V. In Appendix A we provide evidence for the bound for the case of unicyclic networks. The relation between the number of coherent oscillations and the Fano factor is discussed in Appendix B. Numerical evidence for our conjecture is presented in Appendix C. Appendix D is dedicated to the model for a single KaiC. The relation between number of coherent oscillation and the entropy production in analyzed in Appendix E. Finally, Appendix F is dedicated to the activator-inhibitor model.

## Ii Unicycic network

As a simple model for a biochemical oscillation we start with a single enzyme with the unicyclic reaction scheme

(1) |

where are transition rates. A generic transition from state to can represent, for example, a conformational change, binding of substrate to the enzyme or the release of a product from the enzyme. The thermodynamic force driving this system out of equilibrium is given by the affinity Seifert (2012)

(2) |

where Boltzmann’s constant multiplied by the temperature is set to throughout in this paper. For example, if one is consumed and generated in the cycle in Eq. (1), then the affinity is the chemical potential difference .

The model from Eq. (1) follows the master equation , where is the vector of probabilities to be in a certain state. The stochastic matrix is defined by

(3) |

where is the Kronecker delta, for , and for . Let us assume that the enzyme is phosphorylated only in state . The precision of oscillations in the phosphorylation level of an enzyme that is phosphorylated at time is characterized by the number of coherent oscillations in the correlation function plotted in Fig. 1, which is the probability that the enzyme is in state at time given that the enzyme was in state at time , i.e.,

(4) |

where and the subscript indicates the first component of the vector . For large , this correlation function tends to , which is the stationary distribution for state . This stationary distribution is the right eigenvector of the stochastic matrix that is associated with the eigenvalue .

The first nontrivial eigenvalue of the stochastic matrix , gives the decay time and the period of oscillations in Fig. 1. We characterize the coherence of oscillations by the ratio Qian and Qian (2000)

(5) |

where the number of coherent oscillations (9); Cao *et al.* (2015) is .

For general Markov processes that fulfill detailed balance, which corresponds to for the unicyclic model, and there are no oscillations in correlation functions. Hence, a non-zero driving affinity is a necessary condition for biochemical oscillations. In particular, for the case of uniform rates in Eq. (1) given by and , we obtain and .

For the general unicyclic scheme in Eq. (1) with fixed affinity and number of states , the ratio is maximized for uniform transition rates, which leads to our first main result

(6) |

Thus, the maximal number of coherent oscillations in a unicyclic network is bounded by the thermodynamic force and by the network topology through the number of states . The evidence for this bound is as follows. For we can show analytically that uniform rates correspond to a maximum of , whereas for larger we rely on extensive numerical evidence as shown in Appendix A. Specifically, we have confirmed this conjecture up to with both numerical maximization of and evaluation of at randomly chosen rates. Similar to the ratio , the Fano factor associated with the probability current is extremized for uniform rates Barato and Seifert (2015a, b). However, as discussed in Appendix B, the bound in Eq. (6) and this earlier bound on the Fano factor are different results, i.e., one does not imply the other.

## Iii Multicyclic networks

Biochemical networks are typically more complicated than a single cycle. We now extend the bound from Eq. (6) to general multicyclic networks. As an example, we consider an enzyme that consumes a substrate and generates a product . The enzyme has two binding sites, leading to the network of states shown in Fig. 2, which is a common model in enzyme kinetics Barato and Seifert (2015b). The affinity that drives the system out of equilibrium is the chemical potential difference between substrate and product .

This network of states has four types of cycles, as shown in Fig. 2. There are cycles with three states and affinity , like the cycle ; cycles with four states and affinity , like the cycle ; cycles with five states and affinity , like the cycle ; and one cycle with six states and affinity , which is the outer cycle in Fig. 2 that goes through all states. Among all these cycles, the last one with and leads to the largest value of the function . We have verified numerically that indeed bounds the ratio with numerical maximization and numerical evaluation at randomly chosen rates, as shown in Fig. 2. The bound is saturated if the transition rates for the cycle with six states are uniform and much faster than the rates associated with the three links in the middle that are not part of the six-state cycle. In this way, the multicyclic network corresponds effectively to a unicyclic network with six states. In Appendix C, we perform similar numerical tests for several multicyclic networks that do not share any symmetry, and in all cases the ratio follows a similar bound.

Based on this numerical evidence we conjecture the following universal bound on the ratio . Consider an arbitrary Markov process with a finite number of states on an arbitrary multicyclic network. The cycles are labeled by , with a number of states and affinity , where is the product of forward transition rates divided by backward transition rates over all links in the cycle (see Appendix C). The affinity and number of states of the cycle with the maximal value of , defined in Eq. (6), are denoted by and , respectively, i.e., . The ratio is then bounded by

(7) |

The basic idea behind this bound is that the simple unicyclic network in Eq. (1) is a building block
for a generic multicyclic network: any two point correlation function cannot
have a larger number of coherent oscillations than the bound determined by its “best” cycle. Hence,
the number of coherent oscillations is bounded by the thermodynamic force and the topology of the network
of states, as characterized by . Our bound in Eq. (7) leads to two general necessary conditions
for a large number of coherent oscillations, a large number of states and a large maximal affinity.
For biochemical models with irreversible transitions, e.g., the models
in (9); van Zon *et al.* (2007), the affinity formally diverges, and the bound in Eq. (7) becomes
. The weaker second inequality involving the total number of states follows from
a known result about the eigenvalues of a discrete time stochastic matrix (34); (35); (36). For the case of a
complex network of states where identifying the large number of states in a cycle is not feasible, like in the activator-inhibitor
model below, we can use the second inequality in Eq. (7), based on .
We now proceed to illustrate this second main result in two models.

## Iv Case studies

### iv.1 Model for a single KaiC

First, we consider a model for a single KaiC hexamer along the lines of the model proposed in van Zon *et al.* (2007).
The assumptions entering the model, which is depicted in Fig. 3, are the following.
A phosphate can bind to each one of the six monomers, hence the phosphorylation level of the hexamer varies from , with
no phosphate, to , with all monomers phosphorylated. Each of the six monomers can be either active or inactive. However, either
all monomers are active or all monomers are inactive, since the energetic cost of having two monomers with different conformations
is high enough to avoid such configurations. There are a total of states, denoted by for phosphorylated active monomers and
for phosphorylated inactive monomers. If the hexamer is active, phosphorylation reactions occur and if
the hexamer is inactive only dephosphorylation reactions occur.

The transition rates of this model for a single KaiC protein are given in the caption of Fig. 3. The parameter is the chemical potential difference of ATP hydrolysis. The parameter sets the energy of a state that depends on the hexamer activity and on the phosphorylation level. If the hexamer is active the energy of a state is and if it is inactive the energy of an state is . This parametrization implies that the transition rate from to and the transition rate from to are both larger than the rates for the respective reversed transitions. The parameters and are related the time-scales of changes in the phosphorylation level and conformational changes between active and inactive, respectively.

The phosphorylation level of the KaiC protein oscillates with the number of coherent oscillations given by , as shown in in Appendix D. The cycle with the largest value of the function is the cycle that goes through all states with , which is marked with the red arrows in Fig. 3. Hence, for this model we have , as shown in Fig. 3.

For fixed and we obtain the red dashed curve in Fig. 3 for as a function of . Interestingly, while
the number of coherent oscillations has a maximum, after which it decreases to zero with increasing , the entropy
production from stochastic thermodynamics Seifert (2012) is an increasing function of . Hence, the number of
coherent oscillations can also decrease with an increase of the rate of free energy consumption, which provides a counter example
to the relation between the number of coherent oscillations and energy dissipation inferred in Cao *et al.* (2015). We discuss the relation
between and the entropy production further in Appendix E.

The maximal number of coherent oscillations that can be achieved in this model is strictly speaking less than . If a single molecule does not have a large number of states, several coherent oscillations can only be sustained in a system with many molecules as we discuss next in our second example.

### iv.2 Activator-inhibitor model

We consider the activator-inhibitor model from Cao *et al.* (2015), see Fig. 4. The main components of this model are inhibitors , activators and
enzymes that can be in four different states. The enzyme goes through a phosphorylation cycle over these four states, hydrolyzing one ATP thus liberating a
free energy . The enzyme in its phosphorylated form () activates and , whereas inhibits . Furthermore, the enzyme must
bind an in order to phosphorylate. Hence, activates the production of and , while inhibits . This feedback loop leads to oscillations
in, for example, the number of species . Finally, there is a phosphatase that must bind to the enzyme for the dephosphorylation reaction.
Further details of the model are given in Appendix F.

Two important aspects about the behavior of this model are the following. First, the number of oscillations increases with and saturates for large enough . Second, in order for different enzymes to synchronize their cycles, they must compete for the smaller number of phosphatase , where is the total number of enzymes, as explained in Appendix F. However, if is too small, the number of enzymes that synchronize is also too small to generate oscillations. Hence, there is an optimal value for . These features are shown Fig. 4.

Due to the complex network of states of this model we bound the ratio with the second inequality in Eq. (7). The largest affinity is given by , which corresponds to a cycle where all enzymes go through their cycles in a synchronized way.

As shown in Fig. 4, the values of obtained with numerical simulations are approximately one order of magnitude below the fundamental limit set by our bound of , which gives for . This result is reasonable, as saturating the bound in a multicyclic network requires transition rates such that an optimal cycle dominates, which is not the case for the present model. The realization of this optimal cycle in a stochastic trajectory would require an unlikely sequence of events that all enzymes go through their own cycles in a synchronized way.

For the case of a close to optimal value of that maximizes , the number of enzymes that synchronize is roughly . Hence,
cycles with an affinity should be typical. Guided by our bound, it is then interesting to compare the value
of with the estimate . As shown in Fig. 4, indeed this number can give an approximate value of ,
typically overestimating . The estimate is best for and for . For large the bound grows
linearly, while saturates. Our results then indicate that this estimate works well in the regime where is close to its saturation value and
is close to its optimal value. Even though this heuristic argument is restricted to this model, if competition for some scarce
molecule that lead to synchronization is present in the biochemical system, e.g., in the model for the Kai system from van Zon *et al.* (2007),
a similar reasoning that leads to an estimate of the number of coherent oscillations should be valid.

## V Conclusion

In summary, we have conjectured a new bound on the number of coherent biochemical oscillations for systems with large fluctuations. Our universal result depends only on the thermodynamic forces that drive the system out equilibrium and on the network topology through the cycle with the largest function from Eq. (6). Knowledge of the chemical potential differences and the network of states of a biochemical system thus leads to a bound on the number of coherent oscillations. As illustrative examples, we obtained the largest number of oscillations that can be sustained by a single KaiC hexamer and analyzed the activator-inhibitor model, showing that our bound is also valid in models with a number of molecules that is large enough to make the network of states complicated but small enough to keep fluctuations relevant and, therefore, make a description in terms of deterministic rate equations inappropriate.

It remains to be seen whether and how our bound can be used as a guiding principle to understand how systems like circadian clocks have evolved and to engineer systems with precise oscillations in synthetic biology. Our results apply to autonomous biochemical oscillators. Analyzing the relation between precision and thermodynamics for biochemical oscillators that are coupled to an external periodic signal is an interesting direction for future work. Finally, a rigorous mathematical proof of our conjecture about the first excited eigenvalue of a stochastic matrix is an open problem for the theory of Markov processes.

###### Acknowledgements.

We thank F. Jülicher for helpful discussions.## Appendix A Evidence for unicyclic network

We discuss the evidence for the bound in Eq. (6) for the unicyclic scheme. The mathematical problem is to calculate the first non-trivial eigenvalue of the stochastic matrix

(8) |

where is the Kronecker delta, for , and for . The absolute value of the imaginary (real) part of this eigenvalue is denoted ().

For the case this eigenvalue can be exactly calculated with some algebra, leading to

(9) |

where

(10) |

and

(11) |

If , there are no oscillations in correlations functions. We want to find the transition rates that maximize for a fixed affinity

(12) |

This maximum can be found with the Lagrange function

(13) |

where is a Lagrange multiplier. The derivatives of with respect to the transition rates are given by

(14) |

and

(15) |

Due to symmetry, it is easy to deduce the derivatives with respect to and from the above expressions. If we substitute uniform rates and in the above expressions, we obtain that both derivatives become zero with a Lagrange multiplier

(16) |

Hence, we have proved that is extremized for uniform rates. We can easily evaluate for specific rates and check that uniform rates indeed correspond to a maximum.

For larger , up to , we have calculated this eigenvalue numerically. We have maximized the ratio numerically and observed that it is maximized for uniform rates in all cases, providing convincing evidence for the bound in Eq. (6). As an independent check we have also evaluated numerically for randomly chosen rates. As an example, we show a scatter plot obtained with this method in Fig. 5.

## Appendix B Relation between and the Fano factor

We now explain the difference between the bound in Eq. (6) and a bound on the Fano factor obtained in Barato and Seifert (2015a, b). This Fano factor is given by , where is the average probability current and the diffusion constant associated with the current Barato and Seifert (2015a). This bound on the Fano factor can be written as , where the quantity is also maximized for uniform rates. Furthermore, for uniform rates and , implying and . Nevertheless, for arbitrary transition rates there is no such simple relations, with a prefactor that only depends on , between () and ().

For uniform rates . Since can be zero even out of equilibrium and becomes zero only in equilibrium, we know that can be larger than . Evaluating and at different rates we find that can also be larger than . Hence, the bound on the Fano factor from Barato and Seifert (2015a, b) does not imply our result in Eq. (6). Their main similarity is that both the Fano factor and the ratio are extremized for uniform rates: it is common to find a function of several variables that is extremized at a symmetric point.

## Appendix C Evidence for multicyclic networks

In this appendix we explain the numerical evidence for the bound in Eq. (7) for multicyclic networks. For all cases, we have confirmed our bound with numerical calculation of the first nontrivial eigenvalue of the stochastic matrix. We have confirmed the bound with both numerical maximization of and by evaluating at randomly chosen transition rates.

As a first example of a multicyclic network, we consider the network with four states shown in Fig. 6. The numbers represent states and the links between them represent transition rates that are not zero. A transition rate from state to is denoted . This network has three cycles: two cycles with three states and , and one cycle with four states . The affinity of cycle is

(17) |

and the affinity of cycle is

(18) |

The affinity of , which is not independent of and , can be written as

(19) |

For the results shown in Fig. 6, which confirm our bound for this network, we have set the affinities of the cycles as and , which leads to . As shown in Fig. 6, the cycle with the largest value for the function , where is the number of states of cycle , depends on the value of . For the cycle with largest value for this function is with . For the cycle with largest value for this function is with .

The second multicyclic network has four states and is fully connected, as shown in Fig. 7. For this network we have one cycle four states and four cycles with three states. We fix the affinities of these cycles as

(20) |

(21) |

(22) |

where from now on we define the cycles through their affinities. These three affinities determine the values of the two remaining affinities as

(23) |

(24) |

The cycle leading to the maximal value of is the cycle with four states and affinity . The numerical results illustrating the bound for this network is shown in Fig. 7.

Our third example is the network with six states and three cycles shown in Fig. 8. The affinity of the cycle with six states is fixed as

(25) |

We also fix the affinity

(26) |

These two affinities determine the affinity of the third cycle as

(27) |

The dominant cycle is the one with affinity and six states. The numerical evidence for the bound is shown if Fig. 8

The fourth and last example is the network with five states and five cycles shown in Fig. 9. The affinity of the five states cycle is set to

(28) |

There are two four states cycles with affinities

(29) |

and

(30) |

These three affinities determine the affinity of the two remaining three states cycles,which are

(31) |

and

(32) |

The dominant cycle has affinity and five states. The numerical evidence for the bound is shown in Fig. 9.

For the multicyclic network given in Fig. 2 we have performed a similar analysis. This network has a total of 11 cycles. We identify the states in Fig. 2 as , respectively. We have generated two sets of points. The first set has points and we just accepted results fulfilling . For this set we have chosen the rates , , , and as ; the rates , , , , , , , , , , and as ; with uniformly distributed between and . The four remaining rates , , , and were determined by the constraints set by the affinities. The second set has points and the independent rates were chosen as , with uniformly distributed between and .

## Appendix D Phosphorylation level in the model for a single KaiC

We show that the phosphorylation level of the KaiC protein displays oscillations, with the number of coherent oscillations characterized by the ratio .

For the single KaiC model we analyze in Sec. IV.1, the stochastic matrix reads

(33) |

where , , , , and . The first seven states are related to the inactive form of the protein, with state corresponding to and state corresponding to . The last seven states are related to the active form of the protein, with state corresponding to and state corresponding to .

The phosphorylation level of the protein is a state function. Its average is given by the expression

(34) |

is the probability of configuration at time . Initially the phosphorylation level is zero with the protein in state , i.e., . The probabilities at time can be calculated with the expression

(35) |

where and given in Eq. (33). We have calculated the phosphorylation level of the KaiC protein as a function of time from Eq. (34) and Eq. (35). The result is shown in Fig. 10. Clearly the exponential decay of the amplitude and the period of oscillation are determined by the first eigenvalue of the matrix in Eq. (33) (see caption of Fig. 10).

## Appendix E Relation between and entropy production

In this appendix, we discuss the relation between the ratio and the entropy production from stochastic thermodynamics Seifert (2012), which we denote by .

For a unicyclic network with uniform rates this entropy production is given by

(36) |

For large , we obtain , as in Eq. (6), and, from , we obtain . If is interpreted as the rate of heat dissipated to the environment Seifert (2012), is the dissipated heat per period of oscillation. Hence, for a unicyclic network with large number of states , we find

(37) |

This expression is a particular case of the relation found in Cao *et al.* (2015), which states that after some critical value ,
for which oscillations set in, the inverse of the number of biochemical oscillations decay to some plateau
with . For our particular case both and the plateau are zero.
In Cao *et al.* (2015) this relation was demonstrated to be fulfilled for several different models.

While this relation is true for an unicyclic network with uniform rates that maximize , for arbitrary rates can also decrease with an increase in . The entropy production for the generic unicylic model in Eq. (1) reads

(38) |

where is the stationary probability of state . As an example, we consider the unicyclic model for with , and . In Fig. 11 we show that as a function of has a maximum, where we vary the affinity . Therefore, the number of coherent oscillations can also decrease with an increase in energy dissipation. A similar behavior has been observed in Qian and Qian (2000), with the main difference that instead of varying the authors vary the temperature. The maximum of as a function of temperature was identified as stochastic resonance.

## Appendix F Activator-inhibitor model

In this appendix, we define the activator-inhibitor model from Cao *et al.* (2015).
The model has four different chemical species: the activator , the inhibitor , the
enzyme and the phosphatase . An enzyme can be in four different states, which form the
phosphorylation cycle

(40) |

The concentrations of , and are assumed to be fixed. From the generalized detailed balance relation, the rates in Eq. (40) fulfill

(41) |

where is the free energy liberated in one hydrolysis. The activator catalyzes the phosphorylation of the enzyme and the phosphatase catalyzes the dephosphorylation of .

The enzyme in the phosphorylated state catalyzes the creation of both the activator with rate and the inhibitor with rate . The activator can also be spontaneously created with a rate . The inhibitor catalyzes the degradation of with rate and can be spontaneously degraded with a rate . Hence, we have the following chemical reactions

(42) |

These are equilibrium reactions and a cycle that involves only them must have zero affinity. The only way to get a cycle with nonzero affinity in this model is to use the chemical reactions in Eq. (40). The reversed rates are assumed to be very small so that we can set them to zero. Formally, they must be nonzero for thermodynamic consistency, however, in a numerical simulation we can just set them to zero, instead of using a very small that will lead to the same results.

The total number of enzymes , and that of phosphatases are conserved. The number of activators fulfills and the number of inhibitors fulfills . A state of the system is then determined by the vector . The volume of the system is written as and the concentration of the chemical species , for example, is denoted by . The master equation that defines this model reads