Totally asymmetric exclusion process with site-wise dynamic disorder
We propose an extension of the totally asymmetric simple exclusion process (TASEP) in which particles hopping along a lattice can be blocked by obstacles that dynamically attach/detach from lattice sites. The model can be thought as TASEP with site-wise dynamic disorder. We consider two versions of defect dynamics: (i) defects can bind to any site, irrespective of particle occupation, (ii) defects only bind to sites which are not occupied by particles (particle-obstacle exclusion). In case (i) there is a symmetric, parabolic-like relationship between the current and particle density, as in the standard TASEP. Case (ii) leads to a skewed relationship for slow defect dynamics. We also show that the presence of defects induces particle clustering, despite the translation invariance of the system. For open boundaries the same three phases as for the standard TASEP are observed, albeit the position of phase boundaries is affected by the presence of obstacles. We develop a simple mean-field theory that captures the model’s quantitative behaviour for periodic and open boundary conditions and yields good estimates for the current-density relationship, mean cluster sizes and phase boundaries. Lastly, we discuss an application of the model to the biological process of gene transcription.
Consider an every-day scenario of cars travelling down a road. If the density of cars is low, cars travel smoothly and there is no congestion. However, a high density of cars or the presence of obstacles (e.g. traffic lights) can induce queuing of vehicles which leads to a congested state in which traffic slows down or even comes to a halt. Similar scenarios also occur in the microscopic world of molecular biology. There, “vehicles” can be molecular motors proceeding along intracellular filaments or DNA/mRNA strands, or ions migrating through ion channels.
The basic features of those situations are captured by the totally asymmetric simple exclusion process (TASEP) , the paradigmatic model for stochastic transport in which particles may hinder each other’s movement. In its simplest incarnation, TASEP describes a system of particles hopping unidirectionally between the sites of a one-dimensional open lattice. Only one particle can occupy a given site at a time. This excluded-volume effect leads to particles colliding with each other, causing congestion when the particle density is sufficiently high.
TASEP was originally proposed to model biopolymerization such as the synthesis of RNA on DNA templates  but since then TASEP and related models have been applied to a variety of phenomena: protein production [3, 4], traffic flow [5, 6], the movement of molecular motors [7, 8, 9], transport through ion channels , and pedestrian traffic . From the theory standpoint, TASEP has been extensively studied as an archetype model of jamming [1, 11, 12, 6], helped by the property that it is exactly solvable and that mean-field approach gives the same result as the exact solution . A celebrated property of the TASEP with open boundary conditions, when particles enter the lattice from a reservoir at one end and exit at the other, is the existence of phase transitions, even though it is a one-dimensional system . These phase transitions, between a low density, high density and a maximum current phase, are reminiscent of particle queueing and congestion observed in traffic-like systems as those mentioned above.
In real-world transport, movement is often hindered by obstacles. On a road this may be crossings or traffic lights, while molecular traffic is often impeded by bound proteins or some transient modifications of the “lane” on which traffic occurs. For example, when mRNA is synthesized by RNA polymerase from a DNA template in the process of transcription , the polymerase encounters “roadblocks” that slow down its progress. These roadblocks can be any DNA-bound structural and regulatory proteins that must be removed for the polymerase to proceed, for example histones that form the core of nucleosomes .
Such obstacles – which we shall call defects – have been extensively studied in the context of TASEP, when specific sites or bonds have a hopping rate that differs from others. In this context one can consider single defects [17, 18], or quenched site-wise disorder, the random distribution of spatially varying hopping rates [19, 20, 21, 22, 23, 24]. Although no exact solution of the TASEP with defects exists
Recently, models with dynamic defects have been studied. For example, transcription “roadblocking” has been considered in computational biology literature. Computer simulations of a single dynamic roadblock were able to explain the behaviour of E. coli lac repressor (LacI) . A more complex model involving cooperation between polymerases in removing a roadblock has been applied to explain why transcription is not significantly compromised in the presence of DNA-bound proteins . Non-biological applications include periodically switching traffic lights , obstacles that stochastically move and perform long range hops , or obstacles that bind and unbind stochastically to specific sites [30, 31]. Inherent to these models is that similarly to the static defect case, the translational invariance is broken and thus congestion occurs at defect sites. In fact, for fast defect dynamics, these systems behave very similar to static defect systems.
The situation of random dynamic defects where defects can bind to any site has been less studied. Results exist for the partially asymmetric exclusion process (with hopping in opposite direction allowed) without average bias, in which the locally preferred transport direction varies dynamically  and in the totally asymmetric case for the “bus-route model” , where defects appear randomly, but are removed by particles. In the former work, however, the model is globally symmetric (and not totally asymmetric locally), while in the latter the particle-defect interaction introduces an additional feedback that makes it difficult to identify the plain effect of defects.
Here we propose a simple process, a TASEP with dynamic disorder (ddTASEP), in which defects appear and disappear randomly and uniformly across the lattice, and when present, slow-down or stop particles from moving down the chain of sites. The defects are thought of as obstacles that bind and unbind from an infinite reservoir. In contrast to previous instances of dynamic defects, this system is fully asymmetric, retains translational invariance and, in its basic version, defect dynamics is independent of particle occupation. We will also consider a version in which obstacles and particles are mutually exclusive.
To explore the dynamics of the model, we will first simulate the model on a computer to obtain the current-density relation (CDR) – the relationship between the current of particles and particle density . We will study what effect the dynamic disorder has on the CDR and how it depends on the density of defects and the timescale of defect turnover. For that purpose, we will develop a mean field approach which captures the main features of the CDR and provides a reasonable estimate for the current. We shall see that despite the preserved translational invariance of the system, the distribution of particles exhibits a high degree of inhomogeneity, and we will present a theory for the formation of particle clusters which is able to provide a good estimate for the mean cluster size. These results will be used to predict the effect of dynamic disorder on the phase diagram of the TASEP for open boundary conditions. Finally, we will show that our model can be used to explain some aspects of the global regulation of gene expression.
2.1 Definition of model dynamics
We consider a totally asymmetric simple exclusion process  with dynamic disorder (ddTASEP), in which defects that slow down particles can appear and disappear on any site. Particles and defects reside on sites of a one-dimensional lattice. A particle hops from site to with rate if the arrival site is empty. If the arrival site contains a defect, the particle hops with rate . If , the defect can be thought as representing a physical obstacle blocking the particle. Defects appear and disappear stochastically: A site without a defect acquires a defect with rate , whereas a defect site switches to a non-defect site with rate . Motivated by biological scenarios, we refer to this transition as defect binding/unbinding, respectively.
We consider two variants of the model. In the unconstrained version defects can bind to a site without any restriction. If denotes the absence/presence of a particle at site , is the absence/presence of a defect, and is the hopping rate , we can formally write the model dynamics for bulk sites, , as
We shall further consider two types of boundary conditions (BC): periodic BC (see Section 3), for which particles at site re-enter at site when hopping,
and open BC (see Section 4), for which at sites and particles enter/exit from a reservoir, respectively, with rates and ,
in addition to normal particle hopping, Eq. (1), on site . The defect binding dynamics are the same in the bulk and on boundaries.
In the constrained version, a defect can only bind if the respective site is not occupied by a particle. Equation (3) is then replaced by
The model dynamics are illustrated in Fig. 1. Equations (1-3) define the model via a set of chemical-like reactions. This description is convenient if one wants to study the behaviour of the model numerically. Here we use a Monte Carlo algorithm with random sequential update. A single step of the algorithm consists of choosing a random site, selecting an event (a particle attempts to hop, a defect binds/unbinds) with probability proportional to the rates (1-3), and increasing the time variable by . Although not exact, this algorithm is faster than the exact kinetic Monte Carlo method (Gillespie algorithm ) and the results quickly converge to the exact results for .
A crucial observable in the TASEP and related models is the mean particle current in the steady state. Let us first define the average current at site as the total rate at which particles hop across the bond . In the ddTASEP, a hop occurs with probability whenever site is occupied and site is not occupied, therefore
We note that is the inverse of the mean waiting time per particle, so that the steady state particle current can be also defined as
At steady state, due to the local conservation of particles. In the case of periodic boundary conditions, we are in particular interested in the relationship between current and particle density , i.e., the function , also called the current-density relation (CDR). For the standard TASEP,
which is an inverted parabola, with maximum at . Another quantity of interest is the correlation between particle occupancies at neighbouring sites,
which is an estimate for the deviation of typical mean field approaches from exact results.
3 Periodic Boundary Conditions
We first consider the ddTASEP on a lattice of sites with periodic boundary conditions, according to Eq. (4). In this case the total number of particles is conserved. We are interested in the limit of and fixed density of particles .
3.1 Unconstrained defect dynamics, full-blocking defects ()
We assume that defects block particle hopping entirely, so that the hopping rate in the presence of the defect is , and that binding of a defect is independent of the particle occupation of a site (Eqs. (1 - 3)). Figure 2 shows the space-time plots obtained by computer simulations for different values of the defect binding/unbinding rates and density which in the standard TASEP would lead to smooth (non-congested) flow. Indeed, particles are uniformly distributed over the lattice for high binding/unbinding rates. However, as the rates decrease and defects stay longer on the lattice, particles begin to cluster. This is reflected in the CDR (Fig. 3)
To see if this relationship can be derived analytically, we consider a simple mean-field theory. Since for we have , we have
If we neglect correlations in the random variables and , we obtain . In the steady state, we have , and (the “defect density”)
We note that for we recover the standard TASEP’s exact relationship, Eq. (10). As we can see in Figs. 3(a,b) for large this naive mean field approximation is very accurate. We conclude that for fast defect turnover, sites behave effectively as having a static, effective hopping rate , similarly to what has been conjectured for localised, single dynamic defects .
To see why this approach works so well for , in Fig. 4(a) we plot the correlations and (Eq. 11) as a function of the particle density . The observed correlations are very weak, thus justifying our mean field approximation.
If and , however, this approximation breaks down, see Fig. 3 (c,d) (blue line). Figure 2(c,d) shows that the distribution of particles becomes inhomogeneous for , with pronounced particle clusters and large gaps emerging. It is clear that in this case there will be correlations between particles at neighbouring sites, which is indeed what Fig. 4(b) shows. We also expect a correlation between particle occupation and defect occupation on its right neighbour site, , because a defect causes particles to pile up in front of the defect. Figure 4(d) indeed shows strong correlations. This is in contrast to the lack of correlations for fast defect dynamics (Fig. 4(c)).
To obtain an enhanced mean field theory, which takes into account correlations between and , we make the approximation
The first factor in this expression, , corresponds to the current of particles if there were no particle exclusion. Here, is the free mean waiting time (according to Eq. (9)), in absence of exclusion interaction, and takes into account the correlation between a particle on site and defect occupation on site . With this, the full current can be approximated as
If we can neglect the rebinding of a defect after it has unbound. When a particle encounters a defect (probability ), it waits until the defect unbinds (rate ), and then it hops with rate . When a particle encounters a no-defect site (probability ), it hops with rate . Taking these two processes together, the waiting time of a particle can be approximated by
The steady-state current is thus
Figure 3(c,d) shows that this approximation is substantially more accurate than Eq. (13) in the limit of slow defect dynamics, . The remaining discrepancy between Eq. (17) and simulations is due to the correlation between particle occupancies on neighbouring sites, . We note that this enhanced mean field theory is not valid for large since in that case rebinding of obstacles, before a particle can hop, cannot be neglected, and thus the approximation made in Eq. (16) does not apply.
To understand the observed inhomogeneity of the particle distribution, we measure the mean cluster size while varying the defect binding/unbinding rates . A cluster is defined as a non-interrupted stretch of particles (more than one particle). Figure 5(a) shows the mean cluster size as a function of , for fixed . We observe a non-monotonic dependence of the cluster size on , with the maximum size at . In Fig. 5(b), on the other hand, the mean cluster size is scaled with both and while keeping the defect density fixed ( shown on x-axis). In this case we do not see any significant peak (within error margins) in the mean cluster size.
We can estimate a typical cluster size as follows. A particle cluster is initiated at site when a particle is blocked by a defect site, , while trailing particles between that defect and the following defect towards the left – let it reside at site – pile up. We first consider what happens in two limiting cases when is either very small or very large. For sufficiently small , defect unbinding can be neglected and all particles residing between and pile up to form a cluster. In total, there are on average particles between two neighbouring defects, where is the local particle density, which may differ from the mean density of particles (we shall argue below that ). In that limiting case, the mean final cluster size is . If is large, however, such that the defect at site unbinds before all particles between sites and pile up completely, the growth of clusters is limited by . In that case particles flow into the cluster with free current (since there are no other defects between two neighbouring ones), for a mean time , which leads to a mean final cluster size for . We note that the final cluster size is an over-estimation in this limit, since at the time point of defect unbinding, it exists only for a short time (in contrast to the case of small , when a saturated cluster can exist for a long time). However, the final cluster size yields the correct magnitude of clusters, and allows to compare and interpolate the two limiting cases, as follows.
To determine the cluster dynamics for intermediate time scales we consider the cluster growth dynamics in more detail. Particles between two defects accumulate with rate , where is the time after cluster initiation. The cluster growth can be stopped by the two events, that (i) the initiating defect at site unbinds, with rate , which is related to the time scale , or (ii) all particles between the defects at site and (on average particles) have been exhausted, which happens when the cluster has grown to include all particles between the defects, after a time scale defined by , thus . The stochastic events (i) and (ii) occur independently from each other. The probability that the cluster is still growing by time can then be approximated as . The mean time of cluster growth is thus and hence the mean cluster size is
This indeed interpolates between the limiting cases for small and large , as discussed above.
Finally, we note that the density corresponds to the density of particles flowing out of a dissolving cluster, later initiating a new one. It was argued in Ref.  that dissolving clusters behave locally like a maximum current phase in absence of defects, such that this density corresponds to the maximum current density of the TASEP, .
Inserting and into Eq. (18) we obtain the mean cluster size as a function of and . Figure 5(a) shows the mean cluster size as a function of for fixed while Fig. 5(b) shows for so that is kept fixed. The blue line shows the result from Eq. (18), which confirms that our approximate calculation correctly estimates mean cluster size for both cases. As expected, for large our theory, which considers clusters at their maximum size, over-estimates the simulated value. Crucially, however, the peak in in the case of fixed is accurately reproduced by Eq. (18). This peak is due to the competing limiting cases: For low , as long as is much smaller than , , which increases with . Beyond this point the blocking defect (on average) unbinds before the cluster can grow to its full size (event (i)), and the cluster size is determined by the defect life time, which is . Thus, in this regime the mean cluster size decreases with .
3.2 Constrained defect dynamics and
Now we consider the constrained variant of the model in which obstacles can only bind if a site is not occupied by a particle (Eq. (7)). Figure 6 shows for the same parameters as in Fig. 3. The curve is approximately parabolic for fast defect turnover (), resembling Eq. (10) for the ordinary TASEP. However, as decrease, becomes skewed to the right, with the maximum shifted to . Figure 7 shows the corresponding space-time plots; particles become non-uniformly distributed and form clusters for low , similarly as for the unconstrained binding model.
In order to understand what aspects of particle-defect interactions are responsible for the skewed , we shall consider a range of mean-field approximations of increasing complexity. We start from the naive mean field approach which neglects any correlations. Similar to the unconstrained case, we have , however, since the binding rate depends on the particle occupation, the mean field equilibrium defect density is , i.e. it depends on the particle density
which predicts that the CDR should be skewed around . Nonetheless, as shown in Figure 6, this approximation is not appropriate for either value of the binding parameters . In particular, for large the simulation data shows no significant skewness.
To improve our approximation, we can focus on the dynamics of holes instead of particles. Defects can bind to holes (and only to holes) without constraints, thus this model variant corresponds to hole-wise unconstrained dynamic disorder. Our hole-based mean field approximation follows
which is similar to the mean field approximation in Eq. (14), but here we “pair” with . Crucially, since defect dynamics on holes are unconstrained and thus independent of the holes’ dynamic history, the probability that there is a hole without a defect on site , , is exactly , where is the unconstrained defect density. Thus, the hole-based mean field approximation for the current is
which is identical to the naive mean field theory of the unconstrained case, and which is indeed a symmetric CDR. In fact, we see that this hole-based mean field approximation matches well the CDR for fast defect turnover, Fig. 6(a). This can also be understood intuitively: a particle can only hop if the next site is empty. Due to the fast equilibration of defects, the probability that a particle-free site is occupied by a defect is well approximated by the unconstrained equilibrium value and not by the real, constrained, defect density .
Yet, the hole-based mean field theory is not sufficient to reproduce the CDR for slow defect turnover, Figs. 6(c,d). Again, this is due to significant correlations between the defect site and the occupation of the left neighboring site (see Fig. 8), which were neglected in the hole-based mean field approximation, Eqs. (20) and (21). Nonetheless, we can follow the same, particle-based, approach of the enhanced mean field theory as introduced for unconstrained dynamics (Eq. (14)) and consider the particle current in absence of exclusion interaction to obtain an approximate expression for . The waiting time is determined by the probability to encounter a defect. Now, however, we have to consider the real defect density of the constrained system, . Thus, we can follow the same lines as for the enhanced mean field theory of the unconstrained model, by substituting in Eq. (16) and obtain,
We see that this approximation correctly reproduces the skewness and shift of the maximum of (Fig. 6(c,d), red line) and gives a reasonable estimate for its magnitude.
Space-time plots in Fig. 7(c,d) show that the particle distribution becomes inhomogeneous for slow defects, which results in the formation of particle clusters. We can apply the theory of cluster initiation and growth (Section 3.1) to estimate the mean cluster size. Similarly as for the unconstrained case, cluster growth is determined by two time scales, the life time of an obstacle, and the time until a cluster “saturates” because trailing particles are cut off by a trailing defect, . However, in the constrained case, the defect distribution is not necessarily equilibrated with respect to a cluster that has just been initiated, due to the particle-defect interaction. In order to find the cluster saturation time, we thus follow a different approach, which considers the cluster coagulation and de-coagulation dynamics and which is outlined in detail in Appendix B. There, we obtain the saturation time . Hence, following the same line of arguments that led to Eq. (18) for unconstrained dynamics, we find that the mean cluster size is
3.3 Non-zero defect hopping rates, unconstrained defect binding
We can extend our theory for non-zero slow hopping rates . We shall only consider unconstrained dynamics for simplicity. Our “naive” mean field theory for fast defects trivially generalizes to
and thus the maximum current at reads
Figure 10 shows the maximum current as a function of
To obtain an enhanced mean-field theory, in line with previous approaches in Sections 3.2 and 3.1, we need to consider all possible ways in which a particle can hop to a new, particle-free site. The new site may be without obstacle (case A, probability ), or it may contain an obstacle (case B, probability ). In the first case A, the particle jumps with rate . In the second case B, the particle either waits for the obstacle to unbind (rate ) and then jumps with rate , or it jumps with rate with the obstacle still present at the arrival site. The total jump rate will be the sum of the rates of the latter two processes. In the limit , the time to hop after defect unbinding, is negligible compared to the unbinding time . Then the average waiting time for the jump to occur in absence of a particle on the next site (see section 3.1) is
where we have taken into account the probabilities of both scenarios A,B. This gives the following expression for the current
and the maximum current reads
To obtain the mean cluster size for we need to take into account that a cluster not merely grows by incoming particles, but also shrinks as particles “leak” at the leading edge (defect site) with the slow hopping rate . We thus need to subtract the leakage current from the incoming current of particles. We can assume that for the site immediately after a defect is unoccupied, so that the leakage current at a defect site of a cluster is . Apart from the loss of particles through the leading edge, particles can also be gained through the leakage at the trailing edge (nearest defect side behind the cluster). We have for the leading defect , whereas for the trailing defect at site the occupation probability is . The total net current due to particles leaking in/out of the cluster is thus . The particles lost through leakage during the time that the defect is bound, , need to be subtracted from the cluster size. Note that the “leak” current can occur only if there is a cluster at all, while the expression derived here does not account for this condition. Considering that, by definition, a cluster must have at least two particles, we obtain following expression for the mean cluster size
Note that the time of leakage, may last longer than cluster growth in the absence of leakage.
In Fig. 11 we compare the theoretical curve from Eq. (29) with the results of computer simulations. Our theory matches the data reasonably well for very small , while for larger , when the cluster size approaches the trivial value of two (no clusters) the theory fails because our assumption does not hold anymore.
4 Open boundary conditions
We now consider the unconstrained model
The open boundary conditions can also be modelled by adding boundary reservoirs with fixed boundary densities and on virtual sites and , attached to sites and , respectively. The in- and outflow of particles from the lattice correspond to particles hopping from site to site and from site to site with the same hopping rate as the “regular” hopping rate. It has been shown that the phase diagram of a driven lattice-gas model (such as TASEP) can be obtained by looking at the extrema of the current density relation of the periodic system. This is known as the extremal current principle [34, 35] and it states that
The virtual boundary densities and are in general directly related to the entry/exit rates, and , as shown below. In particular, for the TASEP, and .
Equation (30) shows that the structure of the phase diagram depends crucially on the number of maxima and minima in the current density relation of the corresponding model with periodic BC . In particular, for with a single maximum, as it is the case for our model, the phase diagram must have the same structure as the normal TASEP [34, 35]. Phase boundaries are determined by whether the current depends on the left boundary density (low density phase, LD), the right boundary density (high density phase, HD) or is independent of the boundary conditions (maximum current phase, MC),
where is the density for which is maximised (), which is for the unconstrained ddTASEP. Here, we utilised the symmetry of the CDR, .
In the following we take the continuous limit (valid for large system size ) for which we can approximate the boundary densities as and . These densities can then be determined from the continuity equation for boundary currents via a mean field approximation. In the steady state (), the continuity equations for sites and read:
where we employed a mean field approximation and approximated the current as (note that ). From this it follows that
Thus, by defining the critical entry rates and , as well as the critical exit rate , the phase boundaries, according to the extremal current principle  are
In particular, in the low- and maximum-current phase () the current is given by
We note that, in contrast to the TASEP with quenched site-wise disorder  the maximum current does not depend on the system size . Furthermore, when we express the current as a function of the re-scaled parameter , we get a universal, parameter-free expression for the current in the LD phase,
provided that is large enough to evade the transition to the high-density phase. Equation (36) predicts that if we take the current obtained from simulations for different parameters , divide it by , and plot it as a function of , all curves should ‘collapse’ onto a single, universal curve. To test this prediction we plot the current as function of and in Fig. 12(a) for different values of . According to the extremal current theory, Eq. (34), a transition from a regime in which depends on towards a regime where is independent of should occur at . This is indeed what Fig. 12(a) shows. We thus conclude that the extremal current principle is able to identify the phase transitions correctly in our model.
Another scaling relationship can be found using the scaling parameter , yielding with . In this form, however, the rescaled current still depends explicitly on and . This scaling relationship applies only when the exit rate is small, and is shown in Fig. 12(b) for . Notably, the curve displays a “bump” where it reaches the maximum, i.e. where the transition between LD- and HD-phase is expected. The mean field approach presented here cannot explain this bump. We hypothesize that the bump may be caused by the de-confinement of shocks at the transition point , which introduces strong correlations. Modelling such correlations will, nonetheless, require going beyond the mean-field framework presented in this work.
Figure 12(c) shows the full phase diagram of our model. Phase boundaries obtained from the mean-field theory (represented by lines) estimate well the results of computer simulation (see the figure caption for how the phase boundaries have been determined numerically). Deviations are due to the approximative nature of the mean field approach in Eq. (32) used to obtain .
5 Application to gene transcription
We shall show how our results can be used to explain a curious biological observation. Transcription is a process in which biological cells make mRNA from a DNA template. DNA is a long polymeric molecule made from four monomers (adenine, thymine, guanine, cytosine, abbreviated as A,T,G,C) called nucleotides. In order to produce proteins, DNA must be first transcribed onto another linear polymeric molecule, the mRNA . Transcription is effectuated by a molecular machine – the RNA polymerase – which attaches to a special DNA sequence (transcription start site) and begins to proceed along the DNA, “reading off” the DNA sequence and adding appropriate nucleotides to a newly created mRNA chain. The polymerase detaches from the DNA when it encounters another special sequence (transcription end site).
We can immediately see analogies between transcription and TASEP as they both involve particles moving along a one-dimensional chain. In fact, modelling transcription was the motivation behind the very first TASEP paper . More specifically, transcription initiation, elongation, and polymerase detachment correspond to TASEP particles entering site 1, moving along the chain, and exiting at site , respectively.
The original application of TASEP to transcription did not involve any obstacles. However, we now know that DNA forms a highly dynamic, three-dimensional structure, with many proteins transiently bound to it. Such proteins can be transcription factors whose binding sites occur in many different places on the DNA [37, 38], or histones around which the DNA is wrapped and which are known to impede transcription [16, 39].
We shall now show that our model with dynamic obstacles can explain recent experimental results. It has been shown in Ref.  that the speed with which RNA polymerases move along the DNA and the rate with which mRNA is produced depend on certain genomic features. We are particularly interested in two such features: DNA methylation (fraction of cytosines that have an additional methyl group attached) and CG density (the number of cytosine-guanine dinucleotides per 1000 nucleotides of single-stranged DNA). DNA methylation is known for its regulatory effects on transcription , whereas CG density probably does not directly affect transcription but it correlates with methylation density. In what follows we shall use CG density as a proxy for DNA methylation density since the latter quantity is much more difficult to measure.
Figure 13 (black points) shows the experimentally measured transcription rate versus CG density for a particular cell line from Ref. , see Appendix A for details. To make this plot we binned genes according to their CG density (bin width = 1/1000 nucleotides) and calculated the mean and its standard error in each bin. Clearly, transcription slows down with increasing CG density.
To explain this, we hypothesize that the RNA polymerase is slowed down by obstacles that bind to the DNA. We assume that one site in our model corresponds to 60 nucleotides of the DNA because this is the size of RNA polymerase (one polymerase = one particle in the model). The maximum speed of the polymerase calculated from Ref.  is 55 nucleotides/s. We therefore take () as the obstacle-free hopping rate, and use formula (35) to predict the rate of transcription for each gene. We assume that in (35) is given by Eq. (17) for , that is
The proportionality parameter is used to convert CG density to density of defects . In particular, can be interpreted as the fraction of CG sites occupied by obstacles, divided by the unbinding rate of the obstacle. This parameter, as well as the unknown proportionality factor in gene expression are the only unknown parameters that must be fitted to data (see also Appendix A). The remaining input parameters are the initiation rates and CG densities of individual genes which we take from Refs.  and .
Figure 13 (red line) shows the best-fit transcription rate (averaged over many genes as described above) to experimental data. Note that since genes in different bins may have different initiation rates (known from Ref. ), the theoretical curve appears ‘wiggly’. The best-fit value of is s. While it is not possible to determine the values of and from the ratio alone, we can estimate if we assume a certain density of obstacles. For example, if we take that 50% of CGs are occupied by obstacles (), s and the mean life time of obstacles reads s, which is typical for many DNA-binding proteins [43, 44]. While we cannot unambiguously identify the nature of the obstacles, our calculation shows that transcription slow-down due to dynamic disorder seems to be a plausible biological mechanism.
In this work we study a version of the totally asymmetric exclusion process with dynamic disorder (ddTASEP) in which defects, which slow down the movement of particles or block it completely, appear and disappear randomly on any site. This is motivated by the binding and unbinding of proteins in intracellular transport and DNA transcription, which serve as obstacles to transport, but may also apply to various traffic scenarios in which dynamic obstacles are present (e.g. traffic lights).
We consider two versions of this model, (i) when obstacles appear and disappear independently of particle occupation, (ii) when obstacles can only appear on empty sites.
For periodic boundary conditions we investigate properties of the current-density relation (CDR), i.e., the current as function of the particle density. We perform computer simulations of the model and observe that for unconstrained defect dynamics the symmetric, parabolic form of the CDR of the standard TASEP is preserved, while for constrained defect dynamics, the CDR becomes skewed for slow defect turnover (). We also observe a spatially heterogeneous distribution of particles for slow defect dynamics, in particular the formation of large particle clusters.
To understand the results of computer simulations we develop a range of mean-field approaches of increasing complexity. These enable us to derive analytic estimates for the CDR and mean particle cluster size. These approximations reproduce well the magnitude and features of the CDR, for example the skewness/symmetry for constrained/unconstrained defect dynamics, and of the mean cluster size for varying defect (un-)binding rates.
We also study the model with open boundaries in which particles enter on one end of the lattice and exit on the other. We use an extremal current principle to show that the model exhibits the same phases as the standard TASEP but with altered phase boundaries, which are well-approximated by our mean field theory.
Dynamic defects, in the form of proteins binding to DNA or structural features of chromatin, have been recently recognized as an important determinant of gene transcription. We show that the ddTASEP is able to explain why gene transcription depends on certain genomic features such as CG density and methylation. Our hypothesis is that proteins that bind to these DNA features or chromatin modifications act as obstacles for transcription and block the RNA polymerase – a molecular machine which moves along DNA and produces mRNA.
Besides transcription, we expect that other intra-cellular processes such as transport by motor proteins can be affected by dynamic defects. For example, microtubule-associated proteins which bind to microtubules may obstruct the progress of kinesin and dynein motors. A crucial difference to our model is that motor proteins themselves can (un-)bind from/to transport filaments. This has been modelled by TASEP variants which do not conserve the number of particles, such as the TASEP with Langmuir kinetics . Static defects [45, 46] and dynamically disordered binding rates  have been considered in previous works, yet disorder has not been discussed in terms of obstacles (slow sites). It would be illuminating to see whether the main conclusions for our model (current-density relation, TASEP-like phase diagram, emergence of clusters in the low-density phase) remain true for a ddTASEP with Langmuir kinetics.
Our work closes a substantial gap in the field of driven diffusive systems. While the TASEP with quenched disorder [19, 20, 21, 23, 24], isolated dynamic defects [29, 31, 30, 48], and disorder with particle-induced unbinding  has been studied before, we study for the first time the TASEP with random dynamic disorder. Although we do not progress beyond the mean field theory, we can reproduce many features (magnitude, skewness) of the CDR obtained from computer simulations. It remains an open question whether our model can be solved exactly as in the case of the ordinary TASEP. We think that the persistence of the parabolic shape of the CDR (very much like the CDR of the ordinary TASEP) in the unconstrained version of model may hint towards some hidden symmetries of its steady-state configurations. Finding such symmetries, and exploring connections between this model and zero-range-like processes (cf. Appendix B) will be an interesting future research project.
We thank Luca Ciandrini for help with literature research. B.W. was supported by an RSE Personal Research Fellowship.
We calculated CG density by counting CG pairs for each gene from the human hg19 reference genome data (GRCh37.74), and dividing by the length of gene. We took gene expression levels from Supplementary Table 1, Ref.  (units: RPKM, Reads Per Kilobase of transcript per Million mapped reads). RPKM measure the amount of mRNA from a given gene accumulated in the cell, not the actual transcription rate . However, if we assume that mRNA is degraded with (possibly gene-dependent) constant rate , we expect gene expression level to be proportional to . Assuming further that degradation rates for different genes are uncorrelated, and averaging over many genes (see below) we obtain that RPKM.
To plot gene expression versus CG density, we took pairs (CG density, expression in RPKM) for all genes for which expression had been measured, and binned them according to CG density. Bin 1 contained all genes with CG density between 0 and 1/1000 nucleotides, bin 2 contained genes with CG density between 1/1000 and 2/1000 nucleotides, etc. For each bin we calculated mean expression and its standard error.
To predict gene expression using our model (Eqs. (35) and (37)), we took initiation rates from Additional File 6, Ref.  (units: 1/min). We then calculated theoretical expressions for each gene from the data set for which we knew its initiation rate as
where were two unknown parameters (). We note that here we replaced compared to Eq. (35), since the measured initiation rate corresponds to RNA polymerases that actually ’enter’ the DNA, i.e. in absence of defects at the initiation site. We binned the genes as for the experimental data, and found best-fit that minimized the sum of squared differences between the binned and experimental RPKMs for CG densities between 0 and 25/1000 nucleotides:
where SE denotes standard error of RPKM.
Here we determine the saturation time for clusters growing under constrained defect dynamics, Eq. (7). The saturation time is the time needed for all particles between two defects to accumulate in a continuous queue, in absence of defect unbinding, . The queue length in this limit corresponds to the total number of particles between two defects (as for unconstrained dynamics, see Section 3.1). To approach this problem, we first consider the mapping to a version of the totally asymmetric zero-range process (ZRP)  with site-wise dynamic disorder, which we call ddZRP. The totally asymmetric ZRP is a lattice model in which each site can carry an arbitrary amount of particles, which can hop from site to site with a rate that only depends on the number of particles on the current site , but not on that of any other site . Furthermore, for the disordered case, it may depend on the defect state of each ZRP site , . We map the constraint ddTASEP on the ddZRP by identifying each hole in the ddTASEP (ordered from left to right) as site in the ddZRP (i.e. corresponds to the -th hole counted from the left) and the particles left of this hole, as the particles on site , so that the particle number on each ddZRP site, , corresponds to consecutive stretches of particles, i.e. clusters. Since ddZRP sites correspond to ddTASEP holes, the defect dynamics on ddZRP sites are unconstrained. We note that the ddZRP has a different system size, (the number of holes, which is conserved).
We can now consider cluster dynamics as a coagulation-decoagulation (CD) model, as studied in Ref. . In this view, for simplicity we consider all particles between two defect sites (in the ddTASEP) as a cluster (which is true for most of the time for , when all particles accumulate in a queue).
A cluster moves forward whenever a defect unbinds, with rate . Since after each unbinding event a cluster moves on average ZRP sites, this corresponds to a random walk in the variable ( = elapsed time) with diffusion constant . Two clusters coagulate, forming a single cluster, if any defects between them disappear, i.e. when a cluster moves onto another cluster. As long as all particles in a cluster (all particles between two defects) accumulate on a single site (in the ddZRP), no defects can bind between those particles to separate the cluster, thus it cannot de-coagulate. Only when the cluster moves forward, at rate , the particle of a cluster stretches out over several sites, and defects may bind between particles of a cluster, leading to the de-coagulation of the cluster. The distance to the next defect is on average sites. During this process, particles move with speed between two defects, thus it takes an average time of , where is the cluster size, until the particles have piled up again on a single site. During that time period defects can bind on the sites between the previous and the next defect, thereby separating the cluster. The cluster decoagulation probability is therefore approximated by the defect binding rate , times the number of sites between the initial defect and the next one, , times the time it takes for all cluster particles to reach it, , which is . Thus the de-coagulation rate is . According to Ref. , the equilibrium density of clusters of such a coagulatio/de-coagulation process is , and thus,