# The effect of reactions on the formation and readout of the gradient of Bicoid

###### Abstract

During early development, the establishment of gradients of transcriptional factors determines the patterning of cell fates. The case of Bicoid (Bcd) in Drosophila melanogaster embryos is well documented and studied. There are still controversies as to whether SDD models in which Bcd is Synthesized at one end, then Diffuses and is Degraded can explain the gradient formation within the timescale observed experimentally. The Bcd gradient is observed in embryos that express a Bicoid-eGFP fusion protein (Bcd-GFP) which cannot differentiate if Bcd is freely diffusing or bound to immobile sites. In this work we analyze an SDID model that includes the Interaction of Bcd with binding sites. Using previously determined biophysical parameters we find that this model can explain the gradient formation within the experimentally observed time. Analyzing the differences between the free and bound Bcd distributions we observe that the latter spans over a longer lengthscale. We conclude that deriving the lengthscale from the distribution of Bcd-GFP can lead to an overestimation of the gradient lengthscale and of the degree of cooperativity that explains the distribution of the protein Hunchback whose production is regulated by Bcd.

Keywords: Bicoid, SDD model, reactions, mathematical modelling, fluorescence.

## 1 Introduction

During the early development of embryos, cell differentiation is carried out by the transcriptional regulation of gene expression. The establishment of gradients of morphogens determines the patterns of cell fates. Cells get the information on their relative spatial location by “reading” the local concentration of these morphogens [Wolpert, 1969, Crick, 1970]. Drosophila melanogaster embryos constitute a model system in which this subject has been studied with great detail. In this system, bicoid (bcd), a maternal effect gene whose mRNA is localized at the anterior end of the embryo [St Johnston et al., 1989, Little et al., 2011], plays a key role. The protein, Bcd, one of the principal morphogenes in these embryos, is responsible, in conjunction with other factors, of the anterior-posterior (AP) axial patterning. bcd is translated into Bcd mainly at the anterior pole of the embryo [Little et al., 2011], forming a concentration gradient along the anterior-posterior axis. Bcd is a transcription factor for over 20 target genes involved in the development. In particular, its role in the regulation of Hunchback is fundamental during the early embryogenesis of Drosophila. After fertilization, the cell undergoes several nuclear division cycles (n.c) without cytokinesis. After n.c. 7, nuclei move to the surface forming a syncytial blastoderm and hours after fertilization, just before cytokinesis begins (close to n.c. 14) there are approximately 6000 nuclei on the surface. Recently, live imaging using Bcd-GFP allowed the observation of the spatio-temporal distribution of Bcd during these early stages of the embryo development [Gregor et al., 2007a]. From these observations it was determined that the Bcd concentration gradient is established within the first 10 n.c., i.e., 90 minutes after egg deposition. Then, between n.c 10 and 14 the gradient remains almost unchanged and thereafter begins to decrease. The mechanisms by which the Bcd gradient is established so early are still not completely determined [Grimm et al., 2010].

One of the simplest and most widely used models to explain the formation of the Bcd gradient is the Synthesis, Diffusion, Degradation (SDD) model [Driever and Nussleinvolhard, 1988, Driever and Nusslein-Volhard, 1989, Struhl et al., 1989]. It assumes that Bcd is synthesized at a constant rate, , at the anterior end, then diffuses along the antero-posterior axis () of the embryo with diffusion coefficient, , while it is being uniformly degraded with rate, . Assuming, for simplicity, a cylindrical embryo of transverse area, , and total length, , the dynamic equation of the Bcd concentration, , in the SDD model can then be written as:

(1) |

with boundary conditions:

(2) |

For long enough the solution of (1)–(2) coincides with that of:

(3) |

in the region . The stationary solution of (1)–(2) is:

(4) |

where and

(5) |

is the characteristic length-scale of the gradient [Houchmandzadeh et al., 2002]. Observations of the Bcd distribution have led to the estimate . Within the framework of the SDD model, the time it takes for Bcd to diffuse over this distance is of the order of . Thus, for the gradient to be established within 90 min it is necessary that [Gregor et al., 2007a]. The first quantification of the Bcd diffusion coefficient was obtained using Fluorescence Recovery After Photobleaching (FRAP) during mitosis [Gregor et al., 2007a]. The estimated value was which was an order of magnitude too small to explain the establishment of the Bcd gradient within the SDD model during the experimentally observed times. In [Bergmann et al., 2007] it was argued that after nc 14 the gradient does not reach the steady state solution, so that (4) is not valid to estimate the diffusion rate. Other alternatives to the SDD model have also been proposed that involve an active [Gregor et al., 2007b] or advective [Hecht et al., 2009] transport of the Bcd protein. Another model stated that the stability of the gradient between n.c. 10 and 14 could be explained in terms of an underlying mRNA gradient [Spirov et al., 2009]. The distribution of mRNA was measured in [Little et al., 2011] finding that it is bell-shaped with an 80% concentrated within the 20% of the total embryo’s length that lies closest to the anterior pole. This seems to discard the possibility that the Bcd gradient is a simple reflection of the way its mRNA is distributed along the axis. Regarding diffusion, the Bcd coefficient was measured again more recently using Fluorescence Correlation Spectroscopy (FCS) in the cytoplasm during interphase [Abu-Arish et al., 2010] and in nuclei [Porcher et al., 2010]. The value estimated in the former was [Abu-Arish et al., 2010]. In that work the diffusion coefficient was also estimated using FRAP obtaining . Although the FCS result apparently reconciles the observed time it takes for the gradient to be formed with the SDD model, the question arises as to what is the reason for the discrepancy between and . In [Sigaut et al., 2014] an explanation was provided for this apparent discrepancy.

Based on previous studies on the transport of substances that diffuse and react [Pando et al., 2006] and on the analysis of FCS and FRAP experiments in such a case [Sigaut et al., 2010], it was shown in [Sigaut et al., 2014] that both the FRAP and FCS estimates of the Bcd diffusion coefficient [Gregor et al., 2007a, Porcher et al., 2010] could be correct if the interaction of Bcd with immobile or slowly moving binding sites was taken into account. Namely, if Bcd is assumed to diffuse with free coefficient, , and react with more massive binding sites, , according to:

(6) |

its net transport can be described in terms of “effective” diffusion coefficients. As shown in [Pando et al., 2006] the effective coefficient is different depending on whether one looks at the transport of a single molecule in which case it is:

(7) |

or at the dispersion of a collection of them in which case it is:

(8) |

In (7)–(8), is the free diffusion coefficient of and of , is the dissociation constant of the reaction (6) and and are the free and the total binding site concentrations. FRAP provides an estimate of [Sprague and McNally, 2005] and FCS gives estimates of and [Sigaut et al., 2010]. Interpreting the results of [Gregor et al., 2007a, Porcher et al., 2010] within this framework the work of [Sigaut et al., 2014] showed that the FRAP and the FCS estimates of the Bcd diffusion coefficient were compatible. Associating the estimated coefficients to or , depending on the experiment, the concentrations and the dissociation constant were determined [Sigaut et al., 2010]. In the current paper we use the same framework with the parameters determined in [Sigaut et al., 2010] to analyze how changes with space and time along the embryo when the Interaction with binding sites is considered as well as its Synthesis, Diffusion and Degradation. Thus, we analyze the formation of the Bcd gradient within an SDID model. It is likely that Bcd binds cooperatively to binding sites and not as prescribed by (6). Our SDID model should then be interpreted as a toy model where to investigate how the characteristic length and time scales of are affected when the interaction with binding sites is considered. In spite of its simplicity, its predictions can be contrasted with the observations. Furthermore, it helps pinpoint the main drawbacks of interpreting the experimental observations without considering the effect of the interactions of Bcd. In particular, taking into account the distribution of Bcd-mRNA determined in [Little et al., 2011] we find that the SDID model can account for the formation of the bulk part of the Bcd gradient within the experimentally observed times. Although the formation of the gradient is a nonlinear process that involves several timescales, the analysis we present here shows that the collective effective coefficient, , gives a correct estimate of the order of magnitude of the time it takes for to converge to its corresponding stationary distribution.

An important aspect of the bcd morphogen system is the precise response of one of its main target genes, hunchback (hb). As well as bcd, hb is a maternal effect gene. In the early embryo, the hb mRNA is supplemented with zygotically transcribed mRNA which production is regulated by Bcd. The distribution of the resulting protein, Hb, presents very sharp borders along the AP axis, as an “on/off” pattern. This indicates a high sensitivity of the hb mRNA to the concentration of Bcd. In [Struhl et al., 1989, Driever and Nusslein-Volhard, 1989] 7 Bcd binding sites in the hb promoter were identified. Thus, the idea of a cooperative Bcd binding was proposed to explain the sharp borders of [Hb]. This implies that the concentrations of Hb and Bcd are related by:

(9) |

where the Hill coefficient, , accounts for the cooperativity and
is the Bcd concentration at which [Hb] reaches half of
its maximun value. Using embryos immunostained for DNA, Bcd and Hb,
scatter plots of vs. were obtained
in [Gregor et al., 2007b]. From these plots a Hill coefficient,
, was estimated. In spite of the relatively large value of ,
the scatter plots also showed a remarkable degree of precision between
the distributions of Hb and Bcd ( near the point of
half-maximal activation) [Gregor et al., 2007b] in spite of the
fluctuations that are intrisic to the transcription
process [Little et al., 2013]. The control experiments
of [Gregor et al., 2007a] showed that the fluorescence intensity
obtained in antibody stainings was linearly related to the one
collected from Bcd-GFP. The latter allows the observation of the
spatio-temporal dynamics of in vivo. It is important to
remark that in experiments that use Bcd-GFP, it is not possible to
distinguish if Bcd is free or bound: the fluorescence distribution
profile corresponds to the total . This means that the actual
dynamics of the free Bcd is hidden in the observations. The length and
time scales of free Bcd could differ, in principle, from that of the
Bcd-GFP observed experimentally. In fact, in the present paper we show
that this is the case using the SDID model with realistic parameter
values. Assuming that the length-scale determined from the
observations of Bcd-GFP corresponds to that of free Bcd can lead to
incorrect estimations of the Bcd diffusion coefficient and/or
degradation rate. The difference in the length-scale of the free and
the total , on the other hand, can have implications for the
relationship between and that is derived from the
fluorescence observations. As we show in the present paper, depending
on what the binding sites of the SDID model represent, the Hill
coefficient that characterizes the cooperativity with which Bcd
promotes the production of Hb can be different from the one that is
directly derived from the scatter plot of the observed
fluorescence. This, in turn, calls for a revision on the conclusions
about the precision with which hb reads the distribution.

## 2 Methods

### 2.1 The SDID model

We consider a model in which Bcd is synthesized over a region of the anterior pole embracing 20% of the embryo, diffuses with free coefficient, , interacts with a single type of binding sites according to (6) and is degraded at a constant rate. We consider a cylindrical domain in which all concentrations only vary along the axial coordinate, , (). The assumptions on the Bcd synthesis are based on the observations of [Little et al., 2011] according to which of the mRNA of Bcd is located in a region that starts at the anterior pole and extends up to of the length of the embryo. Although the mRNA concentration in this anterior region is not perfectly uniform, we simplify its description and assume that is synthesized with a rate:

(10) |

where is constant. We assume that the binding sites are uniformly distributed over the whole embryo and diffuse with coefficient, . We suppose that the binding sites belong to molecules that are much more massive than Bcd so that their mobility remains unaltered when they bind Bcd. In certain instances we also assume that they are immobile. We consider different alternative versions of the model that differ in the manner in which we treat Bcd degradation or in whether we include the dynamics of Bcd-GFP maturation or not. Here we present the equations of the version that we call of “partial degradation” because we assume that Bcd is degraded only in its free form. This version does not include the process of GFP maturation either. The equations of the other versions of the SDID model are described in the Appendix. The spatio-temporal dynamics of the concentrations if Bcd is degraded only when free and the process of Bcd-GFP maturation is not included is given by:

(11) |

where and are the total Bcd and binding site concentrations, respectively and is the degradation rate. To simplify the notation we refer to the concentration of free Bcd as . It should not be confused with Bcd, which generically refers to the protein. In this model we assume that all Bcd, free or bound to sites, is fluorescent.

### 2.2 Analytical estimations: the SDID model under the fast reactions approximation

The SDID model in all of its versions is a reaction-diffusion system. The analysis of this kind of systems may be complicated. Hereby, in order to interpret some results and choose a priori some parameters we introduce an approximation where the Bcd transport is described in terms of an effective diffusion coefficient. If we consider that reactions take place on a much faster time scale than diffusion a fast reaction [Strier and Dawson, 2000] or fast buffering approximation [Wagner and Keizer, 1994] can be used. Under this condition the systems given by (11) (or (32)) can be described as:

(12) | |||||

(13) |

where is the effective (collective) diffusion coefficient defined in (8), , and for the partial degradation model (11), and for the total degradation model (32). Given that , for simplicity we will consider . In such a case the term in (12) can be neglected and the first of the equations reduces to:

(14) |

Although this looks like a linear diffusion equation it is not, since , and depend on , i.e., they are position and time dependent. The steady-state solution of (14) coincides with that of (11) for .

### 2.3 Numerical simulations

We solve the system of equations (11), (32) and (35) using the Douglas-Ratchford ADI method [Douglas and Rachford, 1956]. The integration domain is a cylinder of length with no flux boundary conditions at both ends. We list in table 1 the parameter values that we use. For the concentrations, dissociation constant and free diffusion coefficients we use the estimates presented in [Sigaut et al., 2014] which were derived from an analysis of the experiments of [Abu-Arish et al., 2010]. We show in section 3.1 how the rest of the parameters were chosen.

### 2.4 Choice of parameter values

We here describe how we choose the parameter values in the case of the SDID model with partial degradation. To have a good starting point, we first look at the stationary solution of (11) with the goal of comparing semi-quantitatively the observed fluorescence profile with the (stationary) distribution of the total Bcd concentration, i.e., of . To this end we set , a reasonable approximation given that . Assuming that the continuous production of Bcd eventually saturates the binding sites inside the region where Bcd is synthesized (i.e., for ) we estimate that there is a subregion, , where and are approximately uniform with and in equilibrium between themselves:

(15) | |||||

(16) |

and where there is a balance between the rate of Bcd production and degradation. Thus, we expect the stationary solution to satisfy:

(17) | |||||

(18) |

close to the anterior pole. The observed Bcd concentration has been estimated as at the anterior pole [Abu-Arish et al., 2010]. Therefore, we choose , so that at . Outside the region of Bcd synthesis (), the stationary solution also satisfies the equilibrium condition (15)–(16). The stationary distribution of free Bcd in this region then satisfies

(19) |

Thus, it is given by (4) with equal to:

(20) |

decays with a different lengthscale. Namely, defining (for ) and using (4) and (15) we obtain

(21) |

where and are given by (7)–(8) with . Clearly, (21) does not prescribe a single lengthscale, , since and depend on which is not uniform for . However, using some “typical” concentration values along the gradient we find a first estimate of . In particular, considering that corresponds to the observed characteristic lengthscale of the fluorescence gradient, , that and that in the region where FCS experiments are performed [Sigaut et al., 2014], (20)–(21) yield . Then, through the constraint that the total concentration observed at the anterior pole imposes on (18), we derive . Based on these a priori estimates we then explore the parameter space and choose final values that are able to reproduce semi-quantitatively the experimental observations. In particular, we found that allowed to reproduce most properties of the observed gradient. Using this value of , (16) and , we estimated .

## 3 Results and discussion

### 3.1 Reproduction of the gradient.

Here we show the results obtained through numerical simulations of the SDID model in its different versions and describe how the solutions depend on the various parameters. In the simulations we use the values, , , , , and (the latter, at the anterior pole) derived in [Sigaut et al., 2010]. We use [Gregor et al., 2007a] and the rate, , at which Bcd-GFP matures and becomes fluorescent estimated in [Drocco et al., 2011, Liu et al., 2013]. The other parameters were chosen so that the solutions reproduced semi-quantitatively the experimental observations of [Gregor et al., 2007a, Little et al., 2011] as explained in Methods and the Appendix.

We show in figure 1 the concentration of the different species along the AP axis obtained at time, min, using the partial (a) and total degradation models (b) with the parameters of table 1 and the others as described in the caption. For both models we see that the Bcd profile is consistent with the experimental observations. In particular, its concentration decays to of its maximum value at . We also observe that most Bcd molecules are bound at the anterior pole (as estimated in [Sigaut et al., 2014]). As we move away from the anterior pole, begins to decrease and the number of free binding sites increase. Since the mobility of the bound molecules is slower than that of the free ones, the gradient of , and hence of , is more extended than that of . Comparing figures 1 (a) and (b) we observe that the Bcd distribution is slightly different due to the difference in how the Bcd degradation is treated. In particular, near the anterior pole the slope of is more pronounced in the total degradation case. This agrees with a small difference in the lengthscale of the gradient obtained with each model. In the case of the partial degradation model, decays to of its maximum at , while in the case of the total degradation model the same percent is achieved at . figure 1 also shows that in neither model the free or the total Bcd concentrations follow the mRNA distribution given by . This implies that, for our model, the Bcd concentration distribution is not a passive reflection of the mRNA that produces it, as assumed in [Spirov et al., 2009].

Comparing the values of and used in figure 1 with previously reported ones, [Grimm et al., 2010, Gregor et al., 2007a, Drocco et al., 2011], we observe that the one used in the total degradation model is within this previous estimated range, while the one used in the partial degradation model is larger. This discrepancy is reasonable if we take into account that degradation rates were estimated from fluorescence records that did not distinguish between bound and free Bcd. Namely, if only free Bcd is degraded with rate , (i.e., ), the rate at which decreases due to degradation is given by (i.e., ). Degradation rates determined from fluorescence images would then correspond to . For the parameter values of figure 1(a), this fraction it is , which allows to explain in part the discrepancy between the degradation rate used, , and the values reported previously in the literature [Grimm et al., 2010, Gregor et al., 2007a, Drocco et al., 2011].

Regarding the rate of protein synthesis, , the number of mRNA molecules inside the embryo during n.c. 10-13 was estimated as molecules [Little et al., 2011]. Considering that mRNA molecules are mainly synthesized over a region that occupies % of the total length of the embyro, that each molecule can synthesize one protein per second [Milo and Phillips, 2015], that the region where nuclei are located and Bcd diffuses is the -wide outermost “slice” of the embryo and approximating the latter by an ellipsoid of radii , and , the mRNA molecules imply a Bcd synthesis rate. This value is similar to the ones used in the SDID model with partial or total degradation (see figure 1).

So far we have discussed the spatial properties of the gradient. The time it takes for the gradient to be formed is another important aspect that has been debated at large, mainly because the diffusion coefficient estimated in [Gregor et al., 2007a] was too small to account for the gradient formation within the experimentally observed times. Changing the reaction rate, , while keeping constant it is possible to modify the timescale over which the reactions take place. We thus explored the predictions of the SDID model using the parameters of figure 1 varying over the range . We found that the results were mostly insensitive to variations in (data not shown). We observed that the concentration distributions only changed for the most extreme (unrealistic) values of . We thus chose , which we think is reasonable for the type of interactions that Bcd may experience. We show in figure 2(a) the distribution of the total Bcd concentration at various times obtained using the partial degradation SDID model. There we can observe how converges to its asymptotic value with increasing time. In particular, the time it takes for it to be within of the stationary solution is more than min ( h). This time is much larger than the 100 min observed in experiments [Gregor et al., 2007a]. However, for min the distribution does not change significantly, as illustrated in figure 2(b). During this time interval differs by less than with respect to the concentration at min and the largest differences are restricted to a very small spatial region. Moreover, the difference with respect to the stationary distribution for min is never larger than regardless of position. differences are in the border of experimental detectability (particularly, far away from the anterior pole). Our results then suggest, in accordance with the work of Bergmann et al [Bergmann et al., 2007], that the steady-state is not reached in less than min but that yet the gradient may seem stationary. In the case of the SDID model with total degradation we observe a similar evolution of . For min, varies by less than . Although the rates of production and degradation for the partial and total degradation models are different, the spatio-temporal dynamics of the free and total Bcd concentrations are similar. For this reason, from now on we will present results corresponding to the partial degradation model only.

There is still one property of our simulations that is incompatible with the observations: the maximum concentration () is reached almost instantaneously. This does not agree with the results of [Little et al., 2011] where it is observed that the maximum is reached minutes after fertilization. This discrepancy might be due to the finite time it takes for GFP to mature and become fluorescent. To evaluate the effect of maturation we performed numerical simulations of the system given by (35). We show in figure 3(a) the total Bcd concentration normalized by its asymptotic value at as a function of time for different maturation rates, . As a reference we also show the corresponding curve for the SDID model with no maturation (black dashed curve). The parameters used in the simulations are the same as in figure 1(a). As expected, the convergence to the asymptotic value takes longer for the model that incorporates maturation and becomes slower as decreases. For we observe little differences between the models with and without maturation. At min the concentration reaches of the asymptotic value in the model with maturation and in the model without maturation. In the case with the concentration reached at min is of the asymptotic value. Hence, for low values of a significant delay in the convergence is observed in the region close or at the source. The delays obtained, however, never exceeded of the time elapsed at very early times and this gap decreased rapidly as time went by. In regions far from the source, the fraction of mature to total Bcd-GFP molecules is larger because it takes longer for the molecules to reach those regions and in that time they mature and become fluorescent. These results are similar regardless of whether we consider the SDID model with partial or total Bcd degradation. This disparity in the delay to reach steady-state depending on the position along the AP axis affects the fluorescence spatial distribution implying a change in the relationship between the lengthscale of the observed gradient and the parameters of the model [Liu et al., 2013].

To analyze the effect of maturation on the lengthscale we show in figure 3(b) the ratio between total Bcd over fluorescent Bcd, . depends on the relation between and . In figure 3(b) we show the value, , as a function of for different values of (). For all cases is larger near the source and decreases to as the posterior pole is reached. In [Liu et al., 2013] it was determined that close to the source. Of all the considered values of the one that gives the most similar result to this observation is . This value of is perfect agreement with the rate of degradation of Bcd reported in [Grimm et al., 2010, Gregor et al., 2007a, Drocco et al., 2011]. The fact that is a decreasing function of the distance to the source also changes the lengthscale of the gradient with respect to the case in which GFP maturation is not considered. In particular, if we compare the distribution, , for the same parameters as in figure 3 with and without including the process of GFP maturation the characteristic lengthscales differ by a factor of 3.

### 3.2 Interpretation of the experimental observations with a model that includes reactions.

The results of the simulations presented so far show that it is possible to reproduce the spatio-temporal characteristics of the Bicoid gradient semi-quantitatively using the SDID model with “reasonable” biophysical parameter values. We now discuss how the experimentally observed properties are related to the parameters of the model. More specifically, we are interested in determining the relationship between these parameters and the length and timescales of the Bcd gradient and how these relationships change depending on whether the reactions with binding sites are included in the model or not. Thus, we are after a re-interpretation of the observations within the framework of a model that includes reactions. Such a model is nonlinear and the concentrations are not characterized by a single spatial or temporal scale. This becomes evident in the fast reaction approximation, (14), where the transport rate is determined by an effective diffusion coefficient which depends on the concentration, and hence, does not have a single value along the embryo or over time. The presence of “many” coefficients or “multiple” scales is in agreement with the work of [Little et al., 2011] in which, in order to reproduce the experimental observations, a model with diffusion coefficients that changed in time ad hoc was introduced. This reinforces our idea that it is the effective diffusion coefficients, which naturally arise within the context of the SDID model, that determine the characteristic scales of the problem. Here we seek to relate the effective parameters of the model with the observed spatial scale and the convergence time of the gradient in the simplest possible way. To this end we work with the SDID model with partial degradation and . The difference with respect to the SDID model with total degradation is mainly a matter of parameter values. The differences with respect to the model that includes the delay in GFP maturation is discussed later.

#### 3.2.1 Lengthscale: free Bcd gradient vs total Bcd gradient

As discussed in the Introduction, the stationary solution of the SDD model with a source at one end is given by (4)–(5). Within the framework of this model corresponds to the characteristic lengthscale, , of the observed fluorescence distribution. As described in the Methods Section, the stationary solution of the SDID model with partial degradation satisfies (13)–(14). Within the context of this model, the lengthscale of the free Bcd concentration, , is given by (20) (and (33) in the case of the SDID model with total degradation). This lengthscale does not correspond to that of the observed gradient because the fluorescence cannot distinguish between free and bound Bcd. Therefore, the observed lengthscale, , should be related to that of the total Bcd concentration, , given by (21). and can be very different between themselves. Moreover, changes with position and time. We now discuss in what regions (21) provides a good estimate of the characteristic lengthscale of the total Bcd distribution. We show in figure 4(a) the normalized free and total Bcd concentrations as functions of at . The free Bcd distribution decays by 50% for , while the total Bcd concentration does it at . Although the total concentration does not decay exactly exponentially with as in (4), it can be approximated by such an expression over a certain range of values. Free Bcd does follow an exponential decay over the region where there is no source. This can be observed in figure 4(b) where we plot and , normalized by their maximum values, as functions of . The exponential fits (linear on the logarithmic scale of the figure) were done over the regions for and for obtaining and . These values can be compared with those predicted by (20)–(21). In the case of free Bcd, the characteristic lengthscale given by (20) with the simulation parameters is which agrees with the fitted value. In the case of the comparison is more complicated because the lengthscale of (21) depends on and which vary with time and space. If we consider the values, and at time min and over the region where the fitting begins, , we obtain . Inserting those values in (21) we obtain which is very similar to the one estimated from the fitting. If instead we consider the values at , the ratio of effective coefficients is , leading to an estimate of which only differs by a factor of 2 with respect to the fitted value. Thus, (20) and (21) provide good estimates of the characteristic lengthscales of the free and total Bcd concentrations if we use the values of the effective coefficients in the region just contiguous to the source (where concentrations start to decrease). We obtain similar results using (33) and (34) within the framework of the SDID model with total degradation.

#### 3.2.2 Timescale: effective vs. free diffusion coefficients

The SDD model given by (1)–(2), for long enough , has solutions of the form [Bergmann et al., 2007]:

(22) |

where erfc is the complementary error function, erfc(. This equation shows that the approach to the stationary solution (4) occurs as if there was a front that travels at speed:

(23) |

that depends on the diffusion coefficient and the rate of degradation , and allows to define a convergence time at a distance from the source as:

(24) |

Thus, if is known a priori and is chosen so that the theoretical characteristic lengthscale of (5) corresponds to the observed fluorescence lengthscale, , the convergence time can be rewritten as:

(25) |

At , this time is too long () if it is assumed that , the value estimated in [Gregor et al., 2007a, Abu-Arish et al., 2010] using FRAP, and it is too short () if the free diffusion coefficient of Bcd estimated in [Sigaut et al., 2014], , is used instead. The solution (22) also shows that the rate of production, , determines the maximum value of but is not involved in the convergence time.

In the case of the SDID model it is more difficult to define a “propagation speed” because in addition to the characteristic Bcd degradation time there are other timescales related to the reaction. In order to derive a propagation speed in such a case we then work with the reduced equations in the fast reaction approximation ((14) and (13)). Given the formal equivalence between (14) and (1)-(2) we define the speed and the convergence time as in (23) and (24) but using instead of and instead of . We obtain:

(26) |

for the model with partial degradation and

(27) |

for the model with total degradation. As we did for the lengthscale, we now analyze whether the solution obtained numerically for the model with partial degradation moves with the speed described by (26), if it is possible to define a single characteristic value for the speed in the region immediately adjacent to the source and, in that case, which effective diffusion coefficients determine it. As in the case of the SDD model, if the value, , is determined setting with the observed fluorescence lengthscale and given by (21)–(20) with known values of qne , the convergence time can be rewritten as:

(28) |

The same expression is obtained for the model with total degradation using (34) and (33) and setting . It then follows that if the degradation rate, or , is derived from the observed fluorescence lengthscale, the convergence time to the steady state solution will be the same regardless of whether we use the model with total or partial degradation. We now continue the analysis for the model with partial degradation.

In order to represent the advancement of the Bcd front and characterize its timescale we compute for each position, , the time, , at which the free Bcd concentration, , reaches 50% of its asymptotic maximum value, . We plot in figure 5 the position, , the time, , just defined. The slope of this curve corresponds to the propagation speed. As expected in this case the front does not move with a constant speed. It can be observed that the speed is smaller the smaller is. As for the analysis of the lengthscale, here we focus on a region where the speed is approximately constant. Based on the results of figure 5 we fit the front profile with a linear function in two regions: (light dashed line) and (dark dashed line). From the fits we determine the speeds in the region closest to the source and in the region further away. This implies that at a distance of the source of the order of the observed fluorescence lengthscale, , the convergence time is of the order of 77-100, similar to the characteristic time of the gradient formation obtained experimentally (). We must point out that even if we here analyze the convergence of the free Bcd concentration to its steady state solution, the total Bcd (free and bound) reaches its asymptotic distribution on a similar timescale.

We now analyze whether there is a simple expression that can be used to estimate the speed and the convergence time. To this end we compare the speed estimates of figure 5 with those predicted by (26). The latter gives and . These estimates are similar to those derived with the fitting. We then conclude that it is possible to relate the model parameters with the timescale of the Bcd gradient formation in a relatively simple way. (26) also highlights the importance of distinguishing between the collective and single molecule diffusion coefficients. If in (26) we replace by the estimates of the front velocity decrease by approximately one half implying that the timescale would be twice the value derived before.

#### 3.2.3 The distinction between free and total Bcd and the role of Bcd as a transcription factor.

As we have already mentioned, Bcd acts as a transcription factor for the expression of hb. This process has been studied in detail both experimentally and theoretically [Gregor et al., 2007b, TkaÄik et al., 2008, Dubuis et al., 2013]. In particular, the observed (fluorescence) distributions coming from Bcd and Hb in fixed embryos have been used to develop the theory. These distributions were found to be related by a non-linear function with Hill coefficient , i.e., consistent with a high degree of cooperativity of Bcd for the transcription of hb [Gregor et al., 2007b]. These observations, however, cannot distinguish between free and bound Bcd. Here we explore how these observations should be re-interpreted when the interaction of Bcd with binding sites is taken into account. To this end we consider two possible scenarios. In one of them we assume that the sites, , of the scheme (6), correspond to specific sites for the transcription of hb. In the other situation we assume that represents other binding sites either on DNA or in other species (e.g. RNA) and that the binding of Bcd to the promoters of the hb transcription does not alter significantly the concentrations of free or bound Bcd that enter the scheme (6). This last scenario fits nicely within the sliding and hopping model in which trasncription factors bind to non-specific sites and then eventually find the specific sites for transcription on the DNA molecule [von Hippel and Berg, 1989]. In particular, the binding to DNA with different affinities, probably associated to specific and non-specific sites, has recently been quantified in mouse embryos [White et al., 2016].

We first analyze the scenario in which the sites that interact with Bcd are the promoters for the synthesis of the mRNA involved in the production of Hb. For simplicity, we here assume that the concentration of Hb is proportional to the concentration of bound Bcd, . This implies a simplification because the SDID model considered here does not include a cooperative scheme for the interaction of Bcd and the binding sites, S (see (6)). In any case, the purpose of this analysis is to estimate by how much the Hill coefficient that can be inferred from the observations could vary if it is assumed that the observed fluorescence is proportional to the free or the total Bcd concentrations. In this first scenario the concentrations are related by:

(29) |

with the dissociation constant of the reaction scheme (6). We show in figure 6(a) , computed as prescribed by (29), as a function of the free () and total () Bcd concentrations normalized by their maximum values. The figure reflects the fact that most Bcd is bound so that . A similar plot would have been obtained within this scenario if most Bcd were bound regardless of the cooperativity of the interaction. The relationship between and is very different from the observations of [Gregor et al., 2007b] that show a nonlinear relationship between the fluorescence coming from Hb and Bcd. As may be observed in the figure, the relationship between and is very nonlinear. However, the range of values for which [Hb] is most sensitive to changes in does not agree with the observations. These results suggest that Bcd not only interacts with the specific DNA sites that promote the production of Hb.

We now analyze the second scenario in which (6) represents the interaction with non-specific binding sites. For the results obtained in the previous Sections to be applicable within this scenario we need to assume that the fraction of Bcd that is bound to the sites on DNA that are specific for the hb transcription is much lower than the concentration, , of Bcd bound to non-specific sites. This occurs, in particular, if the concentration of non-specific sites is much larger than that of sites that are specific for transcription. In view of the hopping and sliding model of transcription [von Hippel and Berg, 1989, Elf et al., 2007, Hammar et al., 2012], this is a reasonable assumption. We then assume that Hb and (free) Bcd are related by (9) with , an effective dissociation constant between Bcd and the sites on DNA that are specific for the transcription of hb. We show in figure 6 (b) the normalized concentration of Hb as a function of the concentrations of free and total Bcd obtained using (9) with and and the distributions, and , that correspond to the stationary solution of the SDID model with partial degradation and the parameters of table 1. We note that the relationship between and in this case is more similar to the one observed in [Gregor et al., 2007b] than the one displayed in figure 6 (a). This similarity increases with increasing values of the Hill coefficient, . We here consider for simplicity. We also observe in figure 6 (b) that the dependence with free Bcd is very different from the dependence with total Bcd. Given that the observations of Bcd-GFP correspond to the distributions of total, rather than free, Bcd, this suggests that the Hill coefficient, , that may be derived from the fluorescence distributions may not correspond to the real cooperativity between Hb and Bcd if a significant fraction of Bcd is bound to non-specific sites. In order to analyze how the estimated cooperativity coefficient may differ from the actual one if it is derived from the fluorescence distributions under the implicit assumption that the Bcd fluorescence is proportional to the free (not the total) Bcd concentration we proceed as follows. We first compute using (9) with the free Bcd concentration and the same parameters as in figure 6. We assume that this is the relationship that holds in the real system. We then proceed as if we had obtained the distributions of the fluorescence coming from Hb and Bcd in this system and derive an estimated degree of cooperativity, , from a fit of the form:

(30) |

where and are proportional, respectively, to the Hb and Bcd fluorescence distributions. We compare the estimates of when the fluorescence is proportional to (i.e. ) as we think occurs in the real system and when we use instead. To do the fitting we rewrite (30) as:

(31) |

where it becomes clear that is the slope of the relationship. We show in figure 7(a) as a function of for equal to (dashed-dotted line) or (solid line). As expected, the relationship is linear in the first case with a slope, , that coincides with the actual cooperativity coefficient, . In the other case the relationship is linear for small but, as increases, the linearity is lost. In particular, the slope changes dramatically in the region where is most sensitive to changes in , . If the data is fitted using (30) in this region of great sensitivity (see figure 7 (a)) we obtain which is much larger than the actual cooperativity index, . The estimated Hill coefficient, , works pretty well when we try to reproduce the relationship over all the range of values as shown in figure 7 (b). This illustrates that the inability to distinguish between free and total Bcd can lead to an overestimation of the Hill coefficient and of the degree of Bcd cooperativity with which hb is transcribed.

## 4 Conclusions

Understanding the processes that lead to cell differentiation during embryogenesis is a key goal of scientific research [Wolpert, 1969, Crick, 1970]. Advancing in this regard is not only relevant to improve the comprehension of how life and living organisms are shaped but also of the limits that physics imposes on such processes [TkaÄik et al., 2008, Dubuis et al., 2013]. The case of the patterning along the anterior-posterior axis of Drosophila melanogaster embryos is an example that has been studied in great detail both experimentally [Driever and Nusslein-Volhard, 1989, Gregor et al., 2007a, Gregor et al., 2007b, Little et al., 2011, Little et al., 2013] and through modeling [Driever and Nussleinvolhard, 1988, Bergmann et al., 2007]. The gradient of the protein Bicoid (Bcd) which acts as transcription factor for the production of other proteins, is key for this process. The Bcd system, on the other hand, provides a paradigmatic example of the difficulties of quantifying biophysical and biochemical parameters from fluorescence observations. The SDD model [Driever and Nussleinvolhard, 1988, Driever and Nusslein-Volhard, 1989] was proposed to explain the formation of the Bcd gradient in Drosophila melanogaster embryos but it could not account satisfactorily for all its observed characteristics. In particular, the estimates of the Bcd diffusion coefficient derived in [Gregor et al., 2007a] using FRAP were too small to explain the establishment of a stable gradient within the times observed experimentally. The estimated diffusion rate was challenged by new measurements obtained with FCS [Abu-Arish et al., 2010, Porcher et al., 2010]. These apparently contradictory results on Bcd diffusion could be explained within a unified model in [Sigaut et al., 2014] by considering that, in the embryo, Bcd not only diffuses freely but also interacts with binding sites, a process that naturally occurs in this case given that Bcd is a transcription factor. According to this unifying model Bcd diffuses with a free coefficient, , and a large fraction of it is bound to immobile or very slowly moving () sites so that FRAP and FCS experiments provide information on the effective coefficients of (7)–(8) [Sigaut et al., 2010, Sigaut et al., 2014]. In this paper we have studied if this type of SDID model with the diffusion coefficients, concentrations and dissociation constant estimated in [Sigaut et al., 2014] can explain the formation of a Bcd gradient with the space and time properties observed experimentally. In spite of its simplicity, the model provides an ideal platform where to analyze how the characteristic length and time scales of are affected when the interaction with binding sites is considered. In order to quantify some unknown parameters we compared the characteristic lengthscale of the observed Bcd gradient with that predicted by the model. Given that Bcd-GFP is fluorescent regardless of whether it is free or bound we interpreted the observed lengthscale as the one that corresponds to the total (not just the free) Bcd distribution. As discussed in the Methods and Results sections, if Bcd binds to sites the lengthscales of and differ by a factor (see (21)) outside the region of Bcd synthesis. This ratio can be arbitrarily small [Pando et al., 2006] and has been estimated to be 0.1 in the anterior end of the embryo during its early stages [Sigaut et al., 2014]. This means that the quantificaction of biophysical parameters based on the lengthscale of the fluorescence distribution can lead to very different parameter estimates depending on the model with which the observations are interpreted. Since both and are nonlinear functions of the concentrations, their ratio changes with position along the embryo. The simulations of figure 4 estimate it as outside the region where Bcd is synthesized. Although this ratio is close to one and does not imply an order of magnitude difference in the parameter estimates that can be derived from the observed lengthscale, according to the simulations of figure 4, and decay to 50% of their maximum values at very different distances from the anterior end, , and , respectively. This again highlights the implications that a particular choice of model has on the interpretation of the observations. We based our choice of the parameters that were not estimated in [Sigaut et al., 2014] on the fluorescence lengthscale distribution and on the time it takes for the gradient to be established. Although there are some discrepancies between the results of the simulations of figure 4 and the experimental observations (e.g. the lengthscale is 45 in the simulations and 100-150 in the experiments) there are other processes that are not included in figure 4 that can bring these values closer together. In particular, the process of GFP maturation decreases the fluorescence intensity more pronouncedly at the anterior than at the posterior end (see figure 3(b)). Thus, the experimentally observed fluorescence lengthscale can be larger than the actual lengthscale of the total Bcd-GFP concentration that the simulation of figure 4 represents.

The simplicity of the SDD model is very appealing. Within its context, the properties that are observed experimentally are directly related to the parameters of the model. The SDID model is nonlinear and a direct comparison between theory and experiment is more complicated. In spite of this, in this paper we went beyond the numerical simulations and obtained analytical expressions that could describe the simulated results. In this way we could establish that the role that the free diffusion coefficient plays in the SDD model, in the SDID model it is played by the largest of the two effective diffusion coefficients of [Pando et al., 2006] (the collective coefficient of (8)) as illustrated in figure 4–5. Had it been the single molecule coefficient that is estimated with FRAP [Sigaut et al., 2010], the timescale of the gradient formation would have been too large compared to the experimental observations. This is especially important for the action of Bcd as transcription factor and the precision with which its “bulk” concentration can be estimated by the regulatory binding sites on DNA [Gregor et al., 2007b, Ipiña and Dawson, 2016]. Considering the interaction of Bcd with binding sites as done in our SDID model has major consequences for the interpretation of the experiments that seek to quantify the action of Bcd as transcription factor. More specifically, given that the fluorescence does not distinguish between free and bound Bcd, the relationship between the Bcd concentration and that of the proteins, e.g. Hunchback (Hb), whose production it regulates needs to be reassessed. As shown in figure 6 the SDID model predicts that the Hill coefficient that characterizes the cooperativity with which Bcd promotes the production of Hb can be smaller than the one that is directly derived from the scatter plot of the observed fluorescence. This conclusion is derived under the assumption that the fraction of regulatory sites on DNA that Bcd binds to is negligible compared to those that are included in the SDID model which should then correspond to other (non-specific) binding sites. In view of the hopping and sliding model of transcription [von Hippel and Berg, 1989] and the typical fraction of time that transcription factors spend bound to non-specific sites [Elf et al., 2007, Hammar et al., 2012] it is very likely that this assumption be valid in the case of Bcd. Given that most intracellular messengers are likely to be subject to binding/unbinding processes, it is likely that similar problems to those discussed here will be found in other systems as well. Our results are then not only relevant for the particular case of the gradient of Bcd but also have wide implications for the interpretation of fluorescence images in living organisms in general.

## Acknowledgments

This research has been supported by Universidad de Buenos Aires (UBACyT 20020130100480BA), Agencia Nacional de Promoción Científica y Tecnológica (PICT 2013-1301). SPD is a member of Carrera del Investigador Científico (Consejo Nacional de Investigaciones Científicas y Técnicas).

## Appendix

In this Appendix we present the versions of the SDID model that we have implemented and analyzed but have not described in detail in the manuscript.

### The SDID model with total degradation of Bcd.

In this case we consider that Bcd is degraded while being free or bound and the dynamic equations read:

(32) |

where is the degradation rate. The choice of parameter values is done as in the model with partial degradation. We first derive the stationary solution for . In this case the equilibrium condition (15)–(16) does not hold for all . But if reactions occur on a faster time-scale than degradation (as in the “fast reaction approximation” of (14)), it is possible to assume that (15)–(16) hold approximately at every . In such a case, the evolution equation for is given by (14) with and as before. As in the model with partial degradation we estimate the lengthscales of the stationary solution as:

(33) |

for , and,

(34) |

for . Although the ratio between the characteristic lengthscales of and in this case is given by (21) as in the partial degradation model, the lengthscale of the gradient depends on different biophysical parameters. The estimate of the degradation rate, , that may be derived from the characteristic lengthscale of in this case is approximately related to the one obtained in the model with partial degradation by . Since , if we use this value of and the model with total Bcd degradation to determine the source intensity as before we obtain a value, , that is smaller by a factor, , with respect to the one derived using the partial degradation model. Taking into account that in the region where FCS experiments are performed , using (33)–(34) we obtain the a priori estimate . The numerical simulations performed with this value did not give proper concentration distributions for the different species. Hence we used and instead.

### The SDID model with with GFP maturation

Experiments use Bcd-GFP to observe the distribution of Bcd. It takes some time for GFP to mature and become fluorescent [Sniegowski et al., 2005, Iizuka et al., 2011, Little et al., 2011]. Thus, to interpret the observations it may be necessary to include this process. In such a case we need to distinguish between fluorescent (or tagged) and non-fluorescent (or untagged) Bcd ( and , respectively) and include the transformation between one another. The equations then read: