Physics of base-pairing dynamics in DNA
As a key molecule of Life, Deoxyribo-Nucleic Acid (DNA) is the focus of numbers of investigations with the help of biological, chemical and physical techniques. From a physical point of view, both experimental and theoretical works have brought quantitative insights into DNA base-pairing dynamics that we review in this Report, putting emphasis on theoretical developments. We discuss the dynamics at the base-pair scale and its pivotal coupling with the polymer one, with a polymerization index running from a few nucleotides to tens of kilo-bases. This includes opening and closure of short hairpins and oligomers as well as zipping and unwinding of long macromolecules. We review how different physical mechanisms are either used by Nature or utilized in biotechnological processes to separate the two intertwined DNA strands, by insisting on quantitative results. They go from thermally-assisted denaturation bubble nucleation to force- or torque-driven mechanisms. We show that the helical character of the molecule, possibly supercoiled, can play a key role in many denaturation and renaturation processes. We categorize the mechanisms according to the relative timescales associated with base-pairing and chain orientational degrees of freedom such as bending and torsional elastic ones. In some specific situations, these chain orientational degrees of freedom can be integrated out, and the quasi-static approximation is valid. The complex dynamics then reduces to the diffusion in a low-dimensional free-energy landscape. In contrast, some important cases of experimental interest necessarily appeal to far-from-equilibrium statistical mechanics and hydrodynamics.
keywords:DNA, denaturation bubble, polymer dynamics, zipping, unwinding, free-energy barrier
Pacs:87.15.H- 87.15.A- 82.39.Pj 36.20.-r
- 1 Introduction
- 2 Base-pair breathing
- 3 One-dimensional pre-averaged dynamical models
4 Interplay between base-pairing and chain dynamics
- 4.1 Experimental motivation
- 4.2 The nucleation step preceding the zipping of long molecules
- 4.3 Far-from-equilibrium zipping of long molecules
- 4.4 Analogies with field-driven polymer translocation – Different driving force regimes
- 4.5 Base-pairing dynamics close to the melting temperature
- 4.6 Thermal unzipping/unwinding above the melting temperature
- 4.7 Bubble final closure and metastable bubble opening
- 5 Prospects and open questions
- Appendix: Main experimental values
The main reason why Nature has selected a double helical form for double-stranded DNA (dsDNA) in its more common forms is probably its stability, in order to preserve the genetic information of this key-molecule of Life Alberts2002 (), through hydrogen-bonding interactions between Watson-Crick paired bases (guanine (G)–cytosine (C) and adenosine (A)–thymine (T)). The coding sequences for genes are thus “buried” inside the DNA duplex and consequently poorly accessible for any kind of damage (from, e.g., OH and HO ions or mutagens Frank1987a ()).
Although this DNA structure in double helix is robust enough, it is however sufficiently loose to allow the opening of the double helix. DNA hybridization and de-hybridization capabilities are in fact fundamental processes in molecular and cell biology. Spontaneous opening is rare and transient at physiological temperature but it is promoted by specialized enzymes when the genetic code has to be accessible to molecular machineries. This occurs during transcription (DNA translation into messenger RNA), replication (DNA copy), recombination (DNA “cut-and-paste”), repair or any enzyme binding on single strands. For instance, RNA polymerases “read” single-stranded DNA (ssDNA), and the formation of a so-called transcriptional bubble at the transcriptional starting site is required to initiate transcription. At high enough temperature (or low enough ionic strength), thermal energy promotes partial or even complete base-pairs dissociation, a phenomenon called DNA denaturation or melting. This property is fully exploited, e.g., in Polymerase Chain Reaction (PCR) where the DNA is denatured before short sequences are hybridized which are then extended, this occurring during several dozen cycles controlled by the temperature. DNA hybridization kinetics, of evident interest in the biological processes evoked below, is also of importance for fast-developing nanotechnology designs involving DNA hybridization.
Local duplex openings, commonly called denaturation bubbles, can be observed at any temperature. They are rare at room temperature where the fraction of open base-pairs in dsDNA is experimentally found to be – for A-T pairs Gueron1995 (); vonHippel2013 (); FrankKamenetskii2014 (), probably an order of magnitude smaller for G-C pairs Krueger2006 (). Their nucleation probability (and their size) increases with Krueger2006 (); Palmeri2008 () and they proliferate when getting close to the so-called denaturation or melting temperature (precisely defined as the midpoint of the transition, where one half of the base-pairs are denaturated). Due to sequence heterogeneities, with AT-rich segments being less stable than GC-rich ones, the former tend to melt at lower temperatures Nagapriya2010 (). Hence, at physiological ionic strengths, varies between 50 and 90C. Therefore the DNA macromolecule manifests more thermally driven ÒbreathingÓ fluctuations at physiological temperatures in the AT-rich regions vonHippel2013 (). This property can be exploited by Nature by correlating starting sites of molecular machineries where duplex opening must be initiated and the more fragile AT-rich regions. Duplex opening is also at play when non-linear elastic properties of DNA are involved when the molecule is strongly bent Destainville2009 () or negatively supercoiled Adamcik2012 (). This is also of biological relevance, e.g. in nucleosomes and plasmids.
1.1 Motivations for the study of base-pairing dynamics
Since the discovery of DNA three-dimensional structure by Watson and Crick in 1953 Watson1953 (), the DNA double helix internal dynamics have been the subject of many theoretical and experimental investigations not only from a biological or biochemical perspective but also from a physical point of view. This dynamics corresponds either to DNA zipping from the denaturated to the duplex state, or its opening from the duplex to the single stranded state, or diffusion of denaturation bubbles. Understanding the internal dynamics of DNA when in solution is indeed considered as a pivotal first step before going further and tackling the whole problem of dynamics in the nucleosome or chromatin, where proteins bound to DNA (such as histones) and higher levels of compaction make the problem even more complex Alberts2002 (). Many proteins interact with the dsDNA and some of these protein-DNA complexes are correlated to the opening of the double helix. Apart form the RNA polymerase evoked above, one can cite the DNA polymerase which plays a central role in the DNA replication, the enzymes acting in the homologous recombination Alberts2002 () such as RecA in the Escherichia coli bacteria or RAD51 in humans, and also the Single Strand Binding proteins which are essential to maintain the genetic information Meyer1990 (). Active opening of nucleic-acid double strands by helicases has also been proposed to be the result of a competition between passive zipping and helicase processivity on the single strand to which it is bound Betterton2005 (). Zipping dynamics are then directly involved in this interaction between nucleic acids and a wide class of protein at work in many molecular processes. However, we shall see that even in vitro where the DNA is isolated, many questions remain open from an experimental point of view. Designing reliable theoretical models in close collaboration with experimentalists Frank1987b (); FrankKamenetskii2014 () might help the latter to distinguish between different mechanisms when exploring the base-pairing kinetics in the future.
From the theoretical point of view, the richness of this quasi-one-dimensional dynamical systems comes from:
The cooperativity between adjacent base-pairs (related to the so-called ÒstackingÓ interactions between cycles), which favors a sharp crossover (within a few Kelvins) between the closed and open states when destabilizing the nucleic base interactions by raising the temperature above (alternatively, denaturation can be reached by decreasing the ionic strength at fixed , or again by modifying the H value Lazurkin1970 (); Gotoh1983 ()).
The much smaller bending and torsional moduli of a DNA bubble as compared to the double helix DNA (given in Table 2), which effectively couples base pairing (ÒinternalÓ) and chain conformation (ÒexternalÓ) degrees of freedom, increases the entropy in the denaturated state and thus destabilizes the duplex state at high enough temperatures (see e.g. Refs.Yan2004 (); Chakrabarti2005 (); Manghi2009 ()). DNA elastic or mechanical properties are also involved in many biological processes, notably interaction with partner proteins.
The helical character of the molecule in its duplex state that is further stabilized through geometrical entanglement and can furthermore be supercoiled (negatively or positively). This provides an additional strain-assisted way to go from the closed to the denaturated state, notably used in Nature FrankKamenetskii2006 (), for example by actively applying a torque to the molecule through specialized enzymes Lohman1996 (); Alberts2002 () or by negatively supercoiling chromosomes and plasmids222Plasmids are kilobases-long closed double-stranded DNA mini-circles, most commonly found in bacteria. As compared to the larger chromosomes containing all the essential genetic information, plasmids usually are very small and contain “inessential” information. Being closed, they can be passively under-twisted in vivo. in order to facilitate their opening FrankKamenetskii1997 (); Fye1999 (); Jost2011 (). Conservation of the Linking number Lk, a topological quantity related to the dsDNA helicity and defined below, imposes a constraint on the helical twist dynamics. Indeed, twist must be unwound at the molecule ends in order for the single strands to separate. Conversely, renaturation requires rotation of the molecule for accumulation of twist.
The heterogeneity of genetic sequences (encoded by an ÒalphabetÓ of 4 nucleotides or ÒlettersÓ, A, T, G and C, paired in tandems A-T and G-C in the double strand). The ensuing quenched disorder makes in principle thermodynamical and dynamical properties sequence-dependent, because AT rich regions are less stable than GC rich ones333In this respect, the terminology of “random sequence” used by physicists, which is certainly not meaningful from a biological perspective, simply means that the sequence does not contain some specific motifs (e.g., periodic motifs, A-tracts, or palindromes) that would make the molecule have a behavior different from the average one, from a physical point of view.. Sequence is of notable biological importance when DNA/enzyme interactions require DNA opening at relevant sites.
Note that even though both nucleic acids share many similarities, Watson-Crick base-pairing kinetics in DNA molecules are relatively simpler than in RNA because the paired geometry is a double helix, whereas secondary and tertiary structures of RNA can be much more complex Alberts2002 (); Bundschuh2006 (). DNA also possesses secondary structures different from the simple double helix (e.g. triple or quadruple helices and hairpins, cruciforms or loops FrankKamenetskii1997 (); FrankKamenetskii1995 (); McMurray1999 (); Burge2006 ()), but they are not as often associated with biological properties as in RNAs. However, DNA base-pairing dynamics remains highly challenging because of the interplay between internal (base-pairing) and external (chain conformation) degrees of freedom in the duplex state. Zipping of DNA also shares some similitudes with the helix-to-coil transition of -helices Jayaraman2007 (), even though a single polymer is involved in this case.
In addition to their obvious implications in molecular biology, DNA base-pairing mechanisms have recently known a growing interest in nanotechnological applications. This started with PCR in the 1980s, more recently followed by aptamer design Tuerk1990 (); Ellington1990 (), DNA biochips, DNA combing Michalet1997 (), or DNA origami Goodman2005 (); Rothemund2006 (); Zadegan2012 ().
1.2 Report outline and presentation of theoretical approaches
In this Report, we chose to categorize in three major sections the numerous theoretical findings about DNA base-pairing dynamics of the past half-century. This classification, summarized in Table 1, dwells on the respective roles played by internal (base-pairing) and external (chain conformation) degrees of freedom, as follows (Figure 1).
|External (chain conformation) degrees of freedom are slow variables and are considered as frozen; transient bubbles “breathe”, e.g., by thermal activation.||ns||bp||Figure 1a||2|
|Chain orientational degrees of freedom are fast variables and are pre-averaged. Base-pairing dynamics is explored in this pre-averaged free-energy landscape (quasi-static approximation).||s||bp||Figure 1b||3|
|Internal (base-pairing) and external (chain conformation) degrees of freedom must be considered on an equal footing.||s||bp||Figure 1c||4|
Section 2 addresses the local small distortions of the distance between strands of the DNA double helix (Figure 1a). At very short timescales (typically ns), chain degrees of freedom remain frozen (or are very slow variables), as they are assumed to follow much slower dynamics than base-pair opening and closure. When now-and-then opening below the melting temperature , DNA “breathing bubbles” (or “breathers” for breather modes originally precisely defined in the context of non-linear physics Dauxois1993a (); Yakushevich2004 ()) are thus short-lived ( ns) and we shall also call them “transient bubbles”. They are smaller than ten or so base-pairs.
Note that some biophysicists and biologists also generalize the use of the word “breather”, to refer to real openings of the dsDNA due to thermal fluctuations in vitro (as opposed to opening induced by active proteins in vivo) vonHippel2013 (). These long-lived denaturation bubbles are therefore hardly ever observed at physiological temperature. Close to they become more probable (usually close to the melting temperature of the AT sequences C in physiological salt conditions). We shall return to these two different categories of bubbles below.
In the opposing limit where chain orientational degrees of freedom are assumed to be fast variables and can thus be integrated out (or pre-averaged), we present in Section 3 some models in which a quasi-equilibrium free-energy landscape for base-pairing (the slow variables) is constructed. These models have been historically developed to study the zipping/unzipping of long DNAs. The use of such quasi-static models is fully justified in some circumstances such as the opening/closure of small DNA hairpins (few tens of base-pairs) below , a collective phenomenon occurring on the s timescale which requires to cross a high free-energy barrier and is often modeled by a two-state approximation (Figure 1b). The opening induced by applied forces and torques is also evoked in this section.
Section 4 focuses on a wide class of phenomena where internal and external degrees of freedom must be considered on an equal footing because none of them can be considered as frozen or pre-averaged. This notably concerns DNA zipping below (Figure 1c) and its unwinding dynamics when denaturing above , which occur on times ranging from tens of micro-seconds to seconds for long constructs. The DNA lengths studied in this regime are above the persistence length , so that scaling laws can be anticipated. In some cases as the closure below of a “pre-equilibrated” bubble in the middle of a chain, the far-from equilibrium zipping stops in a metastable bubble of bp, followed by a s-long final bubble closure caused by a coupling between the chain bending and torsional degrees of freedom and the base-pairing ones (Figure 1d).
The existence of two distinct categories of bubbles deserves a particular attention. An important difference is to be made between Òtransient bubblesÓ (opening of few base-pairs rapidly followed by renaturation) and Òmetastable bubblesÓ where the chain has time to equilibrate after opening the bubble and before re-closing it. By measuring imino-proton exchange with water by proton NMR in close to physiological conditions (albeit at room temperature or less; see vonHippel2013 (); FrankKamenetskii2014 () for detailed historical reviews), Wärmländer and his collaborators found in 2000 two distinct timescales associated with lifetimes of open AT pairs in short duplexes Warmlander2000 (). The first ones dwell in the nanosecond range, as in earlier works Gueron1987 (); Frank1987a (); Gueron1995 (); vonHippel2013 (). Quite interestingly, these new experiments also indicated the existence of longer open states, with lifetimes in the micro-second range, concerning less than 10% of the opening events. It is tempting to speculate that the shorter lifetimes correspond to transient bubbles, whereas the longer ones should correspond to metastable bubbles. Interestingly also, long closure times ( s) observed by Fluorescent Correlation Spectroscopy by Libchaber and its collaborators Altan2003 () presumably belong to this last category as well (these authors did not mention Wärmländer and collaborators’ work). These different experiments are based upon relatively short constructs (less than 30 bp) with CG clamping extremities and AT cores of variable lengths. The somewhat different lifetimes of metastable bubbles observed with the two different experimental techniques potentially come from the different lengths of the AT cores (8 bp in Wärmländer and collaborators’ longest construct, 18 bp in Libchaber’s group experiments). We shall see below that the optimal metastable bubble length is around 10 bp.
Recently, Phelps et al. Phelps2013 () measured DNA bubble timescales of s using single-molecule Förster resonance energy transfer (FRET) and linear dichroism. They attributed the fact that their bubble lifetimes were times larger than those of Ref. Altan2003 () to the difference between labelling schemes used in the two studies. The proximity between the tagged base-pair and the stationnary ds-ssDNA fork (distance of 14 bp) might also be responsible for this increased lifetime. They also studied bubble lifetime closer to the fork with or without helicase (and GTP), showing that these timescales increase by a factor 3 in presence of the enzyme.
In these different contexts, DNA can be investigated theoretically at a more or less coarse-grained level. In principle, the more realistic descriptions belong to all-atom modeling with either explicit or implicit solvent. However, in the context of molecular and cell biology where complex mechanisms are often at play, one limitation of such approaches is that the key degrees of freedom associated with a given biological process are not always easily identifiable amongst the atomistic details and physical insight concerning the main mechanisms at play can be lacking. But the most critical weakness of all-atom simulations lies in the attainable timescales when considering a molecule with a size of biological interest. Real times are generally limited to a dozen of nanoseconds, they can be pushed up to a fraction of microsecond using huge numerical facilities, but timescales of interest can be in the millisecond range, possibly longer, as we have just seen. Alternative methods have been explored to reach longer timescales. A first step towards an increase of the timescale numerically accessible consists in considering small physico-chemically relevant groups of atoms (such as a water molecule, a sugar, a phosphate, or a nucleo-base) instead of single atoms. Each small group is then considered as a single effective particle, thus leading to coarse-grained models. But in most of the cases, the gain remains insufficient in regard of the needs.
Beyond this, another promising approach consists in going a step further in the coarse-graining process, thus giving up the ambition of accounting for realistic interactions at the atomic level. Effective mesoscopic models are then at stake, where elementary ÒparticlesÓ are larger subparts of the nucleic acid (e.g. nucleotides). The relevant degrees of freedom must be identified, and then the model parameters must be finely tuned in order to correctly account for the microscopic degrees of freedom that are implicitly integrated out in the process. The gain in terms of simulation times and attainable system sizes is then of several orders of magnitude. Biologically relevant questions can be addressed in a wider class of cases, since several hundreds of base-pairs can be tackled on the milli-second timescale. The resultant increase in simple physical insight can also be substantial because each level of coarse-graining reduces the questions of interest to their more fundamental aspects. Analytical approaches from statistical physics relying upon mesoscopic models, eventually coupled to hydrodynamics, elasticity, or electrostatics depending on the biological issue, are also always very useful to understand the fundamental physical mechanisms at play and to span the whole range of parameters. Of course these analytical approaches are only possible for very simple models where the key degrees of freedom have been identified. Among them, continuous models, where two DNA single strands seen as Worm-Like Chains (WLC) DoiEdwards2004 () are inter-wound, can also be useful.
The tools by which the mesoscopic models are solved are those of out-of-equilibrium statistical mechanics and non-linear physics, ranging from numerical methods (Molecular Dynamics, Brownian or Langevin dynamics, Monte Carlo simulations, biased sampling) to analytical ones (master equation, Fokker-Planck and Smoluchowski equations, mean first passage times and Kramers theory, Rouse or Zimm approximations). While they significantly increase the simulation cost or the complexity of the equations, hydrodynamic interactions can be approximately treated, for example by using the Oseen or Rotne-Prager tensors DoiEdwards2004 (); Manghi2006 (), in both numerical and analytical approaches.
In this Report, an important distinction will be made between quasi-static processes and far-from-equilibrium ones Chaikin1995 (). In the first case, most degrees of freedom are considered as fast variables that can be pre-averaged in an effective free energy depending on a few slow variables only. Such an approximation turns out to be inappropriate in some circumstances for DNA base-pairing dynamics, notably as far as chain orientational degrees of freedom are concerned. One then has to switch to far-from-equilibrium statistical mechanics.
The typical experimental values of the main quantities involved in DNA dynamics as discussed in this Report are listed in Table 2 in the Appendix.
1.3 A historical survey of DNA mesoscopic modeling
As motivated just above, following the discovery of the double-helix structure of DNA by Watson, Crick and Franklin in 1953, a succession of mesoscopic models have been proposed in the literature, with the primary objective to describe equilibrium thermodynamical properties of DNA before addressing dynamic ones. Their respective merits will be discussed in this Report, thus we only briefly present here the most popular ones in a chronological order, without any ambition of exhaustivity.
In the 1960’s, the first physical mesoscopic models were inspired by the 1D Ising model of solid-state physics Lazurkin1970 (); Wartell1972 (). They simply said that a base-pair could be either open or closed, thus attributing a boolean-like variable (in fact a classical “spin” ) to each base-pair. In such a model, the DNA is seen as a ladder, the rungs of which are the hydrogen bonds between base-pairs, either broken or unbroken. The helicity is therefore secondary at this level of modeling. The coupling between adjacent base-pairs accounts for their cooperativity, itself coming from the stacking interactions between consecutive cycles along a same strand. A rapid calculation shows that in absence of the cooperativity term, the transition occurs on more than 50 Kelvins, which is in evident contradiction with the experiments. The cooperativity narrows the transition to few Kelvins, as expected, but the transition cannot strictly speaking be a thermodynamic one because one deals with a 1D system with short-range interactions. The main contribution to the double helix stability actually does not come from the hydrogen-bonds alone, but the stacking of nearest-neighbor base-pairs contributes on an equal footing. The “sequence-dependent stacking” energy model was proposed later to realistically take the sequence into account by making the model parameters (on-site free-energy cost of base-pair opening and cooperativities) dependent on the position in the polymer SantaLucia1998 (). These models do not consider the chain entropy and elasticity and they give a semi-quantitative understanding of equilibrium properties and of melting profiles (see the reviews Gotoh1983 (); Wartell1985 (); FrankKamenetskii2014 ()).
In 1970, following the work of Zimm and Bragg in the context of the helix-coil transition in peptides Zimm1959 (); Zimm1960 (), Poland and Scheraga studied an improved version of the Ising model PSbook (). As compared to its predecessors, an additional so-called “loop-entropy” cost was associated with each bubble of length denoted by ( on the order of unity is the so-called loop exponent). Its origin lies in the requirement that a denaturation bubble in the middle of the polymer has its two extremities closed, thus forming a (self-avoiding) closed loop of total length . Thus this model is the first one to take some chain orientational degrees of freedom into account in an effective way. This leads (at equilibrium) to an effective long-range interaction in this originally 1D model, which in principle allows a true thermodynamic transition in the infinite polymer length limit. Several studies have focused on the exact nature of this transition, even though this is a rather academic issue since polymers are finite-sized by nature Kafri2000 (). The phenomenological numerical parameters in Poland and Scheraga’s original algorithm were evaluated by Blake and Delcourt Blake1998 (). This led to the celebrated MELTSIM software Blake1999 (), now routinely used in genetics laboratories in order to predict with a correct accuracy the melting profile of any DNA sequence. In a series of papers starting in 2003, Metzler and his collaborators studied the dynamical properties of this model, thus assuming that chain orientational degrees of freedom leading to can be considered as pre-averaged. This will be discussed below.
In 1979, Benham proposed a mean-field 1D model of denaturation in superhelically stressed DNA Benham1979 (), which was subsequently improved Benham1980 (); Benham1990 (); Fye1999 (). Basically, he analyzed how the supercoiling stress is distributed among the (more flexible) denatured regions and the closed ones, by minimizing the torsional elastic energy while taking account the internal degrees of freedom in an Ising-like fashion. The model uses the topological relation between the linking number for a dsDNA ( bp is the pitch fo the helix and the number of base-pairs) and the twisting number , the twist of the dsDNA ribbon (where is the angle between two adjacent base-pair vectors). It is given by the Fuller-White formula Fuller1971 (); White1969 ()
where the writhing number Wr measures the degree of supercoiling and depends only on the shape of the ribbon axis. Neglecting the writhe, Benham related the total twist in the bubble to the residual twist in the ds region (with two different torsional moduli in ss and dsDNA regions) and the imposed linking difference between the imposed linking number and the dsDNA equilibrium one. It successfully predicted the location of the stress-induced duplex destabilization sites in specific genes Benham1996 (); Leblanc2000 (). The model can also explicitly take the sequence into account by setting two different energies of denaturation for A-T and G-C base-pairs Jost2011 ().
Starting in 1983, Yomosa, Takeno and later Yakushevich and others proposed a 1D model where the important variables are the rotational degrees of freedom of base monomers with respect to a rotational axis, and , with one such axis for each strand Yakushevich1989 (); Yakushevich2004 (). This model was studied with a non-linear physics point of view. Bubble breathers are seen as solitonic excitations such as in chains of pendula coupled with torsional springs. In addition to equilibrium properties, this led to an intensive study of the dynamical properties of the denaturation processes in these systems, which will be detailed in this Report. Sequence effects have also been incorporated in this model.
In 1989, Peyrard and Bishop Peyrard1989 () explored a different approach, subsequently improved in 1993 by the same authors and Dauxois (PBD model) Dauxois1993a (). In this other non-linear 1D model, the distance between the two bases of a given base-pair is now a continuous variable . The potential energy between both bases is modeled by a Morse potential . The denaturation above is simply due to the translational entropy gained by the coordinates, using the analogy with polymer desorption done by de Gennes in 1969 deGennes1969 (). The interest of such a model is that it is fully solvable in its simplest form. Chain orientational degrees of freedom are not taken into account either but sequence effects can be included by fitting simulations of the model to ultraviolet (UV) melting curves.
In the first decade of the 21st century, several coupled models were designed to explicitly take into account the coupling between internal and external degrees of freedom. This coupling arises from the lower bending () and/or torsional () elastic moduli in the denatured form as compared to the duplex one. Typically, Yan2004 (); Manghi2009 () and Kahn1994 (); Bryant2003 (); Lipfert2010 () (see Table 2). Chain orientational degrees of freedom are treated using a discrete WLC model, while internal ones are again modeled in the traditional Ising-like fashion. But the chain elastic parameters explicitly depend on the internal state, thus coupling the chain and the base-pair states. The gained entropy above the denaturation is due to the increase of the number of conformations in the denaturated state.
Storm and Nelson mixed the features of both the WLC and the Bragg-Zimm models Storm2003 (). Their goal was to account for single-molecule experiments where a force is applied to the polymer extremities. Their so-called Ising-Discrete Persistent Chain model takes two different possible conformations of DNA into account, each with its own elastic constants as explained above, and it couples chain orientational degrees of freedom to the applied force. However, the addition of the term corresponding to the external force in the Hamiltonian prevents an exact solution of the model. An approximate variational scheme had to be implemented.
When proposing their coupled model, the primary goal of Yan and Marko Yan2004 () (and others Ranjith2005 () with the same model) was to give a theoretical foundation to DNA cyclization experiments. Their solvable model was a simplified Ising-like one but without cooperativity, coupled to chain orientational degrees of freedom in the following way. A bending energy term was added in order to assign different elastic moduli to the open () and closed () base-pair states. The are unit vectors giving the chain orientation at the level of each base-pair.
Palmeri, Manghi and Destainville Palmeri2007 (); Palmeri2008 (); Manghi2009 () went further by proposing a model were the cooperativity between adjacent base-pairs could be fully taken into account. This model was exactly solved by the transfer matrix technique. Note that a very similar approach had been proposed in 1993 by Palmeri and Leibler in 2D Palmeri1993 () and then by Chakrabarti and Levine Chakrabarti2005 (); Chakrabarti2006 (), but its application to the study of denaturation bubbles in dsDNA had not been explored at that time (only the helix-to-coil transition of proteins was considered, but both problems share strong similarities). When coupling chain orientational degrees of freedom to an applied force, such a model also permits to describe in an approximate but very accurate analytical way the situations where a force is applied to the DNA molecule extremities, and to provide a rationale for the rich variety of observed behaviors Manghi2012 ().
Even though the previous coupled models can in principle be apprehended analytically in order to address their out-of-equilibrium properties, this should require some important, potentially ill-controlled approximations. Consequently, in the 5 past years, several groups have adopted an alternative strategy consisting of designing somewhat more realistic models while giving up the idea of exact analytical treatments, and focusing instead on scaling arguments and/or numerical experiments. Bead-spring models have thus been developed, where one bead represents one nucleo-base. The polymers can be defined on a lattice or alternatively they can diffuse in a continuous medium. The model parameters are tuned in order to reliably account for equilibrium properties (such as the bending and torsional elastic moduli in both double-stranded and single-stranded forms, the melting temperature, or the helix pitch).
In 2011, Carlon and his collaborators proposed a simplified model where the two intertwined polymers are defined on a face-centered-cubic (fcc) lattice Ferrantini2011 (). The two strands are both mutually and self-avoiding, with the exception of monomers with the same index along each strand, which are referred to as complementary monomers. They are superimposed when the duplex is closed. Such superimposed positions of complementary monomers are energetically favored in order to ensure the transition to the duplex form below . Molecules as long as 500 base-pairs can be studied, but this model does not display the double-helix geometry in the closed state, nor the correct persistence lengths. It permitted however to study the scaling of the DNA zipping time in function of its polymerization index .
Two years later, Dasanna, Destainville, Palmeri and Manghi published a more realistic model, displaying correct helix pitch and persistence lengths, as well as a state-dependent elastic twist modulus Dasanna2013 (). In this off-lattice model, the inter-strand potential is a Morse one as in the PBD approach. This model was used to propose a realistic mechanism for the observed s-long closure times of metastable bubbles.
We shall later discuss in greater details the advances permitted by these two approaches.
Beyond that, there exist a large variety of numerical coarse-grained models with more than one bead per nucleotide, developed in the two last decades. The increased degree of refinement automatically allows in principle a better account of experimental data. However, this is at the cost of increased computing cost, without necessarily providing better insight into the physically relevant mechanisms. For example, we still do not know whether, above , open bases still have stacking interactions with adjacent base pairs in the single strands, or they are totally disordered Krueger2006 (); FrankKamenetskii2014 (). These models will be surveyed in the next Section.
2 Base-pair breathing
Since a rich literature has been devoted to the study of effective, mesoscopic models to tackle DNA base-pairing breathing dynamics, this first section primarily intends to review some of the most popular ones while discussing their intrinsic limitations. When deriving mesoscopic models, one has to give realistic values to the parameters appearing in these effective models. In this respect, experiments are of course of primary importance, but all-atom simulations are becoming everyday more efficient to give reliable insight at the base-pair scale. They likely provide very good reference points to more coarse-grained models. So we briefly discuss them in Section 2.1 before tackling base-pairing breathing as apprehended by mesoscopic models from the more detailed ones with several beads per nucleotide (Section 2.2) to the models with one bead per base (Sections 2.3 and 2.4).
2.1 All-atom numerical simulations
As motivated above, we briefly present several types of simulations where the dsDNA base-pairs are modeled at the atomic level. Since, this Report focuses on the base pairing dynamics of relatively long DNA molecules, which involves many atoms and occurs on timescales larger than the nanosecond, we can anticipate that these numerical models with many degrees of freedom remain by nature limited to address such an issue.
With the help of current development of numerics, it is possible to simulate all-atom DNAs of few dozens of bp Beveridge2004 (); Merzel2007 (). Beveridge et al. reviewed in 2004 the studies of DNA elasticity based on molecular dynamics approaches Beveridge2004 (). They focused on the effect of the sequence on the intrinsic curvature and the flexibility of DNA for lengths up to 25 bp, notably the well-known specificity of A-tracts, and showed that their simulations accounted well for the essential features of experimental observations. For instance, this work provided independent support for the Goodsell-Dickerson “non-A-tract model” of DNA intrinsic curvature. One of the major interest in these studies is to follow the effect of mobile ions (which are of course included explicitly) on the elasticity of oligomers. It was also observed that a typical time of 100 ns is needed to equilibrate ions. Other numerical simulations on 11-bp fragments computed for example the elastic constants as a function of the sequence Lankas2000 (), and found them to be in very good agreement with experimental values. The 5-ns all-atom simulations revealed a marked sequence-dependence of the stretching and torsional rigidities of DNA. In contrast, the bending moduli (though over-estimated by the force-field) appeared to depend weakly on the sequence.
Among these all-atom simulations some of them considered the binding/unbinding of a single DNA base-pair or of short oligomers. Hagan et al. studied the characterization of kinetic pathways Hagan2003 () for a single base-pair in a 3 bp oligomer. Focusing on the flipping of a pyrimidine end base of this oligomer, they followed using transition-path sampling and umbrella sampling the four reaction coordinates (two distances and two energies) that provide a meaningful description of the reaction, associated with the inter-base interaction and the intra-strand stacking one. The simulations revealed two qualitatively different kinetic pathways for the unbinding of the base-pair: one in which the flipping base breaks its intramolecular hydrogen bonds before it unstacks and another in which it ruptures both sets of interactions simultaneously. Moreover they showed that these trajectories are not determined by a stretching of the H-bond (but by a base flipping by rotation around the backbone) nor by penetration of a water molecules and formation ot water-DNA H-bonds.
One the contrary, focusing on the hybridization of a 6 A-T oligomer, some metastable structures and potential barriers have been observed in Ref. Qi2011 (), associated with the water molecules and their Hydrogen-bonding capability with base-pairs. Using the solvent accessible surface area as an order parameter, two different metastable configurations were identified where one water molecule forms a “bridge” between two approaching bases (see Figure 2a), impeding their immediate binding. The potential barrier to be overcome to eventually close the base-pair is then measured to be at room temperature. These non-trivial barriers to base-pairing likely play a role in slowing-down the hybridization dynamics (see also the discussion in Section 5.3).
Using umbrella sampling simulations, free-energy pathways for base-pair opening were also computed for 13 bp oligomers with repeated GA sequence Giudice2003 (). This study showed that the larger purine (A or G) base flips favourably into the major groove and that the unstacking of the first base can be produced with small conformational changes in the DNA backbone. However in a second step once a base is removed from the interior of the helix (again by rotation around the backbone), enhanced backbone bending and untwisting appears. Here again, long-lived water bridges were observed.
Recently, the microsecond timescale (up to 44 s) has been explored using intensive Molecular Dynamics (MD) all-atom simulations of 12 to 18 bp oligomers Perez2007 (); Drsata2013 (); Galindo2014 (). Frequent transient breathing events (breakage of few H-bonds) were observed with a duration comprised between 10 ps and 1 ns. No clear double strand opening could be observed on these s long runs apart from the expected terminal base-pair fraying.
To finish with, Harris and coworkers have been able to investigate the effects of supercoiling on 90 bp minicircles Harris2008 ().
Negative supercoiling can lead to partial denaturation, as we shall explore it in Section 3.5, and this is indeed observed in these molecular dynamics simulations. However, they are limited to simulate 4 ns only even though using a supercomputer, and it is difficult to assert that thermodynamical equilibrium has been reached.
2.2 Numerical coarse-grained models with several beads per nucleotide
In order to tackle longer molecules and timescales, several coarse-grained numerical models have been developed in particular to study denaturation properties of short DNA at equilibriium. In these models, each nucleotide is modeled using two Zhang1995 (); Drukker2000 (); Drukker2001 (); Sayar2010 (), three Knotts2007 (); Ouldridge2011 (); Linak2011 (); DeMille2011 (), or even six Dans2010 (); Zeida2012 () beads. All these models involve classical effective potentials between beads, such as harmonic potentials for covalent bonds, and quadratic angle and dihedral potentials for bending and torsion. The effective non-linear potential chosen for mimicking hydrogen-bonding is either a Lennard-Jones Florescu2011 (); Sayar2010 (); Knotts2007 (); Sambriski2009b () or a Morse one Zhang1995 (); Drukker2000 (); Ouldridge2011 () (see Ref. Dasanna2013T () for an overview of these various coarse-grained models).
The 3SPN (for 3 sites per nucleotide) model first developed by the de Pablo group Knotts2007 (); Sambriski2009b (); Sambriski2009 () is sketched in Figure 2b Florescu2011 (). It seeks to incorporate available experimental information into the development of a coarse-grained description where the three sites are mapped onto the full atomistic representation of each DNA base. Its first evolution 3SPN.1 Sambriski2009c () includes the entropic effect of solvation on interacting strands which enables a spontaneous rehybridization of a bubble. The water solvent is treated implicitly and the electrostatic interactions are taken at the Debye-Hückel level. In the last evolution 3SPN.1-I Freeman2011 (), monovalent and divalent ions are included explicitly which allows one to reproduce the experimental melting temperature for oligonucleotides. A similar model developed just after in Oxford, the oxDNA model Ouldridge2011 (); Sulc2012 () also considers the cross-stacking interactions and ensures a physical representation of the ssDNA. Linak et al. Linak2011 () which also develop a model similar to the 3SPN one introduces cross-stacking (Hoogsteen bonds) as well. In these three models, the effective potential shapes are fitted on potentials of mean force obtained by Boltzmann inversion of the relevant probabilities distributions coming from all-atom simulations. Hence, the effective interaction parameters, e.g. elastic moduli, can be inferred from more realistic atomistic molecular simulations. Moreover since the solvent is implicit, Langevin dynamics simulations are performed with time-step on the order of 5 to 30 fs Sambriski2009c ().
In these works, melting profiles are computed by equilibrating the small oligomers on nanoseconds for various temperatures, and then compared to experimental ones, which is assumed to validate the choice of the parameter values.
In all these coarse-grained models, the denaturation and hybridization kinetics is followed on a maximum of a few hundreds of nanoseconds. In Ref. Zhang1995 (), the hydrogen-bond displacement of base-pairs are simulated on a few ns, which corresponds to the base-pair breathing timescale as defined in the Introduction, where chain orientational degrees of freedom are not equilibrated.
Using the 3SPN model, Knotts et al. Knotts2007 () followed the bubble formation at 360 K (i.e. close to the melting temperature where dynamics is faster) and rehybridization after quenching it to 300 K in a 60 bp DNA sequence. They measured opening and closure times of 15 ns and 44 ns respectively, but as explicitly said in Knotts2007 (), these figures should be viewed with caution since friction is not included is this coarse-grained model. Using the evolution 3SPN-1 and replica exchange simulations (hence varying the temperature), only the average fraction of open base-pairs is dynamically followed on 200 ns but no kinetic information is given Sambriski2009c (). Using transition path sampling Sambriski2009 (), the nucleation pathways in the ssDNA-dsDNA renaturation for oligomers are analyzed which show the significant role played by the sequence. For repetitive sequences the reassociation is non-specific with a molecular “slithering” mechanism, whereas for random sequences favors a restrictive pathway involving the formation of key base-pairs.
To study the early stages of a 14 bp DNA hybridization kinetics, Ouldridge et al. Ouldridge2013 () (oxDNA model) had to accelerate dynamics by choosing a higher (by a factor 16) diffusion coefficient than physical DNA and to implement the so-called Forward Flux Sampling technique. They found that strand association proceeds through a complex set of partially hybridized intermediate states in a temperature-dependent way. More formed base-pairs are required in the intermediate states to render duplex closure probable when gets closer to .
Hence, due to current CPU time limitations, most of these numerical coarse-grained models with several beads per nucleotides only follow the melting and renaturation of short oligomers on times smaller than 1 s. The bubble nucleation and life times are therefore not accessible. One exception is the work by Zeida et al. Dans2010 (); Zeida2012 () where a Langevin dynamics with a time step of 20 ps (3 order of magnitude faster than in 3SPN model) was used. Simulations up to 50 s where done and the results are discussed at the end of Section 4.
2.3 Mechanical models
To go a step further in the physical modeling and coarse-graining process, many mechanical mesoscopic models Yakushevich2004 () have been developed where the DNA chain is viewed as a straight chain of masses (points, spheres or disks) coupled by linear or torsional springs, in the line of the old concept of coupled atomistic chains for solids. Among these models one should distinguish between linear physics models (the potentials are quadratic in the position and angular variables which implies linear equation of motions), which describe the DNA dynamics in terms of phonons Barkley1979 (), and nonlinear physics ones. The latter follow the original ideas of Englander and coworkers in 1980 Englander1980 (), which enable one to study larger excursions of molecules, such as base-pair unpairing by using the appealing concepts of nonlinear physics such as solitons or breather modes. However, as pointed out by Yakushevich himself in Ref. Yakushevich2004 (), processes of dissipation are completely neglected in these mechanical models (this is why we call them “mechanical”). This major approximation is discussed in Section 2.4.
Although linear models do not permit to study the base pairing/unpairing dynamics, they are able to yield the typical time scale of these mechanical models. In the continuum approximation ( where is the distance between neighbouring disks or masses), three acoustic branches appear associated with torsional, longitudinal, and transverse modes. The frequencies are generically given by the ratios of the associated elastic modulus along the DNA axis and the mass (or moment of inertia ),
When the double-stranded structure is taken into account, usually with two coupled chains, optical branches appear with a second eigenvalue
where is the coupling elastic constant between masses of the same base-pair. Adjusting these formula to typical values of sound velocities in DNA, , (both for longitudinal and torsional waves) which are measured around m/s Yakushevich2004 (); Yakushevich2002 (), and using equal to a few Angströms, and (where kg is the proton mass) yields equal to a few N/m. Moreover several works have measured by microwave and infrared absorption/transmission experiments equal to cm to 80 cm Kim1987 (); Yakushevich2004 (); Yakushevich2002 () which yields smaller than 1 N/m. This estimation will be useful below when investigating the non-linear model predictions.
Non-linear models have indeed been extensively studied in the context of DNA. The first works dealt with single rod-like models where the potential between successive monomers were chosen to be non-quadratic both for longitudinal Muto1989 (), transversal Christiansen1990 (); Christiansen1993 () or bending modes Ichikawa1981 (). However, the main interest of non-linear models is that large excursions of bases, and therefore base-pair unpairing, can be accessible with a proper modeling of the interactions between the bases of the two single strands. Assuming simple two-body potentials, analytical solutions emerge as solitons by analogy with mechanical solitons in pendula chains as first shown by Englander Englander1980 () and Scott Scott1969 (). Two main models can be distinguished.
One of the most famous double rod-like model that leads to solitonic solutions is due to Yomosa Yomosa1983 (); Yomosa1984 () further developed by Takeno Takeno1983 (), Yakushevich Yakushevich1989 (); Yakushevich2001 (); Yakushevich2002 (); Yakushevich2004 () and others Salerno1991 (); Gaeta1992 (), where the contribution to base-pair opening comes from rotational motions of the bases around the backbone. The important variables are thus the rotational degrees of freedom of base monomers with respect to a rotational axis, chosen as (O), parallel to the duplex axis (see figure 3). In its continuous version, the Hamiltonian reads
where are the angles defining the bases orientations (on the two strands labelled by , see Figure 3). The bases have radius and are assumed to be in contact, is their moment of inertia along , is the distance between base-pairs, and and are the longitudinal and transversal spring constants. The terms in the square brackets correspond to the classical sine-Gordon Hamiltonian for a chain of torsion springs Scott1969 (); Englander1980 (), whereas the last term models the interaction potential between both strands. The linearization of Eq. (4) leads to the same frequencies as Eqs. (2,3) where the mass is replaced by in the symmetric case. The full mechanical problem associated with Eq. (4) is difficult to solve exactly but it has been shown within some approximations that it leads to at least 4 soliton-like solutions where (where is the wave velocity) increases by , corresponding to local breaking of base-pair bonds, i.e., in the present context, to small denaturation bubbles. Using the method of Hereman et al. Hereman1986 (), it can be shown analytically that an approximate solution is
where , , and thus . The coordinate is the center of the soliton moving at constant velocity . Typical values of are between 0.5 to 1 Yakushevich2002 (). Hence denaturation bubbles of typical size travel at a velocity slightly lower than the sound velocity . Using the values of and cited above (respectively 1 and a few N/m) leads to bubble sizes between 2 to 10 bps. More details about these types of models can be found in the book by Yakushevich Yakushevich2004 (). Several extensions have been developed in recent years Gaeta2006 (); Cadoni2007 (); Gaeta2008 (); Tabi2008 (); Vasumathi2009 (); Vanitha2012 (). One drawback of these models is that the large fluctuations of the inter-strand separations are not allowed (the distance between the strand backbones is maintained fixed), which makes difficult a proper description of all denaturation bubbles. In addition, the number of effective parameters can become very large when the model is refined further Yakushevich2001 (), which makes their estimation very difficult in practice.
In the alternative model proposed by Peyrard and Bishop Peyrard1989 () and later extended by Dauxois et al. Dauxois1993a (); Dauxois1993b (); Peyrard2004 () and others Cule1997 (); Campa1998 (); Barbi1999 (), rotational motion of the bases is ignored, and base-pair opening is now modeled by the stretching of hydrogen-bonds, where the essential variable, , is the distance between the two bases of the base-pair of index . The two strands are also supposed to remain parallel. In addition to a non-linear Morse potential that models the interaction between bases of the same pair (a Lennard-Jones potential is also used in Ref. Muto1990 ()), another potential describes the stacking interaction between the adjacent bases and of the same strand. The part of the Hamiltonian which depends on is therefore
where , nm, and the stretching modulus is N/m.
The dynamics of such a system has been extensively studied numerically using for instance molecular dynamics simulations coupled to a Nosé thermostat Dauxois1993a () or Langevin simulations Joyeux2005 (); Kalosakas2006 (); Alexandrov2009b (). They show small regions of a few base pairs where reaches a large amplitude which have been named as DNA breathers. It has been suggested by the authors that these breathers could be the precursors of the denaturation bubbles observed in experiments. Again Eq. (6) leads to so-called non-linear waves (breathers), and using a multiple time expansion of the low amplitude expansion of the continuous dynamical equation, solitonic solutions appear. The order of magnitude of the frequency Peyrard2004 () is the same as in the Yakushevich model, and is given by the usual dispersion relation for planar waves
corresponding to the optical frequencies [Eq. (3)] in the limit . Interestingly, this approach also describes, in the continuous limit, the domain wall between a large denaturated region and a closed one as the linear growth of a kink, the typical size of which is given by bp. Indeed, the dimensionless parameter , with the parameter values given above.
This PBD model has subsequently been improved in two directions. When comparing experimental DNA denaturation curves, the PBD model yields a too smooth transition. They thus added a non-linear coupling (, nm) in the spirit of the Poland-Sheraga model PSbook () in which the stacking interaction (or cooperativity) parameter narrows the transition. The solution of this new potential can only be obtained numerically. Other types of non-linear couplings have been investigated Joyeux2005 (); Buyukdagli2006 (); Buyukdagli2007 (), taking advantage of the experimental knowledge of stacking interactions from thermodynamics calculations. Moreover, Buyukdagli et al. inserted a finite stacking by allowing the bases to move in the plane perpendicular to the strand axis Buyukdagli2006 () hence extending the number of degrees of freedom.
Barbi and collaborators Barbi1999 () extended further the PBD model by taking the helicoidal geometry into account, but still assuming that the DNA is infinitely stiff. The Hamiltonian has thus two variables per base-pair which are coupled through a geometrical constraint. The usual distance between the bases, noted , has an equilibrium value , and the twist angle is such that the difference is forced to be close to its equilibrium value ( bp is the B-DNA pitch) by a new term in the Hamiltonian
where is the equilibrium distance along each strand between bases and ( is the distance along the DNA axis). Non-linear localized breather-like solutions are obtained using the same approximate methods as above, showing a local untwisting (kink) of 0.001 rad. Molecular Dynamics simulations were performed very close to the melting temperature on the 10 ns timescale Barbi2003 (), and showed breathers of a 5 to 10 bp. A central peak was also observed in the dynamical structure factor at ps, attributed to the dynamics of bubble boundaries. Campa Campa2001 () observed numerically that a breather, created by locally unwinding the DNA of rad, travels on long chains made of bp both for homogeneous and heterogeneous DNA sequences. Gaeta and Venier have shown that traveling solitary wave solutions cannot exist for physical values of the parameters and must be associated with a global over-twisting of the helix Gaeta2008 (). Note that another type of interaction has been proposed to take into account the helicity in Refs. Dauxois1991 (); Zdravkovic2011 ().
2.4 Mechanical models are not adapted to all denaturation bubbles dynamics
Both Yakushevich and PBD-like models have first been developed to study DNA dynamics, analytically Peyrard1989 (); Barbi1999 (); Peyrard2004 (); Sulaiman2012 () or numerically Peyrard1992 (); Dauxois1993a (); Barbi2003 (); Alexandrov2009b (); Kalosakas2006 (). Both viscous and stochastic forces induced by the surrounding aqueous solvent were not included in the first versions of the models. It is clear that for a nucleo-base in water, the “coasting time”
which corresponds to the time spent for a mass of typical size to coast in a fluid of viscosity thanks to inertia Purcell1977 (); Lauga2009 (); Manghi2006 () is 0.1 ps (the base plus backbone mass is kg, the maximal backbone-base distance is nm, and Pa.s). Hence for times longer than 0.1 ps, all inertial effects are completely damped DoiEdwards2004 (); Chaikin1995 (). This is an important point, since the solitonic solutions of these mechanical models come from wave-like equations where inertial terms in are central in the theory.
This point has been recently raised by Frank-Kamenetskii and Prakash in their critical review FrankKamenetskii2014 () and admitted by Peyrard and Bishop in their comment Peyrard2014 () and even earlier in Ref. Peyrard2009 ().
Several attempts to introduce a friction force in the mechanical models have been proposed Yakushevich2002 (); Zdravkovic2011 (); Sulaiman2012 (); vanErp2005 (); Das2008 (); Deng2008 (). For instance, in the geometry of the PBD model, this friction force is written as where kg/s is the friction coefficient of a base-pair. By accounting for the interactions forces where is the base-pair potential energy, and the Langevin stochastic force , one obtains the Langevin equation:
where and . The damping coefficient corresponds to the inverse of the coasting time defined above. For a nucleotide base it is thus around 10 ps.
Since the success of mechanical models is based on the emergence of solitons, inertial terms where kept and several values of the damping coefficient where introduced. To simulate properly inertial effects, one needs , the simulation time step, usually on the order of the femtosecond (hence a simulation of typically time steps corresponds to a real time of a few nanoseconds). Commonly to in the numerical resolution of the Langevin equations. Therefore the damping coefficient values were chosen very small so that is large enough and simulation durations in real time are of physical interest. They range from ps Joyeux2005 (); Buyukdagli2006 (); Buyukdagli2007 (); Sanrey2009 () to ps Yakushevich2002 (); Vasumathi2009 (); Hien2007 (); Das2008 (), values which are much smaller than the value of 10 ps estimated above444Note that, using the PBD model, some thermodynamic quantities such as the time-averaged fraction of open base-pairs are underestimated when the damping coefficient is too small, smaller than 0.05 ps Das2008 (); vanErp2009Comment (); Das2009Reply (). This is presumably due to the fact that the system did not reach equilibrium..
These damped mechanical models lead to the same conclusions: viscous damping induces a quick stoppage of the soliton which thus travels less than 10 bp Yakushevich2002 (); Vasumathi2009 () and decreases drastically the soliton velocity Hien2007 (); Sulaiman2012 (), and the “bubble” lifetimes which becomes on the order of a few picoseconds only Alexandrov2006 (); Alexandrov2009 (); Alexandrov2010 (). Deng Deng2008 (), using a stochastic differential PBD equation, found base-pair opening times of about 10–400 ps. They propose to obtain opening times on the order of microseconds by choosing an extremely small damping coefficient of ps corresponding to a non-physical base-pair radius nm.
In order to compensate the slowing down effect due to friction, some studies have been done at temperatures close to . Joyeux and collaborators Joyeux2005 (); Buyukdagli2006 (); Buyukdagli2007 (); Sanrey2009 () focused on the melting transition such that large bp excursions are observed.
Das and Chakraborty Das2008 () simulated numerically the nucleation of denaturation bubbles larger than 4 to 5 bps close to the denaturation transition (only C was explored) such that the opening was possible in time steps.
All these results are coherent with the simple estimate of a soliton (breather) which survives during and travels on a distance where is the soliton velocity (on the order of the sound velocity). If one uses the real values measured experimentally (see Section 2.3) and includes the proper friction coefficient values, one finds that solitons travels on a distance to nm during less than 1 ps. These short-living opening events therefore clearly do not correspond to a complete opening of a bubble but to breathers defined by small distortions of the distance between bases less than 1 nm (indeed the threshold for opening are taken in these works to lie between 0.05 Ares2005 () and 0.3 nm Alexandrov2009c ()), as discussed in the Introduction.
Since in a highly viscous medium such as water, the motion is diffusive for times larger than ps, larger bubble-breather lifetimes of a few ps at room temperature are coherent with the rough picture of a particle diffusing in an harmonic potential with spring constant , the correlation function of which is DoiEdwards2004 ()
where . Inserting the values given above, N/m, kg/s at room temperature, one finds ps and a mean-squared deviation of the intra-base pair distance of nm. Note that the entropy associated with the chain orientational degrees of freedom and the forces exerted by the adjacent base pairs do not modify quantitatively this rough estimate.
In all these models, base-pair breathing in dsDNA appears at a typical timescale of a few ps, in any case ns. It thus raises a crucial question: what is missing in the model to reach the microsecond life time observed in some experiments Warmlander2000 (); Altan2003 (); Phelps2013 ()? As it will be discussed in the next Sections, many chain degrees of freedom are not taken into account in the previous models, which can induce a free-energy barrier between the closed and open states.
To reach accessible opening times with friction included in the model, Duduiala et al. Duduiala2009 () defined a non-linear stochastic differential model to study the breathing of a structural defect, a thymine base being replaced by a difluorotaluene (F) with only one hydrogen-bond for the A-F base-pair. By fitting the parameters using MD simulation data, they deduced the potential of mean-force that should be introduced in the mesoscopic model to reproduce the MD results. The damping coefficient was found to be around 3 ps, and importantly, a barrier from the closed state to the breathing one was found around for under-twisted DNAs.
Peyrard and co-workers proposed in 2008 to introduce an ad hoc barrier of which enforces the times scales to be larger by 2 or 3 orders of magnitude Peyrard2009 (). The time scale is then obtained by the Kramers process escaping controlled by the base-pair diffusion. The origin of this barrier is not made explicit in Ref. Peyrard2009 () and several unknown parameters are introduced. Some of them are determined through the melting curves which turn to be much sharper. Indeed entropic effects sharpen the melting transition as explained previously by Cule and Hwa Cule1997 () who showed that a -dependent stretching coefficient, , introduces an entropic barrier. They obtained opening times of about 7 ns but no systematic study as a function of the barrier height or the friction coefficient (not given) was performed. Moreover, it thus shows that at these timescales the velocity distribution function has relaxed towards a Maxwellian as expected at timescales . Local thermodynamical equilibrium has been reached, confirming the fact that the dynamics is controlled by diffusion.
2.5 Chain dynamics timescales
In the preceding section, it was argued that to be able to catch the base-pairing dynamics at the microsecond scale, as observed in the experiments, some additional degrees of freedom must be included in the modeling. This makes the connection with polymer physics, where it is well known that the polymer entropy controls the typical time scales of polymer dynamics PGGbook (); DoiEdwards2004 (). The Rouse model for the dynamics of a polymer chain of physical monomers ( is the Kuhn length) leads to the Rouse relaxation time of the mode (i.e. corresponding to a chain section of physical monomers)
where is the perpendicular component of the friction coefficient (per unit length) of a rod of length . Note that is also the characteristic relaxation time of bending fluctuations. Indeed, the bending energy of a polymer parametrized by where is the curvilinear index is
The equation of motion of is given by balancing the friction and bending forces, which leads to
in the directions perpendicular to . Hence the relaxation time for a segment of length is , which, for the Kuhn length where nm is again the chemical monomer (base-pair) length, leads to . Putting numbers yields s for dsDNA ( nm) and ns for ssDNA ( nm).
Note that when excluded volume is taken into account the Rouse time (Eq. (12) for the slowest mode ) of a chain of monomers and end-to-end distance where is the Flory exponent, can be generalized using a simple scaling argument
because it is assumed that the total chain friction coefficient is simply in the free-draining regime ( is the physical monomer friction coefficient). When hydrodynamics interactions are included and using the pre-averaging Kirkwood approximation PGGbook (); DoiEdwards2004 () one obtains the Zimm time
where the friction coefficient is the one of a sphere of radius (non-draining regime). It has been shown experimentally (Fluorescence correlation spectroscopy experiments) Petrov2006 () that hydrodynamic interactions control the segmental dynamics of dsDNA. Hence, in principle, the dsDNA relaxation time follows the Zimm scaling Eq. (16).
The observation that the largest measured lifetimes of DNA denaturation bubbles (as described in the Introduction) turn to be on the order of timescales of bending fluctuations led some authors to suggest that bending fluctuations play a major role in DNA bubble dynamics Jeon2006 (); Dasanna2013 (). Indeed, it has been shown previously that the huge difference in bending moduli of ss- and ds-DNA plays a major role in the DNA equilibrium statistical physics Palmeri2007 (); Palmeri2008 (); Manghi2009 (), and it presumably influences the bubble dynamics too. These works will be discussed in Section 4.7.
One attempt in coupling bending fluctuations to base-pairing ones has been done by Jeon et al. Jeon2006 (); Kim2008 () who developed a breathing DNA model by explicitly considering two different bending rigidities for denaturated ss segments and ds ones. Using a ladder model with two interacting single WLC strands through a Morse potential, they studied the interplay between bending motions and bubbles dynamics for DNAs of 300 bp. Starting with a railroad-track configuration, they simulated the DNA on hundreds of ns and examined the bubble lifetime as a function of their size. At room temperature they observed lifetimes on the order of 5 ns for bubbles of 10 bp, which slightly increase close to the melting temperature . These measured times are much smaller than the relaxation time of the chain, which is between the Rouse time of one ssDNA and the one for a dsDNA, i.e. between 1 to hundreds of s (note that 300 bp corresponds to for a dsDNA). Hence the chain is far from being equilibrated and essentially keeps its initial railroad-track conformation during the whole simulation. Furthermore, they cannot have a significant sampling of chain degrees of freedom with such a model. To properly study the interplay between the bending motions and the denaturation bubble dynamics, longer simulations are thus necessary.
To conclude this section devoted to all-atom, coarse-grained Molecular Dynamics and mechanical mesoscopic models, the explored timescale with these models is generally less than s and therefore much smaller than the relaxation time of a dsDNA segment of size equal to the Kuhn length, s. Hence these models implicitly assume that conformational chain degrees of freedom are frozen when base-pairing evolves. The base-pair dynamics which is followed with these models is therefore the base-pair breathing, i.e. the local opening of few base-pairs rapidly followed by their immediate renaturation. These transient bubbles have lifetimes ns, during which chain degrees of freedom cannot relax. We have discussed why the transient bubble dynamics should essentially be governed by a diffusive process in the potential well, close to the dsDNA equilibrium structure.
A whole category of base-pairing processes cannot be addressed reliably in this mesoscopic context. Additional models have thus been developed to tackle DNA dynamics on longer timescales, e.g., the milli-second one as studied experimentally by Altan-Bonnet, Libchaber and Krichevsky (ALK) Altan2003 (). Reviewing them is the goal of the next Sections. In particular, we shall see that far-from-equilibrium processes can then be at play at the chain level, which definitely precludes the use of effective models where chain degrees of freedom are considered as frozen, but underlines the need of models that consider the coupling between chain orientational degrees of freedom and the internal ones related to base-pair dynamics.
3 One-dimensional pre-averaged dynamical models
As discussed in the Introduction, after the discovery of the double-helix structure of DNA, the first physical mesoscopic models to be proposed in the early 1960s were inspired by the 1D Ising model of solid-state physics Lazurkin1970 (); Wartell1972 (). As it is common in statistical physics, when trying to account for experimental observations, one starts from the simplest reasonable model where the maximum number of degrees of freedom are pre-averaged. The simplest models assume that a base-pair is simply either open or closed. The DNA molecule is seen as a ladder, the helicity being secondary at this level of modeling, the rungs of which are the hydrogen bonds between base-pairs, either broken or unbroken. The conformational degrees of freedom of the semi-flexible polymers, whether ds- or ss-DNAs, involved in the physical description of the system are thus considered to be pre-averaged as if they were in equilibrium. This is equivalent to consider the chains in a quasi-static approximation, and is the essence of the zipper model to be presented now, before discussing its more elaborate successors in various biophysical experimental contexts.
3.1 Zipper model
Zipping (sometimes also called zippering) is in fact the second stage of renaturation (or (re-)hybridization) of DNA, which is the formation of duplexed strands from two complementary single strands after annealing from a high-temperature, fully denaturated state. Indeed, renaturation occurs in two stages, first a generally limiting, complex-forming or nucleation step (see Section 4.2), followed by a faster, processive and complete closure of the double helix, usually known as zipping Wetmur1968 (); Cantor1980 (); Sikorav2009 ().
The question of the kinetics of the renaturation of DNA was addressed shortly after the discovery of its double-helix structure in 1953, as detailed in the early review Marmur1963 (). These experimental studies essentially focused on the establishment of the second-order character of the reaction, and possible deviations from it, without any experimental access on the faster intramolecular zippering stage, because second-order kinetics only take into account complementary strand recognition and not the following irreversible zipping stage, once recognition has been ensured. Furthermore, severe complications inherently arose from the insufficiently controlled quality of the DNA molecules due to the extraction processes, most nucleation events leading only to partial Watson-Crick base-pairing of the fragments, and single strand regions eventually remaining unpaired (“dangling tails”) Britten1976 (). It has not been possible to work with large amounts of well-controlled, strictly complementary strands before the discovery of molecular biology techniques (PCR) in the 1980s Sikorav2009 (). It was then possible to measure finely melting curves, as well as association and dissociation rates which both display Arrhenius behaviors (see e.g. Paner1992 (); Morrison1993 (); Yamashita2008 ()).
From a theoretical perspective, first questioning was raised by Flory as soon as 1961 Flory1961 (), together with helix-coil transition of -helices, even though adopting a quite basic approach, essentially amounting to a biased random walk in one dimension (the bias of which depends on temperature in order to match Boltzmann distributions at equilibrium). Indeed, the zipper model (Figure 4), one of the dynamical counterparts of the 1D Ising model, has been developed as an effective model of DNA double helix formation. It supposes for simplicity sake – and assumes this approximation to be correct for sufficiently short constructs – that the partially double-stranded molecule is allowed to have only one unbroken sequence of successive paired bases growing from the initial nucleus because successful nucleation events are rare Porschke1971 (); Porschke1974 ().
Once a successful nucleation event has occurred, processive, base-by-base zipping can proceed. The zipper model fundamentally assumes that the renaturation process can be described by a one-dimensional reaction coordinate (the number of successive closed base-pairs) and thus it implicitly supposes that chain orientational degrees of freedom can be pre-averaged so that the quasi-static motion is diffusive in a one-dimensional free-energy landscape. It can easily be analytically solved as a biased 1D random walk Zimm1960 (); Applequist1963 (); Bicout2004 (). If one denotes by and the closure and opening rates (Figure 4) and by the average free-energy gained when closing one base-pair (including the chain conformational entropy)555 at room temperature (25C K) when averaging over the sequence and at physiological temperature (37C K) where entropic effects get stronger. The loss of entropy when closing a base-pair is at these temperatures SantaLucia1998 ()., then the detailed balance leads to
As expected, if the molecule unzips () and if it zips (). Close to , and Wartell1972 ().
The averaged velocity of the junction is given by . The velocity is also given by where pN is the driving force exerted on the DNA at the fork junction associated to the variation of the DNA free-energy when one base-pair opens or closes. It is thus implicitly assumed that is essentially uniform along the chain included at the sub-base-pair length-scale. The friction coefficient is kg/s by assuming that only one base-pair moves, with nm the base-pair hydrodynamic radius. The characteristic time needed to (un)zip a single base-pair is then for , i.e.
and the average zipping time of a chain of base-pairs is
This model anticipates ns. We shall discuss this order of magnitude below when comparing it to experimental values. The underlying issue is whether the single base-pair closure or opening is controlled by as supposed above, or by a (yet unknown) microscopic barrier. This issue will be discussed in Section 5.3
The diffusion coefficient of the 1D random walk is equal to . A short calculation leads to
For the special case , i.e. in the limit , one recovers from Eq. (20) the Einstein relation , as expected close to equilibrium. The relaxation time is then given by the 1D diffusion law
with ns is the characteristic diffusion time of a base-pair to be discussed in Section 5.3 as well.
As compared to early studies, the zipper model can be refined further, for example by taking some form of cooperativity between base-pairs into account in an empirical way Murugan2002 (). In general, it is assumed for simplicity sake that the binding energies of all base-pairs are identical, as a first approximation, thus ignoring possible sequence effects. Jarayaman and coauthors Jayaraman2007 () (see also Szu1979 ()) went further into the analysis of this one-dimensional effective model by assuming that a base-pair can be closed even though its two neighbors are open, but that this is less probable than the closure of a base-pair a neighbor of which is already closed (the basic assumption of the original zipper model). The authors propose an analytic expression for the time-correlation function between nearest neighbors and next-nearest neighbors through a perturbation approach. However, this calculation is limited to the special case where rate constants for reverse and forward reactions are identical in order to ensure the Hermitian character of a transition matrix. It limits the applicability of this approach to the strict vicinity of the melting temperature.
Applying the zipper model or its further developments notably accounts to ignore staggering, slithering or transitory hairpin formation, that has been shown both experimentally and numerically to play a role by slowing down the hybridization process because of the presence of either complementary sub-sequences in single strands Rief1999 (); Montanari2001 (); Gao2006 (); Schreck2015 () or repetitive, periodic motifs Sambriski2009 (); Sambriski2009b (); Hinckley2014 (). Pushing further the one-dimensional biased random walk point of view in the continuous limit and solving convection-diffusion equations, it can for example be shown Redner_private () that allowing the formation of one or several defects during zipping (such as secondary mismatched partial hybridizations) can significantly delay final closure because the system gets stuck in a dynamical trap: it can be long before the erroneously wound segment unwinds, all the more reason when the temperature is low and the random walk is biased. This effect can in principle be quantified. A reduction of hybridization rates by a factor 10 is given in Ref. Schreck2015 (). Consequently, zipping time predicted by the zipper above can only give the lower bounds of real closure times.
Furthermore, independently of the quality of these different approaches, we shall see in Section 4 that because of time-dependent, non-local frictional forces that depend on the chain conformation, zipping of long molecules is better described by far-from-equilibrium approaches. It cannot be fully understood through this over-simplified zipper model because it relies upon the strong assumption that the chain is close to equilibrium (differently said in a quasi-static regime). For the relevant regimes of driving forces, the polymer is likely far from equilibrium because it cannot respond to the driving force all at once.
3.2 Bubble closure in the Poland-Scheraga landscape
In the zipper model, the chain thermodynamical contribution is completely included in the local parameter . An improvement of this simple model has been proposed by Poland and Scheraga PSbook () where the entropy of the denaturated strands (seen as a ssDNA loop of size ) is taken into account in a non-local term depending on in the free-energy. Given the predictive character of the Poland-Scheraga model for DNA thermodynamics, and especially the DNA denaturation, many works on DNA base-pairing dynamics also explored the diffusive dynamics in the Poland-Scheraga free-energy landscape (see for instance the review by Metzler et al. Metzler2009 ()). In this framework the mean life time of an initial bubble of size has been the focus of several studies which are surveyed here. It is important to note that this model only applies for the zipping of a bubble located in the middle of the DNA. For DNA zipping in the Y-shape, it reduces to the zipper model.
The Poland-Scheraga model is an extension of the zipper model by Zimm and Bragg Zimm1959 (); Zimm1960 () which can be reformulated as an Ising model Wartell1972 (). At each base-pair is associated an Ising variable which can take two values, for a closed base-pair and for an open base-pair. The Ising Hamiltonian is therefore
where corresponds to the free energy to pay to break one base-pair (whatever the state of the neighbouring base-pairs), the energetic cost to create a domain wall between two adjacent segments in different states and the free energy difference between two adjacent base-pairs closed or open666In the limit or with periodic boundary conditions, and play the same role in the model and can be embedded in a single parameter, .. They are related to the melting temperature (for a homopolymer) by . Note that and the previously defined are simply related, . The cooperativity parameter controls the width of the denaturation transition. Poland and Scheraga propose to add a loop entropy factor to this parameter,
where the denaturation bubble of size is viewed as a closed flexible ssDNA polymer loop of size . This is therefore an entropic term which favors the closed state. The empirical “stiffness” parameter , first introduced by Fixman and Freire Fixman1977 (), takes implicitly the rigidity into account at small values777A reasonable value of would be , the number of base-pairs in a ssDNA Kuhn length. However, when comparing the Poland-Scheraga model to experimental denaturation profiles, the fitting value for may be as large as 100, which has been correlated to the fraction of stacked residues in the loop Blake1987 (); Palmeri2008 (). and the loop exponent (where is the dimension and the Flory exponent). Hence, in 3D, for a phantom chain and for a self-avoiding chain. If excluded volume interactions are taken into account between loops, Kafri2000 (). The free energy of a bubble of size reads
Hanke and Metzler simplified Eq. (24) by neglecting and studied the Smoluchowski equation (in the approximation of the continuous limit) for the probability density to have a bubble of size at time Hanke2003 (); Banik2005 (); Chaudhury2009 ()
where as in the zipper model, is the diffusion coefficient, being the rate constants to close/open a base-pair, taken as adjustable parameters in the model. The drift term is .
Although appealing, the Smoluchowski equation Eq. (25) in a Poland-Scheraga energy landscape suffers from two major assumptions: (1) As observed by Metzler et al. Metzler2009 (), it is again assumed that is the slow variable of the system compared to the chain degrees of freedom. Especially, the loop is assumed to be in a local thermal equilibrium at any given time during its evolution Bar2007 (). We have previously discussed in Section 2 that this assumption is questionable. (2) The single strand rigidity is completely ignored, in particular for small bubbles , it is known that the loop entropy factor in Eq. (23) is wrong since it has been calculated in the limit PGGbook ().
Thanks to a mapping to the quantum Coulomb problem (the potential of which is in and Eq. (25) is equivalent to an imaginary time Schrödinger equation), bubble lifetime distributions as well as auto-correlation functions have been computed analytically Fogedby2007a (); Fogedby2007b (); Bar2007 (); Bar2009 (). The dynamics depends on the sign of :
For () the loop entropy factor plays a negligible role as soon as is large enough. In particular, the mean bubble lifetime, for an initial bubble size , is
where is the same characteristic time as in the zipper model (see Eq. (18)) for not too large, and is the Bessel function of order . The mean bubble lifetime scales as as for the zipper model, the factor which depends on the value tends to 1 for large bubbles, .
At the melting temperature , since , the dynamics is uniquely controlled by the loop exponent . The mean bubble lifetime is , and scales as as a signature of any free diffusion in 1D.
For , the picture of one single bubble is no more valid since the dynamics evolves towards the chain denaturation.
Sequence effects have been studied following essentially the same approach but in the discrete form and with Ising parameters depending on the sequence. Hence the problem has been solved numerically for the sequence of the ALK experiment Altan2003 () or various biological sequences Ambjornsson2006 (); Ambjornsson2007a (); Ambjornsson2007b (); Talukder2011 (). The coalescence of two DNA bubbles initially located at weak (AT rich) domains and separated by a more stable (GC rich) barrier region in a designed construct of dsDNA has also been studied in Ref. Pedersen2009 (). In these works, the time averaged fluorescence autocorrelation function has been fitted, and being the adjustable parameters. Probability distribution functions connected to the autocorrelation function have different forms for small and large bubbles Bandyopadhyay2011 (). For small bubbles the dynamics depends essentially on the exponent , whereas for large ones the key parameter is .
Changes of temperature have also been studied using this model, either under conditions of rapid heating from a subcritical to the critical temperature Murthy2011 () where a bubble of size grows in a time , or with a temperature ramp which induces some hysteresis effects Allahverdyan2009 ().
Other Langevin dynamics, based on the Poland-Scheraga free-energy landscape, were studied. Kunz et al. Kunz2007 () describes the dynamics in terms of the order parameter , again controlled by . If , decreases exponentially with typical time in for and for close to (and a logarithmic dependence in the case ).
To sum up, the important underlying assumption in describing the closure of DNA bubbles using the free energy of the Poland-Scheraga model of Eq. (24), where all the degrees of freedom except the number of open base-pairs have been integrated out, is that these degrees of freedom equilibrate much faster than . However we have seen that some conformational degrees of freedom of the DNA chain such as bending, but also torsion and entropic stretching, have long relaxation times. This is the reason why the simple scaling laws found within this approach, at and at are the same as the ones of the zipper model and do not display anomalous exponents, as measured in numerical experiments. We shall return to this issue in Section 4 with more elaborate arguments.
3.3 Mechanical unzipping below the melting temperature by applied force
We now address force-induced unzipping well below the melting temperature, which is of experimental relevance. The fundamental motivation is to give solid foundations to more complex in vivo situations where active, force-induced unwinding is performed by helicases or RNA polymerases Alberts2002 (), which exert a force on the order of piconewtons. In the single-molecule experiments, in general close to the physiological conditions, two dsDNA linkers are attached to the ssDNA ends, and are used to pull ssDNA ends apart (Figure 5a) Essevaz1997 (); Bockelmann1997 (); Bockelmann1998 (); Bockelmann2002 (); Bockelmann2004 (); Danilowicz2004 (). The Y-shape assumption is exact in this case because temperature-activated bubbles are extremely rare in dsDNA at physiological temperature. Note that this unzipping process where the applied force is transverse to the molecule should not be confused with DNA structural changes induced by a force applied longitudinally, as studied for example in Ref. Manghi2012 () (and references therein).
In practice, the velocity of the linker extremities is held constant while the applied force is monitored from the measurement with a sub-nanometer precision of the relative positions of the solid supports to which the linkers are attached. The dsDNA begins to open when the force is larger than to pN (at low velocity), the lower value corresponding to the pure AT-limit and the higher one to the pure GC-limit Bockelmann1997 (). Rief et al. measured by AFM the forces at equilibrium pN (resp. pN) for pure AT (resp. pure GC) constructs Rief1999 (). More generally, the unzipping force vs displacement plots (Figure 5b) present reproducible features which are governed by the sequence being opened. The molecule to be opened is several kbp long, e.g. a -phage DNA of kbp. Since twist must be evacuated at one of the molecule extremities, a part of the molecule will rotate and one expects that the ensuing rotational friction torque should contribute to oppose the torque associated with the applied force needed to unzip DNA.
In the low velocity regime, m/s, i.e. typically kbp/s, one observes that the force intensity does not depend significantly on the velocity (Figure 5b). At higher velocities, a significant increase of unzipping force is observed Thomen2002 (), indicating that the rotational friction torque on double-stranded DNA leads to an additional contribution to the opening force, this effect increasing with the length of the molecule. This torque is estimated to be on a -phage DNA at m/s.
The whole process can be reversed (), thus leading to DNA rehybridization, also sometimes called re-annealing. The signal presents hysteresis, an effect that increases with velocity. At velocities kbp/s, the force measured during zipping decreases significantly, indicating that the “velocity of re-annealing of the molecule is less than the velocity imposed” by the experimental device Thomen2002 (). This provides an estimate of the zipping velocity of long duplexes, bp/s which will be useful in Section 4.
As far as reversibility is concerned, the process is quasi-static at low velocities typically slower than 1 kbp/s for a 50-kbp dsDNA, and the zipper model is fully relevant provided that an additional force is applied at the fork level, the external force transmitted by the linkers. The ssDNA polymers being held under a tension of pN, their fluctuations are significantly suppressed. Chain fluctuations that were integrated out in the zipper model in a disputable way become in fact unessential. Note that the entropy gain when going from the ds to the ss form is consequently lower, which increases the value of that should in principle be used. At higher velocities , the equilibrium approach ceases to apply and out-of-equilibrium effects cannot be ignored anymore Cocco2002b (): (a) In the unzipping regime, the rotational viscous drag on the dsDNA becomes important, as discussed above. (b) In the zipping regime, the rehybridization is slower than the imposed velocity and resembles far-from-equilibrium zipping in absence of force, where the viscous torque delays the rewinding of the single strands. This situation will be discussed in detail in Section 4.3 below.
From a theoretical point of view, Cocco, Monasson and Marko proposed a model where the different contributions to the free energy are carefully taken into account, notably the elastic energies of the different involved DNAs displayed in Figure 5a Cocco2002b (); Cocco2001 (). The precise relation between the total extension and the number of open base-pairs can be inferred at equilibrium, when . By carefully comparing equilibration times and (un)zipping timescales, they demonstrated that dsDNA and ssDNA stretching degrees of freedom are in equilibrium on experimental timescale for the -phage DNA discussed above.
The relaxational dynamics of the fork position is governed by the equation
In this equation, the fraction in the left-hand-side is the total force acting on the fork base-pair: is the variation of the elastic energy stored in each ssDNA when closing a base-pair, at fixed tension and fixed single-strand extension . It accounts for the extensibility of ssDNAs, described, e.g. as freely-jointed chains PGGbook (). Its form is not central at this stage but it satisfies at zero torque and at equilibrium, where is the unzipping force at zero torque and at equilibrium. Furthermore is again the friction coefficient of a base-pair (here for sake of simplicity, the base-pair hydrodynamic radius and the dsDNA monomer length are assumed to be equal). The angle rad is the equilibrium twist angle between adjacent base-pairs ( bp is the B-DNA pitch) and relates the number of denaturated base-pairs and the total rotation angle of the dsDNA through (see Figure 5a). Thus . Finally,
is the rotational friction torque exerted on the dsDNA assumed to be linear, in the Rouse regime DoiEdwards2004 (), where nm is the dsDNA radius. Combining Eqs. (27,28) leads to a first-order ordinary differential equation on the fork position :
where the typical time is linear with :
The so-obtained evolution equation also applies to rezipping, at least at low velocities where ssDNAs remain under tension. At fixed as in experiments888Note that the definition of differs by a factor 2 between the experiments and this theoretical work because here the total molecule extension is denoted by , as illustrated in Figure 5a. The conversion rule is ., . Changing the variable from to in Eq. (29), expending the force and the fork position at order 1 in , and plugging them in the equation, one eventually gets Cocco2002b ()
Here grows slowly with the displacement , and is always larger than 20 m/s in the present context. One recovers that the force is essentially independent of while m/s, and that the quasi-static approach is then fully justified.
Above this threshold, we can already anticipate the discussion of Section 4 by examining how the rotational viscous drag prevents instantaneous equilibration. The force required to unzip the double helix then grows significantly (if then ), while it decreases during rezipping (if then ). This explains the hysteresis observed experimentally. The order of magnitudes are in agreement with experimental values even though this agreement can be refined further by appealing to more elaborate far-from-equilibrium arguments in order to take into account the relatively slow twist propagation inside the rotating dsDNA Cocco2002b (). Writhing might also complicate the issue of twist propagation through Eq. (1).
More recently, the hysteresis has also been investigated on a more fundamental basis by Kapri, with the help of either Monte Carlo simulations on a simplified ladder-like 2D lattice model Kapri2012 (); Kapri2014 (), and by Kumar and collaborators using 3D Brownian Dynamics Kumar2013 (); Mishra2013 (), with a number of base-pairs up to 512. In the case where the system is periodically driven with pulsation (not to be confused with above) and force amplitude , original scaling laws are found. Let us define the area of the hysteresis loop as on a cycle, where is the separation between ssDNAs extremities at a given force , averaged over realizations. This dynamical order parameter measures the intensity of the hysteresis and vanishes in the quasi-static regime . Then a clear scaling law is observed in simulations, where , and Kapri2014 (); Mishra2013 ().
The experimental situation discussed above, where rezipping is too fast to keep the single-strands under tension because of rotational drag, can also be tackled in the framework developed by Cocco, Monasson and Marko Cocco2002b () by setting , i.e. in Eq. (29). The so-obtained differential equation can be integrated exactly. If one chooses , that is to say if one starts from a completely unzipped configuration, then . As an example, the closure times in the -phage DNA case appear to be on the order of 1 s, which eventually sets the lower zipping velocity limit where the assumption is correct, m/s. The effect gets stronger when zipping progresses, as it is also visible in Figure 5b (lower blue and green curves).
Sequence effects which have a clear experimental signature, especially at low velocity Essevaz1997 (); Thomen2002 (), can also be included in the theory. Bockelmann, Essevaz-Roulet, and Heslot Bockelmann1997 (); Bockelmann1998 (), and later Cocco, Monasson and Marko Cocco2002b () have extended the previous ideas to the case where is a function of the base-pair being unzipped/zipped. More sophisticated approaches have also been proposed by Lubensky and Nelson Lubensky2002 (). The results of these calculations are in good agreement with experimental data on the -phage sequence. As in the experiments, the more stable GC-rich parts of the sequence generate a “stick-slip”-like motion Bockelmann1997 (); Bockelmann1998 (), giving a sawtooth-like pattern in the extension-force plots Lubensky2002 (). However using this approach Baldazzi2006 () for the fast sequencing DNA at the base-pair level, even though feasible in principle, does not seem realistic at this stage. Only sequence features appearing at a 10 bp scale have been observed so far Bockelmann2002 (). Theoretical considerations led to converging conclusions Thompson2002 ().
3.4 Unfolding and folding of hairpins: Two-state model approximation
When the nucleic acid length is reduced to a few dozens of base-pairs with a short ssDNA loop at one of its ends to avoid a full separation in two strands, the DNA construct is usually called an “hairpin”. The same experimental protocol as above can then be applied to pull on the single strands with a given force . For an applied force suitably chosen around 15 pN, the open and closed configurations become roughly equiprobable, as illustrated in Figure 6.
Dynamics of folding/unfolding of DNA and RNA hairpins, with or without external force, can be studied by measuring fluorescence correlation spectroscopy Altan2003 (); Bonnet1998 (); Jung2006 (), absorbance vs. time after laser temperature jumps Ansari2005 (), or using mechanical force spectroscopy Hayashi2012 (); Liphardt2001 (); Hanne2007 (); Neupane2012 (). Very precise equilibrium energy landscapes can be inferred from experiments ( or measures the hairpin extension) Woodside2006 () which indeed display two marked potential wells. The ensuing two-state model has been the object of several experimental and theoretical investigations, together with the associated unfolding and folding dynamics. The central assumption of this two-state approximation is that the short sequence closure and opening occur in a cooperative manner.
Kramers theory is particularly adapted to this situation (see for instance Ref. Hanggi1990 ()). Adopting a quasi-static viewpoint supposed to be valid here Hummer2012 (), it states that in the overdamped regime of interest here, the closure time is given by
where and are the spring constants associated to the free-energy in the metastable basin (located on the right of the black curve in Fig. 7) and at the transition state ( is the friction coefficient of the hairpin). For the reverse case, i.e. the hairpin opening, Eq. (32) also gives the opening time once is replaced by and is replaced by the spring constant in the equilibrium well999In this section, is the total “reaction” free-energy, i.e. for the whole hairpin, not for a single base-pair..
To our knowledge, the first study of the hairpin kinetics at zero external force is due to Bonnet et al. Bonnet1998 () (see also the review Gurunathan2008 ()). The measured opening (or unfolding) times were around s for a duplex stem of bp at room temperature. The closure (or folding) times were shown to depend strongly on the hairpin loop size (in bp units; see Figure 6c) as . Indeed as increases, the loop becomes more and more flexible, thus decreasing the probability that the two strands encounter back. According to the sequence, the mechanism might also be a three-state one with an intermediate state where only the distal CG base-pairs are paired Jung2006 (). The relaxation time between the open and intermediate states was also measured to be around s in this case.
Ansari and Kuznetsov Ansari2005 () focused on the dependence of the hairpin dynamics upon changes in solvent viscosity. They found an Arrhenius behaviour following Eq. (32) for the opening () and closure times () which scale almost linearly with the solvent viscosity. These long relaxation times, , are measured between s (at water viscosity) to 1 ms (with 70% of glycerol). The Arrhenius behaviour is supported by Monte-Carlo kinetic simulations SalesPardo2005 () with an activation barrier for an hairpin stem of bp. Brownian dynamics simulations also show the Arrhenius behavior for short hairpins Kenward2009 ().
Starting with the pioneering work by Liphardt et al. in 2001 Liphardt2001 (), the hairpin folding/unfolding dynamics was later explored by applying a mechanical force of several pN that helps the unfolding. Within the two-state model, the unfolding dwell times depend on the applied force, :
where the opening time at zero force is related to free-energy barrier following Eq. (32), and measures the characteristic distance between the open and the closed states (see Figure 7 and Ref. Woodside2006 ()). The major interest in force spectroscopy is to dramatically increase the dwell times in the open state and thus to get enhanced sampling of the energetic landscape Woodside2006 ().
Liphardt et al. made force-extension measurements for three different hairpin constructs by laser tweezers Liphardt2001 (). Using the two-state model they measured a barrier of for an hairpin of bp with nm. The barrier was related to the experimentally measurable work as the area under the rip observed in the force-extension curves Manosas2005 () (see Figure 6). Their experiment was carefully studied in Ref. Manosas2006 () (see also Ref. Cocco2003 ()) by applying Kramers theory and taking into account the entropy associated with the single strands. The one-dimensional experimental effective barrier was reconstructed as a function of the number of open base-pairs for various force values. The extrapolation at zero force yields s, and a barrier linear in , showing the collective character of the opening, with a value for the full opening (). These different experiments indicate a barrier height roughly proportional to the number of base-pairs in the duplex stem, to 3.
Other experiments were done on different DNA hairpins with a duplex length of bp Hanne2007 (). The zero force opening time was measured to be around 7 s, much larger than the value presented above. Moreover they found a much smaller value of nm as defined in Eq. (33). These huge differences were assumed to be due to the different experimental conditions (H and salt concentration, sequence) Hanne2007 (). As experimentally shown by Bonnet et al. the loop size also has a big influence on the folding rates Bonnet1998 (); Woodside2006b ().
Finally, a step further was done by Neupane and co-workers Neupane2012 (), who were able to measure experimentally the so-called transition time between the two states of hairpins with to 30 bp in the duplex. This transition time is generally shorter than the opening and closure times, because it does not include the contribution of the waiting time spent in the potential wells. It was found to vary linearly with the duplex length, s for several unfolding rates. They were measured by reconstructing the force-dependent one-dimensional energy landscape from the observed distributions, thus again assuming the validity of the quasi-static approximation. An inherent complication arises from the presence of the optical trap and DNA handles, which have been corrected. The transition time is given by the formula Hummer2004 (); Roland2015 ()
where is the Euler constant and is defined in Eq. (32). Moreover they experimentally determined the diffusion coefficient m/s.
To explain the two-state shape of the free-energy landscapes, they were shown to be modeled with a good accuracy by a simple zipper-like model taking into account both the sequence-dependent (from nearest-neighbor free energy parameters) and the elastic energy stored in the stretched and curved polymers Woodside2006 (); Cocco2003 (); Woodside2006b (). The release of the last base-pair reduces the free-energy by increasing the entropy of the ssDNA segment of length forming the initial loop. This is the origin of the potential well in the fully open hairpin state (Figure 7).
However some experimental works questioned the two-state approximation for modeling the hairpin kinetics at zero force Jung2006 (); Ma2007 (); Jung2008 () because of the existence of intermediate states (possibly mis-folded) between the fully zipped and the fully open ones (see also Ouldridge2013 ()). By finely analyzing FCS experimental data on an hairpin of size bp to investigate its terminal fluctuations, Chen and coworkers established that the zipper model was equally unable to account for the short-time ( ms) dynamics and to consistently fit the experimental FCS Chen2008 (). They had to appeal to stretched exponentials , (instead of simple ones, , in the original zipper model, as in ordinary chemical reaction kinetics) to model elementary processes, and the so-obtained fits were then very satisfying. The ensuing average zipping rates were found to be on the order of a fraction to a few tens of bp/s (with significant error bars), in correct agreement with earlier estimates based on more rudimentary techniques Craig1971 (); Porschke1974 (). The occurrence of stretched exponentials was attributed to “static disorder”, without proposing at this stage any microscopic mechanism that could be responsible for it.
Altan-Bonnet, Libchaber, and Krichevsky have also done FCS experiments at zero force Altan2003 () on DNA hairpins to follow the opening/closure dynamics of a AT sequence of 18 bp clamped by GC-rich regions at its extremities (see Figure 6c). In contrast to the previous hairpin constructs, the internal AT bubble can open by thermal activation and then close again without the molecule extremities opening. In these experiments, the quencher and fluorophore were located in the middle of the AT sequence and not at the hairpin extremity, thus measuring the lifetime of these partially open state. They showed that bubble closure times follow an Arrhenius law, which indicates the presence of an energy barrier impeding closure of (for 18 AT bp) and the relaxation time , dominated by the closure time, was to s. One of the first idea to model this hairpin kinetics was naturally to use the zipper model Altan2003 (); Bicout2004 (). It was essentially used to fit the auto-correlation function with the bubble closure and opening times as fitting parameters. For instance, in an analytical work Bicout and Kats Bicout2004 () deduced a bubble lifetime . However was an adjustable parameter only (fitted using as in Altan2003 ()). Other works assumed a quadratic confining potential (modeling the GC base-pairs) to fit the auto-correlation function Chatterjee2007 (); Chakrabarti2011 (), but again without estimating the relaxation time. We shall see below in Section 4.7 that the two-state approximation is not fully adapted to obtain the bubble closure lifetime. The opening of smaller bubbles limited to the AT-core has to be considered as a specific third state. In this case, we shall see that the chain dynamics is then coupled to the base-pair kinetics in a very specific way, which explains why neither the zipper model nor the two-state approximation are fully adapted. It is worth mentioning that the two-state approximation for hairpins already couples base-pair and chain orientational degrees of freedom at play in the open-state free-energy well.
3.5 Bubble dynamics in torsionally constrained DNA
It has been known for a while that a torsionally constrained DNA, either by applying an external torque on it or by closure of the molecule into a superhelical ring (such as in plasmids) or by the formation of loops through regular attachments to a network of proteins in the nucleosome, can be locally denaturated through the nucleation of one or several denaturation bubbles under certain conditions Adamcik2012 (); FrankKamenetskii1997 (); Dean1971 (); Benham1996 (); Strick1998 (); Takahashi2015 (). Still with the goal of deciphering basic biological mechanisms on physical grounds, the statistical physics of this phenomenon in equilibrium has been extensively studied by Benham and others Fye1999 (); Benham1979 (); Benham1980 (); Benham1990 (); Jeon2010 (), including sequence effects and their connection with biologically relevant sites Jost2011 ().
Less is known about the underlying dynamics of the nucleation of bubbles in superhelical DNAs. Lankas et al. observed kinking (but not bubble nucleation) of duration between 10 and 50 ns on MD simulations of DNA minicircles of 94 bp Lankas2006 (). Note that different force fields led to the different conclusion that a bubble can also be nucleated Mitchell2011 (). Very recently, Oberstrass et al. used a high-resolution approach for measuring the torsional response of short DNA sequences (20 to 100 bp) Oberstrass2013 (). In particular, they observe the reversible hopping between two states of AAT-repeats, related to DNA breathing. The opening and closed mean dwell times (between 1 and 10 s for 5 to 8 negative rotations) were measured as a function of the applied negative torque, an increase in negative twist favoring the open state.
Hwa et al. Hwa2003 () studied the dynamics of twist induced denaturation bubbles using the Poland-Scheraga model in the single bubble approximation. According to the applied negative torque value , they focus on the transition between a delocalized small breather which freely diffuses along the chain for small torques, , and a localized bubble due to the sequence heterogeneity for larger torques, (for the DNA is fully denaturated). The critical torque, , was estimated to be around 8 pN.nm . An analogy can be drawn between glassy dynamics and the bubble dynamics in the localized state. In particular the escape time is shown to be sub-diffusive with ( in the delocalized state), with an average bubble length which grows logarithmically with time, and an aging phenomenon is observed in the MD simulations.
Benham and co-workers interested in the DNA transcription dynamics, studied the dynamics of DNA minicircles Mielke2005 () (of length ) with a dynamically imposed superhelicity, to model the torque generated by the RNA polymerase. The system being out-of-equilibrium, they used the Brownian dynamics simulations developed in Ref. Mielke2004 (). They focused on the sequence-dependent location, essentially in the AT sequences, in agreement with the equilibrium Benham model and found that initial duplex opening occurs around 100 s after the imposed torque (for a imposed angular velocity of about s, i.e. at least 2 orders of magnitude greater than that of transcription, to speed up the simulation). Jost and collaborators also explored the connection between sequence and the probability of bubble opening, while insisting on the difference between metastable unwinding of a large bubble and small scale breathing Jost2011 (), as already mentioned in the Introduction.
A full comprehension of the bubble dynamics under applied torque is still lacking. One major difficulty arises from the Fuller-White theorem, Eq. (1). Indeed, increasing the torque or the superhelicity on a plasmid for instance imposes a 3D writhe (supercoiling) to the circle even before a bubble nucleation occurs. Supercoiling being a non-local effect, it is challenging to include it in an effective Hamiltonian. Hence, a complete theoretical model should consider the total bending and torsional energies of the whole plasmid, which is a formidable task.
To conclude this section, we have seen that pre-averaged models are able to describe correctly and in a simplified way number of experimentally relevant situations (force-induced zipping/unzipping at low velocity, folding/unfolding of hairpins) but that their use is more questionable in other circumstances because chain orientational degrees of freedom have relaxation times on the same order of magnitude as or longer than base-pairing. The next section is devoted to the survey of these far-from-equilibrium phenomena where base-pairing and chain orientational degrees of freedom must be tackled on an equal footing.
4 Interplay between base-pairing and chain dynamics
In this section, we explore the coupling between base-pairing and chain orientational degrees of freedom, with a special emphasis on far-from-equlibrium chain dynamics, as motivated at several places in the previous sections. We explore the different regimes of interest, below, close to, and above the melting temperature with an emphasis on the scaling laws in the limit of long molecules. We are especially interested in rehybridization below . It is the result of several distinct stages: a nucleation step where diffusing complementary single strands encounter, followed by a zipping step and, in some circumstances, by a last metastable state before complete closure can occur. We first review some experimental results that motivate the theoretical investigations reviewed in this section.
4.1 Experimental motivation
One of the best candidates for far-from-equilibrium dynamics in the context of DNA base-pairing are related to single strands rehybridization at , as illustrated in Figure 1c, of pivotal importance, e.g., in PCR. Processing zipping velocities can conveniently be measured on short molecules of few base-pairs. Early measurements estimated them to be around 1 to 10 bp/s Sikorav2009 (); Porschke1974 (); Craig1971 () at physiological salt concentration and room temperature, by use of a temperature-jump apparatus and dynamical measurements of UV absorbance. The authors of Ref. Chen2008 () were led to similar conclusions using FCS experiments on hairpins. All-atom simulations predicted somewhat slower velocities of about 0.2 bp/s on very short 6-bp constructs, in similar conditions Qi2011 (). However, for long molecules, it has been understood that closure times grew faster than the number of base-pairs . For example, as already indicated in Section 3.3, it can be concluded from Thomen et al.’s more recent experimental data on single molecules Thomen2002 () that the zipping velocity is bp/s for a -phage DNA of 48.5 kbp at comparable temperature. Therefore it is tempting to assume that zipping times follow a simple scaling law