# Short DNA persistence length in a mesoscopic helical model

## Abstract

The flexibility of short DNA chains is investigated via computation of the average correlation function between dimers which defines the persistence length. Path integration techniques have been applied to confine the phase space available to base pair fluctuations and derive the partition function. The apparent persistence lengths of a set of short chains have been computed as a function of the twist conformation both in the over-twisted and the untwisted regimes, whereby the equilibrium twist is selected by free energy minimization. The obtained values are significantly lower than those generally attributed to kilo-base long DNA. This points to an intrinsic helix flexibility at short length scales, arising from large fluctuational effects and local bending, in line with recent experimental indications. The interplay between helical untwisting and persistence length has been discussed for a heterogeneous fragment by weighing the effects of the sequence specificities through the non-linear stacking potential.

###### pacs:

87.14.gk, 87.15.A-, 05.10.-aThe DNA double helical structure is stable enough to preserve genetic information encoded in the Watson-Crick paired bases and also loose enough to allow for those transient base pair (bp) openings zocchi03 () which make the code accessible to enzymes during the processes of replication, transcription and repair. At physiological temperatures DNA molecules fluctuate between a variety of random coil conformations in which even distant segments along the helical axis can be brought in close proximity biton18 (). This points to an inherent flexibility of the DNA chain which has been widely probed over the last twenty five years busta92 (). While these experiments demonstrate that stretch and twist elasticity are intertwined bohr11 (), they also call for models in which bp fluctuations and stacking interactions are considered as dependent on the specific twist conformation of the molecule. Modeling of the helix and its conformational states can be carried out at different levels of resolution ranging from all-atom simulations to continuous worm-like chain (WLC) models which simply treat DNA as a homogeneous and inextensible rod oroz16 (), not accounting for the interplay between twist and bp fluctuations. This may explain the shortcomings of the WLC model emerged in the analysis of the cyclization properties vafa (); io16b () at those short length scales in which the details of the bp interactions matter. In this regard, mechanical models such as the Dauxois-Peyrard-Bishop (DPB) model pey93b () provide a convenient description of the dsDNA in which the complementary strands are represented by two parallel chains of beads coupled via a intra-chain anharmonic potential. However, the standard mesoscopic modeling has the general drawback that the twist and bending conformational degrees of freedom are frozen (or absent) when the base pairs (bps) vibrate. To overcome this limitation we have proposed a model which includes the angular variables in the intra-chain stacking interactions and developed a method to determine the equilibrium twist conformation of short homogeneous oligomers. The efficacy of the method has been recently tested by evaluating the DNA elastic response in the presence of a stretching perturbation io18 () while previous studies had examined the helix unwinding and formation of denaturation bubbles in circular DNA as a function both of temperature and of the circle size io13 (); io14a ().

Here we focus on a key indicator of the polymer flexibility, namely its persistence length (), which measures the orientational correlation between distant segments of the chain and, for a discrete model, it can be calculated as a sum over the average scalar products of the bond vectors associated to those segments soder97 (). While this microscopic approach proves useful to deal with the end effects associated to short oligomers, such correlation distance depends on the chain length and contains electrostatic contributions arising from the fact that distant monomers along the molecule stack may be brought close to each other because of bending fluctuations. Then, our microscopic correlation distance provides a measure of the apparent conceptually distinct from the intrinsic which instead defines a local property of the polymer, independent of the chain length. On the other hand, the apparent is also the one which can be compared with the experiments, as it is generally extracted from measurements of global properties of the helical molecule e.g., the end-to-end distance obtained by fluorescence resonance energy transfer (FRET) archer08 (). It is also noticed that the methods used to extract ’s data rely on global equations which, strictly speaking, have been derived in the framework of WLC continuous models for kilo-base long polymers and whose application to short length scales may be questionable maiti15 (). For these reasons, we pursue here a research line alternative to the WLC approach and present the theoretical background to calculate the ’s at short length scales on the base of a mesoscopic Hamiltonian containing the forces which stabilize the molecule.

## I Model

We consider a model for the helix with bps as depicted in Fig. 1 in which the stretching vibrations between the mates of the bp are defined by, , where are the positions of the complementary mates, respectively given by: and . represent the fluctuations of the two bases in the pair and is the inter-strands separation in the absence of radial fluctuations i.e., the bare helix diameter. Suppressing the radial fluctuations, all , the red dots in Fig. 1 would map onto the the ’s lying along the molecule mid-axis at a constant rise distance . The latter is the bond length in the freely jointed chain model. Hereafter, the bare helix parameters are set to the average values usually assumed for kilo-base long DNA, i.e., Å and Å.

As shown in the right panel of Fig. 1, adjacent bps along the chain are twisted by and bent by . Thus, the sequence specific distance between and , which can be straightforwardly obtained by geometrical arguments, depends both on the radial and angular fluctuations. Hence, the computation of average distances and average scalar products involves integrations over the whole ensemble of radial and angular fluctuations consistent with the model potential.

The helical molecule is described by a potential given by the sum of i) a one particle term which models the hydrogen bonds between complementary pair mates and ii) a two particles term which models the stacking forces essentially due to the overlap of electrons on adjacent bases along the stack. It is the stacking which depends on the twisting and bending variables. Analytically the potential reads:

(1) |

The one particle term is the sum of a Morse potential and a solvent potential . represents an effective pair interaction energy between nucleotides measured from the zero level which corresponds to the absence of radial fluctuations, i.e., . While may also become smaller than , it is important to emphasize that the range of the radial fluctuations is set by the potential parameters i.e., the depth and width . In fact, the code includes those bp vibrations such that which amounts to exclude those displacements such that . The latter are discouraged by the electrostatic repulsion and would yield a negligible contribution to the partition function. On the other hand, for , becomes flat yielding a vanishing force between the bp mates as expected when the bases get far apart and pair dissociation sets in. While the Morse potential is generally suitable to describe the equilibrium properties of DNA, it has to be corrected to account for the DNA dynamics. In fact, when a base flips out of the stack, there is an entropic gain associated to the new available degrees of freedom and the open bases may also form hydrogen bonds with the solvent. Thus, in order to re-close, the base has to overcome an entropic barrier, not described by . The latter is introduced in our model by the term which enhances by the factor the energy threshold for bp breaking above the Morse dissociation energy and provides the barrier, whose width is determined by , which accounts for those effects of strand recombination occurring in solution.

The two particles term contains the elastic constant and the non-linear parameters (discussed below). is an extension of the non-linear stacking potential first introduced in the DPB ladder model to account for the first order-like denaturation transition associated to the bp opening and strand separation. In ladder models however, whenever two adjacent bases slide far away from each other so that the overlap between their electrons is lost, the stacking energy becomes infinitely large. Instead, our potential realistically accounts for the finiteness of the stacking interaction due to the stiffness of the sugar-phosphate backbone. This is done through the angular variables between adjacent bases, in particular through the twist angle which yields a restoring force in the stacking also in the presence of large amplitude fluctuations io12 ().

## Ii Method

The model potential in Eq. (1) is treated by the finite temperature path integral techniques widely described in some previous works io11 (); io14b (). The idea underlying our method is that of mapping the bp displacements onto the time axis, , so that the separation between the pair mates is a time dependent trajectory varying in the range whose upper limit is , being the temperature and the Boltzmann constant. As the partition function is an integral over closed trajectories, running along the -axis, the bp trajectories can be expanded in Fourier series whose coefficients generate the large ensemble of radial bp fluctuations contributing with their statistical weight to the averages of the helical parameters.

Technically, integrating over the Fourier coefficients, an increasing number of trajectories is added to the partition function until the latter converges. As the model also depends on bending and twisting degrees of freedom, the numerical convergence must be obtained also integrating over the angular variables with their specific cutoffs. This ultimately corresponds to the state of thermodynamic equilibrium which is achieved by summing over about configurations for each dimer in the chain. The method has essentially the following advantages: i) it introduces the dependence in the formalism; ii) it sets the cutoffs on the bp amplitudes by defining an integration measure which normalizes the kinetic action (see below). Accordingly the phase space available to the ’s is consistently confined without operating arbitrary truncations in order to remove the divergence of the partition function as found in Hamiltonian investigations of DNA thermal denaturation zhang97 (); munoz10 (); iii) it directly relates the macroscopic helix parameters, e.g. average diameter and rise distance, to the fluctuational effects treated at the level of the base pair. Then, the partition function for the chain of bps of reduced mass , is:

(2) |

and the free energy of the system is: . Importantly, the integration measure over the bp Fourier coefficients has to normalize the kinetic action. This condition sets the free energy zero and holds for any . Hence, the free energy is independent of as expected for a classical system. Furthermore, the normalization condition yields the cutoff on the radial fluctuations as explicitly shown in ref.io11a (). While the cutoff on the maximum fluctuation amplitudes is the same for all bps, note that, for a given molecule configuration, one may have a value which significantly differs from a neighbor . This is in general the case and this accounts for the fact that a base can flip out of the stack thus causing local helical unwinding.

The cutoffs over the bending and twist angle integrations are set to and , respectively. Precisely, each bending angle between adjacent bps (see Fig. 1) is computed with respect to the average value for the preceding angle along the stack, i.e. and the integration in Eq. (2) is performed over the bending fluctuations taken in the range . This allows for the formation of kinks which locally bend the molecule axis, reduce the bending energy and increase the molecule flexibility.

Likewise, we use a recursive procedure which defines the twist angle of the bp with respect to the average computed for the preceding bp along the axis. For the twist variable, one has to consider the increment associated to the molecule helical conformation with , the number of bps per helix turn, being a tunable parameter of our computational method. Accordingly, the twist angle is written as, and the twist fluctuation variable in Eq. (2) is integrated in the range . The latter integration has to be performed for any value of the helical repeat taken in a physically meaningful range, . In fact, given the absence of measurements for short chains, our method admits that short DNA may have an helical repeat which differs from the experimental value, , usually considered for kilo-base B-DNA under physiological condition wang79 (). Accordingly we have taken a set of input values for centered around , i.e., with being the partition mesh. Further, by integrating over the ensemble defined by Eqs. (2), we have derived a set of average twist conformations expressed by:

(3) |

where is the average twist for the last bp in the chain. The ’s denote the possible twist conformations for the short chain and, by minimizing the free energy over the computed ’s, the program derives the equilibrium average helical repeat (). This can be done under specific conditions defined e.g. by the presence of external forces, temperature and salt concentration in solution cher11 (). It is understood that the ’s also contain the effects of the radial and bending fluctuations according to the integration recipe given in Eq. (2). Certainly the accuracy of our calculation grows with the density of -values taken in the -range around . The following calculations are carried out in the window with a partition step .

## Iii Results

First we apply the method to a set of homogeneous short sequences to highlight the effects of the chain length on the equilibrium properties. The potential parameters are taken in accordance with our previous works and are consistent with available information regarding thermodynamic and elastic DNA properties i.e., , , , , , , . In particular these values permit to reproduce the range of experimental free energies, obtained by averaging the contributions of all dimers, in duplex DNA santa (); io16b ().

The trend of the results displayed for the homogeneous chains would not change by assuming different sets of parameter values also used in DNA models weber09 (); singh11 (); albu14 (). Fig. 2 shows the free energy per bp as a function of the ’s for three chain lengths. The triangles mark the free energy minima yielding the equilibrium ’s for the respective chains. A significant shift towards higher ’s is found as one proceeds to consider longer chains. However, the ’s increment as a function of is not linear suggesting that the equilibrium helical repeat should saturate at , for chains in the range of hundreds bps (albeit not studied here). While the ’s are the most probable twist conformations selected by free energy minimization, our plots indicate that, because of thermal fluctuations, the chains may assume also ’s conformations which are close to on the energy scale io15 ().

### iii.1 Helical parameters

The statistical averages in Eq. (2) are performed to calculate the average distance between neighbor bps along the stack, and the average bp fluctuation, , with respect to the bare helix diameter.

Both quantities may depend on the twist conformation associated to a specific value. Then, we calculate and for the equilibrium twist and further monitor their changes both in the untwisted ( larger than ) and in the over-twisted ( smaller than ) conformations. The results are shown in Fig. 3. First, we notice that there is an appreciable dependence of the helical parameters on the twist conformation, with and being non-monotonous versus . For very short chains, the minimum elongation and the maximum helix diameter (dashed lines) may not coincide with the equilibrium twist conformations (triangles) determined in Fig. 2, an effect attributed to the large fluctuational effects generally more pronounced in shorter chains. However, as the molecule length grows (), the gap between the location of dashed line and triangle narrows thus signaling that the conformation emerges as the one in which the average helix elongation is minimum and the average diameter is maximum. This condition should correspond to the most stable helix conformation driven by a balance between strong bonds along the molecule stack and breathing fluctuations which confer flexibility to the whole chain.

### iii.2 Persistence Length

The orientational correlation between distant portions of the helical molecule may be lost due to disordering thermal fluctuations or to interactions with the surrounding solvent. Such effects vary both with the sequence specificities and type of solvent maiti18 () thus concurring to determine the flexibility properties as measured by the save12 (). For semi-flexible polymers the latter is defined as the characteristic decay length of the correlation between two bond vectors, and , that is:

.

For long molecules, the finite size effects are negligible as, for most sites, the average correlation between and sites tends to vanish before approaching the chain end. Hence, the flexibility of most of the chain is well described by the exponentially decaying correlation function. The same concept of exponential decay (for the directional cosine of the bending angle between distant vectors) defines in the continuous WLC model gole12 () which assumes an infinitely large number of bond lengths while the contour length is kept fixed. Although the WLC model is designed for long polymers, whose is expected to be lower than the contour length, the WLC relations have been applied to extract the apparent also in short chains, from analysis of the end-to-end distance and measurements of the DNA size, e.g. via the radius of gyration or the end-to-end distance archer08 ().

Thus, molecular dynamics computations of the bond vectors correlation and Monte Carlo calculations of the WLC mean square end-to-end distance tan15 () deliver values smoothly increasing from Å to Å for chain lengths growing in the range whereas FRET data for a set of sequences with yield Å through the analysis of the end-to-end distribution function archer08 (). Further, for a chain with beads treated by the WLC model with stretching flexibility tan17 (), the value Å has been fitted to the end-to-end distribution function of a coiled configuration, whose most probable radial distance is less than one half of its mean contour length , i.e. .

While all these cited ’s are generally smaller than the standard value (Å) of kilo-base long DNA, the broad scattering of available estimates may suggest that the WLC picture itself breaks down when it comes to analyze the flexibility of short DNA chains.

For these reasons, we adopt here a microscopic definition of the apparent as an average over all the local site dependent which, in turn, are obtained by averaging the bond vectors correlations over the ensemble in Eq. (2). Then, in our method, reads:

(4) |

Computing Eq. (4), we obtain the results shown in Fig. 4 for a set of homogeneous fragments with and with the same potential parameters as in Fig. 2. For each chain, the twist conformation is that selected by free energy minimization. It is found that: i) grows linearly with the chain length, ii) the obtained values are rather small pointing to a sizeable chain flexibility driven by entropic effects. This conclusion is corroborated by the most probable end-to-end distances () calculated for the same set and displayed in Fig. 4 (right -axis) io18a (). In fact markedly deviates from linearity as the chain length grows. Further, for each chain, is lower than one half of the mean contour length () indicating that coiled configurations have a large statistical weight also at short length scales.

While our estimates of the ’s in short chains appear consistent with some values reported by other studies, e.g. tan17 (), it should be noticed that the latter works are based on a WLC model (without twist) in which appears as an adjustable parameter whereas, in our method, is directly obtained via computation of the correlation function built in the framework of a realistic Hamiltonian model for the twisted molecule.

### iii.3 Heterogeneity

Clearly the ’s may depend on the chosen potential parameters and, in general, such values varies with the sequence specificities. To highlight this effect we consider the bps heterogeneous sequence with and bps:

(5) |

Only the purine bases along one strand are given in Eq. (5) as our model neglects differences in the stacking contributions arising e.g., from the dimers AA/TT, AT/TA and TA/AT with the slash separating strands having opposite orientation (analogously for the GC bps).

On the other hand, the specific contributions due to the dimers AA/TT, AG/TC and GG/CC are distinguished in the stacking potential through the non-linear parameters and (with the commas separating adjacent bps along the stack) which control amplitude and range of the bp fluctuation.

In fact, looking at the potential defined in Eq. (1), as long as the condition holds for all bps, the molecule is stable and the effective stacking coupling is . Whenever, because of a fluctuation, the hydrogen bond of a specific bp is disrupted then the base moves out of the stack, the local coupling drops to and also the adjacent base tends to loose its bond with the complementary mate thus extending cooperatively the fluctuational bubble along the helix. This event yields an energetic gain which is proportional to . Accordingly we attribute a larger anharmonic weight to AA/TT dimers (which are more prone to bending deformations than GG/CC dimers) taking larger values for the parameters. Conversely we assume as the inverse lengths measure the amplitude of the bp fluctuation required to soften the stacking energy coupling.

With these premises, we calculate by Eq. (4) the for the sequence in Eq. (5) against the twist conformation as displayed in Fig. 5. The homogeneous case (potential parameters are those of the previous Figures) is plotted for comparison while the curve labeled by Het (a) is obtained by varying (with respect to the homogeneous chain) only the Morse parameters for the AT bps. This causes a minor effect on , however ascribable to the fact that the averages in Eq. (4) are carried out over the ensemble in Eq. (2), weighed by the Boltzmann factor which also contains the Morse potential.

Instead, the bending flexibility is controlled by the stacking potential depending on the angular degrees of freedom. Once the chain heterogeneity is weighed in the model via the parameters and , we obtain a more pronounced effect on as shown by the curves Het (b),(c),(d) computed respectively for the input values in Table 1. Such values are varied arbitrarily as there are no available data on short sequences to constrain the anharmonic parameters. Interestingly however, the equilibrium helical repeat progressively shifts upwards (as marked by the arrows in the plot) by increasing the weight of the non-linear parameters while decreases. Thus the helical untwisting should be meant as an indicator of an increased molecule flexibility at the short length scales considered in this work.

(b) | 3 | 1 | 2 | 2 | 1 | 3 |
---|---|---|---|---|---|---|

(c) | 5 | 1 | 4 | 2 | 3 | 3 |

(d) | 10 | 1 | 8 | 2 | 6 | 3 |

## Iv Conclusions

I have focused on the persistence length of short fragments proposing a computational method which essentially treats the bp fluctuations as temperature dependent paths. Large amplitude fluctuations and pair breaking effects are included in the calculation which accounts for the formation of bubbles and flexible hinges along the chain. Furthermore, the Hamiltonian contains both the twist and the bending angle between adjacent bps along the stack hence, the path integration is carried out over all radial and angular fluctuations which concur to shape the helical molecule in solution. This feature is a significant advancement with respect to previous coarse-grained investigations which either neglect bp fluctuations or take the double stranded helix as a ladder. The computational method yields thermodynamic quantities and helical parameters as a function of the specific twist conformation. Accordingly, I have calculated the correlation function which defines the microscopic for a set of short homogeneous fragments and obtained values which are markedly smaller than the standard for kilo-base long DNA. This suggests that, at short length scales, DNA maintains a remarkable flexibility also witnessed by its end-to-end distance which appears smaller than its average contour length. These results are in line with a body of experiments supporting the view that the intrinsic flexibility of short chains may be related to local bending of the bonds between adjacent bps. At the current stage however, there are not sufficient experimental information on the of short DNA to compare with the calculated values. I have also applied the method to a heterogeneous oligomer by tuning the potential parameters which weigh the anharmonic effects in the stacking potential. The persistence length, obtained as a function of the twist conformation, shows that an increased molecule flexibility is related to an appreciable untwisting of the helix. I finally observe that the path integral technique is feasible for analysis of DNA properties in crowded environments as the stability of the helical configuration may depend on the specific confinement of the phase space available to the bps operated by the cell walls or by crowders which influence the free volume in the system.

### References

- A. Montrichok, G. Gruner, G. Zocchi, Europhys. Lett. 62, 452-458 (2003).
- Y.Y. Biton, J. Chem. Theory Comput. 14, 2063-2075 (2018).
- S. Smith, L. Finzi, C. Bustamante, Science 258, 1122-1126 (1992).
- K. Olsen, J. Bohr, AIP Advances 1, 012108 (2011).
- P.D. Dans, J. Walther, H. Gómez, M. Orozco, Curr. Opin. Struct. Biol. 37, 29-45 (2016).
- R. Vafabakhsh, T. Ha, Science 337, 1097-1101 (2012).
- M. Zoli, J. Chem. Phys. 144, 214104 (2016).
- T. Dauxois, M. Peyrard, A.R. Bishop, Phys. Rev. E 47, R44-47 (1993).
- M. Zoli, Physica A 492, 903-915 (2018).
- M. Zoli, J. Chem. Phys. 138, 205103 (2013).
- M. Zoli, Soft Matter 10, 4304-4311 (2014).
- M. Ullner, B. Jönsson, C. Peterson, O. Sommelius, B. Söderberg, J. Chem. Phys. 107, 1279 (1997).
- C. Yuan, H. Chen, X.W. Lou, L.A. Archer, Phys. Rev. Lett. 100, 018102 (2008).
- A. Garai, S. Saurabh, Y. Lansac, P.K. Maiti, J. Phys. Chem. B, 119, 11146-11156, (2015).
- M. Zoli, J. Phys.: Condens. Matter 24, 195103 (2012).
- M. Zoli, J. Chem. Phys. 135, 115101 (2011).
- M. Zoli, J. Theor. Biol. 354, 95-104 (2014).
- Y.L. Zhang, W.M. Zheng, J.X. Liu, Y.Z. Chen, Phys. Rev. E 56, 7100-7115 (1997).
- J. M. Romero-Enrique, F. de los Santos, M. A. Muñoz, Europhys. Lett. 89, 40011 (2010).
- M. Zoli, Eur. Phys. J. E 34, 68 (2011).
- J.C. Wang, Proc. Natl. Acad. Sci. USA 76, 200-203 (1979).
- A. G. Cherstvy, J. Phys. Chem. B 115, 4286-4294 (2011).
- J. SantaLucia, Proc. Natl. Acad. Sci. USA 95, 1460-1465 (1998).
- G. Weber, N. Haslam, J.W. Essex, C. Neylon J. Phys.: Condens. Matter 21, 034106 (2009).
- S. Srivastava, N. Singh, J. Chem. Phys. 134, 115102 (2011).
- D.X. Macedo, I. Guedes, E.L. Albuquerque, Physica A 404, 234-241 (2014).
- M. Zoli, Europhys. Lett. 110, 18001 (2015).
- A. Garai, D. Ghoshdastidar, S. Senapati, P.K. Maiti J. Chem. Phys. 149, 045104 (2018).
- A. Savelyev, Phys. Chem. Chem. Phys. 14, 2250-2254 (2012).
- A. Noy, R. Golestanian, Phys. Rev. Lett. 109, 228101 (2012).
- Y.Y. Wu, L. Bao, X. Zhang, Z.J. Tan, J. Chem. Phys. 142, 125103 ( 2015).
- X. Zhang, L. Bao, Y.Y. Wu, X.L. Zhu, Z.J. Tan, J. Chem. Phys. 147, 054901 (2017).
- M. Zoli, J. Chem. Phys. 148, 214902 (2018).