On the behavior of random RNA secondary structures near the glass transition

On the behavior of random RNA secondary structures near the glass transition

William D. Baez Department of Physics, The Ohio State University, Columbus, OH 43210    Kay Jörg Wiese CNRS-Laboratoire de Physique Théorique de l’Ecole Normale Supérieure, 24 rue Lhomond, 75005 Paris, France.    Ralf Bundschuh Department of Physics, The Ohio State University, Columbus, OH 43210, USA.
July 27, 2019
Abstract

RNA forms elaborate secondary structures through intramolecular base pairing. These structures perform critical biological functions within each cell. Due to the availability of a polynomial algorithm to calculate the partition function over these structures, they are also a suitable system for the statistical physics of disordered systems. In this model, below the denaturation temperature, random RNA secondary structures exist in one of two phases: a strongly disordered, low-temperature glass phase, and a weakly disordered, high-temperature molten phase. The probability of two bases to pair decays with their distance with an exponent 3/2 in the molten phase, and about 4/3 in the glass phase. Inspired by previous results from a renormalized field theory of the glass transition separating the two phases, we numerically study this transition. We introduce distinct order parameters for each phase, that both vanish at the critical point. We finally explore the driving mechanism behind this transition.

I Introduction

Heteropolymer folding is a fundamental biophysical process all living systems rely on. It is of medical relevance, since, e.g., misfolded proteins and nucleic acids are strongly implicated in the development of several neurological disorders, such as Alzheimer’s, Parkinson’s, and Lou Gehrig’s disease Bowman et al. (2011); Wu et al. (2013); Taylor et al. (2002); Conlon and Manley (2017); Hartl (2017); Ishiguro et al. (2017). While general questions of heteropolymer folding can be addressed both in proteins and in ribonucleic acids (RNA), RNA is the simpler of the two systems as it is composed of only four monomers (as opposed to 20 for proteins), the nucleotides adenine (A), cytosine (C), guanine (G), and uracil (U). These nucleotides have a tendency to form Watson-Crick (A-U, C-G) base pairs. To form these pairs, the RNA strand folds back onto itself, which leads to the creation of RNA secondary structures. From a statistical physics standpoint, heteropolymer folding presents a challenging task for the physics of disordered systems. In particular, RNA secondary structures are one of the few disordered systems for which one can calculate the partition function in polynomial time McCaskill (1990).

Previous studies show that RNA secondary structures exist within one of two well-identified phases. Above a critical temperature , the system is in a phase where sequence disorder does not play a significant role. This simplifying assumption allows one to model any random base sequence or a sequence with random pairing energies, , as a homopolymer. The defining trait for this phase is that the partition function of long RNA molecules is dominated by an exponentially large number of secondary structures with energies differing only at the order of . This phase is denoted the molten phase.

Figure 1: Diagrammatic and abstract representations of an RNA secondary structure. (a) The backbone of the molecule is represented by the solid gray line while the solid black lines stand for the hydrogen-bonded base pairs where each base is depicted as a circle. The shape of the backbone is such that stems of stacked base pairs and the loops connecting or terminating them can be clearly seen. Stems form double-helical structures similar to that of DNA. (b) The same structure as in (a) but with the backbone of the molecule stretched out turning the solid black lines representing each base pair into arches. The base pairs represented by the red dashed lines in (a) and (b) create a pseudoknot visible in (b) as a crossing of the arches. We exclude such pairings in our definition of a secondary structure. Pseudoknots, such as these, do not contribute much to the total energy and are often deemed part of an RNA molecule’s tertiary structure.

Beginning in 1968, de Gennes de Gennes (1968), while studying the folded homopolymer, laid the theoretical foundation for our understanding of the molten phase by showing that the probability of two bases to pair scales as , where is the distance between any two bases labeled i and j=i+ for 1 i j N, and that . In later studies Bundschuh and Hwa (2002a, b) it was demonstrated analytically that sequence disorder, when introduced perturbatively, is irrelevant for small disorder or high temperatures. It was also confirmed numerically that the pairing probability of a four-letter representation of the RNA heteropolymer model matches the scaling behavior of de Gennes’ earlier prediction at high enough temperature Bundschuh and Hwa (2002a, b).

At low temperatures sequence disorder can no longer be ignored. This (or “frozen”) phase is characterized by the existence of a small number of low-energy secondary structures. In the glass phase the scaling exponent of the pairing probabilities was found numerically to be  M. Müller et al. (2002). Properties of the glass phase have been extensively studied M. Müller et al. (2002); Bundschuh and Hwa (2002a, b); Liu and Bundschuh (2004); Lässig and Wiese (2006); David and Wiese (2007, 2009); Hui and Tang (2006); Pagnani et al. (2000); Marinari et al. (2002); Krzakala et al. (2002). While the existence of the glass phase is well established and the glass phase itself has been characterized extensively, there is yet no clear signature and characterization of the expected phase transition between the molten and the glass phase.

Through the use of a renormalized field theory  Lässig and Wiese (2006), Lässig and Wiese (LW) showed analytically that the freezing transition between the molten and the glass phase is of at least second order. They proved that their model is renormalizable at 1-loop order. They further argued that the value of the exponent at the transition is the same as in the glass phase. The field theory was later refined by David and Wiese David and Wiese (2007, 2009), showing that it is renormalizable to all orders in perturbation theory. At 2-loop order David and Wiese (2007, 2009) it then predicts that in the molten phase and at the transition and in the glass phase.

However, questions concerning the location of this critical point, the existence and behavior of a suitable order parameter, and the transition’s driving mechanism remained open. In the present study, through the use of numerical simulations and analytical tools inspired by the field theory, we study two order parameters which both vanish at the transition. One of them is non-zero in the molten phase, the other in the frozen phase. This allows us to precisely locate the critical point. Measuring pairing probability distributions, we explore the mechanism driving this transition.

This paper is organized as follows: In Sec. II we define the model. Sec. III details our efforts to locate the critical temperature by establishing two order parameters. Sec. IV probes the transition mechanism. In Sec. V we discuss our results.

Ii RNA Secondary-Structure Model

ii.1 RNA Secondary Structures

Each RNA molecule is described by its sequence of bases , its primary structure. Given a molecule, an RNA secondary structure, such as the one shown in Fig. 1, can then be defined by a list of ordered pairs with representing the base pairs formed by the molecule. Following previous studies in the field of RNA secondary structure reviewed in Higgs (2000), we require each base to pair with no more than one other base and we only consider structures that exclude pseudo-knots. The latter is accomplished by requiring that any two base pairs and with satisfy either or (see Fig. 1). Configurations violating this rule are termed pseudoknots. While pseudoknots are biologically relevant Chen and Greider (2005); Staple and Butcher (2005), the no pseudo-knot constraint is necessary to make both analytical and numerical calculations feasible and is justified due to their relatively infrequent occurrence in real folded RNA Tinoco and Bustamante (1999). We consider any pseudo-knots and base triples as parts of the tertiary structure of the molecule, i.e. its 3-dimensional conformation, given the secondary structure.

ii.2 Energy Model

In order to fully describe the statistical physics of RNA secondary structures we have to complement the definition of valid secondary structures by an energy function that assigns an energy to every structure. While very sophisticated energy models are available Lorenz et al. (2011), which allow detailed quantitative descriptions of the folding of actual RNA molecules, in this paper we follow previous studies Higgs (1996); Bundschuh and Hwa (2002a) that focus on the universal aspects of RNA secondary structure formation and adopt a simplified RNA energy model. Specifically, we consider contributions from each base pair, and associate with the pair of bases labeled and an interaction energy, . The total energy of a structure is defined as

(1)

In principle, the interaction energies depend on the identities of the bases and rendering the random variables with discrete values and a complicated correlation structure if the RNA sequences are chosen randomly. However, since we are here interested in universal properties of phase transitions, we further simplify our model Bundschuh and Hwa (2002a) by choosing these interaction energies as independent Gaussian random variables taken from the distribution

(2)

with mean energy zero and variance . Since is the only energy scale in this model, we set in our simulations, but report all results in units scaled by .

This kind of uncorrelated Gaussian-disorder model is expected to have less severe finite-size effects M. Müller et al. (2002) than other choices of disorder models.

ii.3 Partition Function

Once an energy has been assigned to each secondary structure , the partition function is defined as

(3)

where is the set of allowed secondary structures of bases, and = 1/. In the absence of pseudo-knots, the partition function can be studied by considering substrands of the total sequence from base i to base j. The restricted partition function for these substrands obeys the recursive equation McCaskill (1990)

(4)

in which by convention . In this recursion, the right-hand side involves only restricted partition functions for shorter substrands than the one on the left-hand side. Thus, the total partition function, , can be calculated by progressing from the shortest substrands to longer ones in time.

The analogous recursion equation Nussinov and Jacobson (1980)

(5)

holds for the energy of the lowest-energy structure on the substrand of the total sequence from base to base , where by convention. This equation can be used in conjunction with a backtracking mechanism Nussinov and Jacobson (1980) to calculate the (zero-temperature) ground-state secondary structure for any disorder configuration in time.

ii.4 Observables

From the partition function of RNA secondary structures, we can calculate a variety of physical observables that characterize the structural ensemble. Primarily, we study the pairing probability, , for a given base pair . This probability can be obtained as

(6)

where can be calculated from Eq. (4) and is the partition function of the sequence . This last quantity can be obtained when the recursion (4) is applied to a duplicated sequence and calculated as .

We are specifically interested in the dependence of moments of the ensemble averaged base-pairing probability on the distance between the two bases. Within the molten phase, and for large , this quantity has a power-law dependence

(7)

where the brackets stand for the ensemble average over the random base-pairing energies and is a critical exponent. The parameter allows probing different moments of the distribution of the base pairing probabilities . Note that the form (7) has a practical advantage over the asymptotic behavior , as it allows us to use larger in extrapolations.

The logarithms of each pairing probability,

(8)

have been interpreted as the “pinching free energy” Bundschuh and Hwa (2002a), which is the free-energy difference between a pinch of the monomers labeled i and j, and the unperturbed, or unpinched, state. In Sec. III, as in previous studies Bundschuh and Hwa (2002a), we will show how we can use the free energy of the largest possible pinch,

(9)

to obtain an estimate of the critical temperature. Here monomers and are treated as representatives of all splits of a molecule of bases into two equal pieces of length .

ii.5 Review of main results of the Lässig-Wiese field theory

Our numerical analysis of the glass transition is inspired by a field-theoretical calculation by Lässig and Wiese Lässig and Wiese (2006), and its refinements in David and Wiese (2007, 2009). This theory starts from the Gaussian-disorder model defined in section II.2. It introduces replicas (copies) of the RNA molecule and shows that the disorder-averaged partition function of the replicated system can be written in terms of two relevant operators. The first is the contact field that is if the bases labeled and in replica are paired and otherwise. The second is the overlap field between two replicas and defined as , i.e., is if the bases labeled and are paired in both replicas and and otherwise. The scaling dimensions of these operators, i.e., the exponents with which their thermal and disorder-averaged expectation values depend on the distance between the two bases, are called and , respectively.

(a)
 
(b)

Figure 2: Secondary structures of a random RNA molecule at distant times as presented in Lässig and Wiese (2006). The pairing overlap is defined by the common base pairings between the left and right configuration (corresponding bases are shown in black). (a) Above , the molecule contains conserved subfolds on scales up to the correlation length (indicated by shading) and is molten on large scales. (b) Below , the molecule is “locked” into its minimum-energy structure on all scales, up to rare fluctuations (unshaded).

Lässig and Wiese treat the disorder in the model perturbatively, prove renormalizability of the theory at first order (later extended to renormalizability to all orders by David and Wiese David and Wiese (2007, 2009)), calculate the function, find a critical point that they identify with the glass transition, and calculate the critical exponents and (the scaling dimensions of and ) at this transition. Importantly, they find expressions for the two critical exponents as a function of the scaling dimension of the disorder and the number of replicas. In the relevant limit of zero replicas, the expression for the exponent (the scaling dimension of at the transition) becomes smaller than the expression for the exponent (the scaling dimension of at the transition) as the scaling dimension of the disorder approaches its physical value. Since , it is impossible that the probability for two bases to be paired in multiple replicas decays slower than the probability for the same two bases to be paired in a single replica. Thus, Lässig and Wiese conclude that rather than crossing each other, the two exponents must become identical, i.e., that . They interpret this physically as the different replicas “locking” with each other, so that at the transition (and below) there is no difference between looking at the pairing behavior of one or multiple replicas. This results in the picture of small-scale locked regions in a background of an overall molten structure above the transition temperature, and an overall locked structure with small-scale molten regions below the transition temperature as indicated in Fig. 2. The length scale of these small-scale regions diverges as with as the phase transition is approached from either side. The actual values of the exponents from the 2-loop calculation by David and Wiese David and Wiese (2007, 2009) are

(10)

and thus

(11)

They are very close to the numbers of and provided by Lässig and Wiese in their original one-loop calculation Lässig and Wiese (2006).

To connect the results of the Lässig-Wiese theory to the observables described in section II.4, we have to realize that the pairing probability is the thermal average of the contact field . Since is the scaling dimension of the thermal and ensemble averaged contact field , it describes the scaling of the disorder average of the pairing probability , i.e., the quantity described by in Eq. (7). Thus,

(12)

Similarly, the thermal average of the overlap field equals the square of the pairing probability , since the bases labeled and have to independently pair in the two replicas. Since is the scaling dimension of the thermal and ensemble averaged overlap field , it describes the scaling of the disorder average of the square of the pairing probability , i.e., the quantity described by in Eq. (7). Thus,

(13)

ii.6 Numerical Approach

To investigate the RNA secondary-structure glass transition, we use RNA molecules of lengths within the range of . For each length, many independent realizations of the random base-pairing energies are chosen, and the partition function of all allowable secondary structures is calculated for each realization of the random variables using Eq. (4). Then, the observables described above are calculated for each realization of the random variables and averaged over those realizations (as well as over the starting base in case of the base-pairing probability). In the aim to keep the numerical effort manageable we averaged over 20,000 samples for , over 10,000 samples for , over 5,000 samples for , and over 1,000 samples for . We varied the temperature in a range of to capture the behavior deep within each phase and close to the phase transition, which we will find to be located at . Note that the smaller number of samples for large systems is partially compensated by a factor of in the statistics; e.g. the number of starting points for base pairs at a given distance grows as , and we average over these starting points. As a result, our statistics are almost independent of size at the chosen numbers of samples.

Iii Scaling of the contact and overlap observables

To understand the model’s behavior at the transition, we must know its precise location. Previous studies Bundschuh and Hwa (2002a); Pagnani et al. (2000); JankeWieseUnpublished () using different disorder models found that for the system sizes achievable in numerical simulations, thermodynamic signatures of the phase transition are quite weak, making an exact localization of the critical temperature challenging. In this section we will identify two order parameters inspired by the Lässig-Wiese field theory to obtain the precise location of the critical temperature .

iii.1 Pairing probability exponents alone do not specify transition temperature

Figure 3: Average pairing probability versus substrand length. The upper figure shows how the pairing probability scales as Eq. (7) in the glass phase at the lowest temperature considered, . As mentioned in the text, within this phase the critical exponent, , is insensitive to the value of . The lower figure shows how the pairing probability scales as Eq. (7) in the molten phase at the highest temperature considered, , but demonstrates how the value of directly depends on the value of .

In principle, the critical exponents alone should allow us to locate the phase-transition temperature . In the molten phase is independent of disorder and thus , i.e.,

(14)

where we identify exponents in the molten phase with a superscript . In the glass phase and at the transition, the Lässig-Wiese theory predicts (see Eqs. (10), (12), and (13))

(15)

independent of , where we denote exponents at the phase transition with a superscript asterisk and exponents in the glass phase with a superscript .

To verify this, we calculated the disorder-averaged base-pairing probabilities numerically for different temperatures at and fitted them to Eq. (7) to obtain . Indeed, Fig. 3 shows that the expected power laws hold, for values of , both at low temperature () and at high temperature ().

On the other hand, Fig. 4 shows the behavior of and across the transition, which we show later is at . One can see that the apparent values of the exponents change gradually from their value in the glass phase to their value in the molten phase as the system undergoes the phase transition. The steepness of this change only increases slightly as the system size is increased from to . Thus, these exponents cannot be used to precisely locate the critical temperature or to obtain the values of the exponents at the transition.

Figure 4: The exponents and as a function of temperature. As expected, becomes independent of at a value of at very low temperatures and approaches at high temperatures. However, the changes in apparent are so gradual and the steepness increases only so slightly as the system size is quadrupled that this quantity can neither be used to determine the critical temperature nor the exponents at the transition.

iii.2 Coarse Estimate of Phase Transition Temperature from Pinch Free Energies

To obtain a first estimate of the transition temperature, we follow Bundschuh and Hwa (2002a) and consider the disorder-averaged pinch free energy . As seen in the inset of Fig. 5, this quantity has a logarithmic dependence on the sequence length at both low and high temperatures Bundschuh and Hwa (2002a). We thus fit it to the linear form

(16)

for sequence lengths in the range 500 2000. The resulting prefactor is shown in the main part of Fig. 5. As expected de Gennes (1968), in the molten phase resulting in a linear dependence of the prefactor on temperature. However, for lower temperatures, the prefactor is non-monotonic in temperature and thus lends itself as an estimator of the critical temperature. The intersection of linear fits on either side of the minimum yields .

Figure 5: Pinch free energy prefactor versus temperature. The inset shows the disorder-averaged pinch free energy versus in both the glass phase () and the molten phase (). This allows us to extract the prefactors of these logarithmic behaviors, which are shown as a function of temperature in the main graph. The temperature at which the prefactor changes slope is an estimate of the phase transition temperature .

iii.3 Order Parameters for the Transition, and a More Precise Estimation of the Phase-Transition Temperature

Using this value of as a guide, we limited the range over which we performed our simulations to for sequence lengths up to . Since the scaling of the base-pairing probability by itself does not demonstrate a clear indication of when a finite system of length approaches , we defined two additional observables that do allow us to extract when the system approaches from either side. The first is the ratio /, which we would expect to be constant for large in the glass phase due to and revert back to a power law similar to Eq. (7) with an exponent of in the molten phase. Conversely, the second new observable, / we would expect to be constant in the high-temperature regime for large and then decay with the power law of Eq. (7) and an exponent of approximately at low temperatures. Figs. 6 and 7 show these observables for several temperatures, both above, below, and close to the critical temperature at . As indicated with dashed yellow lines, we fit these curves to the forms

(17)
(18)

with offsets and , exponents and , and global prefactors and .

Figure 6: High-temperature pairing ratio versus substrand length. The curves show the behavior of this new observable at several temperatures in the vicinity of and for temperatures deep within the glass phase () and molten phase (). The dashed yellow lines are fits to Eq. (17).

We identify the offsets and as the order parameters of the system. They scale as , where is a characteristic length scale, diverging at the transition as . This implies that and should depend on temperature as

(19a)
(19b)

To determine the two exponents, we vary them over a range of for each system size and fit a linear regression to versus , and versus for a small temperature range immediately below and above , respectively. We then choose for each system size the values of and that maximize the coefficient of determination of the linear fits. Lastly, we treat each system size as an independent realization (upon inspection the dependence of the optimal and on system size does not seem to have a systematic contribution) and determine the average and standard error of the mean of the two exponents. On the left side of the critical point (), we find

(20)

while on its right side ()

(21)
Figure 7: Low-temperature pairing ratio versus substrand length. Shown is the behavior of this new observable at several temperatures in the vicinity of and for temperatures deep within the glass phase () and molten phase (). The dashed yellow lines are fits to Eq. (18).

Fig. 8 shows and with the such determined values for and as functions of temperature. One clearly sees that each quantity approaches zero roughly linearly on its respective side of ( for and for ). Both cross zero at and become dependent on the system size on the opposite side of , where they remain small and appear to tend toward zero with increasing system size . We thus conclude that and are non-zero on their respective side of the phase transition and vanish on the opposite side, rendering them proper order parameters for the transition. We further conclude that the phase transition takes place at .

Figure 8: Appropriate powers of the order parameters and versus temperature. (a) As defined by Eq. (18) approaches from below it drops below zero. (b) Similarly, as defined by Eq. (17) approaches from above it becomes negative. Note that in order to take the appropriate powers of the order parameters, the power is taken of their absolute value and the sign is restored afterwards. On the side of the transition, where the order parameters and are negative, they tend to zero with increasing system size . This allows precise determination of the phase transition temperature .

In order to verify the consistency of the observed critical behavior, we note that the field theoretical predictions given in Eqs. (11) and (14)-(15) read

(22)

and

(23)

These values are in good agreement with our numerically determined values in Eqs. (20) and (21).

Iv Transition Mechanism

With the critical temperature in hand, we aim to understand the mechanism of the phase transition itself. Deep in the molten phase, base-pairing energies are irrelevant and as a consequence any base can pair with any other base. Thus, the thermally averaged base-pairing probability is the same for any pair of bases with the same distance and decays with the distance as the power law introduced in Eq. (7). This probability is independent of the disorder realization. In contrast, at zero temperature the RNA molecule folds into the ground-state structure, which is typically non-degenerate within the Gaussian-disorder model. Thus, for a given realization of the disorder the thermally averaged pairing probability of a given base pair is either zero (if the base pair is not in the ground-state structure) or one (if the base pair is in the ground-state structure). Which base pairs have a zero probability and which have a probability of one changes with the disorder realization. Thus, the scaling behavior (7) at zero temperature arises only after ensemble averaging.

As discussed in Sec. II.5, and illustrated in Fig. 2, the picture of the phase transition proposed in Lässig and Wiese (2006) is that as the phase transition is approached from below, the fraction of base pairs “locked” into a ground-state structure decreases by the appearance of molten regions. At the transition, the situation inverts and the “locked” base pairs become localized regions of decreasing size within an overall molten structure on the high-temperature side of the phase transition. These ideas are reflected in the order parameters introduced in Eqs. (17) and (18). Let us stress that while the expression “locked” used in Ref. Lässig and Wiese (2006) suggests that bases are paired with probability , this condition can be relaxed to mean that they are paired with probability , or even . This suffices to render the expectations (17) and (18) non-trivial. We will understand “locked” in this sense from now on.

iv.1 Base-pairing probability distribution

In order to directly probe the transition mechanism, we numerically calculate the entire distribution of base-pairing probabilities . We note that all moments of the disorder-averaged base-pairing probabilities can be reconstructed from these distributions as

(24)

We obtain these distributions by explicitly calculating the base-pairing probabilities using Eq. (6) and then tabulating their frequencies averaged over all and many realizations of the disorder. Moreover, due to the large range of base-pairing probabilities, instead of taking histograms of the themselves, we sample the distribution , where . The two distributions are connected to each other by and thus moments of the disorder-averaged base-pairing probability can be obtained from as

(25)

To reduce the statistical noise, we further use

(26)

where the range of integration is motivated by the fact that according to Eq. (7), the pairing probability is a function of , and in the chosen range the latter does not vary by more than .

Fig. 9 shows that in both temperature regimes the base-pairing probability distribution has a Gaussian-like shape. Fig. 9(a) is the distribution of at the highest temperature considered, , for a system of size . The distribution is centered around and we interpret it as a broadened version of what would be a -peak around the value of from Eq. (7) at infinite temperature or . Fig. 9(b) shows the same distribution at the lowest temperature numerically accessible to us, . Here we see a broad Gaussian-like distribution again. However, it is by far wider than the high-temperature distribution and is located at , i.e., at much smaller base-pairing probabilities than its high-temperature counterpart. We interpret this Gaussian-like feature as the contributions from all the base pairs that are not part of the ground-state structure and thus would have exactly zero probability at zero temperature, while acquiring a finite, albeit very small (, ), pairing probability at low temperatures. Importantly, as the inset of Fig. 9(b) shows, close to the origin ( or ) there exists another peak. We interpret this prominent peak at as the contributions from those base pairs that are “locked” in the ground-state structure and thus define the glass phase of this model. As the inset in Fig. 9(a) shows, no corresponding peak at occurs deep in the molten phase.

Figure 9: Distributions of the logarithm of base-pairing probabilities for system size averaged over the final third of all ’s. (a) Deep in the molten phase () the distribution of base-pairing probabilities has a distinct Gaussian-like form, which would go to a delta peak as (). The inset shows that no weight of the distribution is present close to (). (b) Deep in the glass phase () the base-pairing distribution still has a Gaussian-like form, albeit broader than the high-temperature distribution. As the inset shows there exists now a peak close to ().

As discussed above, when calculating the ensemble averaged pairing probability the distribution has to be integrated against . Thus, in order to appreciate where the weight contributing to the ensemble-averaged pairing probability is coming from in the two phases, Fig. 10 shows . Deep in the molten phase () multiplying with simply shifts the peak of the Gaussian-like distribution to somewhat smaller (larger ). In contrast, deep in the glass phase () the Gaussian peak at large observed in the distribution alone becomes invisible and only the feature at remains and is unaffected in amplitude by multiplication with . Thus, as suspected, in the glass phase only the small- range contributes significantly to the ensemble-averaged pairing probability.

Figure 10: Distributions of the logarithm of base-pairing probabilities for system size averaged over the final third of all ’s multiplied with for the same ranges as shown in Fig. 9, there without the factor of . (a) Deep in the molten phase () the distribution of base-pairing probabilities has a distinct Gaussian-like form, which would go to a delta peak as (). (b) Deep in the glass phase () the base-pairing distribution consists of only a peak at (). Note the difference in scales between figures 9(a) and 10(a) while the scales in figures 9(b) and 10(b) are identical.
Figure 11: Separation of the glass-phase ground and non-ground-state distributions. (a) The total base-pairing distribution within the glass phase (). (b) The ground-state base-pairing distribution (note the difference in scale). This peak represents the frequency of those pairs “locked” in the ground-state structure and are the contributing weight seen in the inset of Fig. 9(b). (c) The non-ground-state base-pairing distribution. As mentioned in the text, the broad Gaussian-like peak seen in the low-temperature regime is a result of non-ground-state pairs that have a small, non-zero pairing probability at finite temperatures. These probabilities would be zero at .
Figure 12: Separation of the molten-phase ground and non-ground-state distributions. (a) The total base-pairing distribution within the molten phase (). (b) The ground-state base-pairing distribution (note the difference in scale). The lack of weight close to is due to a significantly reduced pairing probability of ground-state base pairs at high temperatures leading to the absence of any feature in the inset of Fig. 9(a). (c) The non-ground-state pairs dominate the total pairing probability distribution.

iv.2 Locked base pairs at finite temperature stem from ground-state base pairs at zero temperature

In order to confirm that the peak at in the low-temperature behavior is indeed generated from the base pairs that form the ground-state structure at , we explicitly separate the distribution into two sub-distributions. We use the approach outlined at the end of Sec. II.3 to calculate the ground-state structure for every disorder configuration and then tabulate the base-pairing probabilities across all for base pairs participating in the ground-state structure separately from those which do not. This allows us to split the entire base-pairing probability into a ground-state contribution and a non-ground-state contribution . Figs. 11 and 12 show these ground-state and non-ground-state distributions at the lowest and highest temperatures considered. In Fig. 11(b) one can see that the ground-state pairs directly contribute to the distinct peak close to , which we interpret as those base pairs that are “locked” in the ground-state structures. This further confirms that it is the non-ground-state pairs that make up the bulk of the total distribution seen in Fig. 9(b), each of which would otherwise be zero at . Fig. 12 then explains the absence of any weight in the inset of Fig. 9(a). The lack of ground-state pair contributions in the region indicates that the system is no longer in the glass phase at this temperature. Since the transition is driven by the behavior of the ground-state base pairs alone, we focus on from now on.

Figure 13: Fraction of bins at with zero counts as a function of temperature for . We split the range (corresponding to ) into equal bins of width . A bin has zero counts when not a single ground-state base pair with in disorder configurations has a thermally averaged base pairing probability in the interval of associated with the bin. The fraction of bins with zero counts dramatically increases from near zero to near one at the phase transition temperature. The horizontal solid line indicates where 10% of the bins have zero counts, the vertical line the location of .

iv.3 Locked base pairs disappear at the phase transition

Our next goal is to quantify how the weight of “locked” ground-state base pairs changes through the transition. Since a ground-state base pair remaining “locked” at a finite temperature does not mean that the pairing probability of this base pair is one, but rather that it is sizable in some way, we introduce a pragmatic way to characterize the disappearance of “locked” base pairs. Specifically, we divide the range (corresponding to ) into an equal number of bins and ask which fraction of these bins receive at least one ground-state base pair with when we sample independent disorder configurations (for ). This quantity is shown in Fig. 13. While its detailed behavior is definitely dependent on the choice of the upper cutoff on , the width of the bins, and the number of samples, it is quite clear from the very strong temperature dependence shown in Fig. 13 that essentially all log-probabilities are present for , while they are absent for . This switch should only moderately depend on our particular choices. It is consistent with the disappearance of locked base pairs at the phase transition.

iv.4 Behavior of the base-pairing probability distribution at

Figure 14: Average ground-state base-pairing probabilities versus for system size . (a) At , and all temperatures below, the region close to is the dominant peak. (b) As the temperature is increased, the height of these peaks all drop to roughly the same height as the neighboring peaks. (c) Above , the probability density at is actually lower than at moderate .

As we have seen in Fig. 11, the base-pairing probability distribution is highly peaked at deep in the glass phase, consistent with a picture in which a base pair that is part of the ground-state structure at zero temperature still has probability essentially at . Since we have seen in Sec. IV.3 that the entire base-pairing probability at () disappears at the transition to the molten phase, it is of interest to look at the shape of the distribution as the transition is approached. Fig. 14 shows these distributions at various temperatures below, but close to, the glass-transition temperature. The most striking feature is that they switch from a behavior where the bulk of the distribution is at () at low temperatures to a behavior where base-pairing probabilities with small are still prevalent but probabilities at () itself are suppressed. An intermediate case of a distribution that is essentially independent of over the entire range appears at .

Figure 15: Double logarithmic plot of the average ground base-pairing distributions versus for system size . At temperatures below the glass transition, the ground-state pairing distributions at moderate pairing probabilities are approximate power laws (black dashed fit lines, bins with zero counts ignored in the fits) whose exponents turn from negative to positive as the temperature increases.
Figure 16: versus temperature. When plotted against temperature, the slopes of the fit lines from Fig. 15 cross zero at a point between and , which could be interpreted as the secondary freezing event mentioned in Lässig and Wiese (2006).

When plotting the same data on a double logarithmic scale (Fig. 15) we find that over the range of the ground-state base-pairing distribution is well described by a power law

(27)

that allows us to capture its behavior in terms of a single temperature-dependent exponent . Figure 15 shows that the slope of these fits (the black dotted lines) turns from negative to positive with increasing temperature. The temperature dependence of the exponent is shown in Fig. 16. For it approaches consistent with a true -peak of the distribution at base-pairing probability one. The exponent crosses zero for between and and tends to one at the glass-transition temperature, where it becomes impossible to determine it, since too many bins in receive zero counts (see above). We conclude that while the transition, at which high-probability base pairs disappear is at as shown in Sec. III, already at a lower temperature of the system transitions from an ensemble where the bulk of the pairing probability distribution is at to an ensemble where the probability distribution has a power law with positive exponent at , i.e., where the probability density goes to zero for pairing probabilities of . We call this transition, close in spirit to the one proposed in Lässig and Wiese (2006), secondary freezing, stressing that we have not found any signature of a thermodynamic phase transition.

V Conclusions

In this work, we numerically studied the glass-molten phase transition of RNA secondary structures using a Gaussian disorder model. With the guidance of previous studies of this system using renormalization-group theory Lässig and Wiese (2006); David and Wiese (2007, 2009) we determined the precise location of the transition by the introduction of two order parameters, one for the glass phase, and one for the molten phase. In addition to precisely locating the critical temperature of this system, we confirmed the numerical values of the critical exponents at the phase transition predicted by the field theory.

We provide an explanation as to how this model transitions between its molten and glass states by studying the behavior of the base-pairing probability distributions. In particular, we show that the total base-pairing probability distribution can be broken into two sub-distributions composed of base pairs that are “locked” in the ground-state structures below the glass transition temperature, , and non-ground-state base pairs, , that never develop a sizable pairing probability. Interestingly, we also identify a potential secondary freezing event Lässig and Wiese (2006) within the low temperature phase. At temperatures below this secondary freezing event, the most prevalent base pairing probability of ground-state base pairs remains one even at finite temperatures indicating that they are truly locked. Above this secondary freezing event but below the glass transition temperature, base pairs with significant pairing probabilities () remain prevalent but the density of base pairs with pairing probability one itself vanishes, indicating that locking is maintained only in the weaker sense that locked base pairs are common to a finite fraction of the thermodynamic ensemble rather than to “all” structures. It would be interesting to study the implications of this secondary freezing event on the kinetics of RNA structures.

Acknowledgments

This material is based upon work supported by the National Science Foundation under Grants No. DMR-1410172 and DMR-1719316. K.W. thanks W. Janke for discussions and early simulations addressing the Lässig-Wiese phase transition scenario, and for pointing out the importance of finite-size effects. We acknowledge useful discussions with F. David and M. Lässig.

References

Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
267882
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description