Parallel-Tempering Monte-Carlo Simulation with Feedback-Optimized Algorithm Applied to a Coil-to-Globule Transition of a Lattice Homopolymer
We present a study of the parallel tempering (replica exchange) Monte Carlo method, with special focus on the feedback-optimized parallel tempering algorithm, used for generating an optimal set of simulation temperatures. This method is applied to a lattice simulation of a homopolymer chain undergoing a coil-to-globule transition upon cooling. We select the optimal number of replicas for different chain lengths, N = 25, 50 and 75, using replica’s round-trip time in temperature space, in order to determine energy, specific heat, and squared end-to-end distance of the hopolymer chain for the selected temperatures. We also evaluate relative merits of this optimization method.
Monte Carlo (MC) simulations are widely used in polymer modeling Binder and Paul (2008) in order to extract the relevant thermodynamic and structural properties. The MC method is based on generating a random set of points in configuration space (where is a large number) which resemble the distribution of such points in thermal equilibrium. The practical problem is to limit , also referred to as the number of MC steps (MCS), to a manageable size, which would result in realistic times of simulation (we define MCS step as a one attempt to move each particle in a simulated system). For example, the standard Metropolis Algorithm (MA) Binder and Paul (2008) provides a straightforward way to generate such points, but the number of MCS to reproduce the equilibrium distribution is often prohibitively large. While the MA works quite well for high temperatures (typically above a phase transitions), it tends to generate traps in the local free energy minima for low temperatures (typically below a phase transition). Those traps often result in unreliable estimates of the sampled properties. To solve this problem, many modifications of the MA are proposed Wang and Landau (2001); Cerny (1985); Kirkpatrick et al. (1983). One of those is parallel tempering (PT) method Swendsen and Wang (1986); Earl and Deem (2005), in which by parallel simulation of many replicas, in the relevant temperature range, the energy barriers can be overcome. In general, one simulates copies of a system for a set of temperatures, , where . The minimum and the maximum temperatures, and , are fixed and the intermediate temperatures are selected to provide an optimal walk of replicas in the temperature space. The adjacent replicas are exchanged, at a given frequency in MCS, in random order with the following probability:
where , is the Boltzmann constant, and is potential energy of replica at . If the system is trapped at lower temperature, it can be heated up in higher temperature and overcome the energy barrier. By using such scheme, one can get better statistics in less MCS, but one is required to simulate replicas in parallel. However, to run such simulation one has to set some parameters: number of replicas (temperatures) , temperature set, , and exchange interval, . It is known that quality of the PT method strongly depends on those parameters, and if they are not chosen correctly, simulation can give the same results as without the PT. It is not clear how many replicas are optimal, and it is hard to establish the appropriate criteria for this choice. One approach is to consider some thermodynamic quantity (for example the specific heat, ) and to test it how much CPU-time is needed to obtain correct result, or run many simulations with the same number of MCS (or total MCS of all replicas), and to test measurement accuracy for this quantity. In another approach, one can find the optimum set that gives the shortest round-trip times of replicas in temperature space.
Feedback-optimized PT (FOPT) algorithm, which we use in this work, is designed to find temperature sets that maximize the current of replicas from the lowest temperature to the highest one (details in reference Katzgraber et al. (2006)). In this algorithm we mark each replica with a flag (when it reaches ) or (when it reaches ). Initially the replicas do not have any flag, but they acquire one of them if any of the above conditions is met. During the simulation, we generate histograms and . Before exchanging replicas, we check all replicas and increment histograms in such way that is increased if replica with has an flag, and is increased if it has a flag. If replica does not have any flag then neithter of histograms is increased. From this data we create the histogram:
The FOPT uses histogram to optimize temperature set in order to minimize round-trip times.
In this work, we use round-trip time to find optimal parameters set. We test systems of different sizes, , and with different number of replicas, , to find the optimum parameter set. We do not vary the exchange interval, , which is set to MCS. In the future, we intend to test the effect on the PT simulation results.
It is worth to notice, that there are also other methods for generating optimal temperature sets Gront and Kolinski (2007); Sabo et al. (2008); Predescu et al. (2005). One of them, is proposed by Sabe et al. Sabo et al. (2008), and is based mainly on the work of Kofke Kofke (2002) and Kone and Kofke Kone and Kofke (2005), in which the optimal temperature sets are found by using the condition of constant entropy increase for the adjacent replicas. As far as we know, round-trip time of replicas with FOPT was not used as a criterion for finding optimal number of replicas in PT simulation.
The simulation is performed on the face centered cubic (FCC) lattice with coordination number and the bond length where is the FCC lattice constant. Chain bond are not allowed to be stretched or broken, and the usual periodic conditions are imposed. Lattice sites, which do not have the chain segments, are considered to contain the implicit solvent. The interaction is limited to the nearest neighbors (), and the interaction parameter, is related to the Flory parameter by the following equation:
This parameter, , serves also as an energy unit to define the reduced energy per segment, and the reduced temperature as
where, as before, is the number of chain segments.
In this simulation we use a single homopolymer chain with lengths , and segments on the FCC lattice. Elementary moves consists of rotation of a segment inside a chain, moving first and last segment around neighbouring segment and slithering-snake move. Detailed description of this model can be found in references Wołoszczuk et al. (2008); Lewandowski et al. (2008).
Iii Results and discussion
First, the system is equilibrated in athermal limit (), then we perform MCS to equilibrate system in the specified temperature range, and another MCS is used to sample data. During the simulation we collect data for histogram and calculate the number of round trips that each replica performs. To get better ’s we assume that the minimum number of round-trips for each replica should be . If this requirement is not met, we extend the simulation “time” by another MCS. For each polymer length we perform many simulations with different number of replicas. For we use and ; for we use and ; for we use and . All systems are simulated in temperatures range to .
In each simulation we perform 6 rounds of the FOPT, starting with linear set chosen for the first round. We choose 6 iteration arbitrarily. As one can see in Figure 3 temperature set in round 4 and 5 is very similar and if we run more rounds, we will get similar sets fluctuating around ideal one. In the last round we run MCS to collect data to find optimal number of replicas. First MCS are used to equilibrate system and another MCS to collect the data (for longer chains and greater number of replicas we doubled or even tripled the number of MCS to obtain better statistics).
Before we show the results for optimal number of replicas, we look into the simulation process from the 1st round to the 5th round of the FOPT for chain length and and (the second is used as a reference for simulation with ). In Figure 1 we present three traces of selected replicas in the space from the 1st round. As one can see each of the presented replicas reaches and many times, but more often it reaches higher temperatures.
To use the full potential of the PT method, the set should be optimized in such way that each replica spends about the same amount of time at each temperature. We use the FOPT algorithm for that goal. Figure 2 presents histogram , Figure 3 presents generated set for each simulation round and Figure 4 presents exchange probability between adjacent replicas in the 1st, the 2nd and the 5th round. In the 1st round we use linear set, in next rounds temperatures are generated from the histogram obtained in the previous round. The 2nd round yields a reasonably optimized set, and in the next rounds, the set gets closer to the optimal solution. Black line in Figure 2 indicates ideal behavior. In the first round we can see that the -replica is trapped and that its exchange probability with the -replica is about . Exchange probabilities for other replicas are much higher and they grow with increasing . In the 2nd round the bottleneck at is removed and there is a significant improvement in the histogram, but we can notice another bottleneck at with the exchange probability of about . This bottleneck is removed in the following rounds. As seen from exchange probability for the 5th round, the optimal solution has varying exchange probabilities. In general, they should be higher for lower temperatures and for with larger specific heat.
iii.2 The long chain properties
Now we focus on selected thermodynamic and structural properties of the polymer chain of length with replicas. It is well known that a single homopolymer chain undergoes coil-to-globule transition upon cooling (in general, upon decreasing the solvent quality). In Figure 5 we present reduced energy per segment, in Figure 6 the specific heat, and in Figure 7 the squared end-to-end distance for 1st round (dashed), for the 5th round (dotted), and for the 6th round after MCS (solid) that is treated as a reference. From to the chain is in the swollen coiled state, then at about it undergoes coil-to-globule transition. For lower it is in a globular state, but at about there is another ordering effect, probably crystallization on the FCC network. It corresponds to forming an optimal structure within used lattice. As seen in Figure 8, globule below this transition is highly ordered. As seen in Figure 5, the energy for each round is almost the same, but the specific heat is not. Since in the 1st round we use linear set, we miss the low temperature ordering effect at about . Because the -replica is most likely trapped in a free energy minimum, probing specific heat for gives a wrong result. But with optimized set for the 5th round, we observe a considerable improvemanet of the specific heat for the same number of MCS.
As we can see, the PT method without optimization also gives reasonable results. It would be, therefore, reasonable to compare simulation results from simulation with and without the PT. Such comparisons can be found, for example, in reference Sikorski (2002). It is worth to notice that the FOPT not only speeds up the simulation process (giving better sampled data approximation) but also it sets temperatures in such way that the relevant temperature ranges are sampled more accurately (by arranging temperatures in areas where the specific heat is higher).
iii.3 Optimal number of replicas
To find optimal number of replicas for each system size, we perform simulations with different number of replicas and in each case we measure the round-trip time (to get from to , and then back from to ). Figure 9 presents average round-trip time for polymers length N = 25 (solid line), 50 (dashed line) and 75 (dotted line). For each N, the smallest possible number of replicas is chosen to run the PT so that the round-trip is actually performed. For smaller M’s, the replicas cannot perform any round-trips in the temperature space. However if we had a better set in the 1st round, it might run correctly. As we can see in Figure 9 when we increase the chain length, the replicas need more MCS to perform one round-trip. For , and the smallest round-trip time is , and MCS, respectively. Thus it seems to increase in a non-linear way. Also from Figure 9, we can see that with increasing number of replicas, the round-trip time is also increased. This is because a single replica has to take a longer path as is increased. For example, for it is only steps in one direction, and another in the other to complete a single round-trip, whereas for it is 98 steps in both directions. From this plot it is not clear what is the optimal . Therefore we report the round-trip time divided by (Figure 10) in order to get time needed to perform a single step in space. Figure 10 showes a clear minimum of the MCS time at a given M, for each N. While for and , it takes about MCS to perform a single step in space, when we increase we reach a minimum for with MCS. For higher we observe an increase of time needed to perform a single step. For the minimum number of replicas is , and in this case it takes MCS per single step. Again, with increasing , the number of MCS decreases, and for it reaches a minimum with MCS. For , we observe a similar behavior with minimum for at MCS. By dividing the minimum MCS by chain length, , we obtain about (Figure 11).
It is known that number of replicas grows like where is a size of a system Kofke (2002); Earl and Deem (2005). In principle, if one can find the optimal number of replicas for specified , and then one can estimate the optimal for any . However, in this simulations it is not confirmed. If we take with optimal , then optimal for and should be and , respectively. However, it is known that with increasing polymer chain length, coil-to-globule transition moves to higher temperatures, so it has different specific heat which affects the optimal . Another explanation could be that our choice for definition of optimal is not the same as in the other studies Rathore et al. (2005). On the other hand, we have only chain lengths so it is quite difficult to verify this scaling with only points. In our simulations we get an approximate “scaling”:
Average exchange probability between adjacent replicas for chain lengths , and is , and , respectively. This is quite large compared to results obtained by Kone and Kofke Kone and Kofke (2005) and by Rathore et al. Rathore et al. (2005) for systems with constant specific heat. They found that optimal exchange probability is about .
We perform Monte Carlo simulations of homopolymer chains of lengths , and in temperatures range to and find the the optimal number of replicas, , based on the shortest time needed to perform a single step in the temperature space. We also find that for each chain length, and 75, the optimal is and , respectively. The optimal is found to be approximately proportional to , relation.
However, if one changes temperature range, then a set of the FOPT simulations should be run again to find optimal for this system. For example, in this work we obtain that the optimal for , but if we set and , we would get different optimal . Finding the optimal for different ranges is not a subject of this work, but it may be crucial. Another significant parameter which is not changed during the simulation is the replica exchange interval, .
The FOPT is a promissing method for simulating polymer systems. This algorithm provides an iterative way to find optimal number of replicas, but it is quite CPU-time consuming (one has to perform numerous simulations to prepare an optimized set in which the targeted system is sampled). It can also be unstable, and sometimes the generated temperatures can be so inappropriate that the free energy bottlenecks are formed, which are difficult to overcome by the replicas’ walk in the space.
Acknowledgements.We gratefully acknowledge the computational grant from the Poznan Supercomputing and Networking Center (PCSS) and grant N202 287338 from Polish Ministry of Science and Higher Education.
- Binder and Paul (2008) K. Binder and W. Paul, Macromolecules 41, 4537 (2008).
- Wang and Landau (2001) F. Wang and D. P. Landau, Phys. Rev. Lett. 86, 2050 (2001).
- Cerny (1985) V. Cerny, Journal of Optimization Theory and Applications 45, 41 (1985).
- Kirkpatrick et al. (1983) S. Kirkpatrick, C. D. Gelatt Jr., and M. P. Vecchi, Science 220, 671 (1983).
- Swendsen and Wang (1986) R. H. Swendsen and J. S. Wang, Phys. Rev. Let. 57, 2607 (1986).
- Earl and Deem (2005) D. J. Earl and M. W. Deem, Phys. Chem. Chem. Phys. 7, 3910 (2005).
- Katzgraber et al. (2006) H. G. Katzgraber, S. Trebst, D. A. Huse, and M. Troyer, J. Stat. Mech. p. P03018 (2006).
- Gront and Kolinski (2007) D. Gront and A. Kolinski, J. Phys. Condens. Matter 19 (2007).
- Sabo et al. (2008) D. Sabo, M. Meuwly, D. L. Freeman, and J. D. Doll, J. Chem. Phys. 128 (2008).
- Predescu et al. (2005) C. Predescu, M. Predescu, and C. V. Ciobanu, J. Phys. Chem. B 109, 4189 (2005).
- Kofke (2002) D. A. Kofke, J. Chem. Phys. 117, 6911 (2002), 120, 206101 (2004).
- Kone and Kofke (2005) A. Kone and D. A. Kofke, J. Chem. Phys. 122, 1 (2005).
- Wołoszczuk et al. (2008) S. Wołoszczuk, M. Banaszak, P. Knychała, K. Lewandowski, and M. Radosz, J. Non-Cryst. Solids 354, 4138 (2008).
- Lewandowski et al. (2008) K. Lewandowski, P. Knychała, and M. Banaszak, Phys. Stat. Sol. B 245, 2524 (2008).
- Sikorski (2002) A. Sikorski, Macromolecules 35, 7132 (2002).
- Rathore et al. (2005) N. Rathore, M. Chopra, and J. J. de Pablo, J. Chem. Phys. 122, 024111 (2005).
FIG. 1: Traces of three replicas in the temperature space for chain length, , and the number of replicas, , in the 1st round.
FIG. 2: The histogram for each round of the FOPT simulation for and : the 1st round (dashed line with crosses), the 2nd round (short dashed line with stars), the 3rd round (dotted line with squares), the 4th round (dash-dotted line with filled squares), the 5th round (dash-double-dotted line with triangles) and a reference (solid line).
FIG. 3: The temperature set for each round of the FOPT simulation for and . In the first round we start with linear temperatures, and in next rounds temperatures are generated with the FOPT algorithm.
FIG. 4: Exchange probability for three FOPT rounds for and . Boxes filled with crossed lines are for the 1st round, boxes filled with black are for the 2nd round and boxes filled with lines are for the 5th round.
FIG. 5: Reduced energy per segment as a function of temperature for chain length with number of replicas : the 1st FOPT round (dashed line), the 2nd round (dotted line) and a reference (solid line). For the reference we use and MCS.
FIG. 6: Specific heat, , as a function of temperature, , for chain length, , with the number of replicas, : the 1st FOPT round (dashed line), the 2nd round (dotted line) and a reference (solid line). For the reference we use and MCS.
FIG. 7: Squared end-to-end distance as a function of temperature, , for chain length with number of replicas : the 1st FOPT round (dashed line), the 2nd round (dotted line) and a reference (solid line). For the reference we use and MCS.
FIG. 8: Snapshots of structures of polymer chain of length for temperatures: (swollen state), (coil-to-globule transition), (globular state) and (globular state after another ordering transition in ).
FIG. 9: Round-trip time in MCS as a function of number of replicas for chain lengths: (solid line), (dashed line) and (dotted line).
FIG. 10 Average time in MCS needed to perform a single step in temperature space as a function of number of replicas, , for chain lengths: (solid line), (dashed line) and (dotted line).
FIG. 11 Minimum average time needed to perform a single step in temperature space divided by chain length as a function of .