Skip to main content
ACS AuthorChoice logoLink to ACS AuthorChoice
. 2015 Feb 6;11(3):1077–1085. doi: 10.1021/ct5009087

Enhanced Conformational Sampling Using Replica Exchange with Collective-Variable Tempering

Alejandro Gil-Ley 1, Giovanni Bussi 1,*
PMCID: PMC4364913  PMID: 25838811

Abstract

graphic file with name ct-2014-009087_0008.jpg

The computational study of conformational transitions in RNA and proteins with atomistic molecular dynamics often requires suitable enhanced sampling techniques. We here introduce a novel method where concurrent metadynamics are integrated in a Hamiltonian replica-exchange scheme. The ladder of replicas is built with different strengths of the bias potential exploiting the tunability of well-tempered metadynamics. Using this method, free-energy barriers of individual collective variables are significantly reduced compared with simple force-field scaling. The introduced methodology is flexible and allows adaptive bias potentials to be self-consistently constructed for a large number of simple collective variables, such as distances and dihedral angles. The method is tested on alanine dipeptide and applied to the difficult problem of conformational sampling in a tetranucleotide.

Introduction

Biomolecular dynamics involves a wide range of time scales ranging from bond fluctuations to slow large-scale motions.13 Molecular dynamics (MD) with accurate force fields can in principle be used as a virtual microscope to investigate these motions at atomistic resolution.4 However, its applicability to problems such as folding or conformational transitions in proteins and RNA is limited by the fact that only short time scales (∼μs) are directly accessible by straightforward simulation. In spite of the development of ad hoc hardware,5 many relevant processes are still out of reach for accurate atomistic modeling. Several different techniques have been developed in the last decades to address this issue. These techniques can be roughly classified in two groups.6 In the first group, inspired by annealing,7 ergodicity is achieved by increasing the temperature8,9 or by artificially modifying the Hamiltonian.1013 In methods of this group a ladder of replicas with different degree of ergodicity is often employed. Swaps of coordinates between neighboring replicas are periodically attempted and accepted or rejected with a Metropolis criterion. These methods are usually referred to as replica exchange molecular dynamics (REMD). Whereas temperature is certainly the most adopted control parameter, temperature REMD (T-REMD) is computationally demanding for solvated systems, because replica spacing and interchange probability depends on the system size.14 Moreover, T-REMD is ineffective on entropic barriers.15,16 Scaling of portions of the Hamiltonian (H-REMD) is a common alternative and could have a better convergence behavior for large systems. T-REMD and H-REMD can also be combined, by integrating both schemes on each replica17 or in a multidimensional framework.18 The second group of enhanced sampling techniques includes methods based on importance sampling, where suitable collective variables (CVs) are a priori selected and biased. This class has its root in umbrella sampling method19 and includes local elevation,20 conformational flooding,21 adaptive biasing force (ABF),22 and metadynamics.23,24 These methods are very efficient but require large a priori information. With the notable exception of bias-exchange metadynamics,25 approaches based on importance sampling have been traditionally applied to a small number of CVs at a time, due to the difficulties in building a history dependent potential in a high-dimensional CV space. For many systems it is difficult to find a small number of effective CVs that describe the slow degrees of freedom, and one often has to resort to expensive methods of the first class. For instance, conformational transitions in unstructured oligonucleotides have been studied with different REMD schemes.18,26,27 In these studies, the generation of a converged conformational ensemble was proven a cumbersome task, even when a high number of replicas and tens of μs of simulated time were employed.

Methods bridging between these two classes can be designed by biasing a large number of local CVs (e.g., dihedral angles), so as to avoid the complication of designing ad hoc CVs. For example, Straatsmaa and McCammon28 introduced a technique where bias potentials acting on dihedrals was used in a simulated annealing protocol. In that work a bias was fitted to the potential of mean force of backbone dihedrals and then used to quickly optimize the structure of a polypeptide. Zacharias and collaborators29,30 used a similar technique to build bias potentials for dihedrals to be employed in H-REMD and successfully applied it to study conformational changes in proteins. However, these potentials were fitted on a model system and cannot account for the specific identity of each residue and for the cross-talk between correlated dihedrals. For nucleic acids, where the complex backbone does not allow a straightforward application of this technique, penalty potentials centered on the stable rotamers were manually selected with a procedure that seems difficult to generalize.3133

In this paper we propose to use concurrent well-tempered metadynamics24 (WT-MetaD) to build bias potentials acting on a large number of local CVs. We then show how to integrate this approach in a H-REMD scheme, exploiting the replica ladder to obtain unbiased conformations. In WT-MetaD the compensation of the underlying free-energy landscape is modulated by the so-called bias factor γ. We here change this parameter across the replica ladder, adjusting the ergodicity of each replica. The final bias could be also used as a static potential so as to completely eliminate any nonequilibrium effect. Since the effect of the bias is that of keeping the chosen CVs at an effectively higher temperature, we refer to the introduced method as replica exchange with collective-variable tempering (RECT). The method is first tested on alanine dipeptide in water and then applied to the conformational sampling of a RNA tetranucleotide where it outperforms dihedral-scaling REMD and plain MD. The chosen tetranucleotide is a very challenging system that has been recently studied with different variants of REMD.18,26,27

Methods

In this section we show how to use WT-MetaD as an effective method to build bias potentials that allow barriers to be easily crossed. One of the input parameters of well-tempered metadynamics is a boosting temperature ΔT = (γ–1)T, where γ is the bias factor and T is the temperature of the system. In the rest of the paper we will equivalently use either γ or ΔT so as to simplify the notation. This parameter can be used to smoothly interpolate between unbiased sampling (γ = 1, ΔT = 0) and flat histogram (γ → ∞, ΔT → ∞). One can thus introduce a set of replicas using different values of ΔT, ranging from 0 to a value large enough to allow all the relevant barriers to be crossed. Metadynamics relies on the accumulation of a history dependent potential and cannot be applied straightforwardly to a large number of CVs. In the next subsection we show that this issue can be circumvented by performing many, low-dimensional, concurrent metadynamics simulations. We then show how to combine many simulations of this kind in a multiple-replica scheme.

Concurrent Well-tempered Metadynamics

In well-tempered metadynamics a history dependent potential V(s,t) acting on the collective variable s is introduced and evolved according to the following equation of motion

graphic file with name ct-2014-009087_m001.jpg

Here kB is the Boltzmann constant, T is the temperature, τB is the characteristic time for the bias evolution, ΔT is a boosting temperature, and K is a kernel function which is usually defined as a Gaussian. For simplicity we consider the case of a single CV. The variance of the Gaussian provides the binning in CV space and is usually chosen based on CV fluctuations or adjusted on the fly.34 By assuming that the bias is growing uniformly with time one can show rigorously24,35 that in the long time limit the bias potential tends to

graphic file with name ct-2014-009087_m002.jpg 1

so that the following probability distribution is sampled

graphic file with name ct-2014-009087_m003.jpg

The role of ΔT is that of setting the effective temperature for the CV. The explored conformations are thus taken from an ensemble where that CV only is kept at an artificially high temperature, similar to other methods,3638 but has the nice feature that it is obtained with a bias that is quasi-static in the long lime limit. The bias is usually grown by adding a Gaussian every NG steps. As a consequence, to obtain an initial growing rate equal to (kBΔT)/(τB), the initial Gaussian height should be chosen equal to ((kBΔT)/(τB))NGΔt where Δt is the MD time step.

We here propose to introduce a separate history-dependent potential on each CV

graphic file with name ct-2014-009087_m004.jpg

where α = 1,...,NCV is the index of the CV, and NCV is the number of CVs. The growth of each of these bias potentials will depend only on the marginal probability for each CV

graphic file with name ct-2014-009087_m005.jpg

In the long time limit, this potential will tend to flatten the marginal probabilities for every single CV. Since CVs are in general nonorthogonal between each other, one should consider the fact that whenever a bias is added on a CV also the distribution of the other CVs is affected. In the following we will discuss this issue considering two CVs only, but the argument is straightforwardly generalized to a larger number of CVs.

Two Independent Variables. If two CVs are independent, the joint probability is just the product of the two marginal probabilities, i.e. P(sα,sβ) = Pα(sα)Pβ(sβ). Adding a bias potential on a CV will not affect the distribution of the other. As a consequence, in the long time limit the two bias potentials will converge independently to the predicted fraction of the free energy as in eq 1. The final bias potential will be completely equivalent to that obtained from a two-dimensional well-tempered metadynamics but will only need the accumulation of two one-dimensional histograms, thus requiring a fraction of the time to converge. A simple example on a model potential is shown in Figure S1.

Two Identical Variables. We also consider the case of two identical CVs, sα = sβ. This can be obtained for instance by biasing twice the same torsional angle. Here the potentials Vα and Vβ will grow identically, and the total bias potential acting on sα will be Vtot = 2Vα. The total potential will grow as

graphic file with name ct-2014-009087_m006.jpg

Thus, the net effect will be exactly equivalent to that of choosing a doubled ΔT parameter. In other words, the ΔT parameter acts in an additive way on the selected CVs. A similar effect can be expected if two CVs are linearly correlated.

In realistic applications one can expect the behavior to be somewhere in the middle between these two limiting cases. The most important consideration here is that the bias potentials will tend to flatten all the marginal probabilities, but there will be no guarantee that the joint probability is flattened. Results for a simple functional form can be seen in Figure S2. In the same figure it is possible to appreciate the importance of using a self-consistent procedure when CVs are correlated. In ref (39) two metadynamics were applied on top of each other, namely on the potential energy and on selected CVs, in a non-self-consistent way. This was possible because the correlation between the potential energy and the selected CVs is small. The need for a self-consistent solution was also pointed out in a recent paper40 where a generalization of the ABF method22 was introduced. In that work independent one-dimensional adaptive forces were applied at the same time to different CVs so as to enhance the sampling of a high multidimensional space.

In short, the novelty of the introduced procedure is that many low-dimensional metadynamics potentials are grown instead of a single multidimensional one. This allows the bias to converge very quickly to a flattening potential, with the degree of flatness controlled by the parameter ΔT. The flattening is expected to enhance conformational transitions which are otherwise hindered by free-energy barriers on the biased CVs. When variables are correlated the exact relationship between bias and free energy (eq 1) could be lost.

Hamiltonian Replica Exchange

The procedure introduced above produces conformations in an ensemble which is in general difficult to predict. However, since the bias potential is known, one can in principle reweight results so as to extract conformations in the canonical ensemble. In the case of static bias potentials acting on the CVs, this can be done by weighting each frame as e(∑αVα(sα))/(kBT). This can provide in principle correct results even if the joint probability is not flattened. It must be noticed that such a reweighting can provide statistically meaningful results only for small fluctuations of the total biasing potentials, on the order of kBT.40 However, in a typical setup one would be interested in biasing all the torsional angles of a molecule. Even if each of them contributes with a few kBT, the total fluctuation of the bias would grow with the system size. For similar reasons, also the ABF-based scheme introduced in ref (40) is limited to a relatively low number of CVs.

A more robust and scalable procedure can be designed by introducing a ladder of replicas with increasing values of ΔT, ranging from 0 to a value large enough to enhance the relevant conformational transitions. The first replica (γ = 1, ΔT = 0) can be used to accumulate unbiased statistics. Replicas other than the first one feel multiple biasing potentials on all the CVs. From time to time an exchange of coordinates between neighboring replicas is proposed and accepted with probability chosen so as to enforce detailed balance with respect to the current biasing potential:

graphic file with name ct-2014-009087_m007.jpg
graphic file with name ct-2014-009087_m008.jpg

Here the suffix i = 1,...,Nrep indicates the replica index, Nrep being the number of replicas. The exchanges allow the bias potential of every single replica to grow as close as possible to equilibrium taking advantage of the enhanced ergodicity of the more biased replicas. We notice that to reach a quasi-static distribution it is necessary that all the bias potentials converge for all the replicas. Since the time scale for convergence is related to the parameter τB,24 it is convenient to use the same τB for all the replicas or, equivalently, to choose the initial deposition rate as proportional to ΔT. The number of replicas required to span a given range in the ΔT parameter is proportional to (NCV)1/2, thus allowing for a very large number of CVs. We underline that, if a very large number of CVs has to be calculated at every step on every replica, the overall performances could be degraded. This issue could be tackled using e.g. multiple time-step schemes.41,42 In the examples presented here this performance issue was not observed.

We notice that in principle one could use the bias potentials built with this protocol to perform a replica-exchange umbrella sampling simulation. In this manner the final production run would be performed with an equilibrium replica-exchange simulation. However, we observe that well-tempered metadynamics is designed so that the speed at which the bias grows decreases with time and the potential becomes quasi-static. In the practical cases we investigated, this second stage was not necessary.

Model Systems

Alanine Dipeptide

Alanine dipeptide (dALA) was modeled with the Amber99SB-ILDN43,44 force field and solvated in a truncated octahedron box containing 599 TIP3P45 water molecules. The LINCS46,47 algorithm was used to constrain all bonds, and equations of motion were integrated with a time step of 2 fs. For each replica the system temperature was kept at 300 K by the stochastic velocity rescaling thermostat.48 For all nonbonded interactions the direct space cutoff was set to 0.8 nm, and the electrostatic long-range interactions were treated using the default particle-mesh Ewald49 settings. All the simulations were run using GROMACS 4.6.550 patched with the PLUMED 2.0 plugin.51 We underline that the possibility of running concurrent metadynamics within the same replica is a novelty introduced in PLUMED 2.0.

The RECT simulation was performed with 6 replicas. The backbones dihedral angles (Ψ and Φ) and the gyration radius (Rg) were selected as CVs. The γ factors were chosen from 1 to 15 following a geometric distribution. We recall that a geometric replica distribution is optimal for constant specific-heat systems. In RECT, this would be true if the exploration of each of the biased CVs were limited to a quasi-parabolic minimum in the free-energy landscape. Whereas this is clearly not true in real cases (e.g., double-well landscapes) we found that a geometric schedule was leading to a reasonable acceptance in the cases investigated here. The possibility of optimizing the replica ladder is left as a subject for further investigation. For the dihedral angles the Gaussian width was set to 0.35 rad and for the Rg to 0.007 nm. The Gaussians were deposited every 500 steps. The initial Gaussian height was adjusted to the ΔT of each replica, according to the relation h = (kBΔT)/(τB)NGΔt, in order to maintain the same τB = 12 ps across the entire replica ladder. The CVs were monitored every 100 steps, and exchanges were attempted with the same frequency. The simulation was run for 20 ns per replica.

A H-REMD simulation where the force-field dihedral terms were scaled (Hdih-REMD) was also performed, as implemented in an in-house version of the GROMACS code.52 The same initial structures, number of replicas, and simulated time as in RECT were used. The scaling factor λ for each replica was selected using the relation λ = 1/γ to allow for a fair comparison of RECT and H-REMD. Finally a conventional MD simulation in the NVT ensemble was run for 120 ns using the same settings.

Tetranucleotide

The second system considered was an RNA oligonucleotide, sequence GACC. The initial coordinates were taken from a ribosome crystal structure (PDB: 3G6E), residue 2623 to 2626. Simulations were performed using the Amber99-bsc0χOL3 force field.43,53,54 The system was solvated in a box containing 2502 TIP3P45 water molecules, and the system charge was neutralized by adding 3 Na+ counterions, consistently with previous simulations.26,55 A RECT simulation was performed using 16 replicas simulated for 300 ns each. The γ ladder was chosen in the range from 1 to 4 following again a geometric distribution. The initial structures for the H-REMD were taken from a 500 ps MD at 600 K, to avoid correlations of the bias during the initial deposition stage of the WT-MetaD. Other details of the simulation protocol were chosen as for the previous system. As depicted in Figure 1, for each residue the dihedrals of the nucleic acid backbone (α, β, ϵ, γ, ς), together with the pseudodihedrals angles of the ribose ring (θ1 and θ2) and the glycosidic torsion angle (χ) were chosen as CVs. To help the free rotation of the nucleotide heterocyclic base around the glycosidic bond, the minimum distance between the center of mass of each base with the other three bases was also biased. For the WT-MetaD we used the same parameters as in the previous system. Gaussian width for the minimum distance between bases was chosen equal to 0.05 nm.

Figure 1.

Figure 1

Schematic representation of the collective variables used for the tetranucleotide simulation. For each nucleotide, the labeled dihedral angles and the minimum distance between each nucleobase center of mass and the other three nucleobases were biased.

For this system a Hdih-REMD, a T-REMD, and a plain MD simulation were performed in addition to the RECT. In the case of Hdih-REMD we used 24 replicas with scaling factors λ ranging from 1 to 0.25, so as to cover the same range of the γ values chosen for the RECT. In the T-REMD 24 replicas were used to cover a temperature range between 300 and 400 K with a geometric distribution. For both methods, T-REMD and Hdih-REMD, the simulation length was 200 ns per replica. Exchanges were attempted every 120 steps. The conventional MD simulation was run for 4.8 μs. All the simulations (RECT, Hdih-REMD, T-REMD, and conventional MD) correspond to the same total simulated time.

Analysis

Dihedral Entropy

As the bias compensates the underlying free energy the probability distribution of the biased CVs is partially flattened. The main CVs used in our method are dihedrals angles. To quantify the effect of the Hamiltonian modifications on the angle distributions one-dimensional entropies (S1d) were estimated. The calculation procedure was equivalent to the one used in ref (56) to evaluate the configurational entropy associated with soft degrees of freedom in proteins. We employed wrapped Gaussian kernels to estimate the histogram profile of each dihedral. Histograms were calculated with PLUMED 2.0. For all the distributions the bandwidth for the kernel density estimation was set to 0.017 rad. We underline that using this definition we only evaluate the flatness of the individual one-dimensional distributions, and cross-correlation between CVs is ignored.

RNA Conformations

RNA conformations were classified according to the combination of the χ angle rotameric state of each nucleotide. Torsion orientations in the range of −0.26 to 2.01 rad were considered as syn, while the remaining ones were classified as anti. The limiting values were chosen according to the position of the barriers in the χ free-energy profiles of all the residues. The result of this clustering procedure gave 24= 16 different states that are kinetically well separated by the high torsional barriers. We observe that the population of these states does not depend only on the torsional potential associated with the χ dihedrals but include contributions from base–base stacking, hydrogen bonds, solvation of bases, etc.

Results

In this section we first show the validation of our methodology on a standard model system, dALA in water. Then we present results for the more challenging case of the conformational sampling of a tetranucleotide. For all the applications we benchmark against plain MD and a H-REMD where the dihedral potentials are scaled. For the tetranucleotide also results obtained with T-REMD are shown. All the comparisons are made using the same total simulated time.

Alanine Dipeptide

The goal of the introduced method is to enhance conformational sampling in the unbiased replica. The possibility to explore different metastable conformations in this replica relies on the fact that probability distributions in the biased replicas are flattened and that conformations can travel across the replica ladder. These conditions can be verified by monitoring the exchange rate and the flatness of the distributions.

The acceptance rate is in the range 65–72% for RECT and in the range 43–53% for Hdih-REMD, indicating that the former method requires less replicas. This is likely due to the fact that the total number of scaled dihedrals in Hdih-REMD is larger than the number of biased CVs in RECT. For both REMD methods we also verified that all the trajectories in the generalized ensemble sampled the same conformational ensemble (see Figure S3).

A quantitative measure of the flatness of the distribution in the biased replicas can be obtained from the dihedral entropy, shown in Figure 2 as a function of scaling factors (γ and λ for RECT and Hdih-REMD, respectively). The limiting value corresponding to a flat distribution is also indicated. Entropy grows faster as a function of the scaling factor when using RECT, indicating that free-energy barriers on the dALA isomerization transition are more effectively compensated by the bias potentials. With Hdih-REMD entropy of Ψ angle saturates and apparently the distribution cannot be further flattened by decreasing λ. In the case of the Φ angle, the dihedral entropy does not grow monotonically when λ is decreased. This behavior indicates that the relevant free-energy barriers are not only originating from the dihedral force-field terms. The conformational transitions involve indeed also changes in water coordination, reorganization of hydrogen bonds, nonbonded interactions, etc. On the contrary, RECT achieves an almost flat distribution for both dihedral angles at the highest value of the γ factor. Backbone dihedral distributions for all the replicas are shown in Figure S4. The conformations sampled on each replica are shown projected on the Φ,Ψ free-energy landscape in Figure S5, where it can be appreciated that all the relevant basins (α, β, and αR) are explored and connected by points close to the minimum-action pathways (see refs (40 and 57)).

Figure 2.

Figure 2

Entropy for Ψ(top) and Φ(bottom) dihedral angles in alanine dipeptide. Entropies are shown as a function of 1/λ and γ for Hdih-REMD and RECT, respectively. As the entropy increases the dihedral distributions become more flat. The maximum entropy value corresponding to a flat distribution is represented with a straight line.

To assess the efficiency and the accuracy of the introduced enhanced sampling technique the free-energy difference ΔF between the states ϕ∈[−π,0] and ϕ∈[0, (π)/(2)] was calculated from the distribution of the unbiased replica. Results are shown as a function of time in Figure 3, for the two REMD schemes and for the reference conventional MD. Both H-REMD methods converge to the right value with a similar behavior, whereas plain MD needs several tens of ns for the first transition to be observed. The similarity in the convergence of RECT and Hdih-REMD indicates that for this system the moderate flattening of the distribution induced by Hdih-REMD is sufficient to achieve ergodicity on this time scale. In order to better evaluate differences between the performance of RECT and Hdih-REMD we applied these methodologies to a more complex system. Results are shown in the next section.

Figure 3.

Figure 3

Estimate of the free-energy difference between the two metastable minima in alanine dipeptide. Data are shown for both replica-exchange methods (Hdih-REMD and RECT) and for conventional MD as a function of the total simulated time.

Tetranucleotide

Also in this case we monitor the average exchange ratio (76–83% for Hdih-REMD, 25–32% for T-REMD, and 60–80% for RECT). In Figure S7 the variation of the exchange ratios in time is shown for the exchanges between the first 2 and the last 2 replicas of each method. We also checked the consistency of trajectories along the replica ladder. As it can be appreciated in Figure S6, for Hdih-REMD and RECT the empirical distribution of RMSD is very similar for all the trajectories in the generalized ensemble, indicating that, for each method, all of them sampled the same conformational space. On the contrary, in the case of T-REMD, agreement among the distributions of RMSD is very poor. During this simulation trajectories across the temperature space remain trapped on different metastable conformations. The same behavior was obtained in ref (26) where several T-REMD simulations were performed on the same system, with the same number of replicas and a similar temperature range. In that work divergence among the obtained generalized ensembles was observed even for a simulated time as long as 2 μs per replica. For Hdih-REMD and RECT round-trip times are shown on Figure S8. The average round-trip time is ≈0.5 ns for Hdih-REMD, ≈1.8 ns for T-REMD, and ≈1.2 ns for RECT.

In Figure 4 we show the sum of the entropies for the 32 dihedrals used as CVs. In this respect, RECT is clearly more effective than Hdih-REMD in flattening the dihedral distributions, consistently with what was observed for dALA. Notably, the entropic increment observed in RECT is close to the one observed in T-REMD when using an equivalent temperature. This confirms that RECT has an effect comparable to that of raising the temperature of the biased CVs by a factor γ.

Figure 4.

Figure 4

Total entropy of backbone, puckering, and glycosidic dihedral angles in the tetranucleotide for both replica-exchange methods. Entropies are shown as a function of 1/λ and γ, for Hdih-REMD and RECT, respectively. For T-REMD, temperature is chosen as T = γ 300 K. As the entropy increases the dihedral distributions become more flat. The maximum entropy value corresponding to a flat distribution is represented with a straight line. Entropies obtained for the unbiased replicas in the three methods are consistent within their error bars (error not shown).

The significance of this entropic values could be appreciated on the time series and related histograms for all the dihedral angles shown in Figures S9–S12 for the most and least ergodic replica of Hdih-REMD and RECT. It is clear that for RECT, at the most ergodic replica, all the accessible torsional range is sampled. On the contrary, in the highest replica of Hdih-REMD the distributions of some torsions are not flattened.

The transition around the glycosidic bond, from anti to syn, is among the slowest relaxation times in RNA dynamics.58 To evaluate the convergence of the unbiased replica we analyzed the population of the anti rotamer for each nucleotide χ angle. Populations are shown in Figure 5 as a function of the total simulated time. For all the nucleotides the anti conformations are preferred. The guanosine is the nucleotide with the highest syn proportion, and the cytidines are the ones with the smallest (<2%), as correspond to their rotameric preferences.59 Values from both H-REMD approaches seem well consistent, except for the population of the first nucleotide. From the time behavior of these populations, it is clear that for all the REMD approaches the guanosine proportion of anti is the most difficult to converge. Here RECT can reach values close to a longer reservoir-REMD simulation,26 while both Hdih-REMD and T-REMD show results closer to those obtained from conventional MD, with a higher occupation of the anti conformer.

Figure 5.

Figure 5

Estimated glycosidic angle anti population for each nucleotide as a function of the total simulated time. Data are shown for Hdih-REMD, T-REMD, and RECT unbiased replicas and for conventional MD. Reference values taken from ref (26) are shown as dashed lines.

We observe that our method is enforcing the exploration of both anti and syn conformations in the biased replicas for each nucleotide independently. This however does not guarantee that all 16 combinations of anti and syn conformations are explored. Figure 6 shows the free energy of the RNA structures grouped by the combination of the χ angle anti(a)/syn(s) rotamers. All 16 combinations, except for ssss and asss, are sampled in the unbiased replica from RECT. On the contrary, the unbiased replica from T-REMD and Hdih-REMD explores respectively 13 and 8 of the states and plain MD only 5 of them. The most populated cluster corresponds to an all-anti conformation, followed by the saaa. Then, the three clusters asaa, ssaa, and sasa appear with similar population.

Figure 6.

Figure 6

Estimated free energies for the tetranucleotide conformations clustered according to the χ angle anti/syn rotameric combinations (circles). Free energies are computed as −kBTlogPi, where Pi is the normalized population of each cluster on the indicated replica. Gray boxes represent relative populations higher than 1%. Confidence intervals are shown as bars and span the range [−kBTlog(Pi+ΔPi),–kBTlog(PiΔPi)], where ΔPi is the standard deviation of the average Pi as obtained from four blocks. Clusters which are observed in only one of the four blocks have an infinite upper bound.

In the same figure the free-energy values for the ergodic replica show that all the 16 combinations are populated in RECT within a range of 6 kBT. In the case of Hdih-REMD the most ergodic replica visits only 9 combinations with a population that is very close to that of the unbiased replica. The most ergodic replica in T-REMD explores 14 clusters, but their populations have large statistical errors. We highlight the fact that results from T-REMD could be affected by the lack of convergence of trajectories across the temperature space (see Figure S6). This could lead to an underestimation of the errors as evaluated from block analysis.

Discussion

The introduced method allows for building bias potentials for a Hamiltonian replica-exchange scheme using concurrent well-tempered metadynamics on several CVs. Replicas are simulated using a ladder of well-tempered bias factors γ. When CVs are correlated, the self-consistency among the bias potentials is crucial to achieve flat sampling in each individual CV, as illustrated in Figure S2. In this case the exact relationship between bias and free energy is lost. We also remark that here flattening is not complete but modulated by the value of γ. This is useful since it avoids sampling very high energy states (e.g., with steric clashes) that would have a very low chance of being accepted in the unbiased replica. The method compares favorably with both conventional MD and Hdih-REMD. The method slightly outperforms T-REMD, where the entire system is heated, indicating that for these small systems there is not a substantial advantage in schemes where part of the system is biased. However, RECT can be straightforwardly generalized to large systems since the acceptance only depends on the size of the biased portion.

Results from both dALA and tetranucleotide simulations show that the bias potentials constructed with concurrent WT-MetaD are able to gradually scale the free-energy barriers. We notice that only barriers in the one-dimensional free-energy profiles are compensated, which means that some regions in the multidimensional space of all the CVs might not be explored. In principle this could hide some important minima that would never be observed. We did not observe this problem in the applications presented here.

The second application on which we tested RECT, namely conformational sampling of a tetranucleotide, is particularly challenging. The conformational space of these small RNA molecules is not constrained by Watson–Crick pairings, and ergodic sampling is out of reach of conventional MD simulations.26,55 So far, converged ensembles have been obtained only through highly expensive multidimensional REMD simulations, corresponding to a total simulated time of several tens of μs.18,27 One of the reasons for this difficult convergence is the long relaxation time for the anti to syn transitions, which could be additionally hindered by an incorrect force-field description of base–base stacking and base–solvent interactions.58

Figure 6 illustrates the ability of RECT to accelerate conformational transitions among the χ angle anti/syn rotamers. Although the conformational space of the more biased replicas is highly expanded, the convergence in the unbiased replica is not affected. On the contrary the method facilitates the sampling of glycosidic rotamer conformations that otherwise would not be explored by MD simulations of the same overall length. We finally remark that our procedure can be combined with weighted histogram60 so as to include the statistics of the biased replicas.

Comparison with Related State-of-the-Art Methods

RECT is based on the idea of building a replica ladder where a large set of selected CVs is progressively heated. CVs are heated by flattening their distribution with concurrent well-tempered metadynamics. We first discuss the possibility of using methods other than well-tempered metadynamics to build the replica ladder. Possible alternatives here include ABF22 or a recently proposed variational approach.61 These methods could be used in a RECT scheme provided they are suitably extended so as to sample a partially flattened distribution. We also observe that other methods aimed at keeping selected CVs at a given temperature have been proposed based on coupling thermostats to CVs directly.3638 These techniques have been mostly used in the past with an exploration purpose relying on additional calculations so as to provide free energies (see, e.g., ref (62)), but it is not clear if they can be integrated in a RECT scheme.

In the following we discuss the comparison of RECT with related methods that are not directly based on CV tempering.

Comparison with H-REMD of Curuksu and Zacharias. Our method is closely related to the one introduced in ref (31). There, a bias potential aimed at disfavoring the most probable rotamers is manually constructed and applied on several replicas using a scaling factor. This bias disfavors the major minima but does not ensure a proper compensation of the free-energy barriers, as their positions and magnitudes are not a priori known. The main advantage of RECT is that several low-dimensional bias potentials are built with a self-consistent procedure so that the technique can be straightforwardly applied to a large number of degrees of freedom.

Comparison with Bias-Exchange Metadynamics. In bias-exchange metadynamics every replica performs an independent metadynamics simulation so that one CV at a time is feeling the flattening potential. Thus, it is typically used with a relatively small number of ad hoc designed CVs capable of describing the relevant conformational transitions. On the other hand, RECT is designed to be used with a very large number of dummy CVs with little a priori information and to bias them concurrently to exploit their cooperation in enhancing conformational sampling. For this reason, the two approaches are complementary and could even be combined in a multidimensional replica exchange suitable for a massively parallel environment.

Comparison with Solute Tempering and Related Methods. In replica exchange with solute tempering the solute Hamiltonian is scaled so as to obtain an effect equivalent to a rise in the simulation temperature.11,13 Any set of atoms can be identified as solute, giving the opportunity to enhance sampling in a region localized in space.52,63 This requires modifying charges of the enhanced region, with long-range effects and sometimes affecting fundamental properties such as hydrophobicity. In our method, the bias potentials act on precisely selected degrees of freedom minimally perturbing their coupling with the rest of the system. Moreover, the bias is adaptively built so as to compensate the free energy and not the potential energy, so that with properly chosen CVs it could be used to compensate entropic barriers.

Comparison with Hyperdynamics and Accelerated MD. In these methods the potential energy of the system is modified so as to decrease the probability to sample minima on the potential energy.27,64,65 On the contrary, RECT employs a bias which is related with the free energy so as to achieve a flatter histogram on the selected CVs.

We finally remark that RECT, although formally based on the a priori choice of a set of CVs, typically requires the same amount of information as methods not based on CVs. Indeed, as we have shown, the method can be easily applied to a very large number of CVs, virtually including by construction all the slow degrees of freedom of the system. Additionally, when a few relevant CVs can be identified based on chemical intuition, RECT can be straightforwardly combined with standard metadynamics similarly to parallel tempering66 or solute tempering.67

Conclusion

Replica exchange with collective-variable tempering (RECT) has been here proposed as a novel and flexible enhanced-sampling method. RECT takes advantage of the adaptive nature of well-tempered metadynamics to build bias potentials that compensate free-energy barriers. The flattening of the barriers is modulated by the well-tempered factor γ, and the chosen collective variables (CVs) are effectively kept at a higher temperature. The biasing potentials are built combining concurrent low-dimensional metadynamics protocols so as to be usable on a very large number of CVs. Multiple replicas are then used so as to smoothly interpolate between a highly biased, ergodic simulation and an unbiased one (γ = 1). The number of required replicas scales with the square root of the number of chosen CVs for a fixed range of γ factors. This allows a very large number of CVs to be biased, so that virtually all the relevant transitions can be accelerated. The CVs used here were mostly dihedral angles, which exhibit relevant barriers in many biomolecular conformational transitions, but the method can be used with any CV. The application of this technique to the dALA in water shows that the CV probability distributions are effectively flattened by the action of the bias potentials and unbiased statistics is correctly recovered. In the case of the tetranucleotide conformational sampling is greatly enhanced since RECT effectively overcomes the high free-energy barriers of the χ angle transitions that hindered the conformational sampling at room temperature. RECT is available in PLUMED, and a sample input file is provided in Figure S13. RECT is a promising tool to enhance the exploration of the conformational space in highly flexible biomolecular systems such as RNA, proteins, or RNA/protein complexes.

Acknowledgments

Alessandro Barducci, Massimiliano Bonomi, Alessandro Laio, Jim Pfaendtner, and Omar Valsson are acknowledged for carefully reading the manuscript and providing several useful suggestions. Michele Parrinello and Pasquale Pisani are acknowledged for useful discussions. The research leading to these results has received funding from the European Research Council under the European Union’s Seventh Framework Programme (FP/2007-2013)/ERC Grant Agreement n. 306662, S-RNA-S.

Supporting Information Available

Examples of biasing concurrently different CVs on model potentials with independent (Figure S1) and correlated variables (Figure S2). RMSD distributions of dALA from trajectories across the replica ladder (Figure S3). Histograms of dALA Ψand Φdihedral angles for all replicas (Figure S4). The projection of each replica trajectory on the dihedral free-energy landscape (Figure S5). RMSD distributions of RNA tetranucleotide from trajectories across the replica ladder (Figure S6). Average exchange ratios vs time (Figure S7). Round-trip times for the tetranucleotide simulations (Figure S8). Time series and histograms of the torsional angles for the REMD lower and higher replicas (Figures S9–S12). Example of a PLUMED 2.1 input file for a RECT simulation (Figure S13). This material is available free of charge via the Internet at http://pubs.acs.org.

The authors declare no competing financial interest.

Supplementary Material

ct5009087_si_001.pdf (11.5MB, pdf)

References

  1. Henzler-Wildman K. A.; Lei M.; Thai V.; Kerns S. J.; Karplus M.; Kern D. Nature 2007, 450, 913–916. [DOI] [PubMed] [Google Scholar]
  2. Boehr D. D.; Nussinov R.; Wright P. E. Nat. Chem. Biol. 2009, 5, 789–796. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Mustoe A.; Brooks C. L. III; Al-Hashimi H. Annu. Rev. Biochem. 2014, 83, 441–466. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Dror R. O.; Dirks R. M.; Grossman J.; Xu H.; Shaw D. E. Annu. Rev. Biophys 2012, 41, 429–452. [DOI] [PubMed] [Google Scholar]
  5. Dror R. O.; Young C.; Shaw D. E.. In Encyclopedia of Parallel Computing; Padua D., Ed.; Springer: 2011; pp 60–71. [Google Scholar]
  6. Abrams C.; Bussi G. Entropy 2014, 16, 163–199. [Google Scholar]
  7. Kirkpatrick S.; Gelatt C. D. Jr.; Vecchi M. P. Science 1983, 220, 671–680. [DOI] [PubMed] [Google Scholar]
  8. Marinari E.; Parisi G. Europhys. Lett. 1992, 19, 451. [Google Scholar]
  9. Sugita Y.; Okamoto Y. Chem. Phys. Lett. 1999, 314, 141–151. [Google Scholar]
  10. Fukunishi H.; Watanabe O.; Takada S. J. Chem. Phys. 2002, 116, 9058. [Google Scholar]
  11. Liu P.; Kim B.; Friesner R. A.; Berne B. Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 13749–13754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Liu P.; Huang X.; Zhou R.; Berne B. J. Phys. Chem. B 2006, 110, 19018–19022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Wang L.; Friesner R. A.; Berne B. J. Phys. Chem. B 2011, 115, 9431–9438. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Hukushima K.; Nemoto K. J. Phys. Soc. Jpn. 1996, 65, 1604–1608. [Google Scholar]
  15. Nymeyer H. J. Chem. Theory Comput. 2008, 4, 626–636. [DOI] [PubMed] [Google Scholar]
  16. Denschlag R.; Lingenheil M.; Tavan P. Chem. Phys. Lett. 2008, 458, 244–248. [Google Scholar]
  17. Laghaei R.; Mousseau N.; Wei G. J. Phys. Chem. B 2010, 114, 7071–7077. [DOI] [PubMed] [Google Scholar]
  18. Bergonzo C.; Henriksen N. M.; Roe D. R.; Swails J. M.; Roitberg A. E.; Cheatham T. E. III J. Chem. Theory Comput. 2013, 10, 492–499. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Torrie G. M.; Valleau J. P. J. Comput. Phys. 1977, 23, 187–199. [Google Scholar]
  20. Huber T.; Torda A. E.; van Gunsteren W. F. J. Comput.-Aided Mol. Des. 1994, 8, 695–708. [DOI] [PubMed] [Google Scholar]
  21. Grubmüller H. Phys. Rev. E: Stat., Nonlinear, Soft Matter Phys. 1995, 52, 2893. [Google Scholar]
  22. Darve E.; Pohorille A. J. Chem. Phys. 2001, 115, 9169–9183. [Google Scholar]
  23. Laio A.; Parrinello M. Proc. Natl. Acad. Sci. U. S. A. 2002, 99, 12562–12566. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Barducci A.; Bussi G.; Parrinello M. Phys. Rev. Lett. 2008, 100, 020603. [DOI] [PubMed] [Google Scholar]
  25. Piana S.; Laio A. J. Phys. Chem. B 2007, 111, 4553–4559. [DOI] [PubMed] [Google Scholar]
  26. Henriksen N. M.; Roe D. R.; Cheatham T. E. III J. Phys. Chem. B 2013, 117, 4014–4027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Roe D. R.; Bergonzo C.; Cheatham T. E. III J. Phys. Chem. B 2014, 118, 3543–3552. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Straatsma T.; McCammon J. J. Chem. Phys. 1994, 101, 5032–5039. [Google Scholar]
  29. Kannan S.; Zacharias M. Proteins: Struct., Funct., Bioinf. 2007, 66, 697–706. [DOI] [PubMed] [Google Scholar]
  30. Kannan S.; Zacharias M. Proteins: Struct., Funct., Bioinf. 2009, 76, 448–460. [DOI] [PubMed] [Google Scholar]
  31. Curuksu J.; Zacharias M. J. Chem. Phys. 2009, 130, 104110. [DOI] [PubMed] [Google Scholar]
  32. Kara M.; Zacharias M. Biophys. J. 2013, 104, 1089–1097. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Mishra S. K.; Kara M.; Zacharias M.; Koča J. Glycobiology 2014, 24, 70–84. [DOI] [PubMed] [Google Scholar]
  34. Branduardi D.; Bussi G.; Parrinello M. J. Chem. Theory Comput. 2012, 8, 2247–2254. [DOI] [PubMed] [Google Scholar]
  35. Dama J. F.; Parrinello M.; Voth G. A. Phys. Rev. Lett. 2014, 112, 240602. [DOI] [PubMed] [Google Scholar]
  36. Rosso L.; Tuckerman M. E. Mol. Simul. 2002, 28, 91–112. [Google Scholar]
  37. VandeVondele J.; Rothlisberger U. J. Phys. Chem. B 2002, 106, 203–208. [Google Scholar]
  38. Maragliano L.; Vanden-Eijnden E. Chem. Phys. Lett. 2006, 426, 168–175. [Google Scholar]
  39. Deighan M.; Bonomi M.; Pfaendtner J. J. Chem. Theory Comput. 2012, 8, 2189–2192. [DOI] [PubMed] [Google Scholar]
  40. Chipot C.; Lelièvre T. SIAM J. Appl. Math. 2011, 71, 1673–1695. [Google Scholar]
  41. Tuckerman M.; Berne B. J.; Martyna G. J. J. Chem. Phys. 1992, 97, 1990–2001. [Google Scholar]
  42. Ferrarotti M.; Bottaro S.; Pérez-Villa A.; Bussi G. J. Chem. Theory Comput. 2015, 11, 139–146. [DOI] [PubMed] [Google Scholar]
  43. Hornak V.; Abel R.; Okur A.; Strockbine B.; Roitberg A.; Simmerling C. Proteins: Struct., Funct., Bioinf. 2006, 65, 712–725. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Lindorff-Larsen K.; Piana S.; Palmo K.; Maragakis P.; Klepeis J.; Dror R.; Shaw D. Proteins: Struct., Funct., Bioinf. 2010, 78, 1950–1958. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Jorgensen W. L. J. Am. Chem. Soc. 1981, 103, 335–340. [Google Scholar]
  46. Hess B.; Bekker H.; Berendsen H. J.; Fraaije J. G. J. Comput. Chem. 1997, 18, 1463–1472. [Google Scholar]
  47. Hess B. J. Chem. Theory Comput. 2008, 4, 116–122. [DOI] [PubMed] [Google Scholar]
  48. Bussi G.; Donadio D.; Parrinello M. J. Chem. Phys. 2007, 126, 014101. [DOI] [PubMed] [Google Scholar]
  49. Darden T.; York D.; Pedersen L. J. Chem. Phys. 1993, 98, 10089–10092. [Google Scholar]
  50. Hess B.; Kutzner C.; van der Spoel D.; Lindahl E. J. Chem. Theory Comput. 2008, 4, 435–447. [DOI] [PubMed] [Google Scholar]
  51. Tribello G. A.; Bonomi M.; Branduardi D.; Camilloni C.; Bussi G. Comput. Phys. Commun. 2014, 185, 604–613. [Google Scholar]
  52. Bussi G. Mol. Phys. 2014, 112, 379–384. [Google Scholar]
  53. Pérez A.; Marchán I.; Svozil D.; Sponer J.; Cheatham T. E. III; Laughton C. A.; Orozco M. Biophys. J. 2007, 92, 3817–3829. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Zgarbová M.; Otyepka M.; Šponer J.; Mládek A.; Banáš P.; Cheatham T. E. III; Jurecka P. J. Chem. Theory Comput. 2011, 7, 2886–2902. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Yildirim I.; Stern H. A.; Tubbs J. D.; Kennedy S. D.; Turner D. H. J. Phys. Chem. B 2011, 115, 9261–9270. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Li D.-W.; Brüschweiler R. Phys. Rev. Lett. 2009, 102, 118108. [DOI] [PubMed] [Google Scholar]
  57. Apostolakis J.; Ferrara P.; Caflisch A. J. Chem. Phys. 1999, 110, 2099–2108. [Google Scholar]
  58. Chen A. A.; García A. E. Proc. Natl. Acad. Sci. U. S. A. 2013, 110, 16820–16825. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Neidle S.Principles of nucleic acid structure; Academic Press: 2010. [Google Scholar]
  60. Chodera J. D.; Swope W. C.; Pitera J. W.; Seok C.; Dill K. A. J. Chem. Theory Comput. 2007, 3, 26–41. [DOI] [PubMed] [Google Scholar]
  61. Valsson O.; Parrinello M. Phys. Rev. Lett. 2014, 113, 090601. [DOI] [PubMed] [Google Scholar]
  62. Maragliano L.; Vanden-Eijnden E. J. Chem. Phys. 2008, 128, 184110. [DOI] [PubMed] [Google Scholar]
  63. Affentranger R.; Tavernelli I.; Di Iorio E. E. J. Chem. Theory Comput. 2006, 2, 217–228. [DOI] [PubMed] [Google Scholar]
  64. Voter A. F. Phys. Rev. Lett. 1997, 78, 3908. [Google Scholar]
  65. Hamelberg D.; Mongan J.; McCammon J. A. J. Chem. Phys. 2004, 120, 11919–11929. [DOI] [PubMed] [Google Scholar]
  66. Bussi G.; Gervasio F. L.; Laio A.; Parrinello M. J. Am. Chem. Soc. 2006, 128, 13435–13441. [DOI] [PubMed] [Google Scholar]
  67. Camilloni C.; Provasi D.; Tiana G.; Broglia R. A. Proteins: Struct., Funct., Bioinf. 2008, 71, 1647–1654. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ct5009087_si_001.pdf (11.5MB, pdf)

Articles from Journal of Chemical Theory and Computation are provided here courtesy of American Chemical Society

RESOURCES