continuous electrostatics solvation energies calculation method for biomolecules simulations.
Abstract
We report a development of a new fast surfacebased method for numerical calculations of solvation energy of biomolecules with a large number of charged groups. The procedure scales linearly with the system size both in time and memory requirements, is only a few percents wrong for any molecular configurations of arbitrary sizes, gives explicit value for the reaction field potential at any point, provides both the solvation energy and its derivatives suitable for Molecular Dynamics simulations. The method works well both for large and small molecules and thus gives stable energy differences for quantities such as solvation energies of molecular complex formation.
Solvent interactions play an essential role in Nature in determining electrostatic potential energies of molecular conformations, charge states of proteins, dissociation constants of small molecules, and binding properties of proteinligand complexes. A solvation energy calculation for a moleculesized object has always been and still is a challenging problem. The most accurate approach is, apparently, a large scale MD simulation Rapaport (2004a, b) of the body of interest immersed in a tank of water molecules in a realistic force field or even within quantum mechanical settings. Though such an approach may in principle provide ultimately accurate predictions, the calculations are time consuming and pose a number of specific problems stemming, e.g. from long relaxation times of water clusters. One possible way to bridge the “simulation gap” is to employ different types of continuous solvation models. Fortunately, water is characterized by a very large value of dielectric constant and therefore the reaction field of water molecules is collective in nature. Although realistic properties of molecular interactions depend both on shortscale water molecules alignment and on the longrange dipoledipole interactions at the same time Fedichev and Menshikov (2008, 2006), purely electrostatic models, such as PoissonBoltzmann equation solvers Baker et al. (2001); Schäfer et al. (2000), turned out to be very successful in various applications.
Even within the realm of continuous electrostatic models there are numerous approaches in use to calculate the polar part of the solvation energies. Popular techniques span from finite element methods (FEM, Baker et al. (2001); Schäfer et al. (2000); Bordner and Huber (2003); Boschitsch et al. (2002); Boschitsch and Fenley (2004); Horvath et al. (1996); Vorobjev and Scheraga (1997); Zhou (1993)) to various types of Generalized Born (GB) approximations Schaefer and Karplus (1996); Still et al. (1990); Tsui and Case (2001); Simonson (2001); Lee et al. (2002); Hassan and Mehler (2002); Rashin (1990); Feig et al. (2004a); Onufriev et al. (2002). A numerical FEM solution to the PoissonBoltzmann equation (PBE) is a formally fast (the calculation time and memory scale with being the number of particles in the system) and is a rigorous attempt to solve the electrostatics problem. On the other hand FEM involve a good numerical overhead and in practice GB approximations are faster, in spite of the fact that it normally takes operations to calculate GB energy. Unfortunately GB approximations are very rough and that is why GB calculations work well only for small and medium sized molecules, whereas FEM methods can, although at expense of a numerical complexity, be applied to very large systems. The particular boundary between the applicability of the two methods depends on the balance of speed the amount of details and accuracy required in a specific application.
In this Letter we push forward our recently established connection Fedichev et al. (2009) between the Generalized Born (GB) models Still et al. (1990); Tsui and Case (2001); Lee et al. (2002); Bashford and Case (2000); Onufriev et al. (2002); Lee et al. (2003); Mongan et al. (2007) and the boundary integral formulation of the electrostatics problem Schäfer et al. (2000). We show that the GB solvation energy can in fact be calculated in linear time and memory for arbitrary system of charges. Using postCoulomb Field Approximation (postCFA) for the Born radii calculations we report a development of a new fast surfacecharges density based method (Surface Charges GB, SCGB) for numerical solution of the PoissonBoltzmann equations and use it for the solvation energy calculations for biomolecules with a large number of charged groups. The procedure turns out only a few percents wrong for realistic molecular configurations, gives explicit value for the reaction field potential at any point within the system and provides both the solvation energy and its derivatives suitable for Molecular Dynamics (MD) simulations. The method works well both for large and small molecules and thus gives stable energy differences for quantities such as solvation energies of a molecular complex formation.
Modern implicit water methods trade accuracy and physical sophistication for speed and usually are based on assumptions Lee et al. (2002); Mongan et al. (2007) traceable back to the original approach of Born Born (1920). Consider a molecule modeled as a system of charges confined within a water cavity as shown on Fig.1. The shape of the cavity can be either obtained by displacing the water out of all the atomic volumes and then collecting the atomic volumes into the molecular volume Lee et al. (2002, 2003); Tjong and Zhou (2007). An alternative can be the molecular volume separated from the water bulk by a sufficiently smooth interface surface containing all the atoms and having no unphysical waterfilled caverns inside Lee and Richards (1971); Richards (1977). Neither of approaches is ideal, though the surface based methods often produce better molecular volumes. The polar contribution to the solvation energy (the solvation energy) of the molecule is given by
(1) 
where is the so called reaction field potential produced by the water polarization charges as explained in e.g. Schäfer et al. (2000). Here the Latin indices enumerate the charges, is the total number of charges, and are the charges and the positions of the ions.
The actual calculation of the reaction field potential depends on further assumptions and may be performed in a number of ways. Since the dielectric constant in water is large (), the electric potential on the molecular surface vanishes to a very good accuracy: (the so called “ideal conductor” approximation). Therefore the polarization charges are confined to the interface and the reaction field can be approximated as
(2) 
Here is the surface density of the polarization charges and is the molecular surface element. The total electric potential at a given point is where
is the Coulomb potential generated by the molecular charges. The surface charge density satisfies the integral equation
(3) 
If the molecular surface is properly discreticized then both the polarization charge density and the solvation energy can be obtained iteratively in operations with the help of either FFT or fast multipole methods for fast matrixvector products and proper preconditioners Greengard and Rokhlin (1987); Sagui and Darden (1999); Beckers et al. (1998); De Leeuw et al. (1980); Essmann et al. (1995). In practice the number of iterations required for full convergence is far from a few and the whole calculation is nevertheless fairly computationally demanding. Another problem arises from the fact that applications such as MD simulations or minimizations require derivatives with respect to the atoms coordinates. Naturally, finding a derivative of an iteratively obtained solution is not an easy task. That is why a substantial effort was put in finding reasonable approximate solutions to Eq. (3) as described in the recent publications Bardhan (2008); Bardhan et al. (2009) and the refs. therein.
Historically Generalized Born (GB) methods provide an apparently different way of the solvation energy calculations Schaefer and Karplus (1996); Still et al. (1990); Tsui and Case (2001); Simonson (2001); Lee et al. (2002); Hassan and Mehler (2002); Rashin (1990); Feig et al. (2004a); Onufriev et al. (2002). In our recent work Fedichev et al. (2009) we established the link between the surface electrostatics and GB models. It turns out that GB models can be also used to provide the reaction field potential approximation within the molecule and to calculate the polarization charge density. To do that we reintroduce GB models following our presentation in Fedichev et al. (2009) using the simplest Kirkwoodlike form of the reaction field potential Kirkwood (1934); Grycuk (2003)
(4) 
where
and is a properly chosen function. Specific expressions for the function are different in different models and are expressed either in terms of either volume Schaefer and Karplus (1996); Still et al. (1990); Tsui and Case (2001); Simonson (2001); Lee et al. (2002); Hassan and Mehler (2002); Rashin (1990); Feig et al. (2004a); Onufriev et al. (2002) or surface integrals Feig et al. (2004b); Lee et al. (2003, 2002); Tjong and Zhou (2007). The values of the function at the positions of the charges are called the respective Born radii, of the ions. The surface and the volume integral formulation dichotomy of GB models has a long history and the models defined with a help of properly chosen molecular surfaces (see e.g. Lee and Richards (1971); Richards (1977)) have a good number of practical advantages Romanov et al. (2004). Normally the polar part of the solvation energy is obtained by plugging the expression from the Eq. (4) for the reaction field potential into the Eq. (1) Kirkwood (1934); Grycuk (2003); Fedichev et al. (2009):
(5) 
where , is the dielectric constant of the molecular interior. The expression implies double summation over the molecular charges and requires operations to compute.
One of the simplest way to calculate the Born radii comes from the so called Coulomb field approximation (CFA) Schaefer and Karplus (1996); Gilson and Zhou (2007); Schlick et al. (1999); Bardhan (2008); Bardhan et al. (2009): the electric displacement vector in a nonuniform medium is taken as that in the vacuum. The CFA is wrong for the ions next to the protein boundary Lee et al. (2002, 2003); Mongan et al. (2007); Grycuk (2003), which is a problem indeed, since most of the charges within typical biomolecules are located next to the molecular surfaces. There are a few ways to go beyond the CFA and obtain a more accurate approximation. The first class of the models was introduces in a number of works Lee et al. (2002); Mongan et al. (2007); Lee et al. (2003); Grycuk (2003); Fedichev et al. (2009); Romanov et al. (2004); Ghosh et al. (1998); Sigalov et al. (2005) in either of the equivalent volume or surface integral formulations
(6) 
where , , and is a (variational) parameter. The integration over the water bulk in the middle of (6) is transformed to the equivalent boundary integral form in a standard way with the help of the Gauss theorem Ghosh et al. (1998). Another important model is given by
(7) 
where is the properly chosen constant. The special choice of in the model (6) and the two models described by Eq. (7) with and are exact for an arbitrary system of charges within a spherical “molecule” Fedichev et al. (2009). Though the specific choice of the Born radii method is not important for the following considerations, we naturally prefer these “inherently accurate” models and call them SCGB (Eq. (6) with ) and SCGB(3) or SCGB(4) (Eq. (7) with ).
To obtain a faster method we suggest to use the model potential (4) to calculate the polarization charge density on the molecular surface from the electrostatic boundary condition in a standard way
(8) 
Next to the molecular surface () the functions in each of our models vanishes, , where is the distance from a given point to the surface Fedichev et al. (2009). Therefore
(9) 
Once the surface charge density is known, we can use Eq. (2) to obtain the solvation energy in the same way as if is a solution of the surface electrostatics. In what follows we call the method Surface Charges Generalized Born (SCGB).
Let us summarize the solvation energy calculation algorithm in a few lines:

given a set of molecular charges located at the positions and a useful discretization of the surface, representing the moleculewater interface, we calculate first the set of Born radii with the help of the surface integration according to either of Eq. (6) and Eq. (7) with properly chosen values of or .

as soon as the Born radii are ready, we calculate the surface charge density at every point on the molecular surface according to Eq.(9).
Although the apparent computational complexity of the outlined procedures is (or better say in ), where is the number of points in the discrete representation of the molecular surfaces, the discrete summation involves only the coordinates differences and thus the calculation can be performed in operations with the help of either FFT or fast multipole methods.
SCGB approximation is by no means exact, , and hence there can be (and in fact there are) superficial polarization charges in the water bulk and within the molecule. Let us perform a few simple model calculation to see how accurate the suggested SCGB procedure can be. Consider first a charge placed somewhere at the position within a sphere of a radius . Then, a simple calculation yields
(10) 
and the model expression for the electrostatic potential coincide with the exact result e.g. from Landau et al. (1961)
(11) 
with . In the same way the surface charge density calculated from this expression for the potential according to Eq.(8) coincides with that given by Eq. (9):
Since is an additive quantity, SCGB approximations gives the exact result for for an arbitrary charge distribution within a sphere. An interesting case corresponds to a sphere with , that is a very large molecule occupying a halfspace.
SCGB approach can not, of course, be exact for an arbitrary molecule geometry. Consider another practically important example: a plain layerlike “molecule” (or membrane) of the thickness surrounded by the continuous water on both sides with a charge placed inside the layer at the distance from one of the water interface planes. The exact result for the solvation energy of the system is Stratton (1941); Jackson and Fox (1999)
(12) 
where . The results for the layerlike molecule in all the three SCGB approaches are:
We compared them with the exact result of Eq. (12) on Figure 2. All the quantities approach the exact value on the molecule boundaries () and differ from the exact solution in the middle of the layer.
Another challenging case is the calculation for a single charge placed within a wedge made of the two perpendicular infinite walls (the “” and “” planes). The SCGB results are
where is the azimuthal angle between the position of a charge and the plane, is the distance separating the charge from the wedge. The results should be compared with the exact solvation energy
The error can be analyzed by observing the ratio , which is the largest at and
The measures of the error are reasonable though of course not perfect. To build more confidence we have also performed the calculations for a charge placed on the axis of an infinite cylinder of the radius and in the center of a cube of the size (see Table 1) with roughly the same results.
Exact  SCGB/Ex  SCGB(3)/EX  SCGB(4)/EX  

cylinder  
cube 
All the calculations presented in this Section so far may be fair but concern only a few oversimplified examples produced for model systems with idealized geometries. To judge on the actual performance of the method we turn to a practically interesting realistic system: solvation energy calculations for neuraminidase protein (pdb accession code ). The molecule is composed of amino acids and, after all the hydrogen atoms added, has atoms. The results of the calculations are represented on Fig. 3. The horizontal axis represents the Born radii taken from “exact” solvation energy using the “definition”
The quantity was found “exactly” by solving FEM version of Eq. (3). The vertical axis shows the Born radii obtained by our SCGB(3) model with the help of (7) with . The results of the calculations agree pretty well in the small Born radii region and diverge in the protein center (large Born radii region). This behavior is well expected, since it is exactly the center of a large molecule which is the region where the divergence between the SCGB and exact solvation energy is the most (compare e.g. with Fig. 2).
The singlecharge calculations represented on Fig. 3 is an interesting but not an ultimate test of the model. What counts in realistic approximations is of course the solvation energy of a large molecule with a complicated shape of the molecular surface and a sophisticated atomic charge distribution. We applied all the four SCGB models to a set of proteins from the Quantum Pharmaceuticals Binding library and presented the results (the data on the vertical axis) in correlation with the solvation energy obtained with a surface FEM method (the data on the horizontal axis) on Figure 4. The blue dots show the performance of SCGB with the Born radii obtained with the standard CFA formulas. The CFAbased method fails pretty miserably, whereas all the other SCGB are in good agreement with the “exact” FEM calculations. Although all three postCFA SCGB methods are nearly all as good as each other, the green dots representing the model give a somewhat better approximation.
The reason behind the distinction of the model may stem from the “inherent” absolute neutrality of the system of the protein and the water polarization charges in the model. Indeed, Eq. (9) for the surface charges density combined with Eq. (7) with give ensure overall neutrality of the system
(13) 
for arbitrary surface geometry and the charges distributions. This does not tell of course that the other two SCGB models are much worse, the abilities of the methods to recover correct solvation energies are approximately the same.
Few concluding remarks should be placed here. Obviously SCGB is not able to provide exact solutions to the electrostatics problem. In fact in many of the practical applications this may well not be an issue: genuine water environment is neither continuous or describable in terms of simple electrostatics. SCGB is clearly computationally superior to classic GB implementations both in speed and accuracy since the calculations can be done in instead of . In fact, the real comparison should be made to iterative surface electrostatic solvers, which can also be made fast. The advantage comes from the fact that SCGB solution can be obtained in a number of steps roughly equal to the number of operations required for a single iteration of surface based FEM electrostatics solver. Another advantage of SCGB stems from availability of numerical derivatives for any surface implementation with surface areas and normals.
The authors are indebted to Quantum Pharmaceuticals for support. The solvation energy contribution introduced this report is implemented in a number of Quantum Pharmaceuticals models and employed in Quantum’s drug discovery applications. PCT application is filed.
References
 Rapaport (2004a) D. Rapaport, The art of molecular dynamics simulation (Cambridge University Press, 2004a).
 Rapaport (2004b) D. Rapaport, The Art of Molecular Dynamics Simulation (Cambridge University Press, 2004b).
 Fedichev and Menshikov (2008) P. Fedichev and L. Menshikov, eprint arXiv: 0808.0991 (2008).
 Fedichev and Menshikov (2006) P. Fedichev and L. Menshikov, Arxiv preprint condmat/0601129 (2006).
 Baker et al. (2001) N. Baker, D. Sept, S. Joseph, M. Holst, and J. McCammon, Proceedings of the National Academy of Sciences p. 181342398 (2001).
 Schäfer et al. (2000) A. Schäfer, A. Klamt, D. Sattel, J. Lohrenz, and F. Eckert, Physical Chemistry Chemical Physics 2, 2187 (2000).
 Bordner and Huber (2003) A. Bordner and G. Huber, Journal of computational chemistry 24 (2003).
 Boschitsch et al. (2002) A. Boschitsch, M. Fenley, and H. Zhous, J. Phys. Chem. B 106, 2741 (2002).
 Boschitsch and Fenley (2004) A. Boschitsch and M. Fenley, Journal of Computational Chemistry 25, 935 (2004).
 Horvath et al. (1996) D. Horvath, D. Van Belle, G. Lippens, and S. Wodak, The Journal of Chemical Physics 104, 6679 (1996).
 Vorobjev and Scheraga (1997) Y. Vorobjev and H. Scheraga, Journal of computational chemistry 18 (1997).
 Zhou (1993) H. Zhou, Biophysical Journal 65, 955 (1993).
 Schaefer and Karplus (1996) M. Schaefer and M. Karplus, J. Phys. Chem 100, 1578 (1996).
 Still et al. (1990) W. Still, A. Tempczyk, R. Hawley, and T. Hendrickson, J. Am. Chem. Soc 112, 6127 (1990).
 Tsui and Case (2001) V. Tsui and D. Case, Biopolymers (Nucl. Acid. Sci.) 56, 275 (2001).
 Simonson (2001) T. Simonson, Current Opinion in Structural Biology 11, 243 (2001).
 Lee et al. (2002) M. Lee, F. Salsbury Jr, and C. Brooks III, The Journal of Chemical Physics 116, 10606 (2002).
 Hassan and Mehler (2002) S. Hassan and E. Mehler, Proteins Structure Function and Genetics 47, 45 (2002).
 Rashin (1990) A. Rashin, J. Phys. Chem 94, 1725 (1990).
 Feig et al. (2004a) M. Feig, A. Onufriev, M. Lee, W. Im, D. Case, and C. Brooks III, Journal of Computational Chemistry 25, 265 (2004a).
 Onufriev et al. (2002) A. Onufriev, D. Case, and D. Bashford, Journal of Computational Chemistry 23, 1297 (2002).
 Fedichev et al. (2009) P. Fedichev, G. Getmantsev, and L. Menshikov, Arxiv preprint arXiv:0908.0634 (2009).
 Bashford and Case (2000) D. Bashford and D. Case, Annual Review of Physical Chemistry 51, 129 (2000).
 Lee et al. (2003) M. Lee, M. Feig, F. Salsbury, and C. Brooks, Journal of computational chemistry 24, 1348 (2003).
 Mongan et al. (2007) J. Mongan, W. SvrcekSeiler, and A. Onufriev, The Journal of Chemical Physics 127, 185101 (2007).
 Born (1920) M. Born, Z. Phys 1, 45 (1920).
 Tjong and Zhou (2007) H. Tjong and H. Zhou, Journal of Physical Chemistry BCondensed Phase 111, 3055 (2007).
 Lee and Richards (1971) B. Lee and F. Richards, Journal of Molecular Biology 55, 379 (1971).
 Richards (1977) F. Richards, Annual Review of Biophysics and Bioengineering 6, 151 (1977).
 Greengard and Rokhlin (1987) L. Greengard and V. Rokhlin, Journal of Computational Physics 73, 325 (1987).
 Sagui and Darden (1999) C. Sagui and T. Darden, Annual review of biophysics and biomolecular structure 28, 155 (1999).
 Beckers et al. (1998) J. Beckers, C. Lowe, and S. De Leeuw, Molecular Simulation 20, 369 (1998).
 De Leeuw et al. (1980) S. De Leeuw, J. Perram, and E. Smith, Proceedings of The Royal Society of London. Series A, Mathematical and Physical Sciences pp. 27–56 (1980).
 Essmann et al. (1995) U. Essmann, L. Perera, M. Berkowitz, T. Darden, H. Lee, and L. Pedersen, Journal of Chemical Physics 103, 8577 (1995).
 Bardhan (2008) J. Bardhan, The Journal of Chemical Physics 129, 144105 (2008).
 Bardhan et al. (2009) J. Bardhan, M. Knepley, and M. Anitescu, The Journal of Chemical Physics 130, 104108 (2009).
 Kirkwood (1934) J. Kirkwood, J. Chem. Phys 2, 351 (1934).
 Grycuk (2003) T. Grycuk, The Journal of Chemical Physics 119, 4817 (2003).
 Feig et al. (2004b) M. Feig, A. Onufriev, M. Lee, W. Im, D. Case, and C. Brooks, Journal of Computational Chemistry 25, 265 (2004b).
 Romanov et al. (2004) A. Romanov, S. Jabin, Y. Martynov, A. Sulimov, F. Grigoriev, and V. Sulimov, J. Phys. Chem. A 108, 9323 (2004).
 Gilson and Zhou (2007) M. Gilson and H. Zhou, Annu. Rev. Biophys. Biomol. Struct 36, 21 (2007).
 Schlick et al. (1999) T. Schlick, R. Skeel, A. Brunger, L. Kalé, J. Board, J. Hermans, and K. Schulten, Journal of Computational Physics 151, 9 (1999).
 Ghosh et al. (1998) A. Ghosh, C. Rapp, and R. Friesner, J. Phys. Chem. B 102, 10983 (1998).
 Sigalov et al. (2005) G. Sigalov, P. Scheffel, and A. Onufriev, The Journal of chemical physics 122, 094511 (2005).
 Landau et al. (1961) L. Landau, E. Lifshitz, and A. King, American Journal of Physics 29, 647 (1961).
 Stratton (1941) J. Stratton, Electromagnetic theory (McGrawHill New York, 1941).
 Jackson and Fox (1999) J. Jackson and R. Fox, American Journal of Physics 67, 841 (1999).