Test peptide (3 glutamic acids) - Tegeler 2008 diploma thesis Goe

The third test system extends the second system by an additional glutamic acid. Three glutamic acids are embedded between two alanine forming a simple

56 5.3 Test peptide (3 glutamic acids)

Figure 5.6: The total fraction of deprotonated glutamic acids in the AEAEA peptide, averaged over 3 independent runs of 5ns each.

peptide as illustrated in Figure 5.10. Again a series of titration curve simu-lations is performed to analyze the influence of pH on the overall fraction of deprotonated acid.

Figure 5.11 depicts the titration curves of the test peptide for each of the simulations performed. The curves are again differently shaped compared to the runs simulating one or two glutamic acids. All three curves show a pK_a shift at around pH = 4.25 and around pH = 4.75. The plateaus forming can be seen best in the averaged titration curve shown in Figure 5.12.

It can be assumed that the plateaus are again results of a pK_a shift originated by the deprotonation of the surrounding amino acids as seen before in the peptide with two glutamic acids. If a single amino acids would be simulated around pH = 4.25 the probability to be found in the deprotonated state is 0.5. In the three glutamic acid peptide, each glutamic acid is under the influ-ence of neighboring charges and if the first glutamic acid deprotonates around pH = 4.0 to pH = 4.25 the local environment stabilizes the protonated state for the remaining glutamic acids. Therefore the total fraction of deprotonated glutamic acid does not change with increasing pH until the pH is high enough

0 0.05 0.1 0.15 0.2 0.25 0.3

0 0.2 0.4 0.6 0.8 1

time in a given λ area (in %)

Figure 5.7: Distribution of λ for the first glutamic acid in a constant pH sim-ulation at pH 4.25. The end states around λ = 0 and λ = 1 are equally populated.

58 5.3 Test peptide (3 glutamic acids)

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45

0 0.2 0.4 0.6 0.8 1

time in a given λ area (in %)

Figure 5.8: Distribtuion of λ for the second glutamic acid at constant pH of 4.25. The ratio between protonated (λ = 0) and deprotonated acid (λ = 1) is suggesting an influence of the first glutamic acid. Instead of equally populating the end states the deprotonated state is favored.

Time (ns)

λλ

Figure 5.9: Trajectory of theλ-particle for the two glutamic acid. The first glu-tamic acid (red curve) is interacting with the second (blue curve) as the second can only deprotonates while the first is protonated, e.g. at 0.3ns, 1.2ns,1.7ns, 3.5ns and 4.4ns.

60 5.3 Test peptide (3 glutamic acids)

Figure 5.10: Test peptide with three directly attached glutamic acids sur-rounded by alanines

0 0.5 1 1.5 2 2.5 3

1 2 3 4 5 6 7 8 9

fraction of deprotonated acid

Trajectory 1 Trajectory 2 Trajectory 3 Theoretical

Figure 5.11: Titration curves of three runs of the three glutamic acid peptide.

Figure 5.12: Combined titration curve over 3 runs of a three glutamic acid peptide.

that the protonation of a second glutamic acid becomes favorable. This condi-tion is found to be around pH = 4.75.

Figure 5.13 supports this argument by three λ-trajectories showing the λ-development over time for each glutamic acid of one run (number 2) at pH = 4.25. As can be seen the protonation states of Glu2 (blue curve) and Glu3 (brown curve) are heavily correlated. IfGlu2 deprotonatesGlu3 is protonates again (e.g. at 0.9nsand 1.7ns). However,Glu1 is most of the simulation time in the protonated state which indicates the stabilization of this state atpH = 4.25 by the remaining two glutamic acids and their averaged charge of ≈ −1e.

Figure 5.14 depicts the titration curve for thisGlu1 and underlines an untypical ratio of ≈ 0.2 protonated and deprotonated acid around pH = 4.25 where a ratio of ≈ 0.5 is expected on the basis of a single glutamic acid. Additionally the curve for higher pH seems to be slightly shifted by half a pK_a unit.

The combined titration curve (Figure 5.12) shows a third stabilized section around apH = 5.75. This could be related to a stabilization of glutamic acid as before. However, this interesting observation was not further investigated yet.

Alternatively the effect could be due to statistical error, which are significantly

62 5.3 Test peptide (3 glutamic acids)

higher in the peptide simulations as compared to isolated gluatmic acids as shown in Figures 5.5 and 4.7.

Time (ps)

λλλ

Figure 5.13: The λ-trajectories for the 3 glutamic acids embedded in the pep-tide in run 2 at a constant pH =pK_a.

64 5.3 Test peptide (3 glutamic acids)

0 0.2 0.4 0.6 0.8 1

1 2 3 4 5 6 7 8 9

fraction of deprotonated acid

Glu 1 Theoretical

Figure 5.14: Titration curve forGlu1 in run 2. The low probability for depro-tonation around pH = 4.25 indicates an influence by the remaining glutamic acids.

Summary

In this thesis an extended λ-dynamics approach allowing constant pH simu-lations was presented. While in classical molecular dynamics simusimu-lations the protonation states of all titratable groups have to be defined in advance, the protonation states of such groups are included as an additional coordinate that is dynamically updated. This dynamical updating is achieved by coupling two Hamiltonians describing the protonated and deprotonated state of the whole system via a coupling parameter λ. By assinging a mass and velocity to λ a virtual particle is introduced on which MD can be performed. The forces governing the dynamics of the λ-particle are the gradient of the free energy landscape with respect to λ. Constant pH simulations are then achieved by additionally accounting for the pH dependent hydration free energy of the vanishing proton to correct the free energy landscape.

Correction to the free energy landscape were required due to the force field offset error: The force field representation of bonds is not correctly modeling the quantum mechanical effects of bond formation as minimal energies stored in the bonds, but simply uses a harmonic approximation with zero minimal energy. Usually this does not impose any problems because free energy cal-culations are performed using a thermodynamical cycle, so only relative free energy differences are computed between branches that have the same force field error. Our approach allows the calculation of a single branch in the ther-modynamical cycle by λ-dynamics. The force field offset errors are corrected using precalculated reference data provided as a parameter to the simulation.

After compensating for the force field effects the pH dependence was included by adding to the free energy landscape a function κ which includes the influ-ences of the hydration free energy of the proton.

The method was implemented into the GROMACS MD simulation software and supports multiple titratable groups. Currently only glutamic acid was parameterized but other amino acids such as lysin or aspartic acid can be sim-ulated as well by providing the corresponding parameters. Yet, the simulation of histidine is still under development due to the complexity of multiple titrat-able sites sharing the same atoms. A solution to this problem could be the definition of multidimensional λ particles.

A first test scenario for the constant pH simulations was the simulation of an isolated glutamic acid in a water box at different pH. By performing 12 runs at different pH values of 5nseach, a titration curve was derived by computing the average extent of protonation for each simulation. The sigmoidal behavior of the theoretical titration curve was qualitatively reproduced. Additionally it was shown, that best results were achieved by performing a series of simulations for each pH value with different starting velocity distributions.

As a next step, titration curves of systems including multiple titratable sites were calculated. The titration curve of a peptide with two glutamic acid showed the formation of a plateau where the relativepK_a of the second glutamic acid increased due to the charge imposed by the deprotonated first glutamic acid.

The same effect could be observed in a peptide containing three glutamic acids directly attached to each other. Two plateaus were formed and a de-tailed analysis of the individual protonation curves showed the influence of the surrounding glutamic acids. Typically, two glutamic acids interact with each other by having in average one deprotonated and the other protonated while the third acid stays most of the time in the protonated state due to the charge imposed by the other two.

Current limitations are twofold. On the one hand residues with more than one titratable site are more difficult to model, as the sites can not be treated independently. The case of Histidine is therefore not yet solved but the use of multidimensional λ particles seems to be a promising approach. On the other hand the technical implementation in the GROMACS software could be greatly improved. The implementation was done introducing as little code changes as possible but the performance is not optimal. Especially the neighbor searching which has to be performed at each step for each group can be optimized to be calculated only every tenth step which in itself’s fastens the simulation by one order of magnitude. Additionally all forces are calculated anew for each λ depending group, but it would suffice to only calculate the δG/δλ at each step which would also lead to a speed gain of around an order of magnitude,

if multiple titratable groups are simulated.

In comparison to other methods allowing constant pH simulations, the pre-sented approach has a series of advantages. After parameterizing the different titratable groups once, no additional effort is required to perform a constant pH simulation and a “black box” usage is possible. The results achievable with the method are encouraging and the additional computational costs are theo-retically minimal. After an optimization the computational costs are below the operations required to simulate a few additional water molecules. While hybrid MC/MD simulations require additional free energy calculations or other com-plex analysis to determine the most probable protonation state, the method presented here is only introducing one more degree of freedom to the system for each titratable group. Compared to the constant pH method presented by Lee et al. [LJI04] the capability of treating explicit solvent has to be highlighted.

The future development will be the analysis and implementation of mul-tidimensional λ to allow the treatment of e.g. histidine. Additionally the parametrization of all amino-acids in different force fields is required to dis-tribute the software and to allow a wide application. With further technical optimizations the additional computational costs can be minimized allowing constant pH simulation to become a standard procedure when performing MD simulations.

[And80] H. C. Andersen. Molecular dynamics simulations at constant pressure and/or temperature. J. Chem. Phys., 72:2384–2393, 1980.

[Ant08] J. M. Antosiewicz. Protonation free energy levels in complex molecular systems. Biopolymers, 89(4)(9999):262–269, 2008.

[BH01] U. B¨orjesson and P. H. H¨unenberger. Explicit-solvent molecu-lar dynamics simulation at constant ph: Methodology and ap-plication to small amines. The Journal of Chemical Physics, 114(22):9706–9719, 2001.

[BKvG02] R. B¨urgi, P. A. Kollman, and W. F. van Gunsteren. Simulat-ing proteins at constant pH: An approach combinSimulat-ing molecu-lar dynamics and Monte Carlo simulation. Proteins: Structure, Function, and Genetics, 47(4):469–480, 2002.

[BMP97] A. M. Baptista, P. J. Martel, and S. B. Petersen. Simulation of protein conformational freedom as a function of pH: constant-pH molecular dynamics using implicit titration. Proteins: Structure, Function, and Genetics, 27(4):523–544, 1997.

[BPvG⁺84] H. J. C. Berendsen, J. P. M. Postma, W. F. van Gunsteren, A. Dinola, and J. R. Haak. Molecular dynamics with coupling to an external bath. The Journal of Chemical Physics, 81(8):3684–

3690, 1984.

[BTS02] A. M. Baptista, V. H. Teixeira, and C. M. Soares. Constant-pH molecular dynamics using stochastic titration. The Journal of Chemical Physics, 117(9):4184–4200, 2002.

70 BIBLIOGRAPHY

[BvdSvD95] H. J. C. Berendsen, D. van der Spoel, and R. van Drunen. Gro-macs: A message-passing parallel molecular dynamics implemen-tation.Computer Physics Communications, 91(1-3):43–56, 1995.

[CP88] D. Chandler and J. K. Percus. Introduction to Modern Statisti-cal Mechanics. Physics Today, 41:114, 1988.

[DA04] M. Dlugosz and M. Antosiewicz. Constant-pH molecular dynam-ics simulations: a test case of succinic acid. Chemical Physics, 302(1-3):161–170, July 2004.

[DCK04] T.E. Cheatham III C.L. Simmerling J. Wang R.E. Duke R. Luo K.M. Merz B. Wang D.A. Pearlman M. Crowley S. Brozell V.

Tsui H. Gohlke J. Mongan V. Hornak G. Cui P. Beroza C.

Schafmeister J.W. Caldwell W.S. Ross D.A. Case, T.A. Dar-den and P.A. Kollman. AMBER 8. University of California, San Francisco, 2004.

[FS96] D. Frenkel and B. Smit. Understanding Molecular Simulation:

From Algorithms to Applications. Academic Press Limited, 1996.

[FTS⁺] M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuseria, M. A.

Robb, J. R. Cheeseman, J. A. Montgomery, Jr., T. Vreven, K. N.

Kudin, J. C. Burant, J. M. Millam, S. S. Iyengar, J. Tomasi, V. Barone, B. Mennucci, M. Cossi, G. Scalmani, N. Rega, G. A. Petersson, H. Nakatsuji, M. Hada, M. Ehara, K. Toyota, R. Fukuda, J. Hasegawa, M. Ishida, T. Nakajima, Y. Honda, O. Kitao, H. Nakai, M. Klene, X. Li, J. E. Knox, H. P. Hratchian, J. B. Cross, V. Bakken, C. Adamo, J. Jaramillo, R. Gom-perts, R. E. Stratmann, O. Yazyev, A. J. Austin, R. Cammi, C. Pomelli, J. W. Ochterski, P. Y. Ayala, K. Morokuma, G. A.

Voth, P. Salvador, J. J. Dannenberg, V. G. Zakrzewski, S. Dap-prich, A. D. Daniels, M. C. Strain, O. Farkas, D. K. Malick, A. D.

Rabuck, K. Raghavachari, J. B. Foresman, J. V. Ortiz, Q. Cui, A. G. Baboul, S. Clifford, J. Cioslowski, B. B. Stefanov, G. Liu, A. Liashenko, P. Piskorz, I. Komaromi, R. L. Martin, D. J.

Fox, T. Keith, M. A. Al-Laham, C. Y. Peng, A. Nanayakkara, M. Challacombe, P. M. W. Gill, B. Johnson, W. Chen, M. W.

Wong, C. Gonzalez, and J. A. Pople. Gaussian 03, Revision C.02.

Gaussian, Inc., Wallingford, CT, 2004.

[HGE74] R.W. Hockney, S.P. Goel, and J.W. Eastwood. Quiet High-Resolution Computer Models of a Plasma. Journal of Com-putational Physics, 14:148–158, 1974.

[HKvdSL08] B. Hess, C. Kutzner, D. van der Spoel, and E. Lindahl. Gromacs 4: Algorithms for highly efficient, load-balanced, and scalable molecular simulation. J. Chem. Theory Comput., 2008.

[HMV78] J. P. Hansen, I. R. McDonald, and Pieter B. Visscher. Theory of Simple Liquids. American Journal of Physics, 46(8):871–872, 1978.

[KB05] J. Khandogin and III Brooks, C. L. Constant pH Molecular Dynamics with Proton Tautomerism. Biophys. J., 89(1):141–

157, 2005.

[KI96] X. Kong and C. L. Brooks III. Lambda-dynamics: A new ap-proach to free energy calculations. The Journal of Chemical Physics, 105(6):2414–2423, 1996.

[LC03] G. Li and Q. Cui. PKa Calculations with QM/MM Free Energy Perturbations. Journal of Physical Chemistry B, 107(51):14521–

14528, 2003.

[LHvdS01] E. Lindahl, B. Hess, and D. van der Spoel. Gromacs 3.0: a pack-age for molecular simulation and trajectory analysis. Journal of Molecular Modeling, 7(8):306–317, 2001.

[LJI04] M. S. Lee, F. R. Salsbury Jr., and C. L. Brooks III. Constant-pH molecular dynamics using continuous titration coordinates. Pro-teins: Structure, Function, and Bioinformatics, 56(4):738–752, 2004.

[MC05] J. Mongan and D. A Case. Biomolecular simulations at constant pH. Current Opinion in Structural Biology, 15(2):157–163, 2005.

[MCM04] J. Mongan, D. A. Case, and J. A. McCammon. Constant pH molecular dynamics in generalized Born implicit solvent. Journal of Computational Chemistry, 25(16):2038–2048, 2004.

[MGGM⁺85] J. B. Matthew, F. R. N. Gurd, B. E. Garcia-Moreno, M. A.

Flanagan, Keith L. March, and S. J. Shire. pH-Dependent

Pro-72 BIBLIOGRAPHY

cesses in Proteins. Critical Reviews in Biochemistry and Molec-ular Biology, 18(2):91–197, 1985.

[MP94] J. E. Mertz and B. M. Pettitt. Molecular dynamics at a constant pH. Int. J. Supercomput. Appl. High Perform. Eng., 8(1):47–53, 1994.

[MRR⁺53] N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H.

Teller, and E. Teller. Equation of state calculations by fast com-puting machines. The Journal of Chemical Physics, 21(6):1087–

1092, 1953.

[Per78] M. F. Perutz. Electrostatic effects in proteins. Science, 201(4362):1187–1191, 1978.

[RPMP03] D. Reith, M. P¨utz, and F. M¨uller-Plathe. Deriving Effective Mesoscale Potentials from Atomistic Simulations. Journal of Computational Chemistry, 24:1624–1636, 2003.

[RSC05] D. Riccardi, P. Schaefer, and Q. Cui. PKa Calculations in Solu-tion and Proteins with QM/MM Free Energy PerturbaSolu-tion Sim-ulations: A Quantitative Test of QM/MM Protocols. Journal of Physical Chemistry B, 109(37):17715–17733, 2005.

[SCW97] Y.Y. Sham, Z.T. Chu, and A. Warshel. Consistent Calculations of pKa’s of Ionizable Residues in Proteins: Semi-microscopic and Microscopic Approaches. Journal of Physical Chemistry B, 101(22):4458–4472, 1997.

[Sim02] T. Simonson. Gaussian fluctuations and linear response in an electron transfer protein. Proceedings of the National Academy of Sciences, 99(10):6544–6549, 2002.

[SLH01] D. v. d. Spoel, E. Lindahl, and B. Hess. GROMACS User Man-ual. University of Groningen, 3.3 edition, 2001.

[SLH⁺05] D. v. d. Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark, and H. J. C. Berendsen. Gromacs: Fast, flexible, and free. Journal of Computational Chemistry, 26(16):1701–1718, 2005.

[Sop96] A. K. Soper. Empirical potential Monte Carlo simulation of fluid structure. Journal of Chemical Physics, 202:295–306, 1996.

[TTB⁺98] G. J. Tawa, I. A. Topol, S. K. Burt, R. A. Caldwell, and A. A.

Rashin. Calculation of the aqueous solvation free energy of the proton. The Journal of Chemical Physics, 109(12):4852–4863, 1998.

[TV77] G. M. Torrie and J. P. Valleau. Nonphysical sampling distribu-tions in Monte Carlo free-energy estimation: Umbrella sampling.

Journal of Computational Physics, 23(2):187–199, 1977.

[War79] A. Warshel. Calculations of chemical processes in solutions.

Journal of Physical Chemistry, 83(12):1640–1652, 1979.

[WD08] E. Weinan and L. Dong. The Andersen thermostat in molecular dynamics. Communications on Pure and Applied Mathematics, 61(1):96–136, 2008.

[WSK85] A. Warshel, F Sussman, and G. King. Free energy changes in sol-vated proteins: microscopic calculations using a reversible charg-ing process. Biophysics, 25:8368–8372, 1985.

[Zwa54] R. W. J. Zwanzig. High-Temperature Equation of State by a Per-turbation Method. I. Nonpolar Gases. The Journal of Chemical Physics, 22(8):1420–1426, 1954.

74 BIBLIOGRAPHY

Appendix

Periodic boundary conditions

All atoms simulated in a MD simulation are contained in a box which is a cubic box in the simplest case. To overcome the boundary problem of treating atoms leaving the box or forces acting behind the box, periodic boundary conditions are introduced. Hereby the box is in all dimensions surrounded copies of itself as illustrated in Figure 7.1 and atoms leaving the box on one side will appear immediately on the other side.

In some simulations this is very helpful, e.g. in simulating a membrane which has then no arbitrary borders, but in others, for example protein or small system simulations, this can lead to unwanted artifacts by the molecule seeing itself although this was not intended. If an isolated system should be analyzed, choosing a sufficient box size is important to overcome these often not easy to find errors. Additionally the geometrical shape can be optimized to achieve high distances to the image of the protein itself while not being required to

Figure 7.1: Periodic boundary conditions in two dimensions for a triclinic box (Figure from [SLH01])

Figure 7.2: The rhombic dodecahedron and a truncated octahedron (Figure from [SLH01])

simulate too many water molecules.

The simple cubic box has for a spherical molecule with a maximal interaction distance d₁ superfluously water-filled corners. For such simulation setups the rhombic dodecahedron as illustrated in Figure 7.2 is optimal: It uses with the samed₁ only 71% of the size and saves therefore about 29% of CPU-time. For the final simulation the image distance d should be d > d1.

Thermo- and barostats

To be able to simulate at constant temperature or pressure (e.g. to simulate an NPT ensemble) these thermodynamical properties need to be maintained by a thermo- or barostat.

Berendsen thermostat

TheBerendsen-Thermostat[BPvG⁺84] couples the system to an infinite exter-nal heat bath with temperature T_ref. The heat flow into or out of the system is implemented by scaling the velocities with a factorλ given by:

λ= The parameterτT is a temperature coupling parameter that translates toτ = 2c_vτ_T/N_dfk for the exponential temperature deviation with constant τ:

dt = Tref −T

τ (7.2)

cv is the total heat capacity of the system, k is the Boltzmann’s constant and Ndf the number of degrees of freedom.

Andersen thermostat

The Andersen Thermostat [And80] (for a detailed analysis on extended prop-erties see [WD08]) couples a system to a reference heat bath by selecting new velocities from a Maxwellian velocity distribution. The updating is governed by a coupling parameter τ determining the collision frequency. In every cou-pling step the velocity is updated if a random number ran from an uniform distribution holds ran≤τ·dt.

This behavior mimics collisions of the system’s atoms with atoms from the heat bath and reproduces an exact Maxwell-Boltzmann distribution independent of the value of the collision frequencyτ (see [FS96]). By doing so, the momentum is not conserved and simulating dynamics is difficult due to the fact, that consequently the mean square displacement as a function of time changes for different collision frequencies.

Berendsen barostat

Analogously to the thermostat the pressure is scaled in theBerendsen Barostat [BPvG⁺84] accordingly to

dt = Pref −P

τ_P (7.3)

which reflects the coupling of system pressure to a ’pressure bath’ realized by scaling the coordinates and box vectors every step.

Free energy calculations

Slow-growth

The standard way in GROMACS to perform free energy calculation is to use slow-growth methods. The system is hereby perturbed slowly from state A to state B. Slowness is important to ensure that every state is thermodynami-cally equilibrated. If that requirement is fulfilled, the process is reversible and a simulation from B to A yields the same results except for the sign. There-fore analysis of the hysteresis is the building block for most slow-growth error

estimates. If the simulation fulfills these requirements integrating over δG/δλ overδλ yields ∆G.

While in Slow-Growth one simulation is performed changing theλin such small steps that equilibrium is assumed, the thermodynamic integration of multiple conformation consists of a set of single simulations. Hereby the interval [0 : 1]

is divided into N parts (e.g. 20) and for each λ a separate simulation of for example shorter length is performed. A linear regression over the averaged dGdλvalues allows the final integration to derive ∆G. Usually the first part of the simulation is dropped for equilibration purposes whereby the equilibration

Im Dokument Tegeler 2008 diploma thesis Goe (Seite 55-79)