• Keine Ergebnisse gefunden

Classical Atomistic Force Fields for Single- and Double-Stranded DNA

N/A
N/A
Protected

Academic year: 2021

Aktie "Classical Atomistic Force Fields for Single- and Double-Stranded DNA"

Copied!
11
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Hauptseminar SS16

Classical Atomistic Force Fields for Single- and Double-Stranded DNA

Sven Erik Ilse May 31, 2016

Institute for Computational Physics - University of Stuttgart

Tutor: Ewa Anna Oprzeska-Zingrebe

(2)

1 Introduction

Classical atomistic force fields are essential for all-atomistic simulations of DNA and RNA. This handout, and the corresponding Hauptseminar talk, will explain the details of such force fields. We will discuss the general setup and the development of DNA force fields, how parameters are derived and tested. Therefore we will mainly focus on AMBER-based (AssistedModelBuilding with Energy Refinement) force fields and present some of the successively introduced modifications and corrections since 1984. Finally, limitations of DNA simulations with atomistic force fields and possible future developments are broached.

2 Motivation

The reason why we want to study DNA is obvious. DNA contains the genetic information for all known living organisms, hence understanding DNA is crucial for a deeper insight into many biological and biochemical pro- cesses. Many of those processes (like DNA packing, transcription, translation, or replication) are understood on the surface but not in all detail. As an example Fig. 1 schematically illustrates the process of DNA tran- scription, translation, and the subsequent folding of the amino acid chain. But now one would like to know how the structural conformation of DNA and RNA segments influence such processes and how again the structural conformation is influenced by the respective DNA sequence.

Figure 1: Schematic illustration of the successive processes of DNA transcription, translation and, protein folding[1].

The question is with which techniques can we gain such a detailed insight. Fig. 2 gives an overview of the spatial and temporal resolution of various experimental techniques and all-atom molecular dynamics simulations.

Electron microscopy and X-ray crystallography for example are used to study the structure of nucleic acids because they have a high spatial resolution but do not give any information about the dynamics. Whereas techniques with less spatial resolution like NMR (nuclear magnetic resonance) or FRET (fluorescence resonance energy transfer) can provide dynamic information. But only simulations have both, a high spatial and a high temporal resolution; one can follow the motion of single atoms on a subpicosecond timescale.

Due to this almost unlimited spatial and temporal resolution simulations can deliver unique information which are not accessible in experiments. This information can help to interpret experimental data. For example why some DNA sequences are and some are not populated in experiments, or whether chromophores attached to DNA pieces in FRET experiments influence the dynamics. Additionally we can assess free energies (e.g. solvation- or conformation-free energy) indirectly from molecular dynamics simulation and we hope to understand DNA and RNA folding better or even obtain folding paths. As a complete DNA is too big (a human genome consists of over 3 billion base pairs) and biological processes take too long to simulate them, one has to isolate short parts of the nucleic acid which are important for single steps in processes. Such structures are for example: RNA hairpin loops, tetraloop segments, RNA riboswitches, or parts of DNA A-, B- or Z- helices.

But to start with all this kind of simulations we need a reliable and carefully tested force field for nucleic acids.

(3)

Figure 2: Spatial and temporal resolution of various biophysical techniques and below the timescale of some molecular and physiological processes[8].

3 Structure of RNA and DNA

Before we can start to build a force field for nucleic acids we need to know of which constituents DNA and RNA are build and how they are structured.

DNA stands for deoxyribonucleic acid and RNA for ribonucleic acid and both are build up of nucleotides as building blocks (see Fig. 3) with a single phosphate group. A nucleotide is build up of a phosphate group, a sugar (particularly a pentose monosaccharide), and one of four nucleobases which are connected via a glycosidic bond to the backbone. For DNA the sugar is a deoxyribose and the four possible bases are adenine, thymine, guanine, and cytosine. For RNA the sugar is a ribose and the nucleobases are the same except thymine is replaced by uracil. Many nucleotides together form a DNA (or RNA) strand with the sugar and phosphate group as backbone. In a double strand the nucleobases form Watson-Crick base pairs (namely adenine-thymine and guanine-cytosine) via hydrogen bonds (see Fig. 3). The sequence of the nucleobases encode the genetic information.

Figure 3: (left) structure of a nucleotide, (right) Watson-Crick base pairs (adenine-thymine and guanine- cytosine) [2][3].

(4)

4 Force Field

A force field is a set of parameters which are used to calculate the potential energy of the simulated atoms.

For an atomistic force field one needs parameters for every type of atom. The parameters are usually derived from experimental data or quantum mechanical calculations. The potential energy function can be divided into bonded and non-bonded interaction energies, and these can be split up again:

Etotal=Ebonds+Eangle+Edihedral

| {z }

bonded interactions

+ EvdW+Ecoulomb

| {z }

non-bonded interactions

(1)

Figure 4: Illustration of the five different interaction energies in the potential energy function [4].

With such a potential energy function we can calculate the force on each atom (via F~ =−∇Etotal) and with that the position and velocity for each time step.

The most common force fields for simulations on nucleic acids are Parm (witch belong to the AMBER package) and CHARMM force fields. In the remainder of this handout we will focus on the AMBER based force fields which have the following energy function:

Etotal= X

bonds

Kr(rreq)2+ X

angles

Kθθeq)2+ X

dihedrals

Vn

2 [1 + cos(nφγ)] +X

i<j

Aij

R12ij Bij

R6ij+X

i<j

qiqj

Rij

. (2) The energy for bonds and bond angles is modeled with a harmonic spring model with force constantsKr and Kθand equilibrium distances/anglesreqandθeq. The third sum in equation (2) models the energy for twisting a bond with the force constant Vn, the multiplicity n, a phase shift γ, and the torsion angle φ. Hard-core repulsion and van der Waals' are modeled via a Lennard-Jones potential (with parameters Aij and Bij) and the electric potential energy ofqi in the potential ofqj is calculated with the last term in Eq. (2). For a force field we need a full set of parameters for every possible interaction between atoms in a simulated system.

4.1 Parameter Fitting

The parameter set of a force field is usually empirical and contains for example values for the atomic mass, partial charges for atoms (depending on the context it is in), spring constants and equilibrium bond-lengths and -angles. For the derivation of the parameters many different techniques are used together.

To obtain the parameters for equilibrium bond lengths and bond angles X-ray crystallography data or the results of NMR experiments are used. For the force constants of these bonds infra red and microwave spectroscopy data of molecules which contain such a bond are consulted. Additionally high level quantum mechanic calculations are performed and the force field parameters are fitted to the experimentally and theoretically obtained data.

The parameters in the Lennard-Jones potential which resembles hard-core repulsion and van der Waals' in- teraction are obtain similar. Data from experimental densities and heats of vaporization as well as data from high level QM calculations are used to adjust the parameters. But usually those parameters are only for the interaction between equal atom types and therefore ad hoc assumptions are made for combinations of different atom types.

The electrostatic interactions are critical for nucleic acids since the electrostatic interactions provide a balance between hydrogen bonds in base pairs and base stacking. Therefore the parameterization of the charges is impor- tant for the structure and the stability of DNA and RNA segments. For simplicity a fixed charge model is used, which means that depending on the context of a specific atom it has a fixed point charge. The partial charges do not change for different conformations or in an external field. The fixed charge model neglects polarizability and charge transfer effects. But some of the neglected contributions are effectively (but unphysically) accounted to by fitting all the force field parameters in such a way that experimental results can be reconstructed. To parameterize the electrostatic interactions there are two commonly applied approaches. For the first approach

(5)

electrostatic potentials are quantum mechanically calculated and the atomic charges in the force field are fitted so that they reproduce this potential. More specifically restrained electrostatic potentials are used to prevent artificial growth in the charges of buried atoms. This approach is used to parameterize the partial charges in AMBER force fields. The second common approach is to empirically adjust the charges of individual atoms to reproduce a set of previously calculated quantum mechanical interaction energies. This was done for CHARMM force fields. One has to mention that non-bonded interactions (electrostatics, vdW, and hard-core repulsion) are assumed to be included in the bond-length and bond-angle interaction term. That is why non-bonded inter- action energies are not calculated for directly bonded atoms or atoms which are bonded to a mutual neighbor.

Figure 5: Illustration of the six DNA-backbone di- hedral angles α, β, γ, δ, , and ζ and the torsion around the glycosidic bondχ [5].

The adjustment of the dihedral parameters is the most difficult and arbitrary task in the force field construc- tion. But they have a huge impact on simulations and can make up for some of the neglected parts in the force field which come due to the simple energy functions and approximations. This is why all recent modifications and corrections for the current DNA force fields solely dealt with the reparameterization and refinement of the dihe- dral part. Parameters are obtained by empirically adjust- ing them to reproduce experimental data and high level quantum mechanical calculations as good as possible. In experiments as well as in QM calculations small model systems are studied and the derived parameters are then used also for bigger systems. Figure 5 illustrates the six important dihedral angles of the sugar-phosphate back- bone and the torsion angle around the glycosidic bond.

The parameters for the backbone angles determine the flexibility of the DNA backbone which can be limited due to the fixed charge model of the electrostatic interaction.

Because there is such a welter of parameters to find one can see how complex it is to find a set of well-matched

parameters. Modifying one parameter can essentially affect all the others and it is hard to say whether refining one will result at the end in a better force field. Therefore corrections and modifications for force fields need careful and validation. In consequence of the complexity of finding a set of functioning parameters it can be said that it is a combination of science and art. And there will never be just one right solution but we will always have several force field with different advantages and disadvantages.

Once a set of parameters is found one can start to test the reliability of the force field and try to obtain an insight into the physical limitations. This can be done by testing the stability of known structures during simulations, comparing simulated structures to experimental structures (from NMR experiments or X-ray crys- tallogrphy), or comparing results to QM calculations. Figure 6a compares the relative energies of different RNA backbone families derived with QM benchmark calculations, modern DFT methods, and with force field simulation respectively. The DFT calculations reproduce the benchmark results closely, whereas the force field derived energies deviate a lot more and exaggerate the energy differences between backbone families. The fixed charge model is presumably the reason for those inaccuracies. Figure 6b shows a tetraloop and compares the theoretical and experimental derived structure. One can see that simulation and X-ray structure agree which means the force field is good parameterized for this case.

(6)

(a) (b)

Figure 6: (a) Relative energies of different RNA backbone families (canonical A-RNA conformations is chosen as reference point). The energies from high level QM benchmark calculations are compared to DFT results (left) and results obtained with a AMBER-based force field (right). (b) Comparison between theoretical (left) and experimental (right) structures of a tetraloop. The simulation result shows an overlay of 50 snapshots. [13]

5 Development and Progress

The development of force fields is closely connected to increasing computing-power which is accompanied by rising simulation time scales. Force fields which proved stable previously exhibit inaccuracies and errors on longer time scales. Therefore force field need to be developed, refined, and improved continuously. But as mentioned previously every modification needs to be validated carefully since it could worsen the force field. In the following we will go through a short history of the AMBER-based DNA force fields and then go into detail of some of the more recent developments.

5.1 Short History of AMBER based DNA Force Fields

1984: Weiner et al. present a first force field for nucleic acids and proteins [15].

1995: Cornell et al. present the successor to the Weiner et al. force field (AMBER ff94 or parm94)[7].

1998-2000: Publication of slightly modified versions (ff98 or parm98 and ff99 or parm99) of the Cornell et al. ff94. Those were widely used until 2007.

2007: Orozco group refines the α and γ torsional term to improve the description of α/γ conformers (parmbsc0) [11].

2010-2015: Various modifications and corrections for different dihedral terms are introduced: χOL3 for RNA [17, 6] and χOL4[10], ζOL1 [16], andβOL1 [18] for DNA.

2016: Orozco group presents a new general-purpose force field (parmbsc1) [9].

5.2 Parmbsc0

The parm94-99 force field was parameterized when simulations where on the 1 ns time scale, thus it was surprisingly enough that they delivered reasonable results even in simulations on the 10 ns time scale. But as simulation times became longer irreversible transitions of the α and γ torsion angles were found. These transitions led to severe distortions of DNA structure in 50 ns simulations. Figure 7b shows the crystal structure of Dickersons’s dodecamer (a DNA B-helix segment first analyzed by Richard E. Dickerson) as well as the structure at the end of a 200 ns simulation with parm99 and parmbsc0, respectively. The simulation with parm99 reveals structural degradation of the DNA helix, whereas with parmbsc0 the structure stays stable. To reparameterize the α and γ dihedral terms high level quantum mechanical calculations were performed on a model system (Fig. 7a). The model system consisted of a phosphate group and a sugar; the two constituents of the DNA backbone. The corresponding force field parameters were then adjusted to reproduce the results of the QM calculations.

(7)

(a)

(b)

Figure 7: (a) Model system which is used for quantum mechanical calculations to reparameterize the αandγ dihedral terms. (b) Structure of Dickerson’s dodecamer. Crystal structure (left), structure obtained by averaging the last 5 ns of the trajectories with parm99 (middle) and parmbsc0 (right) [11].

5.3 χOL3 - Correction

Figure 8: The X-ray structure of a RNA tetraloop (left) and the degraded ladder-like struc- ture when thechiOL3-correction is not used (right) [14].

Figure 8 shows the crystal structure of a RNA tetraloop with helical shape and the degraded ladder-like structure at the end of simulation without the χOL3-correction.

This complete degradation of the helical into a ladder- like structure was traced back to be an artifact in the force field, more precisely a unphysical shift of theχangle during the simulation. Therefore theχdihedral term got reparameterized via QM calculations on model ribonucle- osides (a ribose bonded to a nucleobase). Figure 9 shows a comparison of the torsion profile with and without the correction. The slight difference in the slope of the com- plete torsion profile between 190 and 280 prevents the degradation of RNA structures in simulations. The net χdihedral term in Fig. 9 illustrates that before the repa- rameterization theχ dihedral term was simply modeled by a cosine-like term. Afterwards the energy function for this torsion angle has a far more complex profile.

Figure 9: Comparison of the torsion profile with and without the χOL3-correction. Complete torsion profile including all force field terms (left) and the netχdihedral term (right) [14].

(8)

5.4 Parmbsc1

As one can see there were various modifications and corrections proposed in past decade which addressed individual problems in the force field. Those corrections are principally useful and extensively tested but one always has to chose which corrections are useful for a specific system and which do not need to be taken into account. Therefore there was a need for a new general-purpose AMBER force-filed for DNA simulations and the Orozco group presented 2016 the parmbsc1 force field to solve these needs. The parameters of this force field were fitted to high level quantum mechanical data and it was tested for almost 100 different systems so far (total simulation time of 140µs). Figure 10 and 11 illustrate the good agreement of the parmbsc1 and chidihedral energy profiles with quantum mechanical data compared to the previous parmbsc0 force field.

Figure 10: Potential of mean force profiles of dihedral with parmbsc0 (left) and parmbsc1 (middle) and quantum mechanical derived contour profile (right) [9].

Figure 11: Completeχdihedral profiles for all 4 DNA nucleo bases. Comparison between parmbsc0, parmbsc1, and quantum mechanical derived profiles [9].

6 Limitations

When performing simulations on DNA and RNA with classical atomistic force fields one always has to keep in mind the basic limitations and included approximations in mind. Obtained data need to be analyzed carefully to preempt mis- and over-interpretation of results. The most important limitation of molecular dynamics

(9)

simulations on nucleic acids is the simulation time scale which is short compared to actual biochemical processes.

Base-pair opening (or base-pair breathing) for example happens on a time scale ofµs - ms and folding of nucleic acid structures can take time between µs and even days. In comparison to that the current simulation time scales are around 1 µs. Due to the short simulation times sampling is only done in a limited area around the starting structure and therefore ergodicity is usually not reached. This means that simulations depend sensitively on the accuracy of the starting geometry (form X-ray or NMR experiments). Within the limited simulation time thermal barriers due to structural inaccuracies at the beginning might not be overcome. A second important limitation is that the potential energy function of a force field is not absolute accurate due to the incorporated simplifications and approximations. As simulations times become longer more force filed artifacts due to insufficient precise parameterization will show up. This requires a consecutive adjustment and improvement of the force field parameters.

But over this limitations one should not forget about the greatest advantage of atomistic simulations which is the unsurpassed level of spatial and temporal resolution. With the limitations in mind one can perform various tasks. Starting from more conservative simulations like slightly modifying experimentally derived structures (for example with nucleobase substitution) and going on to more ambitious tasks like exploring folding pathways or the prediction of new structures.

The current force fields are accurately tested for most kind of double-stranded DNA structures or RNA structures with Watson-Crick base paring. But biological interesting are also single-stranded structures which comprise non-Watson-Crick base pairs (e.g. G-quadruplexes). For those structures current force field deliver promising results but further validation is needed.

7 Outlook & Conclusion

As computing-power increases constantly and concomitant with this simulations times become longer there will always be the need for force field modification. All recent modifications concerned the reparameterization of the dihedral terms which delivered good results. But nevertheless it is improbable that constant tuning of existing force fields is unlimited, as they include major simplifications and approximations. On of the most significant approximations is the fixed charge model which neglects polarization effects. Savelyev et al. dealt with this issue and presented an all-atom polarizable force field based on the classical drude oscillator model [12]. This force field is still under development and needs further testing.

Molecular dynamics simulations with atomistic force field are a powerful technique to obtain new information (due to single atom and subpicosecond resolution) which are not accessible with experiments. But therefore the force field limitations need to be kept in mind to prevent mis-/overinterpretation of simulation results. Future force field might need to be more complex to include polarization or charge transfer effects. A first force field which considers polarization effects explicitly was proposed by Savelyev et al. 2014 [12].

(10)

References

[1] http://biosocialmethods.isr.umich.edu/wp-content/uploads/2014/09/

central-dogma-enhanced.png. [Online; accessed 23-May-2016].

[2] https://en.wikipedia.org/wiki/Nucleotide. [Online; accessed 23-May-2016].

[3] https://en.wikipedia.org/wiki/Basepair. [Online; accessed 23-May-2016].

[4] http://www.verachem.com/products/vm2/. [Online; accessed 23-May-2016].

[5] https://ilyasyildirim.files.wordpress.com/2014/01/alltorsions.jpg. [Online; accessed 23-May- 2016].

[6] Pavel Ban´s, Daniel Hollas, Marie Zgarbov´a, Petr Jureˇcka, Modesto Orozco, Thomas E Cheatham III, Jiˇr´ı ˇSponer, and Michal Otyepka. Performance of molecular mechanics force fields for rna simulations: stability of uucg and gnra hairpins. Journal of Chemical Theory and Computation, 6(12):3836–3849, 2010.

[7] Wendy D Cornell, Piotr Cieplak, Christopher I Bayly, Ian R Gould, Kenneth M Merz, David M Ferguson, David C Spellmeyer, Thomas Fox, James W Caldwell, and Peter A Kollman. A second generation force field for the simulation of proteins, nucleic acids, and organic molecules.Journal of the American Chemical Society, 117(19):5179–5197, 1995.

[8] Ron O Dror, Robert M Dirks, JP Grossman, Huafeng Xu, and David E Shaw. Biomolecular simulation: a computational microscope for molecular biology. Annual review of biophysics, 41:429–452, 2012.

[9] Ivan Ivani, Pablo D Dans, Agnes Noy, Alberto P´erez, Ignacio Faustino, Adam Hospital, J¨urgen Walther, Pau Andrio, Ramon Go˜ni, Alexandra Balaceanu, et al. Parmbsc1: a refined force field for dna simulations.

Nature methods, 13(1):55–58, 2016.

[10] Miroslav Krepl, Marie Zgarbov´a, Petr Stadlbauer, Michal Otyepka, Pavel Ban´s, Jaroslav Koˇca, Thomas E Cheatham, Petr Jureˇcka, and Jiˇr´ı ˇSponer. Reference simulations of noncanonical nucleic acids with different χvariants of the amber force field: Quadruplex dna, quadruplex rna, and z-dna.Journal of chemical theory and computation, 8(7):2506–2520, 2012.

[11] Alberto P´erez, Iv´an March´an, Daniel Svozil, Jiˇr´ı ˇSponer, Thomas E Cheatham, Charles A Laughton, and Modesto Orozco. Refinement of the amber force field for nucleic acids: improving the description ofα/γ conformers. Biophysical journal, 92(11):3817–3829, 2007.

[12] Alexey Savelyev and Alexander D MacKerell. All-atom polarizable force field for dna based on the classical drude oscillator model. Journal of computational chemistry, 35(16):1219–1239, 2014.

[13] Jiˇr´ı ˇSponer, Pavel Ban´s, Petr Jureˇcka, Marie Zgarbov´a, Petra K¨uhrov´a, Marek Havrila, Miroslav Krepl, Petr Stadlbauer, and Michal Otyepka. Molecular dynamics simulations of nucleic acids. from tetranu- cleotides to the ribosome. The journal of physical chemistry letters, 5(10):1771–1782, 2014.

[14] Jiˇr´ı ˇSponer, Xiaohui Cang, and Thomas E Cheatham. Molecular dynamics simulations of g-dna and perspectives on the simulation of nucleic acid structures. Methods, 57(1):25–39, 2012.

[15] Scott J Weiner, Peter A Kollman, David A Case, U Chandra Singh, Caterina Ghio, Guliano Alagona, Salvatore Profeta, and Paul Weiner. A new force field for molecular mechanical simulation of nucleic acids and proteins. Journal of the American Chemical Society, 106(3):765–784, 1984.

[16] Marie Zgarbov´a, F Javier Luque, Jiˇr´ı ˇSponer, Thomas E Cheatham III, Michal Otyepka, and Petr Jureˇcka.

Toward improved description of dna backbone: revisiting epsilon and zeta torsion force field parameters.

Journal of chemical theory and computation, 9(5):2339–2354, 2013.

[17] Marie Zgarbov´a, Michal Otyepka, Jiˇr´ı ˇSponer, Arnoˇst Ml´adek, Pavel Ban´s, Thomas E Cheatham III, and Petr Jureˇcka. Refinement of the cornell et al. nucleic acids force field based on reference quantum chemical calculations of glycosidic torsion profiles. Journal of chemical theory and computation, 7(9):2886–2902, 2011.

(11)

[18] Marie Zgarbov´a, Jiˇr´ı ˇSponer, Michal Otyepka, Thomas E Cheatham III, Rodrigo Galindo-Murillo, and Petr Jureˇcka. Refinement of the sugar–phosphate backbone torsion beta for amber force fields improves the description of z-and b-dna. Journal of chemical theory and computation, 11(12):5723–5736, 2015.

Abbildung

Figure 1: Schematic illustration of the successive processes of DNA transcription, translation and, protein folding[1].
Figure 2: Spatial and temporal resolution of various biophysical techniques and below the timescale of some molecular and physiological processes[8].
Figure 4: Illustration of the five different interaction energies in the potential energy function [4].
Figure 5: Illustration of the six DNA-backbone di- di-hedral angles α, β, γ, δ, , and ζ and the torsion around the glycosidic bond χ [5].
+3

Referenzen

ÄHNLICHE DOKUMENTE

the crystal structure, the formation of a straight, stiff TrmBL2 filament with a contour length that is 2% shorter than for dsDNA and the minimal unzipping of 10 basepairs leaves

Understandably, defense leaders thought that im- position of cuts through sequestration was the absence of a strategy, and Secretary of Defense Chuck Hagel directed DOD leaders

This thesis describes the construction of a combined TIRFM-AFM instrument, its possibilities, and limitations, e.g. time correlated mechanical manipulation and uorescence

Figure Annex F-1: Force mapping of average cutting force F C of block 1 (layers 3 -6).. Figure Annex F-3: Force mapping of average cutting force F C of block 2

Halpern reported in an example that in an equilibrium of two reactant states, the thermodynamically unfavored reactant can be converted to the predominant product precursor as it

Having obtained and characterized more realistic model tips in chapter three, we used the tip structures presented in section 3.3.4 in atomistic simulations which revealed

Interaction Potentials and Parametrizations: All simulations were carried out using the CHARMM program, [14] with provisions for evaluating electrostatic interactions involving

In simulations using a water model in which the ε H value was set to −0.10 kcal/mol, we obtained good agreement with the experimentally estimated R g for