• Keine Ergebnisse gefunden

3.2 Force Field

3.2.4 Implementation

In this chapter the implementation of the force calculation in the compute shader is described looking at one shader invocation. The atom processed by that invocation is henceforth referred to asAthread. Implementing the compute shader, at first all buffers are included as mentioned in Chapters 3.1 and 3.2.2. Then all constant values concerningAthread, i.e. mass, type and bonds, plus its position and velocity

are stored locally on the GPU because they are frequently accessed. Subsequently, the force-buffer entry ofAthread, i.e. the force introduced onto this atom in the last time step, is stored to be reused for the time step integration (Chapter 3.3). To make memory access secure without using a floating point atomic add function, each instance of the compute shader first stores its calculated forces in a local array and all threads’ results are summarized after all of them have completed their force calculations. This local force array has as many elements as there are atoms in the considered molecule. Each entry consists of a three dimensional force vector which is initialized with zeros and will be filled with the force introduced onto that atom in the current run of the compute shader. This procedure is described more detailed later in this chapter. To implement the simplified force field as described in Chapters 3.2.2 and 3.2.3, the bonded and non-bonded energies and forces are each calculated in a loop which both are described below. In the following I refer to the atom thatAthreadis currently compared to asAc_vdWorAc_bondrespectively, withc standing for "compare".

For the first loop several remarks have to be made. As mentioned above, for this force field the non-bonded interactions only should be calculated between atoms separated by at least three bonds and non-bonded interactions separated by exactly three bonds should be downscaled. But considering, that the bonds have an equi-librium bond length around roughly 1 Å to 1.5 Å, appropriate to the force field parameters, it was first contemplated to simply calculate the non-bonded interac-tions only for distances above 4 Å in this paper. However, I finally decided to make both options available in the program by a customizable boolean variable that is set in the simulation parameters of the compute shader, and to show both results in Chapter 4.

The loop for the calculation of the non-bonded, i.e. the van der Waals energies and forces iterates over all atoms that have a higher index thanAthread. This may be done, since the force introduced ontoAc_vdWbyAthreadis the negative value of the force introduced ontoAthreadbyAc_vdW(seeChapter 3.2.3) and therefore it is sufficient to sample each pair of atoms in only one direction and directly calculate the force introduced onto both atoms by each other. At the beginning of the loop it is inspected if the calculation should be done only for distances over 4 Å or for all pairs of atoms separated by at least three bonds. In the first case, it is simply calculated if the distance betweenAthreadandAc_vdWis smaller or equal to 4 Å and ifAc_vdWis a direct bond ofAthread. If any of those two conditions is true,Ac_vdWis ignored. If not, the van der Waals parameters for both are derived from the buffer and the energy between those two atoms and the forces thatAthreadandAc_vdWexert on each other are calculated using Equations (26) and (30). Otherwise, if the calculation should be done for atoms separated by at least three bonds, it is examined ifAc_vdWis a direct of bondAthreador of the bonds ofAthread, or of the bonds of those, meaning that both atoms are separated by only one, two or three bonds respectively. If one of the first two is the case, the energy and force calculation is discarded. If they are separated by exactly three bonds, the energy and force calculation is performed with a scale factor of 0.5. If the atoms are separated by more than three bonds the energy and

force calculation is performed normally, with a scale factor of 1. Since each thread calculates the force for two atoms, more than one thread might be accessing the same buffer entry if we wrote the forces directly into the buffer at this point. For that reason the forces are first stored in the local force array mentioned above.

The loop for the calculation of the bonded energies and forces, i.e. for the bonds and angles, iterates over all atomsAc_bond1 that are bound to Athread. First the bond length, i.e. the distance between the two atoms’ centers, as well as the normalized direction fromAthreadtoAc_bond1 is calculated. IfAc_bond1 has a higher index thanAthreadthe bond force parameters are read from the corresponding buffer and subsequently the energy of the bond as well as the force introduced ontoAthread and that introduced ontoAc_bond1 by the bond are calculated. This is done by using Equations (23) and (28), applying the distance and normalized direction as input.

Same as in the loop of the previous paragraph, the forces are then stored in the local force array. Afterwards another loop starts which iterates over all atomsAc_bond2 that have not yet been processed by the superordinate loop. This means that each combination of two bonds ofAthreadproviding an angle withAthreadas the angle’s vertex is observed in only one direction. In this loop the bond length of this second bond and the normalized direction fromAthread to Ac_bond2 is calculated and the angle force parameters are derived from the corresponding buffer. Then the bond lengths and normalized directions calculated forAc_bond1 andAc_bond2 as well as the parameters are used as input for the calculation of the bond angle energy and force which is processed using Equations (24) and (29).

After all energies and forces have been calculated, first the potential, kinetic and total energy of the system are calculated as described inChapter 2.1.2. Afterwards, the calculated forces are written to the force buffer. Therefor the force buffer entries are set to zero and a memory barrier is inserted, to ensure that all threads are at the same position in the program. To assure that the accessed force buffer entry is individual for each thread, I created two loops to be executed consecutively where one of them iterates from the global index ofAthread,gl_WorkGroupID.x, down to index zero and the second of them iterates from one index above the thread ID, i.e.

gl_WorkGroupID.x + 1, up to indexN−1. Using an access index dependent on the index ofAthreadmakes the buffer access unique for each thread. Between those two loops a memory barrier has to be positioned, because the number of iterations differ and it has to be assured that the second loop is not entered until all threads have completed the first loop. Inside of the loops, the force buffer is read at the index provided by the iteration variable, to obtain the current value which is then written to a temporary variableFtemp. Afterwards, the value in the local force array at the same index is added toFtemp. The resulting value is then written back to the force buffer. Memory barriers are inserted after all read and write operations on the buffer.

ÄHNLICHE DOKUMENTE