• Keine Ergebnisse gefunden

Simulation of shear flow: RNEMDS

3. Simulation methods and implementation 49

3.1.1. Simulation of shear flow: RNEMDS

The central question of this work discusses homopolymer melts and lamellar diblock copolymer configurations in shear flow. The introduction of shear flow into a computer simulation is nontrivial. In general, there are two different options for generating a shear flow: stress-controlled and strain-controlled shear flow. In a strain-controlled shear flow the collective motion, thus the strain, of the material is controlled. For MD simulations this can be achieved by either deforming the simulation box away from a triclinic box or with the Lees-Edwards boundary condition (LEBC) [121]. The first option is already part of HOOMD in the form of the box deformation updates but comes with its own challenges i.e. over-wrapped shear boxes. In this case, the response to the strain-controlled situation is the stress.

The alternative approach is the stress-controlled shear flow. In this case, the shear forces acting on a system are controlled and the strain is measured as a response. The advantage of a stress-controlled simulation is that the scheme does not dictate the strain of the system. While for most situations a response with a linear velocity profile, as dictated by strain-controlled simulations, is accurate, there are exceptions, especially, in strong shear flow. The phenomenon of shear-banding [122] is such a deviation. A stress-controlled simulation allows the system to exhibit these responses as well. If a linear velocity profile is obtained, it is a result of the material properties. For this type of simulation Müller-Plathe [104] published an algorithm: reverse nonequilibrium molecular dynamics simulation (RNEMDS). The total simulation box is divided into Nslabs thin slabs along the direction of the shear gradient,e.g.along the Z-axis (Figure 3.1). Two of these slabs are chosen with the maximum distance to each other in periodic boundary conditions,e.g. idmin = 0 andidmax =Nslabs/2. Exchanging momentum between the two chosen slabs induces stress into the system. Because of the periodic boundary conditions, the stress spreads symmetrically into both halves of the simulation box. The direction of the stress is chosen perpendicular to the shear gradiente.g.parallel to the X-axis.

The first step of the RNEMDS is to identify the particleiin the max slabidmax with the maximum momentum components in that direction maxi∈idmaxpix and particle j with the minimum momentum components in the min slab minj∈idminpjx. An update

3.1. MD and DPD implementation HOOMD

according to the RNEMDS scheme swaps the two momentum components of the particles pix ↔pjx. An update per time interval of ∆t introduces an external momentum flux into the simulation

jpx = pix−pjx

2At = ∆p(∆t)

2At . (3.1)

The factor two in the denominator appears because the flux is split across both halves of the simulation box within periodic boundary conditions. Adenotes the area perpen-dicular to the shear gradient,i.e.the XY-area.

The momentum fluxjpx controls the stress in the simulation box. As a result, liquid particles start to flow. Hence, the strain is the response. A simple flow response to the stress is the linear velocity profile with a constant slope ∂v∂zx. In this case, the response can be used to determine the viscosityη of the system

η∂vx

∂z =jpx. (3.2)

More complicated flow patterns are possible, especially in inhomogeneous systems, since the algorithm only dictates stress inside the two dedicated slabs. An example of such a pattern is discussed for the stability analysis of diblock copolymer lamellae in the parallel configuration in section 4.3.3a. The down-side of this approach is that the system is nonphysically disturbed inside the swapping slabs as the swapping is an artificial process.

Consequently, the region in the vicinity of these slabs must not be used for the analysis of the system. In particular, it is common to observe a deviation from the linear flow profile in this region, compare with Figure 4.33. Additionally, the simulation box of an RNEMDS is twice the size compared with a simulation with LEBC. Because of the symmetric propagation of the stress in both halves of the simulation box, it is noncubic and doubles the dimension of the shear-gradient, seeFigure 3.1.

Recently, Statt et al. [123] find that the RNEMDS can establish vortices instead of linear velocity profiles if the box is significantly larger in the flow direction than

the gradient direction. They attribute these characteristics to the periodic boundary conditions and the onset of turbulent instabilities in the flow. All studies of this work use simulation boxes where the gradient direction is significantly larger than the flow direction. Hence, it is not expected to find these vortices, and they have not been observed.

Implementation The original scheme of the RNEMDS swaps min/max momenta between the slabs at a fixed rate. As a result, the momentum flux jpx fluctuates because the momentum difference between the two particles: ∆p is not the same in every swap. This is not a huge drawback as the fluctuations are small with a sufficient number of particles in each slab.1 In addition, the fluctuations can be averaged during post-processing. Nonetheless, it is not intuitive to set the swapping rate and analyze the stress as an output during post-processing than rather control the momentum flux directly. As a solution to the fluctuating nature ofjpx, I choose to control the integrated momentum flux

J(t) = t

0 dtjpx(t) =

k

∆pk 2A =

i

pix,kpjy,k

2A . (3.3)

At the start of a simulation I define an integrated target momentum flux Jt(t) as an input function. Whenever the summed momentum in the simulation deviates from the input |Jt(t)−i ∆p2Ai|> ϵ, the swapping scheme is activated to reduce the deviation. If the difference between input and actual momentum flux changes sign, the job of the two slabs is exchanged – reversing the momentum flux. Anϵ >0 is required here, otherwise the algorithm swaps momenta back and forth between the slabs as an exact match betweenJt(t) andi ∆p2Ai is unlikely with discrete momentum swaps. With the correct choice ofϵ, I am able to achieve simulations, where the actual stress in the simulation closely follows the input function without much overhead.

Unlike the original scheme, the input of the integrated momentum flux allows the definition of more complex shear scenarios. Oscillatory shear, with a sinusoidal input, for example can be realized with Jt(t) ∝ sin(ωt). Especially for comparison with experimental situations, it is important to be able to simulate more complicated shear flow situations. The oscillatory shear experiments discussed insection 4.3.3cand Ref. [31, 32] are an example application of such a situation. Figure 3.2compares the input function Jt(t) with the actual momentum flux in one of the simulations. Overall the deviations are small: less than one percent.

The integration of this algorithm in HOOMD is realized as an Updater class as it changes the state of the simulation by swapping momenta. Unlike other Updater classes, this class requires a call every time step because the class decides on its own if a swap is required at a given time step. Two different implementation layers have to be considered for the algorithm: the update coordinated between different domains i.e. MPI ranks and the update inside of each domain. Inside a single domain, the

1The statistics of the maximum/minimum value is discussed in the theory of extreme value statistics [124–126]. I can assume the momenta to be independently, identically, and Gaussian distributed in a slab. For this situation the variance of the maximum scales as log(m) for many particles,m, in the slab [126].

3.1. MD and DPD implementation HOOMD

Fig. 3.2 Comparison of the stress input J(t) with obtained momen-tum flux

ipi/(2A).

The actual momentum flux follows the desired input closely. The de-viations are below one percent of the

ampli-tude ofJ(t). 0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 t/T

400 200 0 200 400

J(t)[/3]

sinus input output i2Api

parallelization layer is implemented with CUDA. The core of the problem is a reduction similar to argmin and argmax. NVIDIA offers with “Thrust” [127] a flexible CUDA library for reduction problems. I decided to implement the algorithm with “Thrust” as it is already a HOOMD dependency and offers a version-stable implementation. All future generations of CUDA and NVIDIA GPUs will be most likely supported by “Thrust”.

Nonetheless, the reduction problem here is not straightforward. As a solution, a custom functor acts as a binary operator and handles the comparison of two particle indices.

It (i) ensures that the particle position is located inside the slab of interest and (ii) compares the momentum of the particle in correct Cartesian coordinates. As a result, each domain obtains the particle index and momentum with the maximum (minimum) momentum component inside the swapping slab.

The outer layer of the parallelism connects all domains of the simulation. Only domains which contain a section of one of the swapping slabs actually perform the previously described search. All domains participating in search of the maximum (minimum) momentum component locate the global maximum (minimum) via MPI. Finally, the two domains hosting the particle with maximum and minimum momentum component swap the information and exchange the momenta directly in the GPU memory.

This implementation of the RNEMDS scheme shows good performance. For most simulations no increase in the total execution time is noticeable. Only extreme shear rates which require multiple momentum swaps per time step show a drop in performance, but these extreme situations correspond, in most cases, to unphysically large stresses.

The implementation has been integrated into the main version of HOOMD and is available to all users since version 2.1.0. It is as flexible as possible, offering the user a variety of choices: the number of slabsNslabs, the slab id of the maximum and minimum momentum, the direction of the shear gradient, and the direction of the shear flow.

For a regular shear flow, the latter two have to be different from each other, but the implementation is only warning the user if they are identical, in case there is a reasonable application for this scenario. The only restriction is that the implementation is only valid for triclinic boxes since a modification of the algorithm to nontricilic boxes is