A. Usage of hdWE - Structure Formation of Biomolecules studied with Advanced Molecular Dynamic

In this section, we brieﬂy present the individual steps required to setup and perform a WE sim-ulation with the hdWE program and the Amber molecular dynamics soware []. hdWE uses a global conﬁguration ﬁle (^hdWE.conf) to deﬁne the WE simulation parameters. Its syntax is based on the python moduleConfigParser. e following example shows a conﬁguration ﬁle to simulate the binding process of a ligand to a protein with WE using the distance between protein and ligand as binning coordinate.



[hdWE]

boundaries = 10.0 12.0 14.0 16.0 18.0 20.0 sample-region = -99999 99999

e workﬂow initially requires the setup of topology and coordinate ﬁles of the protein and ligand system with the AMBER package. Instructions on the setup protocol can be found in the AMBER manual. e resulting topology (pro-lig.top), coordinate (pro-lig.top), and Amber run ﬁlepro-lig.inhave to be speciﬁed in the WE conﬁguration ﬁle. eworkdirparameter points to the directory where^hdWEsearches for ﬁles and writes the output of the WE simulation.

In order to calculate the binning coordinate(s) from the MD trajectories, hdWE relies on the Amber toolcpptraj. e syntax forcpptrajinput ﬁles is documented in the Amber manual and can be any type of coordinate. To specify one ore multiple binning coordinates, an Amber mask has to be wrien to thepro-lig.maskinput ﬁle in one line per coordinate.

distance d :1-367 :LIG

In this example, the protein is deﬁned by residues – while the ligand molecule is identiﬁed with the residue nameLIG. eoutdirective for the deﬁnition of thecpptrajoutput ﬁle will automatically be added byhdWE. e bin boundaries are then deﬁned with theboundaries pa-rameter. In our example we have  bins covering distancesb0 ={d|0< d≤ 10}, b1 ={d| 10< d≤12}, . . . , b₆ ={d|20< d≤ ∞ }. e bins are open at their endpoints covering the whole phase space up to the region speciﬁed insampling-region. In our case we just speciﬁed a numerically large range compared to the system size. e units of the distance are taken from

cpptrajwhich uses Å for distances. If multiple binning coordinates are used, additional coordi-nate boundaries are speciﬁed in a comma separated list format. e target number of trajectories per bin is given insegments-per-binand is required by the resampling routine to deﬁne, how oen trajectories are to be split or merged. max-iterationsgives the number of WE itera-tions and the additional ﬂagskeep-trajectory-filesandkeep-coords-frequency spec-ify, whether the trajectory output ﬁles are stored and at how oen the coordinate ﬁles resulting from each WE iteration are stored. e amber binary^pmemd.cudashould be in the^$PATH vari-able of the shell, alternatively a full path can be speciﬁed.hdWEstarts multiplepmemdprograms



in parallel via ampipipeline, therefore the user can ﬁne tune the parallelization via thempirun

command which may also run on multiple nodes via the^--hostor^--hostfileﬂag of^mpirun. Parallelization is switched on with theparallelization-mode parameter and the available GPUs are deﬁned over thecuda_visible_devicesparameter which is translated internally to theCUDA_VISIBLE_DEVICESshell variable. Aer having setup the conﬁguration ﬁle, the sim-ulation is ready to be started withhdwE. When callinghdWEwith the ﬂag-hfrom a terminal, it outputs an overview of the available command line options.

usage: hdWE.py [-h] [-c FILE] [-d] [-a | -o] [-n]

_ ___ ________

| | | \ \ / / ____|

| |__ __| |\ \ /\ / /| |__

| '_ \ / _` | \ \/ \/ / | __|

| | | | (_| | \ /\ / | |____

|_| |_|\__,_| \/ \/ |______|

A hyperdimensional weighted ensemble simulation implementation.

optional arguments:

-h, --help show this help message and exit -c FILE, --conf FILE The hdWE configuration file -d, --debug Turn debugging on

-a, --append continue previous iterations (use parameters from conf file when --read is given)

-o, --overwrite overwrite previous simulation data

-n, --new-conf Read new boundaries from config file when --append is used

In our example of a protein–ligand complex formation, the command to initiate the simulation would be

hdwe.py -c hdWE.conf

e program creates two output directoriespro-lig-logandpro-lig-runcontaining log ﬁles of every iteration in the ﬁrst and the trajectory/coordinate ﬁles in the second folder. e iter-ation log ﬁlenames are the zero padded iteriter-ation indices with the ﬁlename extension ^.iter (e. g. 00000000.iter, 00000001.iter). e naming scheme for the coordinate and trajec-tory output ﬁles is assembled again from zero padded iteration, bin, and trajectrajec-tory index (e. g.

ﬁle00000034_00002_00013.rst7identiﬁes as segment  of bin  in iteration ). ^hdWE pro-vides a set of tools to analyze the results of a WE simulation. In the given example the transition rates between arbitrary states can be calculated with theana_TraceFluxprogram.

ana_TraceFlux -l pro-lig-log --state-A 0 10 --state-B 18 99999 -B 1000 -bs 500

e program requires the log ﬁle directory of the WE simulation and the deﬁnition of state intervals along the binning coordinate. e-B ﬂag eﬀects that the initial  WE iterations are skipped as equilibration for the analysis. ana_TraceFluxperforms error analysis according to the statistical bootstrapping method, and therefore splits the data in blocks of size-bs 500

[]. eana_BinPMFtool calculates the PMF along the binning coordinate.

ana_BinPMF.py -l pro-lig-log -o pmf.dat

It again requires the path of the log ﬁle directory and an optional output ﬁlename. e PMF data is stored as an ASCII ﬁle which can be conveniently visualized with standard ploing soware.



Distance based RMSD potential in

Im Dokument Structure Formation of Biomolecules studied with Advanced Molecular Dynamics Simulations (Seite 128-131)