• Keine Ergebnisse gefunden

In this section, we briefly present the individual steps required to setup and perform a WE sim-ulation with the hdWE program and the Amber molecular dynamics soware []. hdWE uses a global configuration file (hdWE.conf) to define the WE simulation parameters. Its syntax is based on the python moduleConfigParser. e following example shows a configuration file to simulate the binding process of a ligand to a protein with WE using the distance between protein and ligand as binning coordinate.



[hdWE]

boundaries = 10.0 12.0 14.0 16.0 18.0 20.0 sample-region = -99999 99999

e workflow initially requires the setup of topology and coordinate files of the protein and ligand system with the AMBER package. Instructions on the setup protocol can be found in the AMBER manual. e resulting topology (pro-lig.top), coordinate (pro-lig.top), and Amber run filepro-lig.inhave to be specified in the WE configuration file. eworkdirparameter points to the directory wherehdWEsearches for files and writes the output of the WE simulation.

In order to calculate the binning coordinate(s) from the MD trajectories, hdWE relies on the Amber toolcpptraj. e syntax forcpptrajinput files is documented in the Amber manual and can be any type of coordinate. To specify one ore multiple binning coordinates, an Amber mask has to be wrien to thepro-lig.maskinput file in one line per coordinate.

distance d :1-367 :LIG

In this example, the protein is defined by residues – while the ligand molecule is identified with the residue nameLIG. eoutdirective for the definition of thecpptrajoutput file will automatically be added byhdWE. e bin boundaries are then defined with theboundaries pa-rameter. In our example we have  bins covering distancesb0 ={d|0< d≤ 10}, b1 ={d| 10< d≤12}, . . . , b6 ={d|20< d≤ ∞ }. e bins are open at their endpoints covering the whole phase space up to the region specified insampling-region. In our case we just specified a numerically large range compared to the system size. e units of the distance are taken from

cpptrajwhich uses Å for distances. If multiple binning coordinates are used, additional coordi-nate boundaries are specified in a comma separated list format. e target number of trajectories per bin is given insegments-per-binand is required by the resampling routine to define, how oen trajectories are to be split or merged. max-iterationsgives the number of WE itera-tions and the additional flagskeep-trajectory-filesandkeep-coords-frequency spec-ify, whether the trajectory output files are stored and at how oen the coordinate files resulting from each WE iteration are stored. e amber binarypmemd.cudashould be in the$PATH vari-able of the shell, alternatively a full path can be specified.hdWEstarts multiplepmemdprograms



in parallel via ampipipeline, therefore the user can fine tune the parallelization via thempirun

command which may also run on multiple nodes via the--hostor--hostfileflag ofmpirun. Parallelization is switched on with theparallelization-mode parameter and the available GPUs are defined over thecuda_visible_devicesparameter which is translated internally to theCUDA_VISIBLE_DEVICESshell variable. Aer having setup the configuration file, the sim-ulation is ready to be started withhdwE. When callinghdWEwith the flag-hfrom a terminal, it outputs an overview of the available command line options.

usage: hdWE.py [-h] [-c FILE] [-d] [-a | -o] [-n]

_ ___ ________

| | | \ \ / / ____|

| |__ __| |\ \ /\ / /| |__

| '_ \ / _` | \ \/ \/ / | __|

| | | | (_| | \ /\ / | |____

|_| |_|\__,_| \/ \/ |______|

A hyperdimensional weighted ensemble simulation implementation.

optional arguments:

-h, --help show this help message and exit -c FILE, --conf FILE The hdWE configuration file -d, --debug Turn debugging on

-a, --append continue previous iterations (use parameters from conf file when --read is given)

-o, --overwrite overwrite previous simulation data

-n, --new-conf Read new boundaries from config file when --append is used

In our example of a protein–ligand complex formation, the command to initiate the simulation would be

hdwe.py -c hdWE.conf

e program creates two output directoriespro-lig-logandpro-lig-runcontaining log files of every iteration in the first and the trajectory/coordinate files in the second folder. e iter-ation log filenames are the zero padded iteriter-ation indices with the filename extension .iter (e. g. 00000000.iter, 00000001.iter). e naming scheme for the coordinate and trajec-tory output files is assembled again from zero padded iteration, bin, and trajectrajec-tory index (e. g.

file00000034_00002_00013.rst7identifies as segment  of bin  in iteration ). hdWE pro-vides a set of tools to analyze the results of a WE simulation. In the given example the transition rates between arbitrary states can be calculated with theana_TraceFluxprogram.

ana_TraceFlux -l pro-lig-log --state-A 0 10 --state-B 18 99999 -B 1000 -bs 500

e program requires the log file directory of the WE simulation and the definition of state intervals along the binning coordinate. e-B flag effects that the initial  WE iterations are skipped as equilibration for the analysis. ana_TraceFluxperforms error analysis according to the statistical bootstrapping method, and therefore splits the data in blocks of size-bs 500

[]. eana_BinPMFtool calculates the PMF along the binning coordinate.

ana_BinPMF.py -l pro-lig-log -o pmf.dat

It again requires the path of the log file directory and an optional output filename. e PMF data is stored as an ASCII file which can be conveniently visualized with standard ploing soware.



Distance based RMSD potential in