Your predictions must be uploaded on the D3R SAMPL6 web-page by January 19, 2018.
Templates for the submission of the prediction data are provided at
host_guest/SAMPLing/absolute_submission_example.txt
and
host_guest/SAMPLing/relative_submission_example.txt
.
See the section "Uploading your predictions" for additional information.
The purpose of the SAMPLing challenge component is to evaluate and compare the performance of different sampling methodologies in the context of free energy calculations of biomolecular systems. Participants are invited to compute the free energy of binding of few host-guest systems taken from the main SAMPL6 challenge. We will be running extremely long calculations with the provided input files in an attempt to obtain "gold standard" results, and then assess how well different methods approach/converge to these results.
Force field parameters, and ideally treatment of long-range interactions, should be identical for all participants to allow a more objective comparison of the sampling methods. For this purpose, equilibrated system files that include topologies and initial configurations are provided in host_guest/SAMPLing/
in various formats (i.e., Amber, GROMACS, OpenMM, PDB). Five different initial configurations are given for each host-guest system. See section Files description for more details about the input files and the setup protocol.
The specific instructions are slightly different for absolute and relative free energy methods.
The challenge consists in computing the standard free energy of binding of three host-guest systems:
- CB8-G3 (quinine),
- OA-G3 (5-hexenoic acid), and
- OA-G6 (4-methylpentanoic acid).
A total of 15 free energy calculations have to be performed, starting from the 5 different initial configurations provided for each host-guest system.
The challenge consists in computing the relative binding free energy of the transformation OA-G3 (5-hexenoic acid) to OA-G6 (4-methylpentanoic acid). The specific transformation is described by the atom map provided in JSON format with the input files (see section Files description). A total of 5 free energy calculations have to be performed, starting from the 5 initial configurations provided for OA-G3.
D3R is currently outfitting the SAMPL6 page with the ability to accept your uploaded predictions. As soon as this is ready, you may upload your predictions. If you want to upload more than one set of predictions, generated by different methods, each set must be uploaded as a separate file. Please use the template provided, as the predictions will be parsed and analyzed with automated scripts.
The names of the prediction files for absolute and relative free energy calculations submissions must begin with the
prefix absolute
or relative
respectively followed by an arbitrary name and an integer identifying the set of predictions.
For example, to submit three sets of predictions generated by 2 absolute free energy calculations methods and 1
relative free energy calculation, you should upload three files called absolute-myname-1.txt
, absolute-myname-2.txt
and relative-myname-1.txt
.
The file will be machine parsed, so correct formatting is essential. Files with the wrong format will not be accepted.
Commented examples with detailed instructions are located at
host_guest/SAMPLing/absolute_submission_example.txt
and
host_guest/SAMPLing/relative_submission_example.txt
.
Lines beginning with a hash-tag (#
) may be included as comments. These and blank lines will be ignored.
For each calculation (15/5 for absolute/relative free energy methods), you will have to submit the following information:
- Binding free energy estimates and (optionally) free energy uncertainties using 1%, 2%, 3%, ..., 100% of the sequential data (n.b. not bootstrapped).
- Total number of energy evaluations, total wall clock time, and (optionally) total CPU time.
- Software listing, a description of the hardware used to perform the calculations, and, if the implementation is distributed over multiple processes/CPUs/GPUs, a description of the parallelization strategy.
- A detailed description of the methodology, including the thermodynamic cycle and the number of states (or windows) employed.
The reference absolute free energy calculations will be performed using YANK and the following methods/parameters:
- Hamiltonian Replica-Exchange and Langevin dynamics (BAOAB splitting) with the temperature set to 298.15K.
- A Monte Carlo barostat set at 1atm
- The OpenMM's implementation of PME for long-range electrostatic interactions with a cutoff of 10A, Ewald tolerance 1e-4, and 5th order B-splines (see OpenMM documentation for more implementation details). A counterion is alchemically decoupled together with the ligand to preserve the neutral net charge of the system.
- VdW interactions used the same 10A cutoff and a switching distance of 9A.
Further details will be provided in the near future.
Equilibrated systems are provided for OA-G3 (5-hexenoic acid), OA-G6 (4-methylpentanoic acid) and CB8-G3 (quinine), and they are located at host_guest/SAMPLing/
. Five different initial configurations are given for each system. The files are available in AMBER (prmtop
/rst7
), GROMACS (top
/gro
), OpenMM (xml
), LAMMPS (lmp
/input
), DESMOND (cms
), and CHARMM(inp
/crd
/prf
/prm
/rtf
) and PDB formats. Each sub-folder HOST-GUEST-X/
, where X
is a digit labeling one of the 5 initial configurations, contains a folder with each program, each of which solvated system files for both the host-guest complex (e.g. complex.prmtop
, complex.gro
) and the guest alone (e.g. solvent.prmtop
, solvent.gro
).
The host_guest/SAMPLing/
folder includes also an atom map in JSON format that has to be used for relative free energy calculations. The ligand atoms of OA-G3 that match the ligand atoms in OA-G6 are given for the systems in complex and in solvent. The file has the following format:
"complex":
"unique_atoms_G3": [184, 185, 187, 192, 193, 194, 195, 196]
"unique_atoms_G6": [185, 186, 189, 192, 193, 194, 195, 196, 197, 202]
"atom_map_G3_to_G6":
"197": 198
"198": 199
"199": 201
...
"solvent":
...
where unique_atoms_G3
is a list of atom indices that do not match any G6 atom, and atom_map_G3_to_G6
maps atoms of G3 to those of G6 by atom index. All indices are 0-based. This map can be used with any of the 5 replicates of OA-G3-X
and OA-G6-X
.
All the host-guest system files in the SAMPLing/
directory were prepared using the protocol below.
- We used the most likely protonation states as predicted by Epik
4.0013
from the Schrodinger toolkit at experimental pH. These are identical to those given in themol2
files inhost_guest/OctaAcidsAndGuests/
andhost_guest/CB8AndGuests/
. - 5 docked complexes were generated with OpenEye
2017.6.1
. - Hosts and guests were both parametrized with GAFF v1.8 and antechamber. AM1-BCC charges were generated using OpenEye's QUACPAC toolkit through
openmoltools 0.8.1
. - The systems were solvated in a 12A buffer of TIP3P water molecules using tleap. ParmEd
2.7.3
was used to remove some of the water molecules from the OA complexes to reduce them to have the same number of waters. - The systems' net charge was neutralized with Na+ and Cl- ions. More Na+ and Cl- ions were added to reach the ionic strength of 60mM for OA/TEMOA systems and 150mM for CB8. Note that this ionic strength is different from the one used for the experimental measurements.
- The system was minimized with the L-BFGS optimization algorithm and equilibrated by running 1 ns of Langevin dynamics (BAOAB splitting, 1 fs time step) at 298.15K with a Monte Carlo barostat set at 1 atm using
OpenMM 7.1.1
. PME was used for long-range electrostatic interactions with a cutoff of 10A. VdW interactions used the same 10A cutoff and a switching distance of 9A. - After the equilibration, the
System
was serialized into the OpenMMxml
format. Therst7
file was generated during the equilibration using theRestartReporter
in theparmed.openmm
module (ParmEd2.7.3
). The AMBERprmtop
andrst7
files were then converted to PDB format by MDTraj1.9.1
. - Other conversions are described in
SAMPLing/conversion.md
, as well as validations of the energies calculated after configuration between programs for each of the five configurations.