-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scale factors lost converting from AMBER to GROMACS #337
Comments
Hi there, I'll be able to take a better look at this tomorrow. Just to check, here are the import BioSimSpace as BSS
# Load the original system.
system = BSS.IO.readMolecules(["MOL.rst7", "MOL.prm7"])
BSS.IO.saveMolecules("MOL", system, ["gro87", "grotop"])
system[0]._sire_object.property("forcefield")
MM ForceField{ amber::ff,
combining_rules = arithmetic,
1-4 scaling = 0.833333, 0.5,
nonbonded = coulomb, lj,
bond = harmonic, angle = harmonic,
dihedral = cosine }
# Convert to GROMACS format.
BSS.IO.saveMolecules("MOL", system, ["gro87", "grotop"])
# Load system back in.
system2 = BSS.IO.readMolecules(["MOL.grotop", "MOL.gro87"])
system2[0]._sire_object.property("forcefield")
MM ForceField{ amber::ff,
combining_rules = arithmetic,
1-4 scaling = 0.833333, 0.5,
nonbonded = coulomb, lj,
bond = harmonic, angle = harmonic,
dihedral = cosine } As you can see, the two forcefield objects are identical. It doesn't look like information is being lost during conversion. Are the scale factors inferred from the initial AMBER files incorrect, i.e. perhaps it is always setting the same value for anything loaded from AMBER format, irrespective of whether this actually was a GROMACS system converted to AMBER. Apologies if I'm misunderstanding things. Just wanted to be clear before debugging further. |
Yes, I see the issue. As I mentioned, the scale factors are being set incorrectly when going to AMBER format. (Not yet sure if it's on read or write.) Here's a minimal example using some example files from the SireUnitTests repository: import BioSimSpace as BSS
# Load a GROMACS system with OPLS parameters.
s0 = BSS.IO.readMolecules(["cage_quin1.gro", "cage_quin1.top"], {"GROMACS_PATH" : "./cage_quin1"})
s0[0]._sire_object.property("forcefield")
MM ForceField{ unknown,
combining_rules = geometric,
1-4 scaling = 0.5, 0.5,
nonbonded = coulomb, lj,
bond = harmonic, angle = harmonic,
dihedral = gromacs }
# Convert to AMBER format and read back in.
BSS.IO.saveMolecules("test", s, ["prm7", "rst7"])
s1 = BSS.IO.readMolecules(["test.rst7", "test.prm7"])
s1[0]._sire_object.property("forcefield")
MM ForceField{ amber::ff,
combining_rules = arithmetic,
1-4 scaling = 0.833333, 0.5,
nonbonded = coulomb, lj,
bond = harmonic, angle = harmonic,
dihedral = cosine } (This assumes you are in the I'll transfer the issue to Sire and see if I can come up with a fix. |
If I analyse the # Just extract the value of the first NB14 parameter, since they are all the same.
list(s1[0]._sire_object.property("parameters").nb14s().values())[0]
AmberNB14( cscl = 0.5, ljscl = 0.5 ) However, if we then convert to GROMACS format, read in, convert back to AMBER format, read in... # Convert to GROMACS and read back.
BSS.IO.saveMolecules("test", s, ["grotop", "gro87"])
s2 = BSS.IO.readMolecules(["test.gro87", "test.grotop"])
# Convert to AMBER and read back.
BSS.IO.saveMolecules("test2", s2, ["prm7", "rst7"])
s3 = BSS.IO.readMolecules(["test2.prm7", "test2.rst7"])
# Now print the NB14 values.
list(s3[0]._sire_object.property("parameters").nb14s().values())[0]
AmberNB14( cscl = 0.833333, ljscl = 0.5 ) Here is the appropriate section from
It looks like the issue is occurring on the second conversions, which is likely due to the inconsistency in the |
The following code in amberprm.cpp sets the if (map["forcefield"].hasValue())
{
ffield = map["forcefield"].value().asA<MMDetail>();
if (not ffield.isAmberStyle())
throw SireError::incompatible_error( QObject::tr(
"This AmberPrm reader can only parse Amber parm files that hold molecules "
"that are parameterised using an Amber-style forcefield. It cannot read "
"molecules using the forcefield\n%1").arg(ffield.toString()), CODELOC );
}
else
{
//now guess the forcefield based on what we know about the potential
ffield = MMDetail::guessFrom("arithmetic", "coulomb", "lj",
1.0/1.2, 0.5, "harmonic", "harmonic",
"cosine");
} It doesn't try to infer the scale factors from any of the I think I need some input from @chryswoods to decide how best to proceed. He's busy in meetings for the next few days so I'll revisit this on Friday, or early next week. |
The issue is that the amber topology (prm) format does not specify the combining rules or the default 1-4 scale factors. The file format supports having different dihedrals having different 1-4 scaling factors, and the Amber default scale factors are used for dihedrals that don't specify their 1-4 terms, e.g. Sire/corelib/src/libs/SireIO/amberprm.cpp Line 4045 in 369a778
Because the combining rules and a "default" 1-4 scaling factor are not in the topology file, it is not possible for Sire to do anything other than guess at what those values should be for the "forcefield" object, hence why it defaults to guessing the Amber values and arithmetic combining rules (when OPLS should be geometric combining rules). In theory the code could scan all of the AmberNB14 objects created after it has read the topology file to find a consensus view of what the "default" 1-4 scaling factors should be. As different dihedrals "could" use different 1-4 terms (although not in any existing forcefield) it would be easiest just to use the 1-4 values from the AmberNB14 that is used most in a System (the defaults apply to the whole System, not just one molecule in the System). That way the guess could be informed by the majority-used AmberNB14 nb14 values. This wouldn't solve the combining rules issue, as the combining rules cannot be written to the topology file, and are thus not available to be read back in. The only solution here is for the user to specify the combining rules via a user-mapping object on read (e.g. as they can now specify the forcefield manually via the map["forcefield"] parameter). I should note that I don't believe that the Amber programs support running simulations using geometric combining rules, as I think they only support arithmetic rules. As such, they won't be using the "correct" OPLS forcefield. Last time I checked was a few years back, so things may have changed. |
Could this be solved by allowing the user to provide a keyword argument to BSS.IO.readMolecules() that specifies non-default scaling factors to be used ? |
Thanks, @chryswoods, that makes sense. I had written code to do what you suggested regarding scanning the |
@jmichel80 Yes, that could be a good option. |
I don't like the idea of printing warnings to the screen, but in this case I think it would be justified. Technically, we should not write a System to AmberPrm if it says it uses geometric combining rules, as these are not supported by the file format. I suggest that we add a warning to AmberPrm that writes a warning to the screen saying something like A very dangerous solution would be for us to add our own %FLAG% to the AmberPrm format that specifies the combining rules to be used. AmberPrm allows for customs flags, e.g. unrecognised flags are ignored, so you can add your own if needed (https://ambermd.org/FileFormats.php#topo.cntrl). It would be safe(ish) to call the flag "%FLAG BSS_COMBINING_RULES" and use "arithmetic" or "geometric" as strings. It does "cross the rubicon" of us modifying file formats, and, ultimately, the best solution would be for the Amber developers to add this flag themselves. |
@jmichel80 - yes, that is effectively what Sire supports via the "map" keyword. This could be exposed in BioSimSpace and made easier to use (e.g. specifying "combining_rules" and "default_14_params" directly). However, the more we add, the more options would then need to be exported by the workflow components that read these files. So we would need to be careful. Ideally, we should use the information that exists in the file, and this does give the 14-scaling parameters, so we should use those. It may be necessary in the case of combining rules though... |
Thanks, those are good suggestions. I totally agree about not liking warning messages, which is why I have pretty much suppressed everything from third-party libraries, or caught the warning and expressed it in a more informative way. |
The |
I suggest posting an example for @jthorton benefit and we will make sure this is well documented when we tidy up later this year. Best to avoid turning BSS into a tool for producing subtly flawed inputs that will run nonetheless... |
I've added the suggested warning to the BioSimSpace writer. I think everything would work if we updated if (map["forcefield"].hasValue())
{
ffield = map["forcefield"].value().asA<MMDetail>();
if (not ffield.isAmberStyle())
throw SireError::incompatible_error( QObject::tr(
"This AmberPrm reader can only parse Amber parm files that hold molecules "
"that are parameterised using an Amber-style forcefield. It cannot read "
"molecules using the forcefield\n%1").arg(ffield.toString()), CODELOC );
}
else
{
//now guess the forcefield based on what we know about the potential
ffield = MMDetail::guessFrom("arithmetic", "coulomb", "lj",
1.0/1.2, 0.5, "harmonic", "harmonic",
"cosine");
} Which would fail if we passed through the Currently ['amber::gaff',
'amber::ff14',
'amber::ff99',
'amber::ff12',
'amber::ff03',
'amber::ff'] On loading an OPLS system we get: MM ForceField{ unknown,
combining_rules = geometric,
1-4 scaling = 0.5, 0.5,
nonbonded = coulomb, lj,
bond = harmonic, angle = harmonic,
dihedral = gromacs } However, the following code in mmdetail.cpp looks like it would do what we want: /** Function used to guess the forcefield from the passed set of conditions.
This returns a null MMDetail object if we can't guess */
MMDetail MMDetail::guessFrom(QString combrules, QString elecstyle,
QString vdwstyle, double elec14, double vdw14,
QString bondstyle, QString angstyle, QString dihstyle)
{
//start with the internals
if ( _isHarmonic(bondstyle) and _isHarmonic(angstyle) and _isCosine(dihstyle) )
{
if ( _isCoulomb(elecstyle) and _isLJ(vdwstyle) )
{
//looking like an amber or opls style forcefield...
if ( _isArithmetic(combrules) )
{
if (elec14 == (1.0/1.2) and vdw14 == 0.5)
{
//this is a standard amber::ff forcefield
return MMDetail("amber::ff", combrules, elec14, vdw14,
elecstyle, vdwstyle, bondstyle, angstyle, dihstyle);
}
else
{
//this is a weird amber::ff forcefield with strange 1-4 terms...
return MMDetail( QString("amber::ff[%1,%2]").arg(elec14).arg(vdw14),
combrules, elec14, vdw14, elecstyle, vdwstyle,
bondstyle, angstyle, dihstyle );
}
}
else if ( _isGeometric(combrules) )
{
if (elec14 == 0.5 and vdw14 == 0.5)
{
//this is a standard opls::ff forcefield
return MMDetail("opls::ff", combrules, elec14, vdw14,
elecstyle, vdwstyle, bondstyle, angstyle, dihstyle);
}
else
{
//this is a weird opls::ff forcefield with strange 1-4 terms...
return MMDetail( QString("opls::ff[%1,%2]").arg(elec14).arg(vdw14),
combrules, elec14, vdw14, elecstyle, vdwstyle,
bondstyle, angstyle, dihstyle );
}
}
}
}
return MMDetail("unknown", combrules, elec14, vdw14, elecstyle, vdwstyle,
bondstyle, angstyle, dihstyle);
} If looks like we are getting |
Okay, I've solved this by flagging Ryckaert-Bellemans dihedrals as a cosine dihedral in gromacsparams.cpp (after which the original force field is detected as With these changes in place you can do the following: # This assumes that you are in SireUnitTests/unittests/io
import BioSimSpace as BSS
# Load an OPLS GROMACS system.
s0 = BSS.IO.readMolecules(["cage_quin1.gro", "cage_quin1.top"],
{"GROMACS_PATH" : "./cage_quin1"})
# Store the force field property for the only molecule in the system.
ff0 = s0[0]._sire_object.property("forcefield")
# Write to AMBER format. We don't need to pass the force field
# in the map since we can write OPLS without issue, although we
# are now warned that this process is potentially irreversible.
BSS.IO.saveMolecules("test", s0, ["prm7", "rst7"])
# UserWarning: AMBER topology files do not support force fields that
# use geometric combining rules, as this cannot be specified in the file.
# When this file is re-read, then arithmetic combining rules will be assumed."
# Read the files back in and specify the format of the original force field.
s1 = BSS.IO.readMolecules(["test.prm7", "test.rst7"], {"forcefield" : ff0})
# Store the new force field.
ff1 = s1[0]._sire_object.property("forcefield")
# Assert the force fields are the same.
ff0 == ff1
True
# Now write back to GROMACS formats to make sure that the scale factors and
# combining rules are preserved.
BSS.IO.saveMolecules("test", s1, ["grotop", "gro87"]) Check the
Great, everything has worked as expected! If needed, I could add a thin wrapper around |
Expected Behavior
When converting an AMBER system with non-default scale factors to GROMACS these are reset back to the amber defaults when I think they should be kept. The output grotop file has this default section but I think based on the input below which uses opls style scaling both fudge factors should be 0.5.
Steps to Reproduce
Context
Please provide any relevant information about your setup. This is important in case the issue is not reproducible except for under certain conditions.
input.zip
The text was updated successfully, but these errors were encountered: