-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
My plan of representing the restraint #1
Conversation
Thanks for this. In terms of your modifications to _restraint.py, I agree that using the production protocol makes sense. I think the tests look good for your suggested representation. However, to avoid user error, it might be good to provide a setter method for the restraints, e.g.:
Where this could use default units (but allow for other units to be passed in). This might prevent users, for example, using the anchor points inconsistently in the Boresch DOF (e.g. using an atom in the bond which is not present in the angles). It might also be good to add a test to check for this. Thanks |
@fjclark You are right, I will adopt this format.
So my thought is that we would have
So we would have the check for restraint in BSS.FreeEnergy.Relative and will not set the restraint in the BSS.FreeEnergy.Restraint |
@fjclark So currently, I will be working on implementing the test for this data structure and the downstream writing for Gromacs in this PR. |
@xiki-tempula this sounds good. However, what do you think of my suggestion to have both BSS.FreeEnergy.RestraintSearch (what is currently BSS.FreeEnergy.Restraint), and BSS.FreeEnergy.Restraint (to hold an engine-agnostic representation of the restraints with potentials as algebraic expressions) (see michellab#321)? If you agree that this is a sensible approach, I think a sensible work plan from my side would be:
What do you think? Thanks |
@fjclark I think
Is quite sensible.
I think this is also quite sensible. I will work on this in this PR where I will implement the checker and the toString method for altering the Gromacs topology. For now I think we can have the data structure as #1 (comment). I'm an industry worker so the deadline for implementing a ABFE prototype is the end of this month and so (to hold an engine-agnostic representation of the restraints with potentials as algebraic expressions) might be a bit too much for the moment. The deadline of getting a ABFE prototype by the end of this month is a deadline for me so don't worry if you cannot get BSS.FreeEnergy.RestraintSearch done, I could swap in the MDRestrintGenerator for now. |
@xiki-tempula OK - I will work exclusively on BSS.FreeEnergy.RestraintSearch now, based on the data structure you highlighted. I appreciate your need to work quickly and I'm sorry that I have not had as much time as I would have liked to work on this. I will do my best with BSS.FreeEnergy.RestraintSearch, but I have some other work to complete this week, and am away/ at a conference from the 28th - 8th, so it is likely I will not complete this before the end of the month - sorry about that. |
@fjclark No worry, we had anticipated this and the origin plan is that I will get this whole thing done and then you will be join us so your help is very appreciated. |
@fjclark I have finished the data structure of the Restraint, do you mind have a look? |
|
||
''' | ||
def __init__(self, system, restraint_dict, type='Boresch'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to avoid passing in the system with the restraint, if we are planning to do this:
restraint , best_frame = restraint_search.analyse() # where restraint is of the type BSS.FreeEnergy.Restraint
protocol = BSS.Protocol.FreeEnergy(restraint=restraint, ...)
ABFE = BSS.FreeEnergy.Absolute(system, protocol, work_dir="output") # Handle writing restraints here based on the engine
Because storing the system in the restraint class would mean effectively passing in the system twice to BSS.FreeEnergy.Absolute . Maybe it would be more robust to take the system as an argument to toString, so in BSS.FreeEnergy.Absolute(system, protocol, work_dir="output") (or relative), only one system is considered, and there can never be any issues with differing systems. BSS.FreeEnergy.Absolute could then call Restraint.toString(system,...) internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because storing the system in the restraint class would mean effectively passing in the system twice to BSS.FreeEnergy.Absolute.
This is a tricky issue, I do agree that passing the system twice is not optimal but I think passing system
to toString
is not intuitive.
I realised that I forget to implement this but I will also need to check if the ligand atoms are in the decoupled molecule, which is a property of the system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it would be optimal to check if the ligand atoms are in the decoupled molecule in Restraint, and I agree that passing system
to toString
is not the clearest.
However, I find would find it most intuitive if Restraint was as separate as possible from the system. For example, maybe the same restraints would be used for different systems (maybe starting from different conformations without bothering to run another RestraintSearch):
ABFE1 = BSS.FreeEnergy.Absolute(system_conformation1, protocol, work_dir="output")
ABFE2 = BSS.FreeEnergy.Absolute(system_conformation2, protocol, work_dir="output")
In this case, it might cause some confusion if ABFE2 has effectively been passed two different systems.
If I understand correctly: I was thinking that BSS.Free.Energy.Absolute (or relative) might be the best place to check if the ligand atoms are in the decoupled molecule, because we will have to check here anyway (in case the user has accidentally passed in a different system with a ligand marked for decoupling) and might as well not duplicate the check. However, I can see benefits of also checking in Restraint (e.g. if a user wants to run the subsequent calculations outside of BSS), so happy to stick with the current approach if you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have give it some thought but I cannot think of an easy way out.
So in the later stage, the BSS.FreeEnergy.Restraint object will be part of the BSS._SireWrappers.System. This should take care of the topology writer issue as well. However, this could be a large task and might take some time.
My vision of the current version of the BSS.FreeEnergy.Restraint object is that it will be a patch to the System to hold the Restraint.
With regard to the issue of maybe starting from different conformations without bothering to run another RestraintSearch
One solution could be that give BSS.FreeEnergy.Restraint a method update_system(), to update the underlaying system without changing the restraint.
f'yet. Only boresch restraint is supported.') | ||
|
||
def toString(self, engine='Gromacs'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly pass the system in here (see previous comment).
import pytest | ||
|
||
import BioSimSpace.Sandpit.Exscientia as BSS | ||
from BioSimSpace.Sandpit.Exscientia.Align._decouple import decouple |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be _decouple or decouple? My understanding was that the user would have to directly use decouple, so would from BioSimSpace.Sandpit.Exscientia.Align.decouple import decouple
be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right, I guess I copied and pasted this command from somewhere so hadn't changed it.
self._system = system.copy() | ||
|
||
if type.lower() == 'boresch': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to use rest_type instead of type to be consistent with RestraintSearch, and to stay away from type()?
@fjclark Thanks for the review and comments, I have made the changes. Do you mind have a look, please? Thank you.
|
@xiki-tempula Thanks for the changes. I only have a few minor comments: |
else: | ||
raise NotImplementedError(f'Restraint type {type} not implemented ' | ||
f'yet. Only boresch restraint is supported.') | ||
|
||
self._content = restraint_dict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would self._restraint_dict be clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
atom = self._content['anchor_points'][key] | ||
if not atom._sire_object.molecule().number() == decoupled_mol._sire_object.number(): | ||
raise ValueError( | ||
f'The ligand atom {key} is not from decoupled moleucle.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo - "moleucle"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch
|
||
# Store a copy of solvated system. | ||
self._system = system.copy() | ||
|
||
def toString(self, engine='Gromacs'): | ||
'''The method for convert the restraint to a format that could be used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realise you're in a rush, but if you have time later, maybe make the docstrings more complete in general (e.g. include parameters)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I will add them now
(decoupled_mol,) = system.getDecoupledMolecules() | ||
for key in ['l1', 'l2', 'l3']: | ||
atom = self._content['anchor_points'][key] | ||
if not atom._sire_object.molecule().number() == decoupled_mol._sire_object.number(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need to loop over l1, 2, and 3, if we expect them all to have the same molecule number, and they are all compared to the same number (decoupled_mol._sire_object.number())?
Would it be more robust and specific to do something like:
for key in ['l1', 'l2', 'l3']:
atom = self._content['anchor_points'][key]
if not atom in decoupled_mol:
...
Could also check the protein anchor atoms generally with if atom in system
, if I am understanding this correctly: michellab#331.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fjclark This is a new feature
Do you mind merge the michellab/BioSimSpace/devel into this devel so I could test it? Thanks.
@fjclark Many thanks for the help. |
@xiki-tempula , great, and thank you. |
* update * Update _restraint.py
@fjclark When you are back, just give me a shout and I can point you towards the lates version. |
@fjclark So I think my version is stable now. Please see the branch https://github.com/Exscientia/BioSimSpace/tree/feat_gen_rest I have incorporated your changes so I would recommend that you could reset to my branch and work from that to avoid conflict hell. So I setup my environment as
The complete API to run the restraint generation is (assume that the ligand and protein is the BSS.molecule which has been parameterised). import BioSimSpace.Sandpit.Exscientia as BSS
from BioSimSpace.Sandpit.Exscientia.Align import decouple
# Decouple the ligand
decoupled_ligand = decouple(ligand)
new_system = (protein + decoupled_ligand).toSystem()
# Solvation
solvated = BSS.Solvent.tip3p(molecule=new_system, box=3 * [6.5 * BSS.Units.Length.nanometer])
# Energy Minimisation
protocol = BSS.Protocol.Minimisation()
process = BSS.Process.Gromacs(solvated, protocol, work_dir='em')
process.start()
em = process.getSystem(block=True)
# Equilibration
em.repartitionHydrogenMass()
protocol = BSS.Protocol.Equilibration()
process = BSS.Process.Gromacs(em, protocol, work_dir='equil')
process.setArg("-update", "gpu")
process.start()
equil = process.getSystem(block=True)
# Production run to get the restraint
protocol = BSS.Protocol.Production(timestep=2*BSS.Units.Time.femtosecond, runtime=15*BSS.Units.Time.nanosecond)
restraint_search = BSS.FreeEnergy.RestraintSearch(equil, protocol=protocol, engine='GROMACS', gpu_support=True, ignore_warnings=True, work_dir='restraint')
restraint_search.start()
restraint = restraint_search.analyse(method='MDRestraintsGenerator', block=True)
# Run the protein leg of the ABFE
protocol = BSS.Protocol.FreeEnergyMinimisation(
lam=pd.Series(data={'bonded': 0.0, 'coul': 0.0, 'vdw': 0.0}),
lam_vals=pd.DataFrame(
data={'bonded': [0.0, 0.01, 0.025, 0.05, 0.075, 0.1, 0.2, 0.35, 0.5, 0.75, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],
'coul': [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.16, 0.33, 0.5, 0.67, 0.83, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],
'vdw': [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95, 1.0]}
))
protein_em = BSS.FreeEnergy.Relative(restraint.system, protocol, engine='GROMACS', restraint=restraint, work_dir='protein_em')
# Set up the ligand leg
# Solvation
solvated = BSS.Solvent.tip3p(molecule=decoupled_ligand, box=3 * [3.06 * BSS.Units.Length.nanometer])
protocol = BSS.Protocol.FreeEnergyMinimisation(
lam=pd.Series(data={'coul': 0.0, 'vdw': 0.0}),
lam_vals=pd.DataFrame(
data={'coul': [0.0, 0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],
'vdw': [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95, 1.0]}
))
ABFE_em = BSS.FreeEnergy.Relative(solvated, protocol, engine='GROMACS', work_dir='ligand_em') |
@xiki-tempula brilliant. Thanks very much for the information. I will reset to your branch and open a pull request to feat_gen_restraint when I continue work on this. To update you on my work plan: my main focus for this month is analysing/ running some more simulations and producing a report for a deadline at the end of the month, so I am unlikely to do much work on BSS. However, from early July, I plan to work on implementing ABFE for SOMD. Thanks |
@fjclark Hi, I have polished up the ABFE generation and the pipeline is in https://github.com/Exscientia/BioSimSpace devel branch now. When you start the development. Do you mind reset to the devel branch? Thanks. |
@xiki-tempula thanks for the update - I will do. Thanks. |
Please don't merge this PR yet.
After the discussion of michellab#321, I have tried to show my plan of representing the restraint https://github.com/fjclark/BioSimSpace/pull/1/files#diff-7ea50c3160a47ea78b09fa99cd48c243de7b49074fc5e62313221400c834ac21R40-R56
And the definition for each field
https://github.com/fjclark/BioSimSpace/pull/1/files#diff-7ea50c3160a47ea78b09fa99cd48c243de7b49074fc5e62313221400c834ac21R59-R98
@fjclark Do you mind have a look and tell me your thought? Thanks.