-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding scripts from 2017 mof subset paper NO_JIRA #60
Merged
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,45 +1,60 @@ | ||
## Contents | ||
|
||
This folder contains scripts submitted by users or CCDC scientists for anyone to use freely. | ||
# Contents | ||
|
||
### Hydrogen bond propensity | ||
- Writes a `.docx report` of a hydrogen bond propensity calculation for any given `.mol2`/refcode. | ||
This folder contains scripts submitted by users or CCDC scientists for anyone to use freely. | ||
|
||
### Multi-component hydrogen bond propensity | ||
- Performs a multi-component HBP calculation for a given library of co-formers. | ||
## Concat Mol2 | ||
|
||
### Packing similarity dendrogram | ||
- Construct a dendrogram for an input set of structures based on packing-similarity analysis. | ||
- Concatenates mol2 files present in working directory to a single `.mol2` file. | ||
|
||
### GOLD-multi | ||
- Use the CSD Docking API and the multiprocessing module to parallelize GOLD docking. | ||
## Create CASTEP Input | ||
|
||
- Creates input files (`.cell` and `.param`) files for a given compound through Mercury. | ||
|
||
## Create GAUSSIAN Input | ||
|
||
- Create GAUSSIAN input file (`.gjf`) for a given CSD refcode or `.mol2` file. | ||
|
||
## Find Binding Conformation | ||
|
||
### Find Binding Conformation | ||
- Generates idealized conformers for ligands and evaluates their RMSD to the conformation in the PDB. | ||
|
||
### Concat Mol2 | ||
- Concatenates mol2 files present in working directory to a single `.mol2` file. | ||
## GOLD-multi | ||
|
||
### Create CASTEP Input | ||
- Creates input files (`.cell` and `.param`) files for a given compound through Mercury. | ||
- Use the CSD Docking API and the multiprocessing module to parallelize GOLD docking. | ||
|
||
### Create GAUSSIAN Input | ||
- Create GAUSSIAN input file (`.gjf`) for a given CSD refcode or `.mol2` file. | ||
## Hydrogen bond propensity | ||
|
||
- Writes a `.docx report` of a hydrogen bond propensity calculation for any given `.mol2`/refcode. | ||
|
||
## MOF subset 2017 Chem Mater publication | ||
|
||
- Two scripts that were supplementary information in the publication "Development of a Cambridge Structural Database Subset: | ||
A Collection of Metal–Organic Frameworks for Past, Present, and Future" DOI: <https://doi.org/10.1021/acs.chemmater.7b00441> | ||
|
||
## Multi-component hydrogen bond propensity | ||
|
||
- Performs a multi-component HBP calculation for a given library of co-formers. | ||
|
||
## Packing similarity dendrogram | ||
|
||
- Construct a dendrogram for an input set of structures based on packing-similarity analysis. | ||
|
||
## Particle Rugosity | ||
|
||
### Particle Rugosity | ||
- Calculates the simulated BFDH particle rugosity weighted by facet area. | ||
|
||
## Tips | ||
A section for top tips in using the repository and GitHub. | ||
### Searching tips: | ||
## Tips | ||
|
||
A section for top tips in using the repository and GitHub. | ||
|
||
### Searching tips | ||
|
||
The search bar in GitHub allows you to search for keywords mentioned in any file throughout the repository (in the main branch). | ||
|
||
It is also possible to filter which file type you are interested in. | ||
|
||
For example: | ||
"hydrogen bond" | ||
For example: | ||
"hydrogen bond" | ||
|
||
<img src="../assets/search.gif" width="500px"> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [markdownlint] reported by reviewdog 🐶 |
||
|
||
|
119 changes: 119 additions & 0 deletions
119
...pts/mof_solvent_removal_2017_chem_mater_publication/Command_prompt_MOF_solvent_removal.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,119 @@ | ||
# | ||
# This script can be used for any purpose without limitation subject to the | ||
# conditions at http://www.ccdc.cam.ac.uk/Community/Pages/Licences/v2.aspx | ||
# | ||
# This permission notice and the following statement of attribution must be | ||
# included in all copies or substantial portions of this script. | ||
# | ||
# 2016-12-15: created by S. B. Wiggin, the Cambridge Crystallographic Data Centre | ||
# 2024-07-02: minor update to include using ccdc utilities to find the solvent file | ||
|
||
""" | ||
Script to identify and remove bound solvent molecules from a MOF structure. | ||
|
||
Solvents are identified using a defined list. | ||
Output in CIF format includes only framework component with all monodentate solvent removed. | ||
""" | ||
####################################################################### | ||
|
||
import os | ||
import glob | ||
import argparse | ||
|
||
from ccdc import io | ||
from ccdc import utilities | ||
|
||
####################################################################### | ||
|
||
arg_handler = argparse.ArgumentParser(description=__doc__) | ||
arg_handler.add_argument( | ||
'input_file', | ||
help='CSD .gcd file from which to read MOF structures' | ||
) | ||
arg_handler.add_argument( | ||
'-o', '--output-directory', | ||
help='Directory into which to write stripped structures' | ||
) | ||
arg_handler.add_argument( | ||
'-m', '--monodentate', default=False, action='store_true', | ||
help='Whether or not to strip all unidenate (or monodentate) ligands from the structure' | ||
) | ||
arg_handler.add_argument( | ||
'-s', '--solvent-file', | ||
help='Location of solvent file' | ||
) | ||
|
||
args = arg_handler.parse_args() | ||
if not args.output_directory: | ||
args.output_directory = os.path.dirname(args.input_file) | ||
|
||
# Define the solvent smiles patterns | ||
if not args.solvent_file: | ||
args.solvent_file = utilities.Resources().get_ccdc_solvents_dir() | ||
|
||
if os.path.isdir(args.solvent_file): | ||
solvent_smiles = [ | ||
io.MoleculeReader(f)[0].smiles | ||
for f in glob.glob(os.path.join(args.solvent_file, '*.mol2')) | ||
] | ||
else: | ||
solvent_smiles = [m.smiles for m in io.MoleculeReader(args.solvent_file)] | ||
|
||
|
||
####################################################################### | ||
|
||
|
||
def is_multidentate(c, mol): | ||
""" | ||
Check for components bonded to metals more than once. | ||
If monodentate is not specified in the arguments, skip this test. | ||
""" | ||
if not args.monodentate: | ||
return True | ||
got_one = False | ||
for a in c.atoms: | ||
orig_a = mol.atom(a.label) | ||
if any(x.is_metal for b in orig_a.bonds for x in b.atoms): | ||
if got_one: | ||
return True | ||
got_one = True | ||
return False | ||
|
||
|
||
def is_solvent(c): | ||
"""Check if this component is a solvent.""" | ||
return c.smiles == 'O' or c.smiles in solvent_smiles | ||
|
||
|
||
def has_metal(c): | ||
"""Check if this component has any metals.""" | ||
return any(a.is_metal for a in c.atoms) | ||
|
||
|
||
# Iterate over entries | ||
try: | ||
for entry in io.EntryReader(args.input_file): | ||
if entry.has_3d_structure: | ||
# Ensure labels are unique | ||
mol = entry.molecule | ||
mol.normalise_labels() | ||
# Use a copy | ||
clone = mol.copy() | ||
# Remove all bonds containing a metal atom | ||
clone.remove_bonds(b for b in clone.bonds if any(a.is_metal for a in b.atoms)) | ||
# Work out which components to remove | ||
to_remove = [ | ||
c | ||
for c in clone.components | ||
if not has_metal(c) and (not is_multidentate(c, mol) or is_solvent(c)) | ||
] | ||
# Remove the atoms of selected components | ||
mol.remove_atoms( | ||
mol.atom(a.label) for c in to_remove for a in c.atoms | ||
) | ||
# Write the CIF | ||
entry.crystal.molecule = mol | ||
with io.CrystalWriter('%s/%s_stripped.cif' % (args.output_directory, entry.identifier)) as writer: | ||
writer.write(entry.crystal) | ||
except RuntimeError: | ||
print('File format not recognised') |
98 changes: 98 additions & 0 deletions
98
scripts/mof_solvent_removal_2017_chem_mater_publication/Mercury_MOF_solvent_removal.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,98 @@ | ||
# | ||
# This script can be used for any purpose without limitation subject to the | ||
# conditions at http://www.ccdc.cam.ac.uk/Community/Pages/Licences/v2.aspx | ||
# | ||
# This permission notice and the following statement of attribution must be | ||
# included in all copies or substantial portions of this script. | ||
# | ||
# 2016-12-15: created by S. B. Wiggin, the Cambridge Crystallographic Data Centre | ||
# 2024-07-02: minor update to include using ccdc utilities to find the solvent file | ||
|
||
""" | ||
Script to identify and remove bound solvent molecules from a MOF structure. | ||
|
||
Solvents are identified using a defined list. | ||
Output in CIF format includes only framework component with all monodentate solvent removed. | ||
""" | ||
####################################################################### | ||
|
||
import os | ||
import glob | ||
|
||
from ccdc import io | ||
from ccdc import utilities | ||
from mercury_interface import MercuryInterface | ||
|
||
####################################################################### | ||
|
||
helper = MercuryInterface() | ||
solvent_smiles = [] | ||
|
||
# Define the solvent smiles patterns | ||
solvent_file = utilities.Resources().get_ccdc_solvents_dir() | ||
|
||
if os.path.isdir(solvent_file): | ||
solvent_smiles = [ | ||
io.MoleculeReader(f)[0].smiles | ||
for f in glob.glob(os.path.join(solvent_file, '*.mol2')) | ||
] | ||
|
||
else: | ||
html_file = helper.output_html_file | ||
f = open(html_file, "w") | ||
f.write('<br>') | ||
f.write('Sorry, unable to locate solvent files in the CCDC directory') | ||
f.write('<br>') | ||
f.close() | ||
# a user-defined solvent directory could be added here instead | ||
|
||
####################################################################### | ||
|
||
|
||
def is_solvent(c): | ||
"""Check if this component is a solvent.""" | ||
return c.smiles == 'O' or c.smiles in solvent_smiles | ||
|
||
|
||
def has_metal(c): | ||
"""Check if this component has any metals.""" | ||
return any(a.is_metal for a in c.atoms) | ||
|
||
|
||
entry = helper.current_entry | ||
if entry.has_3d_structure: | ||
# Ensure labels are unique | ||
mol = entry.molecule | ||
mol.normalise_labels() | ||
# Use a copy | ||
clone = mol.copy() | ||
# Remove all bonds containing a metal atom | ||
clone.remove_bonds(b for b in clone.bonds if any(a.is_metal for a in b.atoms)) | ||
# Work out which components to remove | ||
to_remove = [ | ||
c | ||
for c in clone.components | ||
if not has_metal(c) and is_solvent(c) | ||
] | ||
# Remove the atoms of selected components | ||
mol.remove_atoms( | ||
mol.atom(a.label) for c in to_remove for a in c.atoms | ||
) | ||
# Write the CIF | ||
entry.crystal.molecule = mol | ||
with (io.CrystalWriter('%s/%s_stripped.cif' % (helper.options['working_directory_path'], entry.identifier)) as | ||
writer): | ||
writer.write(entry.crystal) | ||
html_file = helper.output_html_file | ||
f = open(html_file, "w") | ||
f.write('<br>') | ||
f.write('Cif file containing MOF framework without monodentate solvent written to your output directory') | ||
f.write('<br>') | ||
f.close() | ||
else: | ||
html_file = helper.output_html_file | ||
f = open(html_file, "w") | ||
f.write('<br>') | ||
f.write('Sorry, this script will only work for CSD entries containing atomic coordinates') | ||
f.write('<br>') | ||
f.close() |
56 changes: 56 additions & 0 deletions
56
scripts/mof_solvent_removal_2017_chem_mater_publication/ReadMe.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
# MOF solvent removal | ||
|
||
## Summary | ||
|
||
Scripts included in the supporting information of the article "Development of a Cambridge Structural Database Subset: | ||
A Collection of Metal–Organic Frameworks for Past, Present, and Future", Peyman Z. Moghadam, Aurelia Li, | ||
Seth B. Wiggin, Andi Tao, Andrew G. P. Maloney, Peter A. Wood, Suzanna C. Ward, and David Fairen-Jimenez | ||
*Chem. Mater.* **2017**, 29, 7, 2618–2625, DOI: <https://doi.org/10.1021/acs.chemmater.7b00441> | ||
|
||
Scripts are essentially equivalent: one is designed to be run through the Mercury CSD Python API menu to | ||
remove solvent from a single structure present in the visualiser, the second runs from the command line | ||
and takes a list of CSD entries (a .gcd file) to run through the solvent removal process in bulk. | ||
|
||
## Requirements | ||
|
||
Tested with CSD Python API 3.9.18 | ||
|
||
## Licensing Requirements | ||
|
||
CSD-Core | ||
|
||
## Instructions on running | ||
|
||
For the script Mercury_MOF_solvent_removal.py: | ||
|
||
- In Mercury, pick **CSD Python API** in the top-level menu, then **Options…** in the resulting pull-down menu. | ||
- The Mercury Scripting Configuration control window will be displayed; from the *Additional Mercury Script Locations* | ||
section, use the **Add Location** button to navigate to a folder location containing the script | ||
- It will then be possible to run the script directly from the CSD Python API menu, with the script running on the structure | ||
shown in the visualiser | ||
|
||
For the script Command_prompt_MOF_solvent_removal.py | ||
|
||
```cmd | ||
python Command_prompt_MOF_solvent_removal.py <search_results>.gcd | ||
``` | ||
|
||
```cmd | ||
positional arguments: | ||
input_file CSD .gcd file from which to read MOF structures | ||
|
||
optional arguments: | ||
-h, --help show this help message and exit | ||
-o OUTPUT_DIRECTORY, --output-directory OUTPUT_DIRECTORY | ||
Directory into which to write stripped structures | ||
-m, --monodentate | ||
Whether or not to strip all unidenate (or monodentate) ligands from the structure | ||
-s SOLVENT_FILE, --solvent-file SOLVENT_FILE | ||
The location of a solvent file | ||
``` | ||
|
||
## Author | ||
|
||
*S.B.Wiggin* (2016) | ||
|
||
> For feedback or to report any issues please contact [[email protected]](mailto:[email protected]) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[markdownlint] reported by reviewdog 🐶
MD033/no-inline-html Inline HTML [Element: img]