Skip to content

Commit 59eb5c9

Browse files
Merge pull request #71 from ccdc-opensource/CRingroseCCDC-patch-1
Added conformer generator example workflow
2 parents f041769 + 7f166f8 commit 59eb5c9

File tree

3 files changed

+330
-0
lines changed

3 files changed

+330
-0
lines changed

scripts/conformer_demo/AZD9291.mol2

+169
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
@<TRIPOS>MOLECULE
2+
A:6JWL
3+
72 75 1 0 1
4+
SMALL
5+
NO_CHARGES
6+
****
7+
Generated from the CSD
8+
9+
@<TRIPOS>ATOM
10+
1 C13 -47.0936 -3.8091 -13.4034 C.3 1 YY31101 0.0000
11+
2 N2 -47.0086 -2.7311 -14.3695 N.3 1 YY31101 0.0000
12+
3 C12 -45.9115 -3.0021 -15.2755 C.3 1 YY31101 0.0000
13+
4 C11 -46.7746 -1.4560 -13.7005 C.3 1 YY31101 0.0000
14+
5 C10 -47.4726 -0.3070 -14.4355 C.3 1 YY31101 0.0000
15+
6 N1 -48.9046 -0.5000 -14.3475 N.pl3 1 YY31101 0.0000
16+
7 C28 -49.4387 -1.1750 -13.1824 C.3 1 YY31101 0.0000
17+
8 C3 -50.4137 1.1840 -15.2065 C.ar 1 YY31101 0.0000
18+
9 C4 -51.2787 1.6321 -16.1655 C.ar 1 YY31101 0.0000
19+
10 O1 -51.9027 2.8521 -16.0285 O.3 1 YY31101 0.0000
20+
11 C14 -51.6957 3.4241 -14.7895 C.3 1 YY31101 0.0000
21+
12 C2 -49.8087 -0.0340 -15.3855 C.ar 1 YY31101 0.0000
22+
13 C1 -50.0827 -0.7970 -16.5066 C.ar 1 YY31101 0.0000
23+
14 N -49.4027 -2.0741 -16.6096 N.am 1 YY31101 0.0000
24+
15 C7 -49.6557 -3.0691 -17.6256 C.2 1 YY31101 0.0000
25+
16 O -50.4457 -2.8851 -18.4616 O.2 1 YY31101 0.0000
26+
17 C8 -48.8456 -4.3581 -17.5956 C.3 1 YY31101 0.0000
27+
18 C9 -49.6517 -5.6372 -17.7846 C.3 1 YY31101 0.0000
28+
19 C6 -50.9617 -0.3360 -17.4786 C.ar 1 YY31101 0.0000
29+
20 C5 -51.5537 0.9000 -17.2826 C.ar 1 YY31101 0.0000
30+
21 N3 -52.4998 1.5241 -18.1876 N.pl3 1 YY31101 0.0000
31+
22 C15 -52.4578 1.3800 -19.6177 C.ar 1 YY31101 0.0000
32+
23 N5 -51.5097 0.6790 -20.1647 N.ar 1 YY31101 0.0000
33+
24 N4 -53.3928 1.9621 -20.3097 N.ar 1 YY31101 0.0000
34+
25 C16 -53.4318 1.8691 -21.6127 C.ar 1 YY31101 0.0000
35+
26 C17 -52.4518 1.1400 -22.2417 C.ar 1 YY31101 0.0000
36+
27 C18 -51.4837 0.5540 -21.4637 C.ar 1 YY31101 0.0000
37+
28 C19 -50.3887 -0.2730 -22.1187 C.2 1 YY31101 0.0000
38+
29 C26 -49.1556 -0.6420 -21.5447 C.ar 1 YY31101 0.0000
39+
30 C25 -48.5396 -0.3950 -20.3087 C.ar 1 YY31101 0.0000
40+
31 C24 -47.2806 -0.9140 -20.0517 C.ar 1 YY31101 0.0000
41+
32 C23 -46.6356 -1.6751 -21.0187 C.ar 1 YY31101 0.0000
42+
33 C22 -47.2446 -1.9131 -22.2327 C.ar 1 YY31101 0.0000
43+
34 C21 -48.5256 -1.3830 -22.4878 C.ar 1 YY31101 0.0000
44+
35 N6 -49.3076 -1.4780 -23.5728 N.pl3 1 YY31101 0.0000
45+
36 C27 -48.9656 -2.1931 -24.7758 C.3 1 YY31101 0.0000
46+
37 C20 -50.4267 -0.8280 -23.3788 C.2 1 YY31101 0.0000
47+
38 H1282 -51.2463 -0.7410 -24.0905 H 1 YY31101 0.0000
48+
39 H1286 -48.6942 -2.2848 -15.9139 H 1 YY31101 0.0000
49+
40 H1287 -48.3421 -4.4169 -16.6318 H 1 YY31101 0.0000
50+
41 H1288 -48.1031 -4.3077 -18.3906 H 1 YY31101 0.0000
51+
42 H1291 -52.4429 1.0302 -23.3252 H 1 YY31101 0.0000
52+
43 H1294 -45.7035 -1.2610 -13.6727 H 1 YY31101 0.0000
53+
44 H1295 -47.1597 -1.5128 -12.6834 H 1 YY31101 0.0000
54+
45 H1306 -47.2682 -4.7498 -13.9236 H 1 YY31101 0.0000
55+
46 H1307 -47.9158 -3.6164 -12.7159 H 1 YY31101 0.0000
56+
47 H1308 -46.1605 -3.8708 -12.8453 H 1 YY31101 0.0000
57+
48 H1309 -48.6222 -1.4507 -12.5166 H 1 YY31101 0.0000
58+
49 H1310 -49.9709 -2.0722 -13.4949 H 1 YY31101 0.0000
59+
50 H1311 -50.1243 -0.5097 -12.6599 H 1 YY31101 0.0000
60+
51 H1316 -46.0900 -3.9453 -15.7897 H 1 YY31101 0.0000
61+
52 H1317 -44.9823 -3.0664 -14.7114 H 1 YY31101 0.0000
62+
53 H1318 -45.8375 -2.1984 -16.0066 H 1 YY31101 0.0000
63+
54 H1319 -50.2092 1.7814 -14.3192 H 1 YY31101 0.0000
64+
55 H1321 -50.1543 -5.6101 -18.7503 H 1 YY31101 0.0000
65+
56 H1322 -48.9834 -6.4962 -17.7471 H 1 YY31101 0.0000
66+
57 H1323 -50.3934 -5.7194 -16.9915 H 1 YY31101 0.0000
67+
58 H1324 -52.0840 2.7638 -14.0154 H 1 YY31101 0.0000
68+
59 H1325 -52.2111 4.3823 -14.7413 H 1 YY31101 0.0000
69+
60 H1326 -50.6288 3.5776 -14.6342 H 1 YY31101 0.0000
70+
61 H1327 -53.2358 2.0970 -17.7874 H 1 YY31101 0.0000
71+
62 H1331 -49.0489 0.2024 -19.5539 H 1 YY31101 0.0000
72+
63 H1332 -45.6467 -2.0845 -20.8174 H 1 YY31101 0.0000
73+
64 H1335 -46.7380 -2.5080 -22.9913 H 1 YY31101 0.0000
74+
65 H1336 -48.8006 -3.2432 -24.5395 H 1 YY31101 0.0000
75+
66 H1337 -49.7799 -2.1062 -25.4937 H 1 YY31101 0.0000
76+
67 H1338 -48.0579 -1.7703 -25.2037 H 1 YY31101 0.0000
77+
68 H1342 -46.7979 -0.7258 -19.0938 H 1 YY31101 0.0000
78+
69 H1343 -47.1673 -0.3019 -15.4808 H 1 YY31101 0.0000
79+
70 H1344 -47.2018 0.6413 -13.9736 H 1 YY31101 0.0000
80+
71 H1345 -51.1782 -0.9281 -18.3665 H 1 YY31101 0.0000
81+
72 H1347 -54.2200 2.3564 -22.1846 H 1 YY31101 0.0000
82+
@<TRIPOS>BOND
83+
1 1 2 1
84+
2 2 3 1
85+
3 2 4 1
86+
4 4 5 1
87+
5 5 6 1
88+
6 6 7 1
89+
7 8 9 ar
90+
8 9 10 1
91+
9 10 11 1
92+
10 6 12 1
93+
11 8 12 ar
94+
12 12 13 ar
95+
13 13 14 1
96+
14 14 15 am
97+
15 15 16 2
98+
16 15 17 1
99+
17 17 18 1
100+
18 13 19 ar
101+
19 9 20 ar
102+
20 19 20 ar
103+
21 20 21 1
104+
22 21 22 1
105+
23 22 23 ar
106+
24 22 24 ar
107+
25 24 25 ar
108+
26 25 26 ar
109+
27 23 27 ar
110+
28 26 27 ar
111+
29 27 28 1
112+
30 28 29 1
113+
31 29 30 ar
114+
32 30 31 ar
115+
33 31 32 ar
116+
34 32 33 ar
117+
35 29 34 ar
118+
36 33 34 ar
119+
37 34 35 1
120+
38 35 36 1
121+
39 28 37 2
122+
40 35 37 1
123+
41 37 38 1
124+
42 14 39 1
125+
43 17 40 1
126+
44 17 41 1
127+
45 26 42 1
128+
46 4 43 1
129+
47 4 44 1
130+
48 1 45 1
131+
49 1 46 1
132+
50 1 47 1
133+
51 7 48 1
134+
52 7 49 1
135+
53 7 50 1
136+
54 3 51 1
137+
55 3 52 1
138+
56 3 53 1
139+
57 8 54 1
140+
58 18 55 1
141+
59 18 56 1
142+
60 18 57 1
143+
61 11 58 1
144+
62 11 59 1
145+
63 11 60 1
146+
64 21 61 1
147+
65 30 62 1
148+
66 32 63 1
149+
67 33 64 1
150+
68 36 65 1
151+
69 36 66 1
152+
70 36 67 1
153+
71 31 68 1
154+
72 5 69 1
155+
73 5 70 1
156+
74 19 71 1
157+
75 25 72 1
158+
@<TRIPOS>SUBSTRUCTURE
159+
1 YY31101 1 GROUP 0 A YY3 0
160+
@<TRIPOS>SET
161+
CCDC_LIGAND STATIC ATOMS
162+
72 1 2 3 4 5 6 7 8 9 \
163+
10 11 12 13 14 15 16 17 18 19 \
164+
20 21 22 23 24 25 26 27 28 29 \
165+
30 31 32 33 34 35 36 37 38 39 \
166+
40 41 42 43 44 45 46 47 48 49 \
167+
50 51 52 53 54 55 56 57 58 59 \
168+
60 61 62 63 64 65 66 67 68 69 \
169+
70 71 72
+135
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
#! /usr/bin/env python
2+
########################################################################################################################
3+
#
4+
# This script can be used for any purpose without limitation subject to the
5+
# conditions at http://www.ccdc.cam.ac.uk/Community/Pages/Licences/v2.aspx
6+
#
7+
# This permission notice and the following statement of attribution must be
8+
# included in all copies or substantial portions of this script.
9+
#
10+
# 2024-11-22: created by the Cambridge Crystallographic Data Centre
11+
#
12+
########################################################################################################################
13+
14+
from ccdc import conformer, descriptors, io, molecule
15+
from ccdc.search import SubstructureSearch, SMARTSSubstructure
16+
17+
18+
def read(molecule_file: str) -> molecule:
19+
print(f'Reading file: {molecule_file} ... ', end='')
20+
mol_reader = io.MoleculeReader(molecule_file)
21+
mol = mol_reader[0]
22+
print('done.')
23+
24+
return mol
25+
26+
27+
def generate_conformers(molecule: molecule, max_conformers: int = 50) -> conformer.ConformerHitList:
28+
"""
29+
Generate conformers for a molecule.
30+
31+
:param molecule: The Molecule (ccdc Molecule object) to generate conformers for.
32+
:param max_conformers: The maximum number of conformers to generate.
33+
34+
:returns: ccdc.conformer.ConformerHitList
35+
"""
36+
37+
# Set up the ConformerGenerator
38+
confgen = conformer.ConformerGenerator()
39+
confgen.settings.max_conformers = max_conformers
40+
# confgen.settings.superimpose_conformers_onto_reference = True
41+
42+
# Generate conformers and assign identifiers to them before returning
43+
conformers = confgen.generate(molecule)
44+
45+
print(f'Generating conformers, maximum of {max_conformers} ... ', end='')
46+
for i, conf in enumerate(conformers):
47+
conf.molecule.identifier = '{}_{:04}'.format(conf.molecule.identifier, i + 1)
48+
print(f'done, generated {len(conformers)} conformers.')
49+
50+
return conformers
51+
52+
53+
def analyse(conformers: conformer.ConformerHitList) -> molecule:
54+
"""
55+
Perform some basic analysis of the conformers generated.
56+
:param conformers: Conformers generated from ConfGen
57+
:return: The best molecule of all the conformers generated.
58+
"""
59+
print(f'Sampling limit reached? {"Yes." if conformers.sampling_limit_reached else "No."}')
60+
61+
print(f'How many rotamers had no observations? {conformers.n_rotamers_with_no_observations}.')
62+
63+
most_probable_conformer = conformers[0]
64+
65+
print(f'Normalised score of most probable conformer: {round(most_probable_conformer.normalised_score, 5)}.')
66+
print(f'Most probable conformer RMSD wrt input: {round(most_probable_conformer.rmsd(), 3)}; '
67+
f'wrt minimised: {round(most_probable_conformer.rmsd(wrt="minimised"), 3)}.')
68+
69+
print('Scores of top 10 conformers: ', end='')
70+
71+
top_ten = conformers[:10]
72+
for i in range(len(top_ten)):
73+
if i < len(top_ten) - 1:
74+
print(f'{round(top_ten[i].normalised_score, 3):.3f}, ', end='')
75+
else:
76+
print(f'{round(top_ten[i].normalised_score, 3):.3f}.')
77+
78+
return most_probable_conformer.molecule
79+
80+
81+
def overlay(conformers, query: str, output_filename: str) -> None:
82+
"""
83+
Overlay conformers based on a SMARTS substructure pattern
84+
:param conformers: Conformers generated from ConfGen
85+
:param query: SMARTS pattern which the conformers will overlay on top of.
86+
Should be consistent across all conformers, e.g. benzene ring.
87+
"""
88+
print('Overlaying conformers ... ', end='')
89+
conformers_mols = [c.molecule for c in conformers]
90+
ss_search = SubstructureSearch()
91+
substructure = SMARTSSubstructure(query)
92+
ss_search.add_substructure(substructure)
93+
hits = ss_search.search(conformers_mols, max_hits_per_structure=1)
94+
ref_ats = hits[0].match_atoms()
95+
print('done.')
96+
97+
print('Writing file superimposed ... ', end='')
98+
with io.MoleculeWriter(output_filename) as writer:
99+
for hit in hits:
100+
hit_ats = hit.match_atoms()
101+
atoms = zip(ref_ats, hit_ats)
102+
ov = descriptors.MolecularDescriptors.Overlay(hits[0].molecule, hit.molecule, atoms)
103+
superimposed_hit = ov.molecule
104+
writer.write(superimposed_hit)
105+
print('done.')
106+
107+
108+
def write_conformers_to_file(conformers: conformer.ConformerHitList, filename: str) -> None:
109+
"""
110+
Write conformers to a file without any addition overlaying.
111+
:param conformers: Conformer generated from ConfGen.
112+
:param filename: The name of the output file.
113+
"""
114+
115+
with io.MoleculeWriter(filename) as writer:
116+
for conf in conformers:
117+
writer.write(conf.molecule)
118+
119+
120+
if __name__ == '__main__':
121+
122+
input_filename = 'AZD9291.mol2'
123+
# Read example molecule
124+
mol = read(input_filename)
125+
126+
# Generate conformers
127+
confs = generate_conformers(mol, 20)
128+
129+
# Provide summary of analysis
130+
analyse(confs)
131+
132+
# Overlay structures based on common substructure
133+
query = 'c1cncnc1'
134+
output_filename = f'superimposed_{input_filename}'
135+
overlay(confs, query, output_filename)

scripts/conformer_demo/description.md

+26
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Conformer Demo
2+
3+
This is a short script to generate conformers with some rudimentary analysis for a single molecule.
4+
There are also options to overlay the results to view in Hermes.
5+
6+
### Example output showing what the user can expect to see:
7+
8+
```
9+
Reading file: AZD9291.mol2 ... done.
10+
Generating conformers, maximum of 20 ... done, generated 20 conformers.
11+
Sampling limit reached? No.
12+
How many rotamers had no observations? 0.
13+
Normalised score of most probable conformer: 0.0.
14+
Most probable conformer RMSD wrt input: 3.276; wrt minimised: 3.202.
15+
Scores of top 10 conformers: 0.000, 0.000, 0.000, 0.027, 0.027, 0.027, 0.027, 0.027, 0.029, 0.029.
16+
Overlaying conformers ... done.
17+
Writing file superimposed ... done.
18+
```
19+
20+
CCDC Python API Licence required, minimum version: 3.0.15
21+
22+
There is an accompanying mol2 file with this script, but users may use any small molecule provided in a file format readable by our API (e.g. mol, mol2, sdf, etc)
23+
24+
Author: Chris Ringrose - 22/11/24
25+
26+
For feedback or to report any issues please contact [email protected]

0 commit comments

Comments
 (0)