Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chem: MMP: Updated documentation #3254

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

MariaDolotova
Copy link
Collaborator

No description provided.

@@ -61,7 +61,7 @@ with chemical data:
* [Substructure search](#substructure-search--filtering).
* [Chemical space analysis](#chemical-space).
* Structure analysis using [R-groups decomposition](#r-groups-analysis), [scaffold tree](#scaffold-tree-analysis), [elemental analysis](#elemental-analysis).
* SAR: [activity cliffs](#structure-relationship-analysis), [matched molecular pairs](#matched-molecular-pairs).
* SAR: [activity cliffs](#structure-relationship-analysis), [matched molecular pairs](./matched-molecular-pairs.md).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the Markdown-style links in wiki. Links like this ^ create issues in Docusaurus

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't just delete a section from the Level 1 page (Cheminformatics) and move it to a separate document. You need to keep a section with the short overview and GIF, and link to a separate document instead. Please update

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved MMP article back to chem.md file

@@ -0,0 +1,151 @@
# Matched molecular pairs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't use Level 1 headers in Docusaurus. Instead, we add the "title" front matter, which Docusaurus uses to automatically generate the Title. Please see the guide at https://datagrok.ai/help/develop/help-pages/markdown#table-of-contents-headers

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also need to add front matter for keywords and description

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - using H3 headers as for other articles within chem.md.

well-defined molecular modifications and quantify their impact on key properties such
as potency, solubility, permeability, and/or ADMET characteristics.

MMP analysis is particularly valuable in lead optimization, where systematic exploration of chemical space can guide the design of more effective drug candidates.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplify. Paragraph 1 and 2 should be combined into one. Cut unnecessary words like "particularly": "particularly valuable in" --> "used in ... "

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, this should be shorter with more substance. For instance this is how ChatGPT suggests to simplify the first two sentences:

The Matched Molecular Pairs (MMP) tool helps analyze small structural changes in chemical datasets and their effects on properties like potency, solubility, permeability, and ADMET.

as potency, solubility, permeability, and/or ADMET characteristics.

MMP analysis is particularly valuable in lead optimization, where systematic exploration of chemical space can guide the design of more effective drug candidates.
By studying how molecular fragments influence activity, the tool helps users make data-driven decisions when selecting modifications to improve lead compounds.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph repeats the ideas in the first two. Cut/combine

MMP analysis is particularly valuable in lead optimization, where systematic exploration of chemical space can guide the design of more effective drug candidates.
By studying how molecular fragments influence activity, the tool helps users make data-driven decisions when selecting modifications to improve lead compounds.

Traditionally, lead optimization relies on trial and error, guided by medicinal chemistry intuition.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not an explanatory article. We don't need to provide background. There's no value for the users.

* **The upper table (Fragments)** shows all fragment substitutions found in the dataset. It includes the frequency of each substitution and the
corresponding change in the analyzed activity or property. There are two modes to explore fragments dataset:
- *All* shows all found fragment pairs at once
- *Current* shows fragment pairs fount for current molecule in the initial dataset.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix spelling/grammar ("fount")

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Information message on the left top corner of the table shows how many rows of total are filtered.
![Fragments modes](img/mmp_fragments_modes.gif)

Click any row in the table to show all molecule pairs from the initial dataset having corresponding substitution.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which table (top/bottom/grid?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does it interact with grid? We should describe all interactions here (gird <>top table <> bottom table)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some more details. Basically, there are two tables on the Substitutions tab. And we describe each table and available interactions for upper and lower table separately.

* **The lower table (Molecule pairs)** shows all pairs of molecules associated with the
substitution from the upper table. It provides details about the analyzed
activity or property for each pair of molecules.
Click any row in the **Fragments** table to filter molecule pairs with current substitution. If *Current* mode is selected on **Fragments** table then pair containing current molecule from initial dataset will be on top.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please run through grammarly - you have many grammar/spelling/punctuation mistakes in this how to section

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

specified activity or property, with the arrow pointing toward the molecule with
the higher value.

* **Molecule pairs** table. The same as on **Substitutions** tab. Show or hide the table using **Show Pairs** checkbox above the scatter plot.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our convention is to use "scatterplot", not "scatter plot". In any case, whichever way you choose, it should be consistent. You do it both ways (line 116 and line 122)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

properties.
* Whether new molecule already exists in the initial dataset or newly generated

In the **Context panel** there is scatter plot showing observed vs predicted values for each activity for molecules from initial dataset.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it here? It looks like the Context Panel applies only to the last tab?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, context panel is also available from other tabs. But for Generation tab it is specific and contains 'predicted vs observed' scatterplot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants