Skip to content

Support / require reading unspecified SMARTS bonds as single-or-aromatic on query atoms #3

@tylerperyea

Description

@tylerperyea

From Daylight's SMARTS page:
https://www.daylight.com/dayhtml/doc/theory/theory.smarts.html

4.2 Bond Primitives

Various bond symbols are available to match connections between atoms. A missing bond symbol is interpreted as "single or aromatic".

In practice, most tools don't really honor this daylight convention, per se. And that's mostly okay. By a strict reading of the Daylight resource a SMARTS query of c1ccccc1 (benzene) would actually be interpreted as having single-or-aromatic bonds between each atom with each atom itself having at least one aromatic bond somewhere. This is impractical to specify in a molfile.

Typically when a tool produces a SMARTS/SMILES pattern with aromatic atoms (e.g. c) but non-specific bonds between those aromatic atoms (e.g. cc), the common interpretation is that the unspecified bond is aromatic (e.g. c:c). Similarly, when a tool produces a SMARTS/SMILES pattern with aliphatic atoms (e.g. C), but a non-specific bond (e.g. CC), the common interpretation is an implied single bond (e.g. C-C). These conventions are widely used even if they present some problems.

The compromise solution requires a modification to Daylight's statement:

A missing bond symbol BETWEEN ATOMS WHERE AT LEAST ONE ATOM HAS A QUERY FEATURE is interpreted as "single or aromatic".

That is, it's fine to have explicit non-query atoms imply the bonds between them. But if at least one atom is a query atom, AND the SMARTS pattern does not specify a bond type, it should get interpreted as single-or-aromatic. For example:

Ambiguous SMARTS Equivalent to
cc c:c
CC C-C
C[#6] C-,:[#6]
C[*] C-,:[*]
[#6,#7][#6] [#6,#7]-,:[#6]

Here a "query atom" is any atom specified as an atom list (including a list of 1 element) or an atom wildcard.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions