Skip to content

Commit

Permalink
Trim trailing whitespace and put issues section at end. [ci skip]
Browse files Browse the repository at this point in the history
  • Loading branch information
lohedges committed Nov 15, 2024
1 parent 73cc281 commit 32d56c5
Showing 1 changed file with 59 additions and 59 deletions.
118 changes: 59 additions & 59 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -280,11 +280,11 @@ log level is set to `ERROR`.
```

The xyz coordinates of the QM (ML) and MM regions can be logged by providing the
`--qm-xyz-frequency` command-line argument or by setting the `EMLE_QM_XYZ_FREQUENCY`
environment variable (default is 0, indicating no logging). This generates a
`qm.xyz` file (can be changed by `--qm-xyz-file` argument or the `EMLE_QM_XYZ_FILE`
environment variable) as an XYZ trajectory for the QM region, and a `pc.xyz` file
(controlled by `--pc-xyz-file` argument or the `EMLE_PC_XYZ_FILE` environment
`--qm-xyz-frequency` command-line argument or by setting the `EMLE_QM_XYZ_FREQUENCY`
environment variable (default is 0, indicating no logging). This generates a
`qm.xyz` file (can be changed by `--qm-xyz-file` argument or the `EMLE_QM_XYZ_FILE`
environment variable) as an XYZ trajectory for the QM region, and a `pc.xyz` file
(controlled by `--pc-xyz-file` argument or the `EMLE_PC_XYZ_FILE` environment
variable) with the following format:
```
<number of point charges in frame1>
Expand Down Expand Up @@ -416,44 +416,11 @@ and `MACEEMLE` models allow the computation of in vacuo and embedding energies
in one go, using the [ANI2x](https://github.com/aiqm/torchani) and [MACE](https://github.com/ACEsuit/mace) models respectively. Creating additional models is straightforward. For details of how to use the `torch` models,
see the tutorial documentation [here](https://github.com/OpenBioSim/sire/blob/feature_emle/doc/source/tutorial/part08/02_emle.rst#creating-an-emle-torch-module).

## Issues

The [DeePMD-kit](https://docs.deepmodeling.com/projects/deepmd/en/master/index.html) conda package pulls in a version of MPI which may cause
problems if using [ORCA](https://orcaforum.kofo.mpg.de/index.php) as the in vacuo backend, particularly when running
on HPC resources that might enforce a specific MPI setup. (ORCA will
internally call `mpirun` to parallelise work.) Since we don't need any of
the MPI functionality from `DeePMD-kit`, the problematic packages can be
safely removed from the environment with:

```
conda remove --force mpi mpich
```

Alternatively, if performance isn't an issue, simply set the number of
threads to 1 in the `sander` input file, e.g.:

```
&orc
method='XTB2',
num_threads=1
/
```

When running on an HPC resource it can often take a while for the `emle-server`
to start. As such, the client will try reconnecting on failure a specified
number of times before raising an exception. (Sleeping 2 seconds between
retries.) By default, the client tries will try to connect 100 times. If this
is unsuitable for your setup, then the number of attempts can be configured
using the `EMLE_RETRIES` environment variable.

When performing interpolation it is currently not possible to use AMBER force
fields with CMAP terms due to a memory deallocation bug in `pysander`.

## Error analysis
`emle-engine` provides a CLI tool `emle-analyze` that facilitates analysis of
the performance of EMLE-based simulations. It requires a set of single point
reference calculations for a trajectory generated with `emle-server` (currently
only [ORCA](https://orcaforum.kofo.mpg.de) is supported). It also requires MBIS
reference calculations for a trajectory generated with `emle-server` (currently
only [ORCA](https://orcaforum.kofo.mpg.de) is supported). It also requires MBIS
decomposition of the in vacuo electronic density of the QM region with
[horton](https://theochem.github.io/horton/2.1.1/index.html). Usage:
```
Expand All @@ -465,29 +432,29 @@ emle-analyze --qm-xyz qm.xyz \
--alpha
result.mat
```
`qm.xyz` and `pc.xyz` are the QM and MM XYZ trajectories written out by
`emle-server` (see above in the "Logging" section).
`qm.xyz` and `pc.xyz` are the QM and MM XYZ trajectories written out by
`emle-server` (see above in the "Logging" section).

`model.mat` is the EMLE model
used.
used.

`orca.tar` is a tarball containing single point ORCA calculations and
corresponding horton outputs. All files should be named as `index.*` where index
is a numeric value identifying the snapshot (does not have to be consecutive)
`orca.tar` is a tarball containing single point ORCA calculations and
corresponding horton outputs. All files should be named as `index.*` where index
is a numeric value identifying the snapshot (does not have to be consecutive)
and the extensions are:
- `.vac.orca`: ORCA output for gas phase calculation. When `--alpha` argument is
provided, must also include molecular dipolar polarizability
- `.vac.orca`: ORCA output for gas phase calculation. When `--alpha` argument is
provided, must also include molecular dipolar polarizability
(`%elprop Polar`)
- `.h5`: horton output for gas phase calculation
- `.pc.orca`: ORCA output for calculation with point charges
- `.pc`: charges and positions of the point charges (the ones used for `.pc.orca`
- `.pc`: charges and positions of the point charges (the ones used for `.pc.orca`
calculation)
- `.vpot`: output of `orca_vpot`, electrostatic potential of gas phase system at
- `.vpot`: output of `orca_vpot`, electrostatic potential of gas phase system at
the positions of the point charges

Optional `--backend` argument allows to also extract the energies with the
in vacuo backend. Currently, only `deepmd` and `ani2x` backends are supported by
`emle-analyze`. When `deepmd` backend is used, the DeepMD model must be provided
Optional `--backend` argument allows to also extract the energies with the
in vacuo backend. Currently, only `deepmd` and `ani2x` backends are supported by
`emle-analyze`. When `deepmd` backend is used, the DeepMD model must be provided
with `--deepmd-model`.


Expand All @@ -498,10 +465,10 @@ convention as the one for `emle-analyze` script, with the difference that only
gas phase calculations are required and dipolar polarizabilies must be present.
Simple usage:
```
emle-train --orca-tarball orca.tar model.mat
emle-train --orca-tarball orca.tar model.mat
```
The resulting `model.mat` file can be directly used as `--emle-model` argument
for `emle-server`. A full list of argument and their default values can be
The resulting `model.mat` file can be directly used as `--emle-model` argument
for `emle-server`. A full list of argument and their default values can be
printed with `emle-train -h`:
```
usage: emle-train [-h] --orca-tarball name.tar [--train-mask] [--sigma] [--ivm-thr] [--epochs]
Expand All @@ -526,7 +493,7 @@ options:
--lr-thole Learning rate for Thole model params (a_Thole, k_Z) (default: 0.05)
--lr-sqrtk Learning rate for polarizability scaling factors (sqrtk_ref) (default: 0.05)
--print-every How often to print training progress (default: 10)
--computer-n-species
--computer-n-species
Number of species supported by AEV computer (default: None)
--computer-zid-map Map between EMLE and AEV computer zid values (default: None)
--plot-data name.mat Data for plotting (default: None)
Expand All @@ -536,5 +503,38 @@ training set (provided as `--orca-tarball`) that is used for training. Note that
the values written to `--plot-data` are for the full training set, which allows to
do prediction plots for train/test sets.

The `--computer-n-species` and `--computer-zid-map` arguments are only needed when
using a common AEV computer for both gas phase backend and EMLE model.
The `--computer-n-species` and `--computer-zid-map` arguments are only needed when
using a common AEV computer for both gas phase backend and EMLE model.

## Issues

The [DeePMD-kit](https://docs.deepmodeling.com/projects/deepmd/en/master/index.html) conda package pulls in a version of MPI which may cause
problems if using [ORCA](https://orcaforum.kofo.mpg.de/index.php) as the in vacuo backend, particularly when running
on HPC resources that might enforce a specific MPI setup. (ORCA will
internally call `mpirun` to parallelise work.) Since we don't need any of
the MPI functionality from `DeePMD-kit`, the problematic packages can be
safely removed from the environment with:

```
conda remove --force mpi mpich
```

Alternatively, if performance isn't an issue, simply set the number of
threads to 1 in the `sander` input file, e.g.:

```
&orc
method='XTB2',
num_threads=1
/
```

When running on an HPC resource it can often take a while for the `emle-server`
to start. As such, the client will try reconnecting on failure a specified
number of times before raising an exception. (Sleeping 2 seconds between
retries.) By default, the client tries will try to connect 100 times. If this
is unsuitable for your setup, then the number of attempts can be configured
using the `EMLE_RETRIES` environment variable.

When performing interpolation it is currently not possible to use AMBER force
fields with CMAP terms due to a memory deallocation bug in `pysander`.

0 comments on commit 32d56c5

Please sign in to comment.