Skip to content
Open
20 changes: 8 additions & 12 deletions docs/contents/notebooks/deep_learning_barrier_heights.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
"\n",
"----\n",
"\n",
"This two-part tutorial showcases how ORCA integrates into downstream deep learning workflows by serving as a data source training and evaluation.\n",
"This two-part tutorial showcases how ORCA integrates into downstream deep learning workflows by serving as a data source for training and evaluation.\n",
"1. First, we will show how to calculate the barrier height of a chemical reaction using ORCA with the ORCA Python inferface (OPI). \n",
"2. Second, we use the ChemTorch framework to train and evaluate a graph neural network (GNN) on a curated subset of the popular [RGD1 dataset](https://www.nature.com/articles/s41597-023-02043-z) which contains precomputed barrier heights.\n",
"\n",
Expand Down Expand Up @@ -985,16 +985,10 @@
"git clone -b tutorial/opi_orca https://github.com/heid-lab/chemtorch.git && \\\n",
"cd chemtorch && \\\n",
"conda deactivate && \\\n",
"conda create -n chemtorch python=3.10 && \\\n",
"conda env create -f env/environment.yml && \\\n",
"conda activate chemtorch && \\\n",
"pip install rdkit numpy==1.26.4 scikit-learn pandas && \\\n",
"pip install torch && \\\n",
"pip install hydra-core && \\\n",
"pip install torch_geometric && \\\n",
"pip install torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.5.0+cpu.html && \\\n",
"pip install wandb && \\\n",
"pip install ipykernel && \\\n",
"pip install -e .\n",
"uv sync && \\\n",
"uv pip install torch_scatter torch_sparse torch_cluster torch_spline_conv torch_geometric --no-build-isolation\n",
"```"
]
},
Expand All @@ -1021,14 +1015,16 @@
"To use an existing one, run the following command from the `chemtorch` project root:\n",
"\n",
"```bash\n",
"python chemtorch_cli.py +experiment=graph data_pipeline=rgd1 data_pipeline.data_source.data_path=\"../QM_data_precomputed.csv\"\n",
"python chemtorch_cli.py +experiment=graph data_module.data_pipeline=rgd1 data_module.data_pipeline.data_source.data_path=\"../QM_data_precomputed.csv\"\n",
"```\n",
"\n",
"This tells ChemTorch to use the default graph learning configuration with the RGD1 data pipeline but use our own custom dataset specified via `data_path`.\n",
"Under the hood, this setup will convert each reaction SMILES to a condensed graph of reaction (CGR), train a DMPNN, track metrics of interest and save the best performing model parameters for later.\n",
"to the CLI as well as [Weights & Biases](https://wandb.ai/site/models/) which is a graphical user interface that can be accessed through the browser.\n",
"\n",
"If you would like to make your own configuration file instead, an example is already included in your ChemTorch installation, and can be found in `conf/experiment/opi_tutorial/training.yaml`, so no need to create or change a file. Note that the important lines are setting the `data_pipeline` to `rgd1`, and `data_pipeline/data_source/data_path` to `\"../QM_data_precomputed.csv\"`, just as above. To launch the training process with the config file, run \n",
"If you would like to make your own configuration file instead, an example is already included in your ChemTorch installation, and can be found in `conf/experiment/opi_tutorial/training.yaml`, so no need to create or change a file.\n",
"Note that the important lines are setting the `data_module.data_pipeline` to `rgd1`, and `data_module.data_pipeline/data_source/data_path` to `\"../QM_data_precomputed.csv\"`, just as above.\n",
"To launch the training process with the config file, run \n",
"```bash \n",
"python chemtorch_cli.py +experiment=opi_tutorial/training\n",
"```\n",
Expand Down