GPU-accelerated UFF (Universal Force Field) conformer optimization using PyTorch. A GPU-accelerated alternative to RDKit's UFFOptimizeMoleculeConfs with the same return format.
- Exact energy match with RDKit's UFF implementation (verified to machine precision across 50+ diverse molecules including edge cases)
- Fully batched L-BFGS optimization with GPU-accelerated energy/gradient computation
- Supports NVIDIA GPUs (CUDA)
- Supports all UFF energy terms: bond stretch, angle bend, torsion, van der Waals, inversion
- Handles ring-corrected angles (3/4-membered rings), Group 6 torsion special cases, and SP3D2 hybridization
pip install git+https://github.com/Novel-Therapeutics/gpuff.gitRequirements: Python >= 3.10, PyTorch >= 2.0, RDKit
from rdkit import Chem
from rdkit.Chem import rdDistGeom
from gpuff import UFFOptimizeMoleculeConfs
mol = Chem.AddHs(Chem.MolFromSmiles("CC(=O)Oc1ccccc1C(=O)O")) # aspirin
rdDistGeom.EmbedMultipleConfs(mol, numConfs=100)
# GPU optimization — same return format as RDKit
results = UFFOptimizeMoleculeConfs(mol, max_iters=200) # auto-detects GPU
energies = [e for converged, e in results]Coordinates are modified in-place, matching RDKit's behavior. Returns list[tuple[int, float]] (converged, energy) in the same format as RDKit.
Note: The keyword arguments differ from RDKit (max_iters instead of maxIters, device instead of numThreads). Code that passes kwargs by name will need updating; positional (mol) calls work as-is.
The device is auto-detected in priority order: CUDA > CPU. A warning is emitted when no GPU is found. Override with device="cuda" or device="cpu".
All computation uses float64 for exact energy match with RDKit (<0.01% relative error).
The default max_iters=200 matches RDKit's default but may not be sufficient for large flexible molecules. See docs/convergence.md for details on how iteration count affects results and guidance on choosing an appropriate value.
Measured on a Tesla T4 GPU vs RDKit UFFOptimizeMoleculeConfs on CPU (single-threaded, 200 iterations):
| Molecule | Atoms | Confs | RDKit CPU | gpuff GPU | Speedup |
|---|---|---|---|---|---|
| aspirin | 21 | 500 | 2.0s | 6.1s | 0.3x |
| ibuprofen | 33 | 500 | 5.7s | 5.3s | 1.1x |
| celecoxib | 40 | 500 | 10.6s | 5.3s | 2.0x |
| osimertinib | 70 | 500 | 38.5s | 7.0s | 5.5x |
| cholesterol | 74 | 500 | 43.4s | 7.6s | 5.7x |
GPU speedup comes from batched energy/gradient computation and batched line search across all conformers. Speedup grows with molecule size; the crossover point is ~30 atoms for 500 conformers.
- Parameter extraction — UFF force field parameters are extracted from the RDKit molecule once on CPU using RDKit's
GetUFF*ParamsAPI, with corrections for ring-based angle overrides - Batched energy — All five UFF energy terms are computed in PyTorch using Chebyshev polynomial expansions (no
acos/atan2), operating on all conformers simultaneously - Batched L-BFGS — All conformers are optimized simultaneously with batched two-loop recursion and batched strong Wolfe line search, with gradients computed via autograd
pip install pytest
pytest tests/ -vThe test suite covers 50+ molecules including drug-like compounds, UFF-only molecules (boron, selenium), 3-membered rings, strained systems, phosphorus inversions, and long flexible chains. Energy tests verify exact match with RDKit (machine precision). GPU tests run automatically when CUDA is available.
MIT