Skip to content

Novel-Therapeutics/gpuff

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

gpuff

GPU-accelerated UFF (Universal Force Field) conformer optimization using PyTorch. A GPU-accelerated alternative to RDKit's UFFOptimizeMoleculeConfs with the same return format.

Features

  • Exact energy match with RDKit's UFF implementation (verified to machine precision across 50+ diverse molecules including edge cases)
  • Fully batched L-BFGS optimization with GPU-accelerated energy/gradient computation
  • Supports NVIDIA GPUs (CUDA)
  • Supports all UFF energy terms: bond stretch, angle bend, torsion, van der Waals, inversion
  • Handles ring-corrected angles (3/4-membered rings), Group 6 torsion special cases, and SP3D2 hybridization

Installation

pip install git+https://github.com/Novel-Therapeutics/gpuff.git

Requirements: Python >= 3.10, PyTorch >= 2.0, RDKit

Usage

from rdkit import Chem
from rdkit.Chem import rdDistGeom
from gpuff import UFFOptimizeMoleculeConfs

mol = Chem.AddHs(Chem.MolFromSmiles("CC(=O)Oc1ccccc1C(=O)O"))  # aspirin
rdDistGeom.EmbedMultipleConfs(mol, numConfs=100)

# GPU optimization — same return format as RDKit
results = UFFOptimizeMoleculeConfs(mol, max_iters=200)  # auto-detects GPU
energies = [e for converged, e in results]

Coordinates are modified in-place, matching RDKit's behavior. Returns list[tuple[int, float]] (converged, energy) in the same format as RDKit.

Note: The keyword arguments differ from RDKit (max_iters instead of maxIters, device instead of numThreads). Code that passes kwargs by name will need updating; positional (mol) calls work as-is.

Device selection

The device is auto-detected in priority order: CUDA > CPU. A warning is emitted when no GPU is found. Override with device="cuda" or device="cpu".

All computation uses float64 for exact energy match with RDKit (<0.01% relative error).

Iteration count

The default max_iters=200 matches RDKit's default but may not be sufficient for large flexible molecules. See docs/convergence.md for details on how iteration count affects results and guidance on choosing an appropriate value.

Benchmark

Measured on a Tesla T4 GPU vs RDKit UFFOptimizeMoleculeConfs on CPU (single-threaded, 200 iterations):

Molecule Atoms Confs RDKit CPU gpuff GPU Speedup
aspirin 21 500 2.0s 6.1s 0.3x
ibuprofen 33 500 5.7s 5.3s 1.1x
celecoxib 40 500 10.6s 5.3s 2.0x
osimertinib 70 500 38.5s 7.0s 5.5x
cholesterol 74 500 43.4s 7.6s 5.7x

GPU speedup comes from batched energy/gradient computation and batched line search across all conformers. Speedup grows with molecule size; the crossover point is ~30 atoms for 500 conformers.

How it works

  1. Parameter extraction — UFF force field parameters are extracted from the RDKit molecule once on CPU using RDKit's GetUFF*Params API, with corrections for ring-based angle overrides
  2. Batched energy — All five UFF energy terms are computed in PyTorch using Chebyshev polynomial expansions (no acos/atan2), operating on all conformers simultaneously
  3. Batched L-BFGS — All conformers are optimized simultaneously with batched two-loop recursion and batched strong Wolfe line search, with gradients computed via autograd

Tests

pip install pytest
pytest tests/ -v

The test suite covers 50+ molecules including drug-like compounds, UFF-only molecules (boron, selenium), 3-membered rings, strained systems, phosphorus inversions, and long flexible chains. Energy tests verify exact match with RDKit (machine precision). GPU tests run automatically when CUDA is available.

License

MIT

About

GPU-accelerated UFF conformer optimization using PyTorch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages