Mixed Pbc Support #320

curtischong · 2025-10-28T22:25:55Z

Summary

Updates torchsim to use the latest version of Vesin which supports mixed PBC neighborlist calculations.

Checklist

Before a pull request can be merged, the following items must be checked:

Doc strings have been added in the Google docstring format.
Run ruff on your code.
Tests have been added for any new functionality or bug fixes.

We highly recommended installing the prek hooks running in CI locally to speedup the development process. Simply run pip install prek && prek install to install the hooks which will check your code before each commit.

torch_sim/io.py

curtischong · 2025-11-01T18:16:09Z

torch_sim/models/soft_sphere.py

                cell=cell,
                pbc=self.pbc,
                cutoff=self.cutoff,
-                sorti=False,


yep. this is a spelling mistake

curtischong · 2025-11-01T18:16:59Z

examples/scripts/7_Others/7.3_Batched_neighbor_list.py


-# Fix: Ensure pbc has the correct shape [n_systems, 3]
-pbc_tensor = torch.tensor([[pbc] * 3] * len(atoms_list), dtype=torch.bool)
+pbc_tensor = torch.tensor(pbc).repeat(state.n_systems, 1)


this repeat is needed since torch_nl_linked_cell accepts (num_systems, 3) for the pbc. it doens't just accept a tensor of shape (3,)

curtischong · 2025-11-01T20:58:29Z

tests/test_trajectory.py

    assert trajectory.get_array("atomic_numbers").shape == (1, 10)
    assert trajectory.get_array("cell").shape == (1, 3, 3)
-    assert trajectory.get_array("pbc").shape == (1,)
+    assert trajectory.get_array("pbc").shape == (1, 3)


I'm not 100% sure this is the correct test. should it be shape (1,3) or just shape (3,)

I think pbc should be system_attributes so I would argue for (1, 3)

yeah I saw that we're also probably going to change trajectory reporter soon as well to make it less confusing

I don't think performance-wise making pbc a system attribute will do anything. Since I can see the case for torchsim taking in arbitrary states and manipulating them I am also leaning towards making pbc a system attribute

okay. yeah this is completely wrong. In this PR I'll keep pbc a global attribute and make it (3,). that should be the last one. Man I was quite tired when I first wrote this PR.

Agree let's keep as (3,) for now

curtischong · 2025-11-01T21:12:54Z

torch_sim/models/metatomic.py

                    positions=system_positions,
                    types=system_atomic_numbers,
                    cell=system_cell,
-                    pbc=system_pbc,


the simstate.pbc is a torch.Tensor by default now! so we don't need this

torch_sim/neighbors.py

torch_sim/transforms.py

curtischong · 2025-11-04T15:09:47Z

there is an annoying bug with torch.jit that I'm trying to solve in a test (our nl function takes too long bc we are timing the compilation and the invocation of the function I believe) but the main API for this is pretty much done. So you can review if you want

wrap only the pbc axis for each atom support init simstate with list of bools better test that uses itertools update vesin version fix pbc check in integrators fix more pytests more fixing more fixes VesinNeighborListTorch is slow fix metatensor for pbc more fixes wip fix pbc trajectory trajectory runs bump metatomic version or vesin will complain fix errors wip fix more trajectory issues make trajectory pass fix pbc in diff_sim fix ase atoms to state conversion for pbc fix neighbors.py assert the pymatgen pbc is valid proper conversion between pbc for atoms, and pymatgen do not pass in pbc to phononpy rm warning and add doc to github make consistent with prev implementation fix io tests lint minor simplification simplify test more simplification changes fix some tests satisfy prek but ruff check errors :/ we'll fix this later more cleanup rename renamove more diffs more changes wip fix rm init in md add type checking and fix pbc type in dataclass def cleanup state loosen test for nl

thomasloux

Some modifications are needed in ts.transforms. Otherwise looks correct to me.
I would also suggest using this PR to port pbc to system_attributes

torch_sim/transforms.py

torch_sim/neighbors.py

torch_sim/io.py

thomasloux · 2025-11-06T09:38:51Z

tests/test_trajectory.py

    assert trajectory.get_array("atomic_numbers").shape == (1, 10)
    assert trajectory.get_array("cell").shape == (1, 3, 3)
-    assert trajectory.get_array("pbc").shape == (1,)
+    assert trajectory.get_array("pbc").shape == (1, 3)


I think pbc should be system_attributes so I would argue for (1, 3)

curtischong · 2025-11-07T02:29:00Z

@orionarcher or @abhijeetgangan do you recall why pbc is a global attribute? I'm actually fine with either option (it being a global/system attribute) because I think when ppl are doing simulations, they would choose to fix periodicity for all of the systems in their batch.

I do see the case for it being a system attribute. Just wanted your opinion!

abhijeetgangan · 2025-11-07T02:34:53Z

@curtischong In principle you could have different settings for each system in the batch, even mixed PBC for that matter. Only cases it would break would be where the simulations is meant for PBC systems only like phonons or elastic constants etc. We could certainly add error handling case by case but I think this could be future addition and we can keep and issue open for it.

IMO this PR is already a big feature compared to the latter as I see it having more applications than people running different PBC systems in a batch.

thomasloux · 2025-11-07T10:05:18Z

IMO this PR is already a big feature compared to the latter as I see it having more applications than people running different PBC systems in a batch.

I agree, we should do that separately. Thanks for this PR @curtischong

curtischong · 2025-11-07T15:53:08Z

@thomasloux I have to rewrite how trajectory stores pbc tensors later tonight (it doesn't understand global attributes that are not constants). but other than that, we should be good.

curtischong · 2025-11-08T01:36:31Z

torch_sim/trajectory.py

The tests pass, but I had a hard time following this logic. This file is overdue for an overhaul.

I'd like to see trajectory aligned with the H5MD standard at some point, thats the next overhaul I have planned

orionarcher

Broadly looking good to me, a few comments

orionarcher · 2025-11-09T17:38:25Z

torch_sim/state.py

+        @property
+        def pbc(self) -> torch.Tensor:
+            """A getter for pbc that tells type checkers it's always defined."""
+            return self.pbc


I have mixed feelings about adding nonfunctional code just to satisfy type checkers. Why aren't they picking up on the attribute?

This is because we define the attribute as torch.Tensor | bool so when the dataclass is initialized, users know they can pass those types in. This attribute is converted to a torch.Tensor inside the post_init. This typehint here is to let users know that when they access this attribute it's always a torch.Tensor.

I tried many different things to make this type accurate. One idea I had was to make pbc just a torch.Tensor, and then manually define a custom init for the dataclass. Unfortunately, child classes will have to manually call this init - which is just really messy and bloats up all of the code for children childclasses. I believe that this is the best way to use the nice "auto init" functions that dataclasses provides while having accurate types

I actually still think, we could do the custom init only for SimState, and assume, as it is the case at the moment, that any Subclass of SimState (MDState...) will be created from a init_fn and another SimState, so that pbc will indeed be a torch.Tensor.

Nope. Child dataclasses do not call the inits of parent dataclasses (because they don't know which params to pass in - annoying I know). See #335.

This issue is reported here https://bugs.python.org/issue43835

I should add a comment in the code pointing to that issue

I agree, but my point is that it's not a problem. You can assume it will be a torch.Tensor because it will be initiated from a SimState

If I take a NVT Langevin, the init function is

def nvt_langevin_init( state: SimState | StateDict, model: ModelInterface, *, kT: float | torch.Tensor, seed: int | None = None, **_kwargs: Any, ) -> MDState: """Docs...""" if not isinstance(state, SimState): state = SimState(**state) model_output = model(state) momenta = getattr( state, "momenta", calculate_momenta(state.positions, state.masses, state.system_idx, kT, seed), ) return MDState( positions=state.positions, momenta=momenta, energy=model_output["energy"], forces=model_output["forces"], masses=state.masses, cell=state.cell, pbc=state.pbc, # So pbc is taken from a SimState, so pbc *is* torch.tensor system_idx=state.system_idx, atomic_numbers=state.atomic_numbers, constraints=state.constraints, )

My code design assumes that no user should create directly a MDState, which can be called into question. But it has the advantage of being a simpler code

ahhh I was confusing nvt_langevin_init with the init of MDState. Can you put an issue up? so we can discuss it further (bc it'll prob be in a diff PR). Personally I'm fine with either pattern (using the overrides like I was doing in this PR vs the nvt_langevin_init function)

torch_sim/state.py

orionarcher · 2025-11-09T17:41:22Z

torch_sim/trajectory.py

I'd like to see trajectory aligned with the H5MD standard at some point, thats the next overhaul I have planned

orionarcher · 2025-11-09T17:45:05Z

torch_sim/trajectory.py

-            return self.get_array(prop, start=start, stop=stop)[0]
+            return self.get_array(prop, start=start, stop=stop)

-        arrays["cell"] = np.expand_dims(return_prop(self, "cell", frame), axis=0)
-        arrays["atomic_numbers"] = return_prop(self, "atomic_numbers", frame)
-        arrays["masses"] = return_prop(self, "masses", frame)
+        arrays["cell"] = np.expand_dims(return_prop(self, "cell", frame), axis=0)[0]
+        arrays["atomic_numbers"] = return_prop(self, "atomic_numbers", frame)[0]
+        arrays["masses"] = return_prop(self, "masses", frame)[0]


why move the index here?

I updated the return_prop function (line 831) to not auto return the [0] index. This was done because return_prop(self, "pbc", frame) is now a global attribute that is a tensor (so we don't want to just return the first index, but the entire tensor).

So for all of the attributes, the functionality is the exact same - it's actually just a change to make PBC to work. (so we can return a tensor for global attribuets - not just constants - which was what it was before)

are you sure about that? I think it's just because get_array(name, n_frame, n_frame+1) normally return (1, size). You modify it to squeeze for pbc, so indeed you need to modify here. But I think you shouldn't squeeze pbc

orionarcher · 2025-11-09T17:47:54Z

IMO this PR is already a big feature compared to the latter as I see it having more applications than people running different PBC systems in a batch.

Given that we may soon have batched neighbor list calculations it's unclear to me if we want to support mixed-PBC. It's nice in principle but I would bet it's very rarely needed in practice and may complicate neighbor list calculations.

curtischong · 2025-11-10T00:48:17Z

I think catalytic applications would benefit from mixed PBC (without having to manually create a vacuum).

Mixed PBC does complicate the writing of kernels. Here is an example of what Vesin had to do when implementing mixed PBC https://github.com/Luthaf/vesin/pull/76/files#diff-38d490d3ab0a91479a981cd188ef933c68d7cbb24bc90ab3d66595f7da4e0da6R133
I do think that this PR does seem larger than it actually is because I think mic displacement vectors were changed to match ASE's.

Here was my implementation of mixed PBC in vesin (albeit with the wrong mic displacement vectors) and it's quite small
https://github.com/Luthaf/vesin/pull/75/files#diff-38d490d3ab0a91479a981cd188ef933c68d7cbb24bc90ab3d66595f7da4e0da6R74

We should ask the nvidia ppl to see how hard it might be to add mixed PBC (can they use the same algorithms?). It might be a really small change like the diff in my PR

torch_sim/trajectory.py

thomasloux · 2025-11-10T15:11:11Z

torch_sim/trajectory.py

-            return self.get_array(prop, start=start, stop=stop)[0]
+            return self.get_array(prop, start=start, stop=stop)

-        arrays["cell"] = np.expand_dims(return_prop(self, "cell", frame), axis=0)
-        arrays["atomic_numbers"] = return_prop(self, "atomic_numbers", frame)
-        arrays["masses"] = return_prop(self, "masses", frame)
+        arrays["cell"] = np.expand_dims(return_prop(self, "cell", frame), axis=0)[0]
+        arrays["atomic_numbers"] = return_prop(self, "atomic_numbers", frame)[0]
+        arrays["masses"] = return_prop(self, "masses", frame)[0]


are you sure about that? I think it's just because get_array(name, n_frame, n_frame+1) normally return (1, size). You modify it to squeeze for pbc, so indeed you need to modify here. But I think you shouldn't squeeze pbc

curtischong · 2025-11-11T04:19:30Z

torch_sim/trajectory.py


        self.flush()

+    def write_global_array(self, name: str, array: np.ndarray | torch.Tensor) -> None:


I needed to write this function just for pbc. because otherwise the trajectory format will add an extra dimension (bc write_arrays thinks there is a system_idx dimension)

curtischong · 2025-11-11T04:22:11Z

torch_sim/trajectory.py

        arrays["positions"] = self.get_array("positions", start=frame, stop=frame + 1)[0]

        def return_prop(self: Self, prop: str, frame: int) -> np.ndarray:
+            if prop == "pbc":


yes I hate this extra if statement as well. this trajectory file format also doesn't egneralize well to global attributes from child simstates. we need to settle on a good simstate definition for attributes soon

curtischong changed the title ~~Pbc~~ Mixed Pbc Oct 28, 2025

curtischong changed the title ~~Mixed Pbc~~ Mixed Pbc Support Oct 28, 2025

curtischong force-pushed the pbc branch from 33b6c2f to 4fb63c6 Compare November 1, 2025 02:19

curtischong commented Nov 1, 2025

View reviewed changes

torch_sim/io.py Show resolved Hide resolved

curtischong commented Nov 1, 2025

View reviewed changes

curtischong force-pushed the pbc branch 2 times, most recently from 94065f9 to 88d8036 Compare November 1, 2025 20:55

curtischong commented Nov 1, 2025

View reviewed changes

torch_sim/neighbors.py Show resolved Hide resolved

curtischong commented Nov 1, 2025

View reviewed changes

torch_sim/transforms.py Outdated Show resolved Hide resolved

curtischong force-pushed the pbc branch 3 times, most recently from 5aba8d7 to fcf1d1a Compare November 3, 2025 00:23

curtischong added the breaking Breaking changes label Nov 3, 2025

curtischong force-pushed the pbc branch from fcf1d1a to f0e7ac5 Compare November 3, 2025 14:40

curtischong added the feature Entirely new features, not improvements to existing ones label Nov 3, 2025

curtischong force-pushed the pbc branch from f0e7ac5 to b6208e0 Compare November 4, 2025 03:22

curtischong force-pushed the pbc branch 2 times, most recently from 19029e5 to 4f77996 Compare November 5, 2025 00:19

curtischong marked this pull request as ready for review November 5, 2025 00:19

curtischong requested review from orionarcher and thomasloux November 5, 2025 00:19

curtischong force-pushed the pbc branch from 4f77996 to 5097cef Compare November 5, 2025 00:19

thomasloux requested changes Nov 6, 2025

View reviewed changes

curtischong commented Nov 8, 2025

View reviewed changes

curtischong force-pushed the pbc branch 2 times, most recently from 4dd71ad to 7866f65 Compare November 8, 2025 01:37

Thomas' review

2850f50

curtischong force-pushed the pbc branch from 7866f65 to 2850f50 Compare November 8, 2025 01:38

curtischong requested a review from thomasloux November 8, 2025 01:39

orionarcher reviewed Nov 9, 2025

View reviewed changes

orion's review

300315a

curtischong requested a review from orionarcher November 10, 2025 00:49

thomasloux reviewed Nov 10, 2025

View reviewed changes

curtischong force-pushed the pbc branch 2 times, most recently from 4375c2e to 3a11f02 Compare November 11, 2025 04:18

curtischong commented Nov 11, 2025

View reviewed changes

solve pbc trajectory issues

6b2912f

curtischong force-pushed the pbc branch from 3a11f02 to 6b2912f Compare November 11, 2025 04:21

curtischong commented Nov 11, 2025

View reviewed changes


		self.flush()

		def write_global_array(self, name: str, array: np.ndarray \| torch.Tensor) -> None:

Mixed Pbc Support #320

Are you sure you want to change the base?

Mixed Pbc Support #320

Uh oh!

Conversation

curtischong commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Checklist

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

curtischong Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

curtischong commented Nov 4, 2025

Uh oh!

thomasloux left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

curtischong commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abhijeetgangan commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thomasloux commented Nov 7, 2025

Uh oh!

curtischong commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

orionarcher left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

curtischong commented Oct 28, 2025 •

edited

Loading

curtischong Nov 7, 2025 •

edited

Loading

curtischong commented Nov 7, 2025 •

edited

Loading

abhijeetgangan commented Nov 7, 2025 •

edited

Loading

curtischong commented Nov 7, 2025 •

edited

Loading

thomasloux Nov 10, 2025 •

edited

Loading

curtischong commented Nov 10, 2025 •

edited

Loading

thomasloux Nov 10, 2025 •

edited

Loading

curtischong Nov 11, 2025 •

edited

Loading