feat: add optional boundary datastore support by sadamov · Pull Request #635 · mllam/neural-lam

sadamov · 2026-05-11T14:01:20Z

Describe your changes

Add optional boundary datastore support to WeatherDataset and WeatherDataModule, enabling LAM models to ingest boundary forcing from a separate domain (e.g. ERA5 for a COSMO/DANRA interior).

NeuralLAMConfig accepts an optional datastore_boundary field
load_config_and_datastore returns a 3-tuple (config, datastore, datastore_boundary)
WeatherDataset.__getitem__ returns a 5-tuple (init_states, target_states, forcing, boundary, target_times) where the boundary tensor is empty (last dim 0) when no boundary datastore is configured
New CLI args --num_past_boundary_steps / --num_future_boundary_steps control the boundary forcing window size
ForecasterModule.common_step unpacks the boundary tensor but does not yet wire it into the forward pass (model-side integration is planned as a separate PR)
Boundary datastore can be any registered datastore type (mdp, npyfilesmeps, etc.)
MDPDatastore and NpyFilesDatastoreMEPS both handle boundary-only configs (forcing + static, no state variables) gracefully
ERA5 boundary test configs added at tests/datastore_examples/mdp/era5_1000hPa_danra_100m_winds/ (WeatherBench2 64x32 equiangular grid as boundary for DANRA interior), with init_datastore_boundary_example() fixture in conftest.py

This is PR A in the boundary datastore plan outlined in #108. PR B (model-side boundary handling) and PR C (#636, boundary plotting) will follow.

Issue Link

refs #108

Type of change

🐛 Bug fix (non-breaking change that fixes an issue)
✨ New feature (non-breaking change that adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
📖 Documentation (Addition or improvements to documentation)

Note on breaking change: WeatherDataset.__getitem__ now returns a 5-tuple instead of 4-tuple, and load_config_and_datastore returns a 3-tuple instead of 2-tuple. All callers in the repo have been updated.

Checklist before requesting a review

My branch is up-to-date with the target branch - if not update your fork with the changes from the target branch (use pull with --rebase option if possible).
I have performed a self-review of my code
For any new/modified functions/classes I have added docstrings that clearly describe its purpose, expected inputs and returned values
I have placed in-line comments to clarify the intent of any hard-to-understand passages of my code
I have updated the README to cover introduced code changes
I have added tests that prove my fix is effective or that my feature works
I have given the PR a name that clearly describes the change, written in imperative form (context).
I have requested a reviewer and an assignee (assignee is responsible for merging). This applies only if you have write access to the repo, otherwise feel free to tag a maintainer to add a reviewer and assignee.

Checklist for reviewers

Each PR comes with its own improvements and flaws. The reviewer should check the following:

the code is readable
the code is well tested
the code is documented (including return types and parameters)
the code is easy to maintain

Author checklist after completed review

I have added a line to the CHANGELOG describing this change, in a section
reflecting type of change (add section where missing):
- added: when you have added new functionality
- changed: when default behaviour of the code has been changed
- fixes: when your contribution fixes a bug
- maintenance: when your contribution is relates to repo maintenance, e.g. CI/CD or documentation

Checklist for assignee

PR is up to date with the base branch
the tests pass
(if the PR is not just maintenance/bugfix) the PR is assigned to the next milestone. If it is not, propose it for a future milestone.
author has added an entry to the changelog (and designated the change as added, changed, fixed or maintenance)
Once the PR is ready to be merged, squash commits and merge the PR.

refs #138

Add support for loading boundary forcing from a separate datastore, enabling LAM models to ingest boundary conditions from a different domain (e.g. ERA5 boundaries for a COSMO/DANRA interior). - NeuralLAMConfig accepts optional `datastore_boundary` field - load_config_and_datastore returns 3-tuple (config, datastore, datastore_boundary) - WeatherDataset loads, windows, and standardizes boundary forcing - __getitem__ returns 5-tuple (init_states, target_states, forcing, boundary, target_times) - New CLI args --num_past_boundary_steps / --num_future_boundary_steps - ForecasterModule.common_step unpacks boundary (not yet wired to forward) - 4 new boundary-specific tests, all 157 tests pass refs mllam#108 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Pass datastore_boundary through train_model.py into ForecasterModule. During --eval, plot_examples loads raw boundary forcing and overlays it underneath prediction/target panels via vis.plot_prediction. Add four boundary plotting tests using BoundaryDummyDatastore from PR mllam#635. Update README to document boundary plotting during evaluation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add MDP-based ERA5 boundary example at tests/datastore_examples/mdp/era5_1000hPa_danra_100m_winds/ with config.yaml, era5.datastore.yaml (WeatherBench2 64x32 equiangular), and danra.datastore.yaml (DANRA 100m winds interior). Add DATASTORES_BOUNDARY_EXAMPLES dict and init_datastore_boundary_example() to conftest.py for use in boundary integration tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

MDPDatastore.__init__ crashed with KeyError when loading a datastore that has only forcing+static (no state), e.g. ERA5 boundary data. Fix is_ensemble check to guard against missing state, and grid_shape_state to fall back to forcing/static categories. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Make _get_analysis_times fall back to forcing file patterns when no state files exist, guard get_dataarray("state") against empty var_names, and prevent empty feature list from matching state loading path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Guard against missing state/static feature keys in the zarr, not just forcing. Boundary-only datastores (e.g. ERA5) may lack state_feature entirely. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Return len(grid_index) directly instead of computing from grid_shape_state, which is more robust for boundary-only datastores. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…eat/boundary-datastore

- Shrink the ERA5 boundary test dataset to 2022-03-30..2022-04-12 (was 1990-2022), add per-input lat/lon coord_ranges and enable mllam's convex-hull domain_cropping with include_interior_points=true. Stats computation drops from minutes to seconds and the cached zarr stays under 1 MB. - Stack [longitude, latitude] directly into grid_index in the era5 dim_mapping (per mllam's example.era5_cropped.yaml) so the original coord names survive for the convex-hull crop -- removes the need for any rename-preserve workaround in neural-lam. - Generalise the MDPDatastore units loop over self.spatial_coordinates with sensible defaults for x/y (m) and longitude/latitude/lon/lat (degrees_*) so ERA5-style geographic datastores work. - Register a pytest `slow` marker and a `--run-slow` CLI flag so the ERA5 boundary integration test (added in this PR) is skipped by default and can be run on demand. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Pass datastore_boundary through train_model.py into ForecasterModule. During --eval, plot_examples loads raw boundary forcing and overlays it underneath prediction/target panels via vis.plot_prediction. Add four boundary plotting tests using BoundaryDummyDatastore from PR mllam#635. Update README to document boundary plotting during evaluation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Drop state metadata after init and raise KeyError on `state` lookups so plotting/model code that accidentally queries state on a boundary fails loudly. Real ERA5-style boundary datastores expose only forcing fields, and the existing boundary tests (test_datasets.py) only ever access forcing on the boundary, so making the dummy state-less brings it closer to real boundary semantics without changing test behaviour.

Pass datastore_boundary through train_model.py into ForecasterModule. During --eval, plot_examples loads raw boundary forcing and overlays it underneath prediction/target panels via vis.plot_prediction. Add four boundary plotting tests using BoundaryDummyDatastore from PR mllam#635. Update README to document boundary plotting during evaluation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Asserts plot_prediction works against a boundary datastore with no state category (forcing-only), exercising the get_xy("forcing") / get_lat_lon( "forcing") path in vis.plot_on_axis. Pairs with the BoundaryDummyDatastore state-less change in mllam#635.

Resolve conflicts from mllam#239 (normalize on GPU): - weather_dataset.py: drop the CPU standardization path (state/forcing/boundary stats setup, _compute_std_safe, in-__getitem__ scaling, and the standardize= plumbing in WeatherDataModule); keep the boundary-datastore feature and the 5-tuple sample (init, target, forcing, boundary, times). - models/module.py: on_after_batch_transfer now unpacks/returns the 5-tuple, standardizing state+forcing on-device and passing boundary through unchanged (boundary is not yet consumed by the forecaster on this branch). - tests/test_datasets.py: drop the dataset-standardization tests mllam#239 removed (incl. boundary standardization, now a GPU concern); keep the structural boundary tests without the removed standardize= kwarg. - tests/test_gpu_normalization.py: feed/expect the 5-tuple batch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

joeloskarsson

It's so great that you started the work on this 😄 I had a look through everything except the tests now. Had one major comment about the interior-boundary data alignment in the dataset, otherwise mostly small things.

…boundary-datastore

…pe_state Previously a datastore with no state, forcing or static would silently reach `grid_shape_state` and fail with a confusing fallback error. Now the constructor raises immediately if neither state nor forcing is present, and the fallback in `grid_shape_state` collapses to a simple `state if present else forcing` pick. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Boundary features are unpacked from the batch but the forecaster forward pass does not consume them yet; the model-side wiring lands in a follow-up PR (mllam#108). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…is/forecast modes Replace integer-idx boundary slicing with time-based nearest-neighbor (pad) lookup so the boundary datastore can have a different step length than the interior, and either side may be in analysis or forecast mode. - Add `get_time_step`, `check_time_overlap`, `crop_time_if_needed` helpers in `neural_lam.utils` (ported from the research branch in joeloskarsson/neural-lam-dev, with an extra guard against silent argmax-on-all-false cropping). - Refactor `WeatherDataset`: precompute the within-sample state step and any forecast lead-time step in __init__; run `crop_time_if_needed` + `check_time_overlap` against the boundary so the first/last samples never fall outside boundary coverage; replace `_slice_forcing_time` with `_window_forcing_in_time` for time-aligned windowing of cross-datastore boundary; preserve the original integer- idx fast path as `_window_same_forecast_by_idx` for same-datastore forecast forcing (npyfilesmeps has non-unique analysis_time so the pandas pad-lookup cannot be used there). - Window alignment matches the existing forcing convention (target time, i.e. `state_times[init_steps + i]`). - Split `test_boundary_dataset_length_unchanged` into a no-crop and a cropping case to document the new behaviour. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Port a slimmed-down version of the alignment tests from joeloskarsson/neural-lam-dev to exercise the new time-based boundary windowing in WeatherDataset. - Extend `SinglePointDummyDatastore` with forecast-mode support. - Add a boundary-only variant whose state lookup raises KeyError, to catch any path that accidentally asks the boundary for state. - `test_time_slicing_boundary_analysis`: parametrised over past/future window sizes, asserts exact boundary window values around each target state time. - `test_boundary_step_length_mismatch_supported`: 1h interior with a 6h boundary, verifies the pad-matched lookup. - `test_forecast_interior_with_analysis_boundary` and `test_analysis_interior_with_forecast_boundary`: the two mixed analysis/forecast combinations. - `test_check_time_overlap_insufficient_raises`: surface the cropping failure path with a clear error. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Asserts plot_prediction works against a boundary datastore with no state category (forcing-only), exercising the get_xy("forcing") / get_lat_lon( "forcing") path in vis.plot_on_axis. Pairs with the BoundaryDummyDatastore state-less change in mllam#635.

observingClouds

LGTM. I just wonder if we need to raise a deprecation notice for the new config format, so that users now how to change it.

Loading a config that still uses the old top-level `datastore:` key now raises a clear InvalidConfigError pointing to the named `datastores:` mapping. Add a regression test and note the behaviour in the CHANGELOG. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

sadamov · 2026-06-17T02:28:39Z

@joeloskarsson this is ready for another look when you have time. Summary of what changed since your review:

reworked to the named datastores: mapping we landed on in [RFC/Design]: Support for multiple datastores #652 (state/forcing meaning kept, just named), and added plotting.boundary_datastore to name the boundary overlay source - that's your option 3 from [RFC/Design]: Support for multiple datastores #652
migration guard in load_config_and_datastore raising InvalidConfigError on the legacy top-level datastore: key, so users get pointed at the new format (@observingClouds )
the interior-boundary time alignment is ported from the research branch (analysis/forecast interior crossed with analysis/forecast boundary, all four combinations), with coverage in test_time_slicing.py among other new tests
@leifdenby documented the state/forcing/static input/output meaning in the README Data section as you asked in [RFC/Design]: Support for multiple datastores #652 - have a look.

Add an input/output table for the state/forcing/static categories in the Data section, as requested in mllam#652, so the datastore role-by-category explanation has an explicit definition to reference. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

joeloskarsson

Great work with all of this! All my previous comments are fixed, but as you know the time slicing changes meant that there was a lot of new code added. So let me now be annoying and come with more suggestions from looking over also these ;) I still think these are mostly small things, but a couple that maybe need some discussions. Partly this is because I can't remember why we did certain decisions earlier, so don't know if those decisions should be carried forward as we now merge this work to main.

The parenthetical described a `diagnostic` (output-only) category that does not exist in the codebase; tracked separately instead. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Make explicit that exactly one interior and at most one boundary datastore are supported, matching the constraint enforced in load_config_and_datastore. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-authored-by: Joel Oskarsson <joel.oskarsson@outlook.com>

Give the two dataarrays in check_time_overlap and crop_time_if_needed role-based names matching the da_ prefix convention used elsewhere: da_requested is the driving timeline, da_available must cover its windows. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

crop_time_if_needed previously returned a forecast da_requested unchanged, deferring alignment errors. Drop whole out-of-coverage analysis_time launches instead (logged like the analysis-mode crop), addressing Joel's review point that forecast cropping is no less safe than analysis cropping. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Refer to da_requested/da_available in the error and log messages of check_time_overlap and crop_time_if_needed instead of "boundary forcing"/"interior", matching the agnostic signatures. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

crop_time_if_needed and check_time_overlap bounded a forecast boundary's reach by a single analysis step, over-cropping the interior for boundaries with few launches but long forecasts. Use the boundary's max elapsed_forecast_duration as its reach past the last launch in required_max, keeping the two functions consistent. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Replace the O(T) boolean-mask scan in crop_time_if_needed with two np.searchsorted calls on the already-sorted time axis, dropping the mask allocation for an O(log T) lookup. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

_time_step_state and _forecast_step_forcing were set in __init__ but never read; only _forecast_step_boundary is used. Drop both. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

crop_time_if_needed now raises whenever the valid window is empty, so a successful crop always leaves da_state within boundary coverage. Drop the redundant post-crop check_time_overlap call (and its import) from WeatherDataset.__init__, and rename the now-misnamed coverage test. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Convert the lone center_time <= target_time assert in _window_forcing_in_time to a ValueError so the windowing path raises uniformly. The remaining da_state assertions are type-narrowing guards. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

sadamov · 2026-06-25T09:32:48Z

@joeloskarsson thanks for the review, there were definitely some hickups in the code, still. I left a few unresolved where you can give your input. All others I agreed with and they are implemented with individual commits.

joeloskarsson · 2026-06-26T08:15:13Z

 from .base import BaseRegularGridDatastore, CartesianGridShape


 class MDPDatastore(BaseRegularGridDatastore):


Something struck me when I was thinking about this change: Is it maybe a problem that MDPDatastore is a BaseRegularGridDatastore now? Since if we want to use this type of datastore for the boundary data, this will generally not be on a regular grid? Seems like something that might cause silence bugs if one calls the Regular grid methods on the boundary datastore.

Not sure how much the MDP datastore really relies on the regular grid construction? This might not be such a tricky thing to change, but need to look deeper into it.

joeloskarsson

I have gone over all changes, and resolved as suitable. The only things I still think need some consideration are the few open comment threads, in particular my latest comment about the MDP datastore always being the regular grid kind.

sadamov and others added 2 commits May 11, 2026 16:00

Update CHANGELOG PR link to mllam#635

3dc0045

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

sadamov self-assigned this May 11, 2026

sadamov added the enhancement New feature or request label May 11, 2026

sadamov added this to the v0.8.0 milestone May 11, 2026

sadamov requested review from joeloskarsson and leifdenby May 11, 2026 14:13

sadamov mentioned this pull request May 11, 2026

Add boundary data overlay to evaluation plots #636

Open

21 tasks

sadamov and others added 9 commits May 11, 2026 17:17

docs: mention MDP boundary-only datastore support in CHANGELOG

affb441

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: MDP get_vars_names/get_vars_long_names for any missing category

e5d8061

Guard against missing state/static feature keys in the zarr, not just forcing. Boundary-only datastores (e.g. ERA5) may lack state_feature entirely. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: add num_grid_points override to MDPDatastore

8d8bb77

Return len(grid_index) directly instead of computing from grid_shape_state, which is more robust for boundary-only datastores. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge branch 'main' into feat/boundary-datastore

2b12e76

Merge remote-tracking branch 'sadamov/feat/boundary-datastore' into f…

b3e4ef6

…eat/boundary-datastore

sadamov and others added 2 commits May 18, 2026 09:22

Merge branch 'main' into feat/boundary-datastore

a839263

joeloskarsson requested changes May 22, 2026

View reviewed changes

Comment thread neural_lam/datastore/mdp.py Outdated

Comment thread neural_lam/models/module.py Outdated

Comment thread neural_lam/models/module.py

Comment thread neural_lam/weather_dataset.py Outdated

Comment thread neural_lam/weather_dataset.py Outdated

Comment thread tests/test_datasets.py

sadamov and others added 5 commits May 28, 2026 05:03

Merge branch 'main' of https://github.com/mllam/neural-lam into feat/…

41ee6f6

…boundary-datastore

observingClouds reviewed Jun 16, 2026

View reviewed changes

observingClouds approved these changes Jun 16, 2026

View reviewed changes

sadamov and others added 2 commits June 16, 2026 20:59

linting formatting and detailed review

22310b6

sadamov removed the discussion label Jun 17, 2026

sadamov linked an issue Jun 17, 2026 that may be closed by this pull request

[RFC/Design]: Support for multiple datastores #652

Open

joeloskarsson requested changes Jun 23, 2026

View reviewed changes

sadamov and others added 2 commits June 25, 2026 07:14

Merge branch 'main' into feat/boundary-datastore

1a4ff17

Drop unimplemented diagnostic category note from README

059033d

The parenthetical described a `diagnostic` (output-only) category that does not exist in the codebase; tracked separately instead. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

sadamov mentioned this pull request Jun 25, 2026

Add a diagnostic data category (output-only) #684

Open

sadamov and others added 10 commits June 25, 2026 08:19

Document single-boundary-datastore constraint in README

8b08abd

Make explicit that exactly one interior and at most one boundary datastore are supported, matching the constraint enforced in load_config_and_datastore. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Update neural_lam/train_model.py

682f792

Co-authored-by: Joel Oskarsson <joel.oskarsson@outlook.com>

Use binary search for the crop cut points

441ce61

Replace the O(T) boolean-mask scan in crop_time_if_needed with two np.searchsorted calls on the already-sorted time axis, dropping the mask allocation for an O(log T) lookup. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Remove dead time-step attributes from WeatherDataset

a75a1ad

_time_step_state and _forecast_step_forcing were set in __init__ but never read; only _forecast_step_boundary is used. Drop both. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

sadamov requested a review from joeloskarsson June 25, 2026 09:33

added helper for empty dataarrays

c096dc4

joeloskarsson reviewed Jun 26, 2026

View reviewed changes

joeloskarsson mentioned this pull request Jun 27, 2026

Rename d_mesh_static dimension to follow canonical naming convention #687

Open

uttam12331 mentioned this pull request Jun 28, 2026

refactor: rename d_mesh_static dimension to num_mesh_static_vars #695

Open

21 tasks

joeloskarsson requested changes Jul 3, 2026

View reviewed changes

		from .base import BaseRegularGridDatastore, CartesianGridShape


		class MDPDatastore(BaseRegularGridDatastore):

Uh oh!

Conversation

sadamov commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe your changes

Issue Link

Type of change

Checklist before requesting a review

Checklist for reviewers

Author checklist after completed review

Checklist for assignee

Uh oh!

joeloskarsson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

observingClouds left a comment

Choose a reason for hiding this comment

Uh oh!

sadamov commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joeloskarsson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sadamov commented Jun 25, 2026

Uh oh!

joeloskarsson Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

joeloskarsson left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sadamov commented May 11, 2026 •

edited

Loading

sadamov commented Jun 17, 2026 •

edited

Loading