Skip to content

feat: add optional boundary datastore support#635

Open
sadamov wants to merge 43 commits into
mllam:mainfrom
sadamov:feat/boundary-datastore
Open

feat: add optional boundary datastore support#635
sadamov wants to merge 43 commits into
mllam:mainfrom
sadamov:feat/boundary-datastore

Conversation

@sadamov

@sadamov sadamov commented May 11, 2026

Copy link
Copy Markdown
Collaborator

Describe your changes

Add optional boundary datastore support to WeatherDataset and WeatherDataModule, enabling LAM models to ingest boundary forcing from a separate domain (e.g. ERA5 for a COSMO/DANRA interior).

  • NeuralLAMConfig accepts an optional datastore_boundary field
  • load_config_and_datastore returns a 3-tuple (config, datastore, datastore_boundary)
  • WeatherDataset.__getitem__ returns a 5-tuple (init_states, target_states, forcing, boundary, target_times) where the boundary tensor is empty (last dim 0) when no boundary datastore is configured
  • New CLI args --num_past_boundary_steps / --num_future_boundary_steps control the boundary forcing window size
  • ForecasterModule.common_step unpacks the boundary tensor but does not yet wire it into the forward pass (model-side integration is planned as a separate PR)
  • Boundary datastore can be any registered datastore type (mdp, npyfilesmeps, etc.)
  • MDPDatastore and NpyFilesDatastoreMEPS both handle boundary-only configs (forcing + static, no state variables) gracefully
  • ERA5 boundary test configs added at tests/datastore_examples/mdp/era5_1000hPa_danra_100m_winds/ (WeatherBench2 64x32 equiangular grid as boundary for DANRA interior), with init_datastore_boundary_example() fixture in conftest.py

This is PR A in the boundary datastore plan outlined in #108. PR B (model-side boundary handling) and PR C (#636, boundary plotting) will follow.

Issue Link

refs #108

Type of change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 📖 Documentation (Addition or improvements to documentation)

Note on breaking change: WeatherDataset.__getitem__ now returns a 5-tuple instead of 4-tuple, and load_config_and_datastore returns a 3-tuple instead of 2-tuple. All callers in the repo have been updated.

Checklist before requesting a review

  • My branch is up-to-date with the target branch - if not update your fork with the changes from the target branch (use pull with --rebase option if possible).
  • I have performed a self-review of my code
  • For any new/modified functions/classes I have added docstrings that clearly describe its purpose, expected inputs and returned values
  • I have placed in-line comments to clarify the intent of any hard-to-understand passages of my code
  • I have updated the README to cover introduced code changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have given the PR a name that clearly describes the change, written in imperative form (context).
  • I have requested a reviewer and an assignee (assignee is responsible for merging). This applies only if you have write access to the repo, otherwise feel free to tag a maintainer to add a reviewer and assignee.

Checklist for reviewers

Each PR comes with its own improvements and flaws. The reviewer should check the following:

  • the code is readable
  • the code is well tested
  • the code is documented (including return types and parameters)
  • the code is easy to maintain

Author checklist after completed review

  • I have added a line to the CHANGELOG describing this change, in a section
    reflecting type of change (add section where missing):
    • added: when you have added new functionality
    • changed: when default behaviour of the code has been changed
    • fixes: when your contribution fixes a bug
    • maintenance: when your contribution is relates to repo maintenance, e.g. CI/CD or documentation

Checklist for assignee

  • PR is up to date with the base branch
  • the tests pass
  • (if the PR is not just maintenance/bugfix) the PR is assigned to the next milestone. If it is not, propose it for a future milestone.
  • author has added an entry to the changelog (and designated the change as added, changed, fixed or maintenance)
  • Once the PR is ready to be merged, squash commits and merge the PR.

refs #138

sadamov and others added 2 commits May 11, 2026 16:00
Add support for loading boundary forcing from a separate datastore,
enabling LAM models to ingest boundary conditions from a different
domain (e.g. ERA5 boundaries for a COSMO/DANRA interior).

- NeuralLAMConfig accepts optional `datastore_boundary` field
- load_config_and_datastore returns 3-tuple (config, datastore, datastore_boundary)
- WeatherDataset loads, windows, and standardizes boundary forcing
- __getitem__ returns 5-tuple (init_states, target_states, forcing, boundary, target_times)
- New CLI args --num_past_boundary_steps / --num_future_boundary_steps
- ForecasterModule.common_step unpacks boundary (not yet wired to forward)
- 4 new boundary-specific tests, all 157 tests pass

refs mllam#108

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sadamov sadamov self-assigned this May 11, 2026
@sadamov sadamov added the enhancement New feature or request label May 11, 2026
@sadamov sadamov added this to the v0.8.0 milestone May 11, 2026
sadamov added a commit to sadamov/neural-lam that referenced this pull request May 11, 2026
Pass datastore_boundary through train_model.py into ForecasterModule.
During --eval, plot_examples loads raw boundary forcing and overlays it
underneath prediction/target panels via vis.plot_prediction. Add four
boundary plotting tests using BoundaryDummyDatastore from PR mllam#635.
Update README to document boundary plotting during evaluation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
sadamov and others added 9 commits May 11, 2026 17:17
Add MDP-based ERA5 boundary example at
tests/datastore_examples/mdp/era5_1000hPa_danra_100m_winds/ with
config.yaml, era5.datastore.yaml (WeatherBench2 64x32 equiangular),
and danra.datastore.yaml (DANRA 100m winds interior).

Add DATASTORES_BOUNDARY_EXAMPLES dict and
init_datastore_boundary_example() to conftest.py for use in boundary
integration tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
MDPDatastore.__init__ crashed with KeyError when loading a datastore
that has only forcing+static (no state), e.g. ERA5 boundary data.
Fix is_ensemble check to guard against missing state, and
grid_shape_state to fall back to forcing/static categories.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Make _get_analysis_times fall back to forcing file patterns when no
state files exist, guard get_dataarray("state") against empty var_names,
and prevent empty feature list from matching state loading path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Guard against missing state/static feature keys in the zarr, not just
forcing. Boundary-only datastores (e.g. ERA5) may lack state_feature
entirely.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Return len(grid_index) directly instead of computing from
grid_shape_state, which is more robust for boundary-only datastores.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Shrink the ERA5 boundary test dataset to 2022-03-30..2022-04-12 (was
  1990-2022), add per-input lat/lon coord_ranges and enable mllam's
  convex-hull domain_cropping with include_interior_points=true. Stats
  computation drops from minutes to seconds and the cached zarr stays
  under 1 MB.
- Stack [longitude, latitude] directly into grid_index in the era5
  dim_mapping (per mllam's example.era5_cropped.yaml) so the original
  coord names survive for the convex-hull crop -- removes the need for
  any rename-preserve workaround in neural-lam.
- Generalise the MDPDatastore units loop over self.spatial_coordinates
  with sensible defaults for x/y (m) and longitude/latitude/lon/lat
  (degrees_*) so ERA5-style geographic datastores work.
- Register a pytest `slow` marker and a `--run-slow` CLI flag so the
  ERA5 boundary integration test (added in this PR) is skipped by
  default and can be run on demand.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
sadamov added a commit to sadamov/neural-lam that referenced this pull request May 14, 2026
Pass datastore_boundary through train_model.py into ForecasterModule.
During --eval, plot_examples loads raw boundary forcing and overlays it
underneath prediction/target panels via vis.plot_prediction. Add four
boundary plotting tests using BoundaryDummyDatastore from PR mllam#635.
Update README to document boundary plotting during evaluation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Drop state metadata after init and raise KeyError on `state` lookups so
plotting/model code that accidentally queries state on a boundary fails
loudly. Real ERA5-style boundary datastores expose only forcing fields,
and the existing boundary tests (test_datasets.py) only ever access
forcing on the boundary, so making the dummy state-less brings it
closer to real boundary semantics without changing test behaviour.
sadamov added a commit to sadamov/neural-lam that referenced this pull request May 15, 2026
Pass datastore_boundary through train_model.py into ForecasterModule.
During --eval, plot_examples loads raw boundary forcing and overlays it
underneath prediction/target panels via vis.plot_prediction. Add four
boundary plotting tests using BoundaryDummyDatastore from PR mllam#635.
Update README to document boundary plotting during evaluation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
sadamov added a commit to sadamov/neural-lam that referenced this pull request May 15, 2026
Asserts plot_prediction works against a boundary datastore with no state
category (forcing-only), exercising the get_xy("forcing") / get_lat_lon(
"forcing") path in vis.plot_on_axis. Pairs with the BoundaryDummyDatastore
state-less change in mllam#635.
sadamov and others added 2 commits May 18, 2026 09:22
Resolve conflicts from mllam#239 (normalize on GPU):
- weather_dataset.py: drop the CPU standardization path (state/forcing/boundary
  stats setup, _compute_std_safe, in-__getitem__ scaling, and the
  standardize= plumbing in WeatherDataModule); keep the boundary-datastore
  feature and the 5-tuple sample (init, target, forcing, boundary, times).
- models/module.py: on_after_batch_transfer now unpacks/returns the 5-tuple,
  standardizing state+forcing on-device and passing boundary through unchanged
  (boundary is not yet consumed by the forecaster on this branch).
- tests/test_datasets.py: drop the dataset-standardization tests mllam#239 removed
  (incl. boundary standardization, now a GPU concern); keep the structural
  boundary tests without the removed standardize= kwarg.
- tests/test_gpu_normalization.py: feed/expect the 5-tuple batch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@joeloskarsson joeloskarsson left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's so great that you started the work on this 😄 I had a look through everything except the tests now. Had one major comment about the interior-boundary data alignment in the dataset, otherwise mostly small things.

Comment thread neural_lam/datastore/mdp.py Outdated
Comment thread neural_lam/models/module.py Outdated
Comment thread neural_lam/models/module.py
Comment thread neural_lam/weather_dataset.py Outdated
Comment thread neural_lam/weather_dataset.py Outdated
Comment thread tests/test_datasets.py
sadamov and others added 5 commits May 28, 2026 05:03
…pe_state

Previously a datastore with no state, forcing or static would silently
reach `grid_shape_state` and fail with a confusing fallback error. Now
the constructor raises immediately if neither state nor forcing is
present, and the fallback in `grid_shape_state` collapses to a simple
`state if present else forcing` pick.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Boundary features are unpacked from the batch but the forecaster
forward pass does not consume them yet; the model-side wiring lands
in a follow-up PR (mllam#108).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…is/forecast modes

Replace integer-idx boundary slicing with time-based nearest-neighbor
(pad) lookup so the boundary datastore can have a different step length
than the interior, and either side may be in analysis or forecast mode.

- Add `get_time_step`, `check_time_overlap`, `crop_time_if_needed`
  helpers in `neural_lam.utils` (ported from the research branch in
  joeloskarsson/neural-lam-dev, with an extra guard against silent
  argmax-on-all-false cropping).
- Refactor `WeatherDataset`: precompute the within-sample state step
  and any forecast lead-time step in __init__; run
  `crop_time_if_needed` + `check_time_overlap` against the boundary so
  the first/last samples never fall outside boundary coverage; replace
  `_slice_forcing_time` with `_window_forcing_in_time` for time-aligned
  windowing of cross-datastore boundary; preserve the original integer-
  idx fast path as `_window_same_forecast_by_idx` for same-datastore
  forecast forcing (npyfilesmeps has non-unique analysis_time so the
  pandas pad-lookup cannot be used there).
- Window alignment matches the existing forcing convention (target
  time, i.e. `state_times[init_steps + i]`).
- Split `test_boundary_dataset_length_unchanged` into a no-crop and a
  cropping case to document the new behaviour.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Port a slimmed-down version of the alignment tests from
joeloskarsson/neural-lam-dev to exercise the new time-based boundary
windowing in WeatherDataset.

- Extend `SinglePointDummyDatastore` with forecast-mode support.
- Add a boundary-only variant whose state lookup raises KeyError, to
  catch any path that accidentally asks the boundary for state.
- `test_time_slicing_boundary_analysis`: parametrised over past/future
  window sizes, asserts exact boundary window values around each
  target state time.
- `test_boundary_step_length_mismatch_supported`: 1h interior with a
  6h boundary, verifies the pad-matched lookup.
- `test_forecast_interior_with_analysis_boundary` and
  `test_analysis_interior_with_forecast_boundary`: the two mixed
  analysis/forecast combinations.
- `test_check_time_overlap_insufficient_raises`: surface the cropping
  failure path with a clear error.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
sadamov added a commit to sadamov/neural-lam that referenced this pull request Jun 16, 2026
Asserts plot_prediction works against a boundary datastore with no state
category (forcing-only), exercising the get_xy("forcing") / get_lat_lon(
"forcing") path in vis.plot_on_axis. Pairs with the BoundaryDummyDatastore
state-less change in mllam#635.

@observingClouds observingClouds left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I just wonder if we need to raise a deprecation notice for the new config format, so that users now how to change it.

sadamov and others added 2 commits June 16, 2026 20:59
Loading a config that still uses the old top-level `datastore:` key now
raises a clear InvalidConfigError pointing to the named `datastores:`
mapping. Add a regression test and note the behaviour in the CHANGELOG.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@sadamov

sadamov commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator Author

@joeloskarsson this is ready for another look when you have time. Summary of what changed since your review:

Add an input/output table for the state/forcing/static categories in the
Data section, as requested in mllam#652, so the datastore role-by-category
explanation has an explicit definition to reference.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@sadamov sadamov linked an issue Jun 17, 2026 that may be closed by this pull request

@joeloskarsson joeloskarsson left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work with all of this! All my previous comments are fixed, but as you know the time slicing changes meant that there was a lot of new code added. So let me now be annoying and come with more suggestions from looking over also these ;) I still think these are mostly small things, but a couple that maybe need some discussions. Partly this is because I can't remember why we did certain decisions earlier, so don't know if those decisions should be carried forward as we now merge this work to main.

Comment thread README.md
Comment thread README.md Outdated
Comment thread README.md Outdated
Comment thread neural_lam/models/module.py Outdated
Comment thread neural_lam/weather_dataset.py Outdated
Comment thread neural_lam/utils.py Outdated
Comment thread neural_lam/weather_dataset.py Outdated
Comment thread neural_lam/weather_dataset.py
Comment thread neural_lam/weather_dataset.py Outdated
Comment thread neural_lam/weather_dataset.py Outdated
sadamov and others added 2 commits June 25, 2026 07:14
The parenthetical described a `diagnostic` (output-only) category that
does not exist in the codebase; tracked separately instead.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
sadamov and others added 10 commits June 25, 2026 08:19
Make explicit that exactly one interior and at most one boundary
datastore are supported, matching the constraint enforced in
load_config_and_datastore.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Joel Oskarsson <joel.oskarsson@outlook.com>
Give the two dataarrays in check_time_overlap and crop_time_if_needed
role-based names matching the da_ prefix convention used elsewhere:
da_requested is the driving timeline, da_available must cover its
windows.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
crop_time_if_needed previously returned a forecast da_requested
unchanged, deferring alignment errors. Drop whole out-of-coverage
analysis_time launches instead (logged like the analysis-mode crop),
addressing Joel's review point that forecast cropping is no less safe
than analysis cropping.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Refer to da_requested/da_available in the error and log messages of
check_time_overlap and crop_time_if_needed instead of "boundary
forcing"/"interior", matching the agnostic signatures.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
crop_time_if_needed and check_time_overlap bounded a forecast
boundary's reach by a single analysis step, over-cropping the interior
for boundaries with few launches but long forecasts. Use the boundary's
max elapsed_forecast_duration as its reach past the last launch in
required_max, keeping the two functions consistent.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace the O(T) boolean-mask scan in crop_time_if_needed with two
np.searchsorted calls on the already-sorted time axis, dropping the
mask allocation for an O(log T) lookup.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
_time_step_state and _forecast_step_forcing were set in __init__ but
never read; only _forecast_step_boundary is used. Drop both.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
crop_time_if_needed now raises whenever the valid window is empty, so a
successful crop always leaves da_state within boundary coverage. Drop
the redundant post-crop check_time_overlap call (and its import) from
WeatherDataset.__init__, and rename the now-misnamed coverage test.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Convert the lone center_time <= target_time assert in
_window_forcing_in_time to a ValueError so the windowing path raises
uniformly. The remaining da_state assertions are type-narrowing guards.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@sadamov

sadamov commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator Author

@joeloskarsson thanks for the review, there were definitely some hickups in the code, still. I left a few unresolved where you can give your input. All others I agreed with and they are implemented with individual commits.

@sadamov sadamov requested a review from joeloskarsson June 25, 2026 09:33
from .base import BaseRegularGridDatastore, CartesianGridShape


class MDPDatastore(BaseRegularGridDatastore):

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something struck me when I was thinking about this change: Is it maybe a problem that MDPDatastore is a BaseRegularGridDatastore now? Since if we want to use this type of datastore for the boundary data, this will generally not be on a regular grid? Seems like something that might cause silence bugs if one calls the Regular grid methods on the boundary datastore.

Not sure how much the MDP datastore really relies on the regular grid construction? This might not be such a tricky thing to change, but need to look deeper into it.

@joeloskarsson joeloskarsson left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have gone over all changes, and resolved as suitable. The only things I still think need some consideration are the few open comment threads, in particular my latest comment about the MDP datastore always being the regular grid kind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[RFC/Design]: Support for multiple datastores

4 participants