feat: add optional boundary datastore support#635
Conversation
Add support for loading boundary forcing from a separate datastore, enabling LAM models to ingest boundary conditions from a different domain (e.g. ERA5 boundaries for a COSMO/DANRA interior). - NeuralLAMConfig accepts optional `datastore_boundary` field - load_config_and_datastore returns 3-tuple (config, datastore, datastore_boundary) - WeatherDataset loads, windows, and standardizes boundary forcing - __getitem__ returns 5-tuple (init_states, target_states, forcing, boundary, target_times) - New CLI args --num_past_boundary_steps / --num_future_boundary_steps - ForecasterModule.common_step unpacks boundary (not yet wired to forward) - 4 new boundary-specific tests, all 157 tests pass refs mllam#108 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pass datastore_boundary through train_model.py into ForecasterModule. During --eval, plot_examples loads raw boundary forcing and overlays it underneath prediction/target panels via vis.plot_prediction. Add four boundary plotting tests using BoundaryDummyDatastore from PR mllam#635. Update README to document boundary plotting during evaluation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add MDP-based ERA5 boundary example at tests/datastore_examples/mdp/era5_1000hPa_danra_100m_winds/ with config.yaml, era5.datastore.yaml (WeatherBench2 64x32 equiangular), and danra.datastore.yaml (DANRA 100m winds interior). Add DATASTORES_BOUNDARY_EXAMPLES dict and init_datastore_boundary_example() to conftest.py for use in boundary integration tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
MDPDatastore.__init__ crashed with KeyError when loading a datastore that has only forcing+static (no state), e.g. ERA5 boundary data. Fix is_ensemble check to guard against missing state, and grid_shape_state to fall back to forcing/static categories. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Make _get_analysis_times fall back to forcing file patterns when no
state files exist, guard get_dataarray("state") against empty var_names,
and prevent empty feature list from matching state loading path.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Guard against missing state/static feature keys in the zarr, not just forcing. Boundary-only datastores (e.g. ERA5) may lack state_feature entirely. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Return len(grid_index) directly instead of computing from grid_shape_state, which is more robust for boundary-only datastores. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…eat/boundary-datastore
- Shrink the ERA5 boundary test dataset to 2022-03-30..2022-04-12 (was 1990-2022), add per-input lat/lon coord_ranges and enable mllam's convex-hull domain_cropping with include_interior_points=true. Stats computation drops from minutes to seconds and the cached zarr stays under 1 MB. - Stack [longitude, latitude] directly into grid_index in the era5 dim_mapping (per mllam's example.era5_cropped.yaml) so the original coord names survive for the convex-hull crop -- removes the need for any rename-preserve workaround in neural-lam. - Generalise the MDPDatastore units loop over self.spatial_coordinates with sensible defaults for x/y (m) and longitude/latitude/lon/lat (degrees_*) so ERA5-style geographic datastores work. - Register a pytest `slow` marker and a `--run-slow` CLI flag so the ERA5 boundary integration test (added in this PR) is skipped by default and can be run on demand. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Pass datastore_boundary through train_model.py into ForecasterModule. During --eval, plot_examples loads raw boundary forcing and overlays it underneath prediction/target panels via vis.plot_prediction. Add four boundary plotting tests using BoundaryDummyDatastore from PR mllam#635. Update README to document boundary plotting during evaluation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Drop state metadata after init and raise KeyError on `state` lookups so plotting/model code that accidentally queries state on a boundary fails loudly. Real ERA5-style boundary datastores expose only forcing fields, and the existing boundary tests (test_datasets.py) only ever access forcing on the boundary, so making the dummy state-less brings it closer to real boundary semantics without changing test behaviour.
Pass datastore_boundary through train_model.py into ForecasterModule. During --eval, plot_examples loads raw boundary forcing and overlays it underneath prediction/target panels via vis.plot_prediction. Add four boundary plotting tests using BoundaryDummyDatastore from PR mllam#635. Update README to document boundary plotting during evaluation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Asserts plot_prediction works against a boundary datastore with no state
category (forcing-only), exercising the get_xy("forcing") / get_lat_lon(
"forcing") path in vis.plot_on_axis. Pairs with the BoundaryDummyDatastore
state-less change in mllam#635.
Resolve conflicts from mllam#239 (normalize on GPU): - weather_dataset.py: drop the CPU standardization path (state/forcing/boundary stats setup, _compute_std_safe, in-__getitem__ scaling, and the standardize= plumbing in WeatherDataModule); keep the boundary-datastore feature and the 5-tuple sample (init, target, forcing, boundary, times). - models/module.py: on_after_batch_transfer now unpacks/returns the 5-tuple, standardizing state+forcing on-device and passing boundary through unchanged (boundary is not yet consumed by the forecaster on this branch). - tests/test_datasets.py: drop the dataset-standardization tests mllam#239 removed (incl. boundary standardization, now a GPU concern); keep the structural boundary tests without the removed standardize= kwarg. - tests/test_gpu_normalization.py: feed/expect the 5-tuple batch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
joeloskarsson
left a comment
There was a problem hiding this comment.
It's so great that you started the work on this 😄 I had a look through everything except the tests now. Had one major comment about the interior-boundary data alignment in the dataset, otherwise mostly small things.
…boundary-datastore
…pe_state Previously a datastore with no state, forcing or static would silently reach `grid_shape_state` and fail with a confusing fallback error. Now the constructor raises immediately if neither state nor forcing is present, and the fallback in `grid_shape_state` collapses to a simple `state if present else forcing` pick. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Boundary features are unpacked from the batch but the forecaster forward pass does not consume them yet; the model-side wiring lands in a follow-up PR (mllam#108). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…is/forecast modes Replace integer-idx boundary slicing with time-based nearest-neighbor (pad) lookup so the boundary datastore can have a different step length than the interior, and either side may be in analysis or forecast mode. - Add `get_time_step`, `check_time_overlap`, `crop_time_if_needed` helpers in `neural_lam.utils` (ported from the research branch in joeloskarsson/neural-lam-dev, with an extra guard against silent argmax-on-all-false cropping). - Refactor `WeatherDataset`: precompute the within-sample state step and any forecast lead-time step in __init__; run `crop_time_if_needed` + `check_time_overlap` against the boundary so the first/last samples never fall outside boundary coverage; replace `_slice_forcing_time` with `_window_forcing_in_time` for time-aligned windowing of cross-datastore boundary; preserve the original integer- idx fast path as `_window_same_forecast_by_idx` for same-datastore forecast forcing (npyfilesmeps has non-unique analysis_time so the pandas pad-lookup cannot be used there). - Window alignment matches the existing forcing convention (target time, i.e. `state_times[init_steps + i]`). - Split `test_boundary_dataset_length_unchanged` into a no-crop and a cropping case to document the new behaviour. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Port a slimmed-down version of the alignment tests from joeloskarsson/neural-lam-dev to exercise the new time-based boundary windowing in WeatherDataset. - Extend `SinglePointDummyDatastore` with forecast-mode support. - Add a boundary-only variant whose state lookup raises KeyError, to catch any path that accidentally asks the boundary for state. - `test_time_slicing_boundary_analysis`: parametrised over past/future window sizes, asserts exact boundary window values around each target state time. - `test_boundary_step_length_mismatch_supported`: 1h interior with a 6h boundary, verifies the pad-matched lookup. - `test_forecast_interior_with_analysis_boundary` and `test_analysis_interior_with_forecast_boundary`: the two mixed analysis/forecast combinations. - `test_check_time_overlap_insufficient_raises`: surface the cropping failure path with a clear error. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Asserts plot_prediction works against a boundary datastore with no state
category (forcing-only), exercising the get_xy("forcing") / get_lat_lon(
"forcing") path in vis.plot_on_axis. Pairs with the BoundaryDummyDatastore
state-less change in mllam#635.
observingClouds
left a comment
There was a problem hiding this comment.
LGTM. I just wonder if we need to raise a deprecation notice for the new config format, so that users now how to change it.
Loading a config that still uses the old top-level `datastore:` key now raises a clear InvalidConfigError pointing to the named `datastores:` mapping. Add a regression test and note the behaviour in the CHANGELOG. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
@joeloskarsson this is ready for another look when you have time. Summary of what changed since your review:
|
Add an input/output table for the state/forcing/static categories in the Data section, as requested in mllam#652, so the datastore role-by-category explanation has an explicit definition to reference. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
joeloskarsson
left a comment
There was a problem hiding this comment.
Great work with all of this! All my previous comments are fixed, but as you know the time slicing changes meant that there was a lot of new code added. So let me now be annoying and come with more suggestions from looking over also these ;) I still think these are mostly small things, but a couple that maybe need some discussions. Partly this is because I can't remember why we did certain decisions earlier, so don't know if those decisions should be carried forward as we now merge this work to main.
The parenthetical described a `diagnostic` (output-only) category that does not exist in the codebase; tracked separately instead. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Make explicit that exactly one interior and at most one boundary datastore are supported, matching the constraint enforced in load_config_and_datastore. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Joel Oskarsson <joel.oskarsson@outlook.com>
Give the two dataarrays in check_time_overlap and crop_time_if_needed role-based names matching the da_ prefix convention used elsewhere: da_requested is the driving timeline, da_available must cover its windows. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
crop_time_if_needed previously returned a forecast da_requested unchanged, deferring alignment errors. Drop whole out-of-coverage analysis_time launches instead (logged like the analysis-mode crop), addressing Joel's review point that forecast cropping is no less safe than analysis cropping. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Refer to da_requested/da_available in the error and log messages of check_time_overlap and crop_time_if_needed instead of "boundary forcing"/"interior", matching the agnostic signatures. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
crop_time_if_needed and check_time_overlap bounded a forecast boundary's reach by a single analysis step, over-cropping the interior for boundaries with few launches but long forecasts. Use the boundary's max elapsed_forecast_duration as its reach past the last launch in required_max, keeping the two functions consistent. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace the O(T) boolean-mask scan in crop_time_if_needed with two np.searchsorted calls on the already-sorted time axis, dropping the mask allocation for an O(log T) lookup. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
_time_step_state and _forecast_step_forcing were set in __init__ but never read; only _forecast_step_boundary is used. Drop both. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
crop_time_if_needed now raises whenever the valid window is empty, so a successful crop always leaves da_state within boundary coverage. Drop the redundant post-crop check_time_overlap call (and its import) from WeatherDataset.__init__, and rename the now-misnamed coverage test. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Convert the lone center_time <= target_time assert in _window_forcing_in_time to a ValueError so the windowing path raises uniformly. The remaining da_state assertions are type-narrowing guards. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
@joeloskarsson thanks for the review, there were definitely some hickups in the code, still. I left a few unresolved where you can give your input. All others I agreed with and they are implemented with individual commits. |
| from .base import BaseRegularGridDatastore, CartesianGridShape | ||
|
|
||
|
|
||
| class MDPDatastore(BaseRegularGridDatastore): |
There was a problem hiding this comment.
Something struck me when I was thinking about this change: Is it maybe a problem that MDPDatastore is a BaseRegularGridDatastore now? Since if we want to use this type of datastore for the boundary data, this will generally not be on a regular grid? Seems like something that might cause silence bugs if one calls the Regular grid methods on the boundary datastore.
Not sure how much the MDP datastore really relies on the regular grid construction? This might not be such a tricky thing to change, but need to look deeper into it.
joeloskarsson
left a comment
There was a problem hiding this comment.
I have gone over all changes, and resolved as suitable. The only things I still think need some consideration are the few open comment threads, in particular my latest comment about the MDP datastore always being the regular grid kind.
Describe your changes
Add optional boundary datastore support to
WeatherDatasetandWeatherDataModule, enabling LAM models to ingest boundary forcing from a separate domain (e.g. ERA5 for a COSMO/DANRA interior).NeuralLAMConfigaccepts an optionaldatastore_boundaryfieldload_config_and_datastorereturns a 3-tuple(config, datastore, datastore_boundary)WeatherDataset.__getitem__returns a 5-tuple(init_states, target_states, forcing, boundary, target_times)where the boundary tensor is empty (last dim 0) when no boundary datastore is configured--num_past_boundary_steps/--num_future_boundary_stepscontrol the boundary forcing window sizeForecasterModule.common_stepunpacks the boundary tensor but does not yet wire it into the forward pass (model-side integration is planned as a separate PR)MDPDatastoreandNpyFilesDatastoreMEPSboth handle boundary-only configs (forcing + static, no state variables) gracefullytests/datastore_examples/mdp/era5_1000hPa_danra_100m_winds/(WeatherBench2 64x32 equiangular grid as boundary for DANRA interior), withinit_datastore_boundary_example()fixture inconftest.pyThis is PR A in the boundary datastore plan outlined in #108. PR B (model-side boundary handling) and PR C (#636, boundary plotting) will follow.
Issue Link
refs #108
Type of change
Note on breaking change:
WeatherDataset.__getitem__now returns a 5-tuple instead of 4-tuple, andload_config_and_datastorereturns a 3-tuple instead of 2-tuple. All callers in the repo have been updated.Checklist before requesting a review
pullwith--rebaseoption if possible).Checklist for reviewers
Each PR comes with its own improvements and flaws. The reviewer should check the following:
Author checklist after completed review
reflecting type of change (add section where missing):
Checklist for assignee
refs #138