Skip to content

Fix how forcing data is read in#56

Open
mnlevy1981 wants to merge 4 commits into
marbl-ecosys:mainfrom
mnlevy1981:fix_highres_forcing
Open

Fix how forcing data is read in#56
mnlevy1981 wants to merge 4 commits into
marbl-ecosys:mainfrom
mnlevy1981:fix_highres_forcing

Conversation

@mnlevy1981
Copy link
Copy Markdown
Collaborator

Using open_mfdataset() in data_wrangling.py caused our forcing dataset to be chunked in time. This didn't play nicely with xr.map_blocks(), resulting in the wrong forcing data being available when trying to read multiple netcdf files (such as the 0.1 degree POP time series files). Using xr.open_dataset() and then merging all the datasets does not introduce chunking in the time dimension, so xr.map_blocks() receives the entire forcing dataset.

Note that this increases the memory footprint, especially in the Run Multiple Years (highres) notebook. I've had trouble getting enough resources on casper to run two years at a time.

@rmshkv -- do you want to play with this branch and see if you can get two years per run with the 0.1 degree forcing? Or should we bring it in as-is and then figure out how to update the notebook later?

Using open_mfdataset() in data_wrangling.py caused our forcing dataset to be
chunked in time. This didn't play nicely with xr.map_blocks(), resulting in the
wrong forcing data being available when trying to read multiple netcdf files
(such as the 0.1 degree POP time series files). Using xr.open_dataset() and
then merging all the datasets does not introduce chunking in the time
dimension, so xr.map_blocks() receives the entire forcing dataset.

Note that this increases the memory footprint, especially in the Run Multiple
Years (highres) notebook. I've had trouble getting enough resources on casper
to run two years at a time.
@mnlevy1981 mnlevy1981 requested a review from rmshkv August 31, 2023 19:58
To decrease memory usage, create temporary forcing stream files that only
include years that might be used by the current run (based on start_year and
nyears). That's start_year-1, start_year, ... start_year + nyears -1
(end_year), and start_year + nyears
Also update Run Multiple Years (highres).ipynb to only use one year of forcing
at a time (by creating temporary forcing stream files)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant